Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

PerfectLIF model with improved performance #975

Merged
merged 1 commit into from
May 25, 2016
Merged

PerfectLIF model with improved performance #975

merged 1 commit into from
May 25, 2016

Conversation

arvoelke
Copy link
Contributor

@arvoelke arvoelke commented Mar 3, 2016

This is in some ways a natural continuation of #511 (@hunse).

In the above PR, the given math works off the assumption that the input current J is piece-wise constant over each time-step. From this, we can use the discretized lowpass filter with time-constant tau_rc to compute the exact voltage update at t = dt:

v(t) = v(0) + (J - v(0)) * (1 - exp(-t/tau))

But why stop there? As suggested in #511, the way we currently compute overshoot uses a linear interpolation scheme, whereas the previous equation can be used to solve for the exact amount of overshoot (by substituting v(0) = 1 and rearranging for t).

But to make this work well, we must also deal with another issue: we currently handle partially remaining refractory periods by multiplicatively scaling the voltage, which is yet again a linear approximation. Instead, we can apply the same trick and substitute the amount of leftover time as t into the first equation (basically using a different time-step per neuron).

This makes the neuron model's spike rate invariant to dt, under the assumptions that dt <= tau_ref and the input is constant across each time-step (aside: we can in fact relax the first assumption with O(1) additional processing, although the approach becomes more complicated). We can thus think of it as a perfect discretization for digital hardware. By 'perfect', I mean the spike count differs from the expected count by < 1 at all times.

Some testing reveals that this reduces the rmse(true_rates * T, sum_spikes) versus the current LIF by as much as 90% after T = 2 seconds of simulation! This is because the current model will consistently spike slightly faster or slightly slower to accumulate O(t) error (see Efficient and Accurate Time-Stepping Schemes for Integrate-and-Fire Neuronal Networks, JCNS 2001).

This improvement was discovered while identifying a somewhat serious problem with drift during integration (thanks to Wilten Nicola). Any small errors from maintaining the refractory_period will add up over time, causing systematic drift in one direction.

integrator_drift

The improved neuron model reduces drift almost to zero, and this same qualitative result occurs over a wide range of seeds. So the main idea here is that this gives us the ability to run simulations at a higher time-step without sacrificing performance (normally you need to drop dt to see the same reduction in drift).

But, one drawback that I am aware of is this model can slow down the network by a non-negligible amount. Some tests on my machine with an integrator of 1000 neurons gave a slowdown of ~20%. I think this is due to the expm1 and log1p being applied to arrays. 馃槮

If you are eager to see if this improves any of your models, as an interim solution you can make this the default neuron model by installing NengoLib and replacing nengo.Network() with nengolib.Network().

@celiasmith
Copy link
Contributor

Is there a reason we wouldn't just make this the default LIF? (it would then be used by adapting LIF as well). The decrease in time isn't very large, and people can always increase dt to make that up pretty easily (in fact, you can probably increase dt a lot for the same accuracy).

@tbekolay
Copy link
Member

tbekolay commented Mar 3, 2016

I was going to ask the same question! 20% slowdown is pretty negligible, IMO.

@drasmuss
Copy link
Member

drasmuss commented Mar 3, 2016

It's closer to a 50% slowdown if you just look at the neuron update function, but for larger models (where you care about the simulation time), the neuron update function tends to be a negligible part of the overall run time (most of it is the weight multiplications). So I think it'd be worth it to make this the default LIF model, and then if someone is trying to optimize they can always use the less accurate version.

I also took a look to see if there were any obvious optimizations, but nothing significant (and most of the optimizations were equally applicable to the standard LIF).

PS here's the quick benchmarking script I was using:

import time
from cProfile import Profile
import pstats

import nengo
import numpy as np

N = 10000
neuron = nengo.neurons.LIF()

times = []
for _ in range(10):
    J = np.random.uniform(0, 1, size=N)
    spiked = np.zeros(N)
    voltage = np.zeros(N)
    refractory_time = np.zeros(N)

    # p = Profile()
    # p.enable()

    start = time.time()
    for _ in range(10000):
        neuron.step_math(0.001, J, spiked, voltage, refractory_time)
    times += [time.time() - start]

    # p.disable()

print(times)
print(np.mean(times))
print(np.min(times))

# ps = pstats.Stats(p)
# ps.strip_dirs().sort_stats('time').print_stats(20)

@hunse
Copy link
Collaborator

hunse commented Mar 3, 2016

So in #511 I said I looked at using the exponential interpolation, and found that it didn't help things much. I'm wondering what's different now? I guess the fact that we're looking at errors over time, when even small errors can accumulate into significant ones? Also, I can't remember exactly what I did at the time and it's possible I made a mistake. Either way, these results seem like a significant improvement.

Is the 20% slowdown for the whole model, or just for the LIF model? If it's for the whole model, I think that means the LIFs would be 2-3 times slower (just a guess), since they are usually a less important part of the model. (Thanks @drasmuss for checking this. Obviously it's not as bad as I guessed, but it still is something.)

My other concern is that this will be harder to put on other hardware (e.g. GPUs), or that it will result in a more considerable slowdown on other hardware. Of course, our models don't have to run exactly the same on all hardware, but it can be nice to have them be as close as possible. That would be one reason to keep them separate. I'd be fine making this one the default, renaming it to LIF and having the current one become FastLIF or something like that.

@arvoelke
Copy link
Contributor Author

arvoelke commented Mar 3, 2016

@hunse: I'm guessing you would have changed the linear interpolation for the overshoot, but kept the refractory periods scaling the voltage multiplicatively (which is again linear; see description at top). All of these things have to work together in just the right way to prevent any error from accumulating.

The main issue for me was the slow-down. But if people are fine with that, and happy with the level of validation that I've done, I would also like this to be the new default!

That said, I am a little wary of this just because it's a rather significant core change. It would be nice if it could get some "battle testing" just to make sure this doesn't break existing large models that might have constants tweaked to work with the current model.

One other thing I'm thinking of now is if there are possible numerical problems with:

t_spike = dt + self.tau_rc * np.log1p( -(spiked_v - 1) / (J[spiked_mask] - 1))

I am not sure if J[spiked_mask] == 1 could occur, and if so whether this is problematic. A solution involves changing the log1p(x) to log(1 + x), turning the inside into a fraction, and then converting log(a/b) to log(a) - log(b), but this is likely even slower (and may even be numerically worse in some situations)? Edit: Actually, this doesn't help, because log(b) = -inf.

It's also worth mentioning that even though you can increase dt without hurting the spike rate, from the perspective of the post-synaptic neurons the "constant input across each time-step" assumption becomes more easily violated. In essence, what we are doing here is distributing each filtered spike evenly across the time-step (turning it into an aligned rectangle) as input to the next neuron, and then assuming that this is indistinguishable from a filtered impulse at the true spike time. This becomes less of the case as dt becomes larger, but I don't yet have an analytical handle on how bad this gets in terms of dt and the synaptic tau.

@tcstewar
Copy link
Contributor

tcstewar commented Mar 3, 2016

That said, I am a little wary of this just because it's a rather significant core change. It would be nice if it could get some "battle testing" just to make sure this doesn't break existing large models that might have constants tweaked to work with the current model.

I like the idea of battle-testing this a bit before making it the new default. :) I'm very curious what effects it has!

voltage -= (J - voltage) * np.expm1(-delta_t / self.tau_rc)

# determine which neurons spiked (set them to 1/dt, else 0)
spiked_mask = voltage >= 1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think this needs to be changed back to > 1. For example, when the intercept is 0, then the bias is 1, which means x = 0 => J = 1 => v = 1 while there should be no spiking by definition of the intercept. This may also fix the problem with L286, since a spike shouldn't occur unless J > 1. (Note: in theory it should still be >= 1, but numerically the lowpass filter will round it up to 1 eventually).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the lowpass filter round up eventually? I just played around with it a bit, and I think it depends on dt and tau_rc. Specifically, if np.abs(np.expm1(-dt / tau_rc)) > 0.5, then it will round up (which is the case for dt / tau_rc > 0.66 or so), but otherwise it will never round up.

Anyway, changing it is fine. Theoretically, it doesn't make any difference, and numerically it should hardly make a difference either.

@arvoelke
Copy link
Contributor Author

arvoelke commented Mar 7, 2016

Renamed LIF -> FastLIF, and PerfectLIF -> LIF to make this the new default. Added some tests to show that the "spike count invariant" is maintained under 1ms and 2ms time-steps. Think I also resolved the potential numerical issue.

This should also slightly improve the AdaptiveLIF model. Note that further improvements should be possible by playing the same tricks with the adaptive filter (n += (dt / self.tau_n) * (self.inc_n * output - n)). Will open an issue for that once this is settled.

@hunse
Copy link
Collaborator

hunse commented Mar 7, 2016

The other option, and I'm actually preferring this now, would be to give LIF neurons an integration option, that sets how we solve the ODEs. Because these are theoretically all the same model, just with different ways of solving the equations. Then AdaptiveLIF will inherit this option, and we can do the proper integration there, too. If people think that sounds good, I can throw together a proposal pretty quick.

We had talked about having neurons have a make_step function, rather than a step_math. That could make some of what I'm suggesting here easier to implement, but it's a significant change, and definitely something for a later PR. There are a number of issues with that too, that we'd have to work out. Just want to keep it on people's radar, in case anybody has ideas or gets ambitious and wants to try it out.

@arvoelke
Copy link
Contributor Author

arvoelke commented Mar 7, 2016

I'm guessing that would be for nengo_ocl reasons, so that the two adaptive implementations are consistent? I can't see any reason other than speed/implementation to go with the current model (although maybe the battle-testing will indicate otherwise). This is because given the "constant input across each time-step assumption" there should exist no better proxy for the LIFRate model, regardless of dt.

(*) I can however imagine another way to solve for the LIF equations, that better deals with the fact that the above assumption is more easily violated for larger dt and for smaller synaptic tau. You could assume the input is filtered whitenoise with mean J and variance given by the number of presynaptic neurons, and then do some math to find the expected mismatch in the integral due to rectification and spike lag. Accounting for this mismatch should then give a model that more accurately reflects the continuous case whenever it is receiving filtered spikes from another population (or from itself) with high dt and/or small synaptic tau.

That said, across all three LIF cases, the way we may want to include adaptation may differ, due to the interplay between the two filters. In the original case, we probably want to keep things how they are currently. In the perfect case, we probably need to use the computed t_spike times to advance each adapting filter the correct amount of time. In the case of (*), we probably need to use the results of the proposed analysis to do something different yet again. Therefore, at the moment it's not clear to me if such an abstraction can be done in a way that is convincingly going to be more useful in the future, versus what we have right now.

Considering the above, I'd rather make another AdaptiveLIF versus FastAdaptiveLIF distinction for the time being.

@hunse
Copy link
Collaborator

hunse commented Mar 7, 2016

I'm afraid I don't really follow you at all. I was just proposing a different way of arranging what we have in this PR. Rather than have two classes (LIF and FastLIF), have one LIF class with a fast parameter that allows using the old integration scheme. (I would maybe call the parameter something different, but nothing is jumping out right now...)

@tbekolay
Copy link
Member

tbekolay commented Mar 7, 2016

There are tons of integration methods (e.g., Runge-Kutta) so a scheme with more than two possibilities would be good. Not that I think we should bother with these other methods, but some people might want to.

@hunse
Copy link
Collaborator

hunse commented Mar 7, 2016

Yep, very true. I think @arvoelke is right and that there is no better integration scheme given our assumption of constant input across the time step. But there's no reason we have to assume this. If we made neurons have a make_step function, then they can handle their own state and store some of their previous inputs too, allowing them to interpolate and possibly get better integration. (A Runge-Kutta implementation would do this, I think.)

So do we want an integration enum parameter? What should we call our current options for the LIF neuron type?

@tbekolay
Copy link
Member

tbekolay commented Mar 7, 2016

I think that makes sense. I think I'd call what we have now 'euler' and the method introduced here as 'accurate'?

I wonder at what level the integration parameter should live? Probably for now it makes sense to have it just for LIF and its subclasses, but later on we could generalize it to NeuronType, or get rid of it in the frontend and enable it through some config flags in the backend (this integration scheme only works for the reference simulator at the moment right?)

@hunse
Copy link
Collaborator

hunse commented Mar 7, 2016

The method we have now is called 'zoh' (for zero-order hold) in Scipy (see cont2discrete).

The problem with putting it on NeuronType is that the available methods will differ based on type. I was wondering if we could actually do the integration at the level of the simulator, express all of our integration problems like dx/dt = f(t, x) and have the neuron type return f(t, x), so that the simulator can do any type of integration for that function. But I think this is difficult in the case of discontinuous ODEs like ours. So I'd keep the parameter at the LIF level for now, and probably for a while unless we have some big changes.

@jgosmann
Copy link
Collaborator

jgosmann commented Mar 7, 2016

fyi: I submitted jobs to Sharcnet to test this with my n-back model.

@arvoelke
Copy link
Contributor Author

arvoelke commented Mar 7, 2016

fyi: I submitted jobs to Sharcnet to test this with my n-back model.

Sweet, thanks!

I'm afraid I don't really follow you at all.

I guess what I am trying to say is that I can only imagine 3 different methods right now (fast, accurate, and dynamic (*)), and the third is somewhat speculative at the moment and would require changes to the builder. The reason we shouldn't include more is fairly subtle, but I've been thinking about this pretty hard for a few days now, and what I've come to realize should hopefully address these three remarks:

... there is no better integration scheme given our assumption of constant input across the time step. But there's no reason we have to assume this... they can handle their own state and store some of their previous inputs too, allowing them to interpolate and possibly get better integration.

There are tons of integration methods (e.g., Runge-Kutta)...

... express all of our integration problems like dx/dt = f(t, x) and have the neuron type return f(t, x), so that the simulator can do any type of integration for that function.

There are some details I'm probably glossing over to get to this point, but let's give this a shot. There is a reason we can't use any of these integration methods properly in Nengo. If you look at the article I cited originally, or the Wiki for RK4, all of these methods require that we evaluate f(t + dt, *). The problem is that we can only evaluate f(t, *), because we don't yet have the input for the next time-step! Euler's method does not suffer from this problem, since it only requires f(t, *). The way we would deal with this in practice is to wait one extra time-step (i.e. shift to previous inputs), which will only work correctly for feed-forward networks. As soon as you add recursion, you've introduced an extra delay on the input to your neuron model.

One might feel tempted to use RK2 with the same input at both ends, but then this reduces to the same constant input assumption, implying that the second method should be used instead. It's also tempting to think that we can use a method that only requires f(t + a, *) for 0 <= a < dt (note the inequality is strict), but even this is not doable without either reasserting the same assumption, changing the effective dt, or propagating some extra information (which spikes from the presynaptic population occurred before a, and which occurred after). I would want to see an argument for the use of such methods versus just dropping dt before considering this further. There are also some points related to the synapse regarding why this shouldn't be done, but what I've said so far should suffice.

All-in-all, since these point are subtle, we probably shouldn't add any mechanisms to the codebase that would encourage someone to try something like this unless they are already in the situation of thinking about it very carefully.

If we want to handle recurrent networks accurately for larger dt we should do something similar in spirit to (*). I currently cannot imagine that we would want anything more than these three general cases (fast, accurate, and dynamic). I suspect most of the changes to neurons.py will be due to hardware, which IMO should be addressed as they arise rather than preemptively in this PR.

@jgosmann
Copy link
Collaborator

Preliminary results from the n-back model (some trials failed and need to be rerun, not sure if this influences the results):

perfect-lif
(optimized = spaopt, default = no spaopt, perfect lif = spaopt + perfect lif, sd = subdimensions per ensemble)

It seems that the perfect LIF increases performance slightly and reduces variance.

@arvoelke
Copy link
Contributor Author

Hmm.. neat! @jgosmann just wondering briefly how you included the model? Was it a global config change, applied to particular ensembles, or just used this branch?

@jgosmann
Copy link
Collaborator

I merged my n-back branch (some version of spaopt based on an older master) with perfect-lif. So all ensembles should have made use of perfect LIF neurons.

@jgosmann
Copy link
Collaborator

Updated plot with all trials (and default removed which is irrelevant for the comparison)
perfect-lif
:

@arvoelke
Copy link
Contributor Author

Cool, thanks for checking this! Glad it didn't make anything worse.

However, playing the skeptic here: do you think there's any chance that merging this branch introduced some other nengo change or bugfix that could have caused the slight improvement? Possible candidates might include any changes to SPA?

Or did you just cherry-pick the two commits?

@jgosmann
Copy link
Collaborator

do you think there's any chance that merging this branch introduced some other nengo change or bugfix that could have caused the slight improvement?

I think, it's unlikely. But I can push the relevant branches if you want to got through the diffs.

@arvoelke
Copy link
Contributor Author

Nah don't worry about it. Just wondering if anything came to mind. This is a good sign! Thanks again.

@Seanny123
Copy link
Contributor

Is this still being discussed or does it just need to be rebased/reviewed?

@arvoelke
Copy link
Contributor Author

Also for consideration: I've been using this for over two months and haven't had any issues. In fact it's eliminated a relatively major source of error when applying principle 3 in spiking mode, leading to much better approximations when used in conjunction with the discretized version of principle 3.

@tbekolay
Copy link
Member

I've been using this for over two months and haven't had any issues

I wonder if we should move FastLIF to https://github.com/nengo/nengo_extras and just go with the updated LIF model for Nengo?

@hunse
Copy link
Collaborator

hunse commented May 24, 2016

I wonder if we should move FastLIF to https://github.com/nengo/nengo_extras and just go with the updated LIF model for Nengo?

I'd be fine with that for now, and get this merged. We can always add integration options later.

tbekolay added a commit to nengo/nengo-extras that referenced this pull request May 25, 2016
@tbekolay
Copy link
Member

OK, I've rebased this, copied much of the original post to the commit message, and moved FastLIF to nengo_extras (nengo/nengo-extras#10).

Will merge after merging #1046 and CIs are done unless there are objections.

@tbekolay tbekolay force-pushed the perfect_lif branch 3 times, most recently from 4b8547f to d7b655d Compare May 25, 2016 16:13
The way we currently compute overshoot uses a linear interpolation
scheme, whereas the previous equation can be used to solve for the exact
amount of overshoot (by substituting v(0) = 1 and rearranging for t).

But to make this work well, we must also deal with another issue: we
currently handle partially remaining refractory periods by
multiplicatively scaling the voltage, which is yet again a linear
approximation. Instead, we can apply the same trick and substitute the
amount of leftover time as t into the first equation (basically using a
different time-step per neuron).

This makes the neuron model's spike rate invariant to dt, under the
assumptions that dt <= tau_ref and the input is constant across each
time-step (aside: we can in fact relax the first assumption with O(1)
additional processing, although the approach becomes more
complicated). We can thus think of it as a perfect discretization for
digital hardware. By 'perfect', I mean the spike count differs from the
expected count by < 1 at all times.

Some testing reveals that this reduces the rmse(true_rates * T,
sum_spikes) versus the current LIF by as much as 90% after T = 2 seconds
of simulation! This is because the current model will consistently spike
slightly faster or slightly slower to accumulate O(t) error (see
Efficient and Accurate Time-Stepping Schemes for Integrate-and-Fire
Neuronal Networks, JCNS 2001).

This improvement was discovered while identifying a somewhat serious
problem with drift during integration (thanks to Wilten Nicola). Any
small errors from maintaining the refractory_period will add up over
time, causing systematic drift in one direction.

The improved neuron model reduces drift almost to zero, and this same
qualitative result occurs over a wide range of seeds. So the main idea
here is that this gives us the ability to run simulations at a higher
time-step without sacrificing performance (normally you need to drop dt
to see the same reduction in drift).

But, one drawback that I am aware of is this model can slow down the
network by a non-negligible amount. Some tests on my machine with an
integrator of 1000 neurons gave a slowdown of ~20%. I think this is due
to the expm1 and log1p being applied to arrays.
@tbekolay tbekolay merged commit a88aef3 into master May 25, 2016
@tbekolay tbekolay deleted the perfect_lif branch May 25, 2016 17:26
shaunren added a commit to shaunren/nengo_ocl that referenced this pull request Jul 31, 2016
This commit implements the more accurate LIF model, implemented in Nengo
2.1.1, in OpenCL.

A new boolean argument `fastlif' is also added in plan_lif, which
defaults to False.

See <nengo/nengo#975> for details regarding the
new LIF model.

Signed-off-by: Shaun Ren <shaun.ren@linux.com>
hunse pushed a commit to nengo/nengo-ocl that referenced this pull request Aug 17, 2016
This commit implements the more accurate LIF model, implemented in Nengo
2.1.1, in OpenCL.

A new boolean argument `fastlif' is also added in plan_lif, which
defaults to False.

See <nengo/nengo#975> for details regarding the
new LIF model.

Signed-off-by: Shaun Ren <shaun.ren@linux.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants