-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
PerfectLIF model with improved performance #975
Conversation
Is there a reason we wouldn't just make this the default LIF? (it would then be used by adapting LIF as well). The decrease in time isn't very large, and people can always increase dt to make that up pretty easily (in fact, you can probably increase dt a lot for the same accuracy). |
I was going to ask the same question! 20% slowdown is pretty negligible, IMO. |
It's closer to a 50% slowdown if you just look at the neuron update function, but for larger models (where you care about the simulation time), the neuron update function tends to be a negligible part of the overall run time (most of it is the weight multiplications). So I think it'd be worth it to make this the default LIF model, and then if someone is trying to optimize they can always use the less accurate version. I also took a look to see if there were any obvious optimizations, but nothing significant (and most of the optimizations were equally applicable to the standard LIF). PS here's the quick benchmarking script I was using: import time
from cProfile import Profile
import pstats
import nengo
import numpy as np
N = 10000
neuron = nengo.neurons.LIF()
times = []
for _ in range(10):
J = np.random.uniform(0, 1, size=N)
spiked = np.zeros(N)
voltage = np.zeros(N)
refractory_time = np.zeros(N)
# p = Profile()
# p.enable()
start = time.time()
for _ in range(10000):
neuron.step_math(0.001, J, spiked, voltage, refractory_time)
times += [time.time() - start]
# p.disable()
print(times)
print(np.mean(times))
print(np.min(times))
# ps = pstats.Stats(p)
# ps.strip_dirs().sort_stats('time').print_stats(20) |
So in #511 I said I looked at using the exponential interpolation, and found that it didn't help things much. I'm wondering what's different now? I guess the fact that we're looking at errors over time, when even small errors can accumulate into significant ones? Also, I can't remember exactly what I did at the time and it's possible I made a mistake. Either way, these results seem like a significant improvement. Is the 20% slowdown for the whole model, or just for the LIF model? If it's for the whole model, I think that means the LIFs would be 2-3 times slower (just a guess), since they are usually a less important part of the model. (Thanks @drasmuss for checking this. Obviously it's not as bad as I guessed, but it still is something.) My other concern is that this will be harder to put on other hardware (e.g. GPUs), or that it will result in a more considerable slowdown on other hardware. Of course, our models don't have to run exactly the same on all hardware, but it can be nice to have them be as close as possible. That would be one reason to keep them separate. I'd be fine making this one the default, renaming it to |
@hunse: I'm guessing you would have changed the linear interpolation for the overshoot, but kept the refractory periods scaling the voltage multiplicatively (which is again linear; see description at top). All of these things have to work together in just the right way to prevent any error from accumulating. The main issue for me was the slow-down. But if people are fine with that, and happy with the level of validation that I've done, I would also like this to be the new default! That said, I am a little wary of this just because it's a rather significant core change. It would be nice if it could get some "battle testing" just to make sure this doesn't break existing large models that might have constants tweaked to work with the current model. One other thing I'm thinking of now is if there are possible numerical problems with:
I am not sure if It's also worth mentioning that even though you can increase |
I like the idea of battle-testing this a bit before making it the new default. :) I'm very curious what effects it has! |
voltage -= (J - voltage) * np.expm1(-delta_t / self.tau_rc) | ||
|
||
# determine which neurons spiked (set them to 1/dt, else 0) | ||
spiked_mask = voltage >= 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think this needs to be changed back to > 1
. For example, when the intercept
is 0
, then the bias
is 1
, which means x = 0 => J = 1 => v = 1
while there should be no spiking by definition of the intercept. This may also fix the problem with L286, since a spike shouldn't occur unless J > 1
. (Note: in theory it should still be >= 1
, but numerically the lowpass filter will round it up to 1 eventually).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the lowpass filter round up eventually? I just played around with it a bit, and I think it depends on dt
and tau_rc
. Specifically, if np.abs(np.expm1(-dt / tau_rc)) > 0.5
, then it will round up (which is the case for dt / tau_rc > 0.66
or so), but otherwise it will never round up.
Anyway, changing it is fine. Theoretically, it doesn't make any difference, and numerically it should hardly make a difference either.
Renamed This should also slightly improve the |
The other option, and I'm actually preferring this now, would be to give LIF neurons an integration option, that sets how we solve the ODEs. Because these are theoretically all the same model, just with different ways of solving the equations. Then We had talked about having neurons have a |
I'm guessing that would be for (*) I can however imagine another way to solve for the LIF equations, that better deals with the fact that the above assumption is more easily violated for larger That said, across all three LIF cases, the way we may want to include adaptation may differ, due to the interplay between the two filters. In the original case, we probably want to keep things how they are currently. In the perfect case, we probably need to use the computed Considering the above, I'd rather make another |
I'm afraid I don't really follow you at all. I was just proposing a different way of arranging what we have in this PR. Rather than have two classes ( |
There are tons of integration methods (e.g., Runge-Kutta) so a scheme with more than two possibilities would be good. Not that I think we should bother with these other methods, but some people might want to. |
Yep, very true. I think @arvoelke is right and that there is no better integration scheme given our assumption of constant input across the time step. But there's no reason we have to assume this. If we made neurons have a So do we want an |
I think that makes sense. I think I'd call what we have now I wonder at what level the |
The method we have now is called The problem with putting it on |
fyi: I submitted jobs to Sharcnet to test this with my n-back model. |
Sweet, thanks!
I guess what I am trying to say is that I can only imagine 3 different methods right now (fast, accurate, and dynamic (*)), and the third is somewhat speculative at the moment and would require changes to the builder. The reason we shouldn't include more is fairly subtle, but I've been thinking about this pretty hard for a few days now, and what I've come to realize should hopefully address these three remarks:
There are some details I'm probably glossing over to get to this point, but let's give this a shot. There is a reason we can't use any of these integration methods properly in Nengo. If you look at the article I cited originally, or the Wiki for RK4, all of these methods require that we evaluate One might feel tempted to use RK2 with the same input at both ends, but then this reduces to the same constant input assumption, implying that the second method should be used instead. It's also tempting to think that we can use a method that only requires All-in-all, since these point are subtle, we probably shouldn't add any mechanisms to the codebase that would encourage someone to try something like this unless they are already in the situation of thinking about it very carefully. If we want to handle recurrent networks accurately for larger |
Hmm.. neat! @jgosmann just wondering briefly how you included the model? Was it a global config change, applied to particular ensembles, or just used this branch? |
I merged my |
Cool, thanks for checking this! Glad it didn't make anything worse. However, playing the skeptic here: do you think there's any chance that merging this branch introduced some other nengo change or bugfix that could have caused the slight improvement? Possible candidates might include any changes to SPA? Or did you just cherry-pick the two commits? |
I think, it's unlikely. But I can push the relevant branches if you want to got through the diffs. |
Nah don't worry about it. Just wondering if anything came to mind. This is a good sign! Thanks again. |
Is this still being discussed or does it just need to be rebased/reviewed? |
Also for consideration: I've been using this for over two months and haven't had any issues. In fact it's eliminated a relatively major source of error when applying principle 3 in spiking mode, leading to much better approximations when used in conjunction with the discretized version of principle 3. |
I wonder if we should move |
I'd be fine with that for now, and get this merged. We can always add integration options later. |
See nengo/nengo#975 for more details.
OK, I've rebased this, copied much of the original post to the commit message, and moved Will merge after merging #1046 and CIs are done unless there are objections. |
4b8547f
to
d7b655d
Compare
The way we currently compute overshoot uses a linear interpolation scheme, whereas the previous equation can be used to solve for the exact amount of overshoot (by substituting v(0) = 1 and rearranging for t). But to make this work well, we must also deal with another issue: we currently handle partially remaining refractory periods by multiplicatively scaling the voltage, which is yet again a linear approximation. Instead, we can apply the same trick and substitute the amount of leftover time as t into the first equation (basically using a different time-step per neuron). This makes the neuron model's spike rate invariant to dt, under the assumptions that dt <= tau_ref and the input is constant across each time-step (aside: we can in fact relax the first assumption with O(1) additional processing, although the approach becomes more complicated). We can thus think of it as a perfect discretization for digital hardware. By 'perfect', I mean the spike count differs from the expected count by < 1 at all times. Some testing reveals that this reduces the rmse(true_rates * T, sum_spikes) versus the current LIF by as much as 90% after T = 2 seconds of simulation! This is because the current model will consistently spike slightly faster or slightly slower to accumulate O(t) error (see Efficient and Accurate Time-Stepping Schemes for Integrate-and-Fire Neuronal Networks, JCNS 2001). This improvement was discovered while identifying a somewhat serious problem with drift during integration (thanks to Wilten Nicola). Any small errors from maintaining the refractory_period will add up over time, causing systematic drift in one direction. The improved neuron model reduces drift almost to zero, and this same qualitative result occurs over a wide range of seeds. So the main idea here is that this gives us the ability to run simulations at a higher time-step without sacrificing performance (normally you need to drop dt to see the same reduction in drift). But, one drawback that I am aware of is this model can slow down the network by a non-negligible amount. Some tests on my machine with an integrator of 1000 neurons gave a slowdown of ~20%. I think this is due to the expm1 and log1p being applied to arrays.
This commit implements the more accurate LIF model, implemented in Nengo 2.1.1, in OpenCL. A new boolean argument `fastlif' is also added in plan_lif, which defaults to False. See <nengo/nengo#975> for details regarding the new LIF model. Signed-off-by: Shaun Ren <shaun.ren@linux.com>
This commit implements the more accurate LIF model, implemented in Nengo 2.1.1, in OpenCL. A new boolean argument `fastlif' is also added in plan_lif, which defaults to False. See <nengo/nengo#975> for details regarding the new LIF model. Signed-off-by: Shaun Ren <shaun.ren@linux.com>
This is in some ways a natural continuation of #511 (@hunse).
In the above PR, the given math works off the assumption that the input current
J
is piece-wise constant over each time-step. From this, we can use the discretized lowpass filter with time-constanttau_rc
to compute the exact voltage update att = dt
:v(t) = v(0) + (J - v(0)) * (1 - exp(-t/tau))
But why stop there? As suggested in #511, the way we currently compute overshoot uses a linear interpolation scheme, whereas the previous equation can be used to solve for the exact amount of overshoot (by substituting
v(0) = 1
and rearranging fort
).But to make this work well, we must also deal with another issue: we currently handle partially remaining refractory periods by multiplicatively scaling the voltage, which is yet again a linear approximation. Instead, we can apply the same trick and substitute the amount of leftover time as
t
into the first equation (basically using a different time-step per neuron).This makes the neuron model's spike rate invariant to
dt
, under the assumptions thatdt <= tau_ref
and the input is constant across each time-step (aside: we can in fact relax the first assumption withO(1)
additional processing, although the approach becomes more complicated). We can thus think of it as a perfect discretization for digital hardware. By 'perfect', I mean the spike count differs from the expected count by< 1
at all times.Some testing reveals that this reduces the
rmse(true_rates * T, sum_spikes)
versus the currentLIF
by as much as 90% afterT = 2
seconds of simulation! This is because the current model will consistently spike slightly faster or slightly slower to accumulateO(t)
error (see Efficient and Accurate Time-Stepping Schemes for Integrate-and-Fire Neuronal Networks, JCNS 2001).This improvement was discovered while identifying a somewhat serious problem with drift during integration (thanks to Wilten Nicola). Any small errors from maintaining the
refractory_period
will add up over time, causing systematic drift in one direction.The improved neuron model reduces drift almost to zero, and this same qualitative result occurs over a wide range of seeds. So the main idea here is that this gives us the ability to run simulations at a higher time-step without sacrificing performance (normally you need to drop
dt
to see the same reduction in drift).But, one drawback that I am aware of is this model can slow down the network by a non-negligible amount. Some tests on my machine with an integrator of
1000
neurons gave a slowdown of ~20%. I think this is due to theexpm1
andlog1p
being applied to arrays. 馃槮If you are eager to see if this improves any of your models, as an interim solution you can make this the default neuron model by installing NengoLib and replacing
nengo.Network()
withnengolib.Network()
.