New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] FFT based equilibration detection. #112
Conversation
Right now I'm using the correlation function code from statsmodels. We can backport their code to avoid an extra dependency, once we finalize the big picture roadmap here |
What about the code for this contributed from @trendelkampschroer ? |
It was a major refactoring of our current code, which would require more work than the current PR. |
Also, we had discussed using only N ~ 50 origin time points to evaluate as potential time origins |
Do you mean all lag times simultaneously, or time origins |
All lag times, not time origins. The time origin still needs to be searched manually. |
Basically, the idea here is:
|
I thought we only needed to replace the I do like the idea of the binary search, but note that the computed effective number of samples isn't necessarily concave in |
Also, I'm working on a short paper on the |
The issue is that the current statistical inefficiency uses a heuristic for skipping lagtimes, which isn't as useful when you use the FFT approach, which gives you results for all lagtimes simultaneously. Also, https://github.com/trendelkampschroer/pymbar/blob/master/src/pymbar/timeseries.py has lots of extra stuff in there. |
I agree that our function is not concave, so my heuristic gives only a local maximum. However, we're already using heuristics, so it's not a deal-breaker. |
I agree that FFT is better than the multiresolution approach I coded. There's extra stuff in @trendelkampschroer's timeseries.py, but we only need The Let's definitely try out the binary search of |
So what about the Newton's method code? His code for statistical inefficiency calls a Newton optimizer. That's a big change from the current code. |
Also, there are no tests. |
I chose |
I'm happy with the That sound reasonable? |
It's just not possible for all uses to benefit, because the FFT approach only works for a single input observable. The current How about I have |
Sounds great! |
I think there were still some planned changes here. Specifically, you were going to rework |
So I'm going to open an new PR for the FFT code. |
One issue is that some of the previous "tricks" are ineffective (e.g.
fast=True
), as the FFT-based approach calculates all lagtimes simultaneously.For the search the discard region
t
, I've elected to just do binary search of the logarithmically-spaced lagtime grid.