Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up fitting in HyperSpy #488

Closed
francisco-dlp opened this issue Mar 20, 2015 · 5 comments · Fixed by #2422
Closed

Speed up fitting in HyperSpy #488

francisco-dlp opened this issue Mar 20, 2015 · 5 comments · Fixed by #2422
Milestone

Comments

@francisco-dlp
Copy link
Member

Fitting in HyperSpy is slow, or at least it is not as fast as it could be. For multidimensional datasets parallesation can help in some cases and is in the works (see #242). Another complementary approach is to speed up function evaluation. One way to do this would be to use numexpr. Any other ideas?

@dnjohnstone
Copy link
Contributor

This is a particular problem I'm facing right now with trying to fit many 2D Gaussians in a stack of images... i.e. ~40 2D Gaussians in ~5000 images.

I have some code written in C that parallelises over the individual peaks and is really a lot faster than anything I can write in HyperSpy right now. I guess there are two points here, first we could perhaps achieve a speed up by implementing some parts of the optimisation in C but I guess this might go against some of the wider HyperSpy principles?

Secondly, an option to parallelise over components in the model may be of particular benefit in cases (like mine, and often atomic resolution images) where the peaks in each spectrum/image are quite well spaced. In short fitting 4 variables 40 times is probably a better bet than trying to fit 160 in one go.

Any thoughts or points that need doing would be appreciated to guide doing that!

Any thoughts or

@francisco-dlp
Copy link
Member Author

#573 partially solves this issue as, when using numexpr, the speed should be very close to C code. However a new Expression2D component will be needed for Model2D. Hopefully it'll require only minor changes.

Regarding parallelization, if you mention that you have ~5000 images, then I think that running the fit of each image in parallel (as implemented in #242) instead of parallelizing the fit of the individual images should provide a similar or better boost in speed without compromising accuracy. Actually, I might be wrong about this but, if for parallelizing the fit on an individual image you assume that the contribution of all other peaks is negligible, then you may not need to fit at all, as just estimating the parameters of the 2D Gaussians analytically might be good enough for the purpose.

@tjof2
Copy link
Contributor

tjof2 commented Feb 14, 2017

How much of this is fixed/solved with #1101?

@francisco-dlp
Copy link
Member Author

#573, #1101 and #1321 should fully address this.

@tjof2
Copy link
Contributor

tjof2 commented Jan 29, 2020

#573, #1101 and #1321 should fully address this.

@tjof2 tjof2 mentioned this issue Sep 9, 2020
5 tasks
@tjof2 tjof2 linked a pull request Sep 9, 2020 that will close this issue
5 tasks
@ericpre ericpre modified the milestones: Wish list, v1.7 Mar 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants