-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Numba #21
Comments
It is not a straight forward issue, and I think a lot more profiling would be required, to even decide whether it is worth it or not. The Except for single offsets spline is the fastest method, but because it interpolates, so it is out of the category of this comparison. For very small model, nothing beats plain However, I think it boils down to how well the different methods implement parallelization. In this method, The heavy calculation or bottlenecks boil down to So for the moment I leave it as it is, |
For future reference, if I ever pick that up again: The issue is complex. The comparison shows various interesting things:
The last point shows that if there is any speed-up to be achieved, the only important functions are:
Now, two things could be done:
However, given that |
Running the above again but using only 1 thread for The best way to achieve this is, however, to parallelize the whole kernel, hence the calls to I am closing this issue. I won't implement |
After all the experience with |
Note that |
Replacing the whole kernel with numba should be the approach... |
As a noteThe difficulty comes from the many different scenarios, which yield different things:
So this is a problem that depends a lot on the use case, which makes it difficult. There might be absolutely now gain for some cases, and there might be a lot of gain for other cases. |
Will be closed by #77 |
There are currently two branches which try to implement numba:
Roadmap:
wavenumber()
greenfct()
reflections()
fields()
-parallel=True
andnb.prange
transform.?
have to be jitted.numexpr
empymod-asv
)Below was the original commit from 2018.
Currently 13 of the heaviest computations are implemented twice, once with
NumPy
, once as strings fornumexpr
(switchopt=None
oropt='parallel'
). These are all inempymod.kernel
, and areMaybe these two implementation should be replaced by one single implementation using
numba
. See the notebooksin https://github.com/prisae/tmp-share.
It looks like a speed-up of up to 70% could be achieved. At the cost of readability. But then, given the first list, 3 numba-functions might be enough; sqrt, exp, and the division.
Something to think about.
The text was updated successfully, but these errors were encountered: