-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Making the code faster #19
Conversation
…rangeBasisFunctions
Python parallelization and numba do not necessarily play well together. Current experimental roadmap:
Might actually make sense to parallelize on the interpolators themselves rather than in any loop. |
It doesn't work in dom but it does work in my laptop. I really hate when this happens. |
Ok, bar anything that might have gone wrong (for instance, I might have been stupid and maybe I broke something... testing is needed) but I get some important speed up. These are the times for a test with the Time with PR #9
Time with this PR
|
Ok, that's quite impressive. |
And it seems to work. And that's without parallelization yet! |
@felixhekhorn I just enforced a much stricter integration error and the errors are back where they were before. Thanks for your input! Ok, at the time of the last commit this is fully functional now I am going to
|
One comment I am writing here so I don't forget, using Until profile is not done (or NLO is not done) I won't know. If is just the compilation being slower then it is fine (it will be compensated by more difficult integrands) if not we can re-implement some of the One thing I've already mentioned several times is the fact that one of the main performance penalties is just the fact that we need to integrate many times. Reducing the number of integrations (for instance, integrating all splitting functions at once) would reduce the running time by whatever the reduction. |
The only thing left is the |
Ok, not sure whether the test will work on travis but bar that this is now finished. Several things to note @felixhekhorn
|
shall we then merge this PR to LO? |
Yes, if you @felixhekhorn and @scarrazza don't have any comments we can merge, merge LO to master and start thinking on the "next levels" (for instance, I think this might be a good moment to do a APFEL-EKO comparison test to make sure the results are compatible) |
ret["operator_errors"]["S_qg"] = op_s_qg_err | ||
ret["operator_errors"]["S_gq"] = op_s_gq_err | ||
ret["operator_errors"]["S_gg"] = op_s_gg_err | ||
ret["operators"]["S_qq"] = output_array[:, :, 0, 0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we explain somewhere how the indexes of output_array
are sorted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it is explained in line 140 above :P
Personally I don't like passing and updating a dictionary but I also don't see a better option right now. But as a matter of principle I don't like the output of this function.
Ideally we would get two matrix of results directly from quad, but as we discovered the vectorized version of quad is much slower.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- output_array: well that's internal and if Juan knows fine ;-)
- I think we should join
_run_nonsinglet
andrun_singlet
into a single function/loop as they serve the same purpose - updating dictionary: we have to replace it as soon as we start iterating over Q2; for now it was just the quickest fix;
I will replace it by
ret = {};
# assign
return ret
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should join _run_nonsinglet and run_singlet into a single function/loop as they serve the same purpose
Well, they are different functions in that one generates a matrix.
But I think the correct thing would be indeed joining them so that the output is an array [nonsinglet, (singlet1, singlet2, singlet3, singlet4)]
return ret | ||
|
||
|
||
if __name__ == "__main__": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need a main here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is useful for quickly testing that you didn't broke anything and it doesn't hurt anybody.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, but then maybe we should consider this to test folder?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, this is more for convenience since you can do python -i dglap.py
and then it has run some stuff and you can play with the functions from dglap
. It is not a test in the proper sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is useful for quickly testing
that's all said ;-)
r = np.real(N) | ||
i = np.imag(N) | ||
out = np.empty(2) | ||
c_digamma( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This call gives me
src/eko/tests/test_cfunctions.py::test_digamma
/home/carrazza/repo/n3pdf/eko/src/eko/tests/test_cfunctions.py:25: UserWarning: implicit cast from 'char *' to a different pointer type: will be forbidden in the future (check that the types are as you expect; use an explicit ffi.cast() if they are correct)
c_digamma(x, 0.0, _gsl_digamma.ffi.from_buffer(out))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which version of cffi
do you have? Mine is 1.13.2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same version for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw same warning for travis https://travis-ci.com/N3PDF/eko/builds/139186641#L329
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strange, it happens only from pytest
. I'll have a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, it only happens in the test file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so the problem is that upon first call ffi.from_buffer
defaults to char *
, the correct call would be c_digamma(x, 0.0, _gsl_digamma.ffi.from_buffer("double[]", out))
But then numba
is not happy. numba
maybe wants to be in control of the compilation, don't know
So I'll let it be, since this is a problem within cffi
I hope that once they decide to deprecate it for good they will make from_buffer
read the type of the buffer (they are already looking at the size of the buffer without me telling it anything)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to let you know: I get two more warnings, the first is related to numba:
env/lib/python3.7/site-packages/numba-0.46.0-py3.7-linux-x86_64.egg/numba/types/containers.py:3
/home/felix/Physik/N3PDF/EKO/eko/env/lib/python3.7/site-packages/numba-0.46.0-py3.7-linux-x86_64.egg/numba/types/containers.py:3: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
from collections import Iterable
and there is somewhere a cast missing or an actual comparison to 0+0j
src/eko/tests/test_cfunctions.py::test_digamma
/home/felix/Physik/N3PDF/EKO/eko/env/lib/python3.7/site-packages/numpy/testing/_private/utils.py:664: ComplexWarning: Casting complex values to real discards the imaginary part
(actual, desired) = map(float, (actual, desired))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are both not our problem, but the second is undesired (it's numpy's fault) but it can be fixed. I'll push a fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good, in particular the self._compile
mechanism is clear and very non intrusive.
Still haven't changed anything because
interpolation.py
will need considerable changes in order to make things work the way they should but I wanted to have a place to collect my thoughts where they can be monitored/audited.quad
works very nicely withnumba
but not the other way around (obviosuly). This addresses the doubts @scarrazza had the other day. If the integrand is a numba compiled object quad will be very fast.