-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed improvements in loading time to Rubi Integration module #13339
Conversation
Partially fixes sympy#13241 . Pickled the rubi rules using `dill` module. Rubi now uses .pickle file to directly load the rules instead of constructing the discrimination net of all the rules
Has this resulted in reduction of importing time? You can paste the timings of import, just like Ondřej did. BTW I am getting different timings from what @certik posted:
|
756e666
to
835355c
Compare
It is much faster on my computer now than before: Without .pickle fileIn [1]: %time from sympy.integrals.rubi.rubi import rubi_integrate
CPU times: user 3min 22s, sys: 1.32 s, total: 3min 23s
Wall time: 3min 26s With .pickle fileIn [1]: %time from sympy.integrals.rubi.rubi import rubi_integrate
CPU times: user 2.67 s, sys: 96.7 ms, total: 2.77 s
Wall time: 2.81 s |
@Upabjojr , I have picked the rules using We will need |
OK, though I still believe that the best solution would be to make the lambdas full functions, so that we can avoid using dill. |
How did you use As another issue: I'm strongly opposed to include binary files into SymPy. If needed, I would rather suggest to include the code to generate the binary file, but not the binary file itself. |
Furthermore, we still need to do some work with the rules, in many cases they give the wrong results. I believe we could have another GSoC next year to finish RUBI. |
Including binary files doesn't allow to compare the versions and is likely to burst the size of the development branch. I remember experiencing something similar in the past, I've seen a development branch requiring around 100 GB of space, 99.99% used by binary files regenerated for every commit. |
I have used The code to generate the binary file is really trivial: import dill
from sympy.integrals.rubi.rubi import rubi_object
rubi = rubi_object()
file_Name = 'pickled_rubi_rules.p'
fileObject = open(file_Name, 'wb')
dill.dump(rubi, fileObject)
fileObject.close() Shall I add this in documentation? |
I was going to start working on testing the rules again after getting this work. I wanted to improve loading time to make testing easier. |
You can include the generator of the pickled file. There's no way we are going to include a binary pickle file. |
You can add that code directly to the SymPy code, just use |
By the way, in order to match |
How big is the binary file? GitHub doesn't show the filesize. |
You can try cloudpickle. I believe unlike dill, it generates a valid pickle format, so you only need it to serialize, not to deserialize. |
I will not merge any PR containing binary files. The big problem is that every time you modify the rules, the binary file is likely to be noticeably different, and git will probably have to store a whole new blob. |
Yes, please don't merge any binary files into the sympy git repository. For now, why not to create a script that generates this binary blob, and users can (optionally) run it themselves on their machines? |
@parsoyaarihant a nice idea would be to generate the pickled file as a background process, I'd use import multiprocessing
def function_generating_blob(...):
...
p = multiprocessing.Process(target=function_generating_blob, args=(arg1, arg2, ...))
p.start() (hope I wrote everything correctly) This should start the function under target as a separate process in the background and allow you to continue to use the current Python session. This could be a loading strategy. |
Another +1 for not checking in any generated binaries in the repo. If it's useful in the source distribution it could be made to be generated by the |
I think our aim is to generate the decision tree, pickled files can be useful for development purposes. |
@Upabjojr I think you are working on something related to this PR. What's your view on this one? Is this work still needed? |
No binary files into SymPy please. This PR should definitely be closed. |
#13241
Rubi now uses
.pickle
file to directly load the rules instead of constructing the discrimination net of all the rules(which was time consuming). The pickled file is placed atsympy/integrals/rubi/
.I have pickled the rules using
dill
module.