Skip to content

Conversation

@ChrisRackauckas
Copy link
Collaborator

This accelerates the test case by 36,000x and organizes it. The high level scripts are pulled out of src and coexist is made into a proper package (note: the name of the module should change because Julia module conventions have a capital in the front!). Then, a code generation scheme via ModelingToolkit is setup as the dydt_generator.jl, which results in a 36,000x acceleration of the ODE solve. To demonstrate caching I show that the generated code can be saved to a file and then is benchmarked in pyjulia_benchmark.jl. However, note that saving to a file isn't required (as the dydt_generator.jl doesn't use the file generated version at all), so generated fast forms can be used on the fly if needed.

Note that ModelingToolkit doesn't seem to be totally needed here, you can get a good portion (but not all) of the speedup by preallocating buffers and rewriting the einsum expressions. However, this just demonstrates a way to use the code generators such that 0 work is required to get the optimal expressions. The function that is generated is in a sparse form so as the networks get larger the amount of acceleration should keep increasing as well.

Note that the only "glut" in the system here is that the symbolic simplification for generating the fastest form of the function takes around 80 seconds. But this is unnecessary since most of the values are zeros. We should open an issue about this an invite @YingboMa and @shashi to the repo to take a look at some accelerated simplification tools for this case.

This accelerates the test case by 36,000x and organizes it. The high level scripts are pulled out of `src` and coexist is made into a proper package (note: the name of the module should change because Julia module conventions have a capital in the front!). Then, a code generation scheme via ModelingToolkit is setup as the `dydt_generator.jl`, which results in a 36,000x acceleration of the ODE solve. To demonstrate caching I show that the generated code can be saved to a file and then is benchmarked in `pyjulia_benchmark.jl`. However, note that saving to a file isn't required (as the `dydt_generator.jl` doesn't use the file generated version at all), so generated fast forms can be used on the fly if needed.

Note that ModelingToolkit doesn't seem to be totally needed here, you can get a good portion (but not all) of the speedup by preallocating buffers and rewriting the einsum expressions. However, this just demonstrates a way to use the code generators such that 0 work is required to get the optimal expressions. The function that is generated is in a sparse form so as the networks get larger the amount of acceleration should keep increasing as well.

Note that the only "glut" in the system here is that the symbolic simplification for generating the fastest form of the function takes around 80 seconds. But this is unnecessary since most of the values are zeros. We should open an issue about this an invite @YingboMa and @shashi to the repo to take a look at some accelerated simplification tools for this case.
@aa25desh
Copy link
Collaborator

aa25desh commented Jun 7, 2020

Thank you very much! 😄
So speed up is the effect of the proper package?

@ChrisRackauckas
Copy link
Collaborator Author

No, it's from generating a non-allocating dydt.

Copy link
Contributor

@YingboMa YingboMa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few stylistic comments.

ChrisRackauckas and others added 2 commits June 8, 2020 01:38
Co-authored-by: Yingbo Ma <mayingbo5@gmail.com>
Co-authored-by: Yingbo Ma <mayingbo5@gmail.com>
@vollmersj vollmersj merged commit 06cd597 into master Jun 8, 2020
@ChrisRackauckas ChrisRackauckas deleted the accelerate branch June 8, 2020 20:31
@aa25desh aa25desh mentioned this pull request Jun 19, 2020
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants