Feature/device capable codegen support#292
Closed
J-Simkin wants to merge 28 commits intoModellingWebLab:masterfrom
Closed
Feature/device capable codegen support#292J-Simkin wants to merge 28 commits intoModellingWebLab:masterfrom
J-Simkin wants to merge 28 commits intoModellingWebLab:masterfrom
Conversation
…e template type itself
… in place computation on card
…or GPU functions to template writer rather than python codegen script
…logic in python script
…ll instead of emitting a functor for each cpu cell
… GPU array generation for CVODE using context manager
…eted unused logic and added some safety feature
…flect new printer output
…files (even if they are not written to a file)
Author
|
On reflection, the changes I have suggested here could be implemented in a much cleaner way. Instead of modifying the existing ChastePrinter, we could make a new printer which implements the GPU changes and then using the new device_mode context that I suggested adding, we could just use this to switch between a device printer and a CPU printer (the standard printer) where necessary. This means none of the CPU tests would need to change and there is no risk of us breaking something that already works and the autogenerated code should be more readable and we can avoid macro switching. I'll have to revisit this when I get the time. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
I have modified codegen to emit an extra file for standard BackwardEuler and Cvode models that provide GPU (device) kernels. These kernels will be used by future updates to Chaste in order to solve the models on device. These changes depend on the updates mentioned in Chaste/Chaste#497, and hence should not be merged until Chaste itself has been updated,
CPU codegen changes
ChastePrinterto be device agnostic, see the description within Device capable codegen support Chaste/Chaste#497 as to how this works.backward_euler_model.hppandcvode_model.hppto include necessary constants and declarations required for device solves, these both being guarded by a macro that ensures this still works when compiling it as regular C++.backward_euler_model.hppto the file that provides the kernels that perform the backward euler solve. This file is not included in the changes suggested in #497 and is saved for a future Chaste update. Also note this include could probably be brought to the top of the file in the future.GPU codegen changes
ChasteModelto be used within a pythonwithblock such that we can modify our formatting functions such as_format_rY_lookupto be conditioned on the current context and emit the appropriate C++ code depending on if we are currently generating for CPU or GPU.y_derivative_equations_devicerY_lookup_devicejacobian_equations_devicejacobian_entries_devicesparse_jacobian_entriessparse_rowptrsparse_colindsparse_nnzbackward_euler_kernels.hppandcvode_model_kernels.hppMotivation and Context
This changes will allow us to run these models on GPU within Chaste.
Relates to Chaste/Chaste#497 and will be used by future changes to Chaste that provide this support. This is a breaking change in that Chaste will not work with this codegen until the previous issue has been resolved.
Types of changes
Documentation checklist
I have written and updated the Docstrings which I believe covers the two checked points above.
Testing
To test my new additions, I have updated the reference models to match the new output, and I have modified
test_codegen.pyto tell appropriate models to generate the kernels as well as the original files. I also modified conftest.py::compare_model_against_reference to check if the model is generating kernels, and then behave accordingly.I also updated the txt files within chaste_codegen/data/tests to match the new codegen output and use CHASTE_MATH etc.
Finally I also updated the test_console_script_help.txt and test_console_script_usage.txt files to match my new command line argument.
Hence codegen can be tested automatically.
All tests within Chaste pass (not just the Continuous testpack).
Numerical validation of kernels
Although the kernels are currently unused in Chaste, I have tested their correctness through using them in my local GPU implementation and writing to file.
I then compared all state variables calculated in the GPU version to the equivalent CPU implementation for every single time point and found the maximum absolute errors for both BE and Cvode implementations, testing BE with both float and double on GPU.
I got the following results:$10^{-2}$ for all variables at all time points
LuoRudy1991BackwardEuler vs LuoRudy1991BackwardEulerGpuFloat:
error <
LuoRudy1991BackwardEuler vs LuoRudy1991BackwardEulerGpuDouble:$10^{-13}$ for all variables at all time points
error <
LuoRudy1991Cvode vs LuoRudy1991CvodeGpuDoubleDense:$10^{-2}$ but there are large error spikes and abnormalities
error generally <
LuoRudy1991Cvode vs LuoRudy1991CvodeGpuDoubleSparse:$10^{-2}$ but there are large error spikes and abnormalities
error generally <
The Cvode kernels are mathematically correct and safe to merge but the solver itself requires some more tweaking to fix the described behaviour.