-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(jitify2) Serialisation post-NVRTC #133
Comments
|
Thanks, I'll try that today. Looking at our current code it appears we go from |
As noted above, we were calling Thanks |
Within jitify2, the latest point of implemented serialisation appears to be that of
PreprocessedProgram
. Hence, after deserialization it requires compilation via NVRTC. In contrast jitify1 serialises the ptx blob output by NVRTC.The jitify2 approach is presumably more portable, however in our use-case we are serialising to memory and/or
/tmp/
, so there's no requirement of portability. Furthermore, with our large compilation units, the serialised output is ~2x bigger and the time to load them and ~50x slower.A quick test with CUDA 12.3
Is there a reason post-NVRTC serialisation is no-longer present/am I mistaken?
Is it on your roadmap/would you be happy with me submitting a PR?
The text was updated successfully, but these errors were encountered: