You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PyTorch 2.0 has a way to save on disk and reload compiled Triton kernels.
Using their API should save lots of time during the warmup.
We just merged a merge to a recent version of PyTorch 2.0 so I guess it's very doable to just call their API to launch Triton kernels instead of the original Triton one.
However, a CUDA graph can't be exported and reused.
The main reason is that kernels are re-executed with their parameters.
Those parameters include gpu memory addresses.
And it's very likely that your tensor memory address will change from launch to launch (if you quit Python session and the CUDA pool is freed).
It's certainly possible in theory to update the graph parameters, but it seems very hard, so probably better to just rerun the warmup (without recompiling triton kernel ofc).
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
I have not tried straight pickle but did not see any documentation about exporting model.
The text was updated successfully, but these errors were encountered: