Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENHANCEMENT] CUDA-Graph integration #53

Open
xysmlx opened this issue Sep 30, 2020 · 1 comment
Open

[ENHANCEMENT] CUDA-Graph integration #53

xysmlx opened this issue Sep 30, 2020 · 1 comment
Labels
enhancement New feature or request

Comments

@xysmlx
Copy link
Contributor

xysmlx commented Sep 30, 2020

🚀 Feature

CUDA-Graph is introduced in CUDA-10.1 to reduce kernel launch overhead. CUDA-Graph matches current NNFusion's design, so it could be easily integrated to cuda_codegen to improve performance.

Motivation

Pitch

Add stream in kernel_entry and capture the kernel_entry function to initialize the cuda-graph.

Note that it cannot capture default stream and there should not be host-blocking API calls (e.g., cudaDeviceSynchronize) during stream capturing.

Alternatives

Additional context

https://developer.nvidia.com/blog/cuda-graphs/
https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#cuda-graphs

@xysmlx xysmlx added the enhancement New feature or request label Sep 30, 2020
@nnfbot
Copy link

nnfbot commented Sep 30, 2020

Thanks for the report @xysmlx! I will look into it ASAP! (I'm a bot).

@wenxcs wenxcs mentioned this issue Oct 15, 2020
34 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants