Real-time inference for Stable Diffusion - 0.88s latency. Covers AITemplate, nvFuser, TensorRT, FlashAttention. Join our Discord communty: https://discord.com/invite/TgHXuSJEk6
-
Updated
Dec 4, 2023 - Jupyter Notebook
Real-time inference for Stable Diffusion - 0.88s latency. Covers AITemplate, nvFuser, TensorRT, FlashAttention. Join our Discord communty: https://discord.com/invite/TgHXuSJEk6
Some DNN model optimization experiments and notebooks
Add a description, image, and links to the onnxruntime topic page so that developers can more easily learn about it.
To associate your repository with the onnxruntime topic, visit your repo's landing page and select "manage topics."