[Feature Request]: Explore NVIDIA/TransformerEngine for speed/efficiency #4721

0xdevalias · 2022-11-14T22:58:45Z

Is there an existing issue for this?

I have searched the existing issues and checked the recent builds/commits

What would your feature do ?

https://github.com/NVIDIA/TransformerEngine
- A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in both training and inference.

Proposed workflow

N/A

Additional information

Crossposted on:

Explore NVIDIA/TransformerEngine for speed/efficiency huggingface/diffusers#1288

Other issues related to potential performance improvements:

78Alpha · 2022-11-15T18:07:38Z

Looks like the performance boost would only go to 40 series and above, as it relies on FP8 on the tensor cores. Even then, looks like it was locked out in the repo as all 40 series owners are reporting that it isn't currently working for them. And double for Windows.

Still, couldn't hurt for the boost but would also need an option for --no-FP8

0xdevalias mentioned this issue Nov 24, 2022

Add deepspeed, xformers, kernl, transformerengine, ColossalAI, tritonserver, VoltaML, etc stochasticai/x-stable-diffusion#21

Open

mezotaken added the enhancement New feature or request label Jan 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request]: Explore NVIDIA/TransformerEngine for speed/efficiency #4721

[Feature Request]: Explore NVIDIA/TransformerEngine for speed/efficiency #4721

0xdevalias commented Nov 14, 2022 •

edited

Loading

78Alpha commented Nov 15, 2022

[Feature Request]: Explore NVIDIA/TransformerEngine for speed/efficiency #4721

[Feature Request]: Explore NVIDIA/TransformerEngine for speed/efficiency #4721

Comments

0xdevalias commented Nov 14, 2022 • edited Loading

Is there an existing issue for this?

What would your feature do ?

Proposed workflow

Additional information

78Alpha commented Nov 15, 2022

0xdevalias commented Nov 14, 2022 •

edited

Loading