Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: Explore NVIDIA/TransformerEngine for speed/efficiency #4721

Open
1 task done
0xdevalias opened this issue Nov 14, 2022 · 1 comment
Open
1 task done
Labels
enhancement New feature or request

Comments

@0xdevalias
Copy link

0xdevalias commented Nov 14, 2022

Is there an existing issue for this?

  • I have searched the existing issues and checked the recent builds/commits

What would your feature do ?

  • https://github.com/NVIDIA/TransformerEngine
    • A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in both training and inference.

Proposed workflow

N/A

Additional information

Crossposted on:

Other issues related to potential performance improvements:

@78Alpha
Copy link

78Alpha commented Nov 15, 2022

Looks like the performance boost would only go to 40 series and above, as it relies on FP8 on the tensor cores. Even then, looks like it was locked out in the repo as all 40 series owners are reporting that it isn't currently working for them. And double for Windows.

Still, couldn't hurt for the boost but would also need an option for --no-FP8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants