Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange behavior when import torch after import te. #871

Open
GGGGGGXY opened this issue May 27, 2024 · 1 comment
Open

Strange behavior when import torch after import te. #871

GGGGGGXY opened this issue May 27, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@GGGGGGXY
Copy link

Firstly, I would like to express my sincere gratitude for your dedication and significant contributions to the open-source community. Your work has been instrumental and greatly appreciated.

However, while utilizing transformer_engine, I have encountered some issues that I am unable to resolve.

When import transformer_engine before torch, It cause a RUNTIME ERROR.

image image

In my codes. After import transformer_engine, it always teardorn with
image

transformer_engine v1.5 below working fine.

My env:
h800
torch v2.3.0
cuda 12.4.1
cudnn 8.9.7.29
transformer_engine release_v1.7

Thank you in advance for taking the time to read this issue and for any help you can provide. I look forward to hearing from you soon.

@ptrendx
Copy link
Member

ptrendx commented May 28, 2024

Hmm, this is strange.
@pggPL Could you take a look? You should be able to use H100 as a proxy for H800.

@ptrendx ptrendx added bug Something isn't working labels May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants