Skip to content

Conversation

@naromero77amd
Copy link

@naromero77amd naromero77amd commented Nov 13, 2025

In the ROCm fork of PyTorch 2.9, Inductor currently has codegen support for fast_tanhf. However, there were some NaN issues in the original Triton implementation of fast_tanhf .

Upstream Triton has an improved fast_tanhf where the NaN issues are now fixed. This upstream commit has been backported to ROCm fork of Triton (see code comments).

A bump in the Triton commit is also needed.

Other notes:

@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Nov 13, 2025

Jenkins build for 0b59f1c2c8cbe8aeb86ce9a5d6aa471f75e76091 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

@naromero77amd naromero77amd changed the title [release /2.9][ROCm][inductor] Improved fast_tanh code generation [release/2.9][ROCm][inductor] Improved fast_tanh code generation Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants