Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix BF16 group_index_select_2d on AMD GPU #2321

Closed
wants to merge 1 commit into from

Conversation

zoranzhao
Copy link
Member

Summary:
as title

[zhuoran@devgpu003.snc8 /data/users/zhuoran/fbsource/fbcode (7932bb4ab|remote/fbsource/stable...)]$ HIP_VISIBLE_DEVICES=7 numactl --cpunodebind=1 --membind=1 buck2 run mode/{opt,amd-gpu} -c fbcode.triton_backend=amd -c fbcode.enable_gpu_sections=true //hammer/modules/sequential/encoders/tests:hstu_bench -- --enable-multi-stream=true --enable_profiler=true --num-streams=3 --num-workers=3
Watchman fresh instance: new mergebase, cleared graph state, cleared dep files
 ⚠  Python 3.8 is EOL, and is going away by the end of H1 2024. Upgrade //caffe2/tools/setup_helpers:gen_version_header to Python 3.10 now to avoid breakages. https://fburl.com/py38-sunsetting
 ⚠  Python 3.8 is EOL, and is going away by the end of H1 2024. Upgrade //caffe2:substitute to Python 3.10 now to avoid breakages. https://fburl.com/py38-sunsetting
 ⚠  Python 3.8 is EOL, and is going away by the end of H1 2024. Upgrade //caffe2/tools/amd_build:build_amd to Python 3.10 now to avoid breakages. https://fburl.com/py38-sunsetting
 ⚠  Python 3.8 is EOL, and is going away by the end of H1 2024. Upgrade //caffe2/torchgen:gen to Python 3.10 now to avoid breakages. https://fburl.com/py38-sunsetting
 ⚠  Python 3.8 is EOL, and is going away by the end of H1 2024. Upgrade //caffe2/tools/setup_helpers:generate_code to Python 3.10 now to avoid breakages. https://fburl.com/py38-sunsetting
Action failed: fbcode//deeplearning/fbgemm/fbgemm_gpu:sparse_ops_hip (hip_compile src/sparse_ops/sparse_group_index.hip (pic))
Remote command returned non-zero exit code 1
Reproduce locally: `frecli cas download-action f0569d85851723e287f08ed03c0bc831587c0a05f94c911fe0b204ddd7670d24:145`
stdout:
stderr:
buck-out/v2/gen/fbcode/2ab98e452e15a67d/deeplearning/fbgemm/fbgemm_gpu/__sparse_ops_hip_hipify_gen__/out/src/sparse_ops/sparse_group_index.hip:11:10: fatal error: 'cuda_bf16.h' file not found
#include <cuda_bf16.h>
         ^~~~~~~~~~~~~
1 error generated when compiling for gfx90a.

Reviewed By: nrsatish, sryap, htyu

Differential Revision: D53549323

Copy link

netlify bot commented Feb 8, 2024

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
🔨 Latest commit 88a49c5
🔍 Latest deploy log https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/65c41f859185c600071e4f91
😎 Deploy Preview https://deploy-preview-2321--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D53549323

Summary:

as title
```
[zhuoran@devgpu003.snc8 /data/users/zhuoran/fbsource/fbcode (7932bb4ab|remote/fbsource/stable...)]$ HIP_VISIBLE_DEVICES=7 numactl --cpunodebind=1 --membind=1 buck2 run mode/{opt,amd-gpu} -c fbcode.triton_backend=amd -c fbcode.enable_gpu_sections=true //hammer/modules/sequential/encoders/tests:hstu_bench -- --enable-multi-stream=true --enable_profiler=true --num-streams=3 --num-workers=3
Watchman fresh instance: new mergebase, cleared graph state, cleared dep files
 ⚠  Python 3.8 is EOL, and is going away by the end of H1 2024. Upgrade //caffe2/tools/setup_helpers:gen_version_header to Python 3.10 now to avoid breakages. https://fburl.com/py38-sunsetting
 ⚠  Python 3.8 is EOL, and is going away by the end of H1 2024. Upgrade //caffe2:substitute to Python 3.10 now to avoid breakages. https://fburl.com/py38-sunsetting
 ⚠  Python 3.8 is EOL, and is going away by the end of H1 2024. Upgrade //caffe2/tools/amd_build:build_amd to Python 3.10 now to avoid breakages. https://fburl.com/py38-sunsetting
 ⚠  Python 3.8 is EOL, and is going away by the end of H1 2024. Upgrade //caffe2/torchgen:gen to Python 3.10 now to avoid breakages. https://fburl.com/py38-sunsetting
 ⚠  Python 3.8 is EOL, and is going away by the end of H1 2024. Upgrade //caffe2/tools/setup_helpers:generate_code to Python 3.10 now to avoid breakages. https://fburl.com/py38-sunsetting
Action failed: fbcode//deeplearning/fbgemm/fbgemm_gpu:sparse_ops_hip (hip_compile src/sparse_ops/sparse_group_index.hip (pic))
Remote command returned non-zero exit code 1
Reproduce locally: `frecli cas download-action f0569d85851723e287f08ed03c0bc831587c0a05f94c911fe0b204ddd7670d24:145`
stdout:
stderr:
buck-out/v2/gen/fbcode/2ab98e452e15a67d/deeplearning/fbgemm/fbgemm_gpu/__sparse_ops_hip_hipify_gen__/out/src/sparse_ops/sparse_group_index.hip:11:10: fatal error: 'cuda_bf16.h' file not found
#include <cuda_bf16.h>
         ^~~~~~~~~~~~~
1 error generated when compiling for gfx90a.
```

Reviewed By: nrsatish, sryap, htyu

Differential Revision: D53549323
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D53549323

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 86ea895.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants