Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification on the HSA_FORCE_FINE_GRAIN_PCIE requirement #84

Closed
tmh97 opened this issue Jul 2, 2024 · 2 comments
Closed

Clarification on the HSA_FORCE_FINE_GRAIN_PCIE requirement #84

tmh97 opened this issue Jul 2, 2024 · 2 comments

Comments

@tmh97
Copy link

tmh97 commented Jul 2, 2024

Hey folks, Tom here from Cornelis networks.

We've begun using the rccl-tests suite to test the functionality of our libfabric provideropx+aws-rccl-plugin

We successfully ran these tests with no issue

  • all_gather_perf
  • all_reduce_perf
  • broadcast_perf
  • reduce_perf
  • reduce_scatter_perf

These tests all fail due to Out Of Memory

  • alltoall_perf
  • alltoallv_perf
  • gather_perf
  • scatter_perf
  • sendrecv_perf

When I set the HSA_FORCE_FINE_GRAIN_PCIE=1, all of the failing tests magically pass.

The docs say The HSA_FORCE_FINE_GRAIN_PCIE environment variable will need to be set to 1 in order to run the unit tests which use fine-grained memory type, however, I am running all of the tests with the Default: Coarse memory type.

I am hoping for some clarification on why this variable seems to improve behavior? Maybe some of the tests have "fine-grained memory type" by default? Any input would be greatly appreciated, thanks in advance for any help!

@nusislam
Copy link
Contributor

What AMD GPU and rocm version are you using? This flag should not be required for rocm version >= 5.7

@tmh97
Copy link
Author

tmh97 commented Jul 11, 2024

ah the rocm version was my issue! I was using rocm 5.3.0

Thanks so much for the timely response!

@tmh97 tmh97 closed this as completed Jul 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants