Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prov/efa: fix the gdrcopy_handle procedure #8902

Merged
merged 2 commits into from
May 9, 2023

Conversation

shijin-aws
Copy link
Contributor

@shijin-aws shijin-aws commented May 7, 2023

This PR contains 2 commits to fix the procedure that efa pass gdrcopy handle to shm.

  1. Use shm with 1.19 API, since hmem_data is introduced for libfabric >= 1.19
  2. Remove the version check in efa_mr.c, since efa is using hmem_data internally so it should not care what api version that application is using

@shijin-aws
Copy link
Contributor Author

shijin-aws commented May 8, 2023

This PR needs some major change after discussing with the team. We need to remove the condition of FI_VERSION check for the hmem_data and gdrcopy handle flags within efa_mr.c, use 1.19 API version when calling shm's get_info.

All of the changes mentioned above should be based on PR #8905 that update libfabric version to 1.19.

The hmem_data field in ofi_mr will be ignored for api
version < 1.19. In order to make shm use the gdrcopy
handle stored in the hmem_data field, we need to call
shm's fi_getinfo with 1.19 version.

Signed-off-by: Shi Jin <sjina@amazon.com>
Efa is using the hmem_data internally to pass
gdrcopy handle to shm, so it should not care
what api version the application was using.
Since this code is already part of libfabric
1.19, there is no need to add a version check.

Signed-off-by: Shi Jin <sjina@amazon.com>
@shijin-aws shijin-aws changed the title prov/efa: fix the gdrcopy_handle flags prov/efa: fix the gdrcopy_handle procedure May 9, 2023
@wenduwan
Copy link
Contributor

wenduwan commented May 9, 2023

nit PR description should be updated.

@shijin-aws
Copy link
Contributor Author

shijin-aws commented May 9, 2023

nit PR description should be updated.

Thx, updated.

@shijin-aws shijin-aws merged commit aff9ec2 into ofiwg:main May 9, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants