Skip to content

Conversation

@Engininja2
Copy link
Contributor

__shfl_xor() for half2 was added in ROCm 5.6. This PR implements it for HIP versions less than that.
Fixes #7242

@mofosyne mofosyne added Nvidia GPU Issues specific to Nvidia GPUs Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level labels May 14, 2024
@JohannesGaessler
Copy link
Collaborator

Based on just static code analysis I approve. Unfortunately I do not have a system set up to test this PR with. Ideally you would get the person that initially reported the issue to confirm that the fix works. @Engininja2 I would still be able to merge this PR without an actual confirmation if I can get a pledge from you that you will take care of any potential follow-up issues that could arise from this PR (just in case, I don't think there will be any).

@Engininja2
Copy link
Contributor Author

I tested it with the 5.5 Windows HIP SDK and main compiles & runs okay.

@JohannesGaessler JohannesGaessler merged commit d233b50 into ggml-org:master May 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Nvidia GPU Issues specific to Nvidia GPUs Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Compilation error using HIP SDK on Windows

3 participants