Skip to content

Conversation

npmiller
Copy link
Contributor

No description provided.

@npmiller npmiller requested a review from bader as a code owner July 30, 2021 14:52
bader
bader previously approved these changes Jul 30, 2021
Copy link
Contributor

@bader bader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to add sqrt for fp16 as well?

@bader bader added the libclc libclc project related issues label Jul 30, 2021
@npmiller
Copy link
Contributor Author

Does it make sense to add sqrt for fp16 as well?

Good point, I'll add it as well, it's just that I ran into an application using the double variant.

@bader
Copy link
Contributor

bader commented Jul 30, 2021

Does it make sense to add sqrt for fp16 as well?

Good point, I'll add it as well, it's just that I ran into an application using the double variant.

Considering that typical built-in implementation for amdgcn-amdhsa target is a simple wrapper around a compiler built-in, adding implementation for all the types seems like a good rule to follow.

bader
bader previously approved these changes Jul 30, 2021
@npmiller
Copy link
Contributor Author

Does it make sense to add sqrt for fp16 as well?

Good point, I'll add it as well, it's just that I ran into an application using the double variant.

Considering that typical built-in implementation for amdgcn-amdhsa target is a simple wrapper around a compiler built-in, adding implementation for all the types seems like a good rule to follow.

So I was testing this a bit further and the current change actually breaks the build, adding fp16 support seems a bit more involved than I thought.

This is because the default build, builds for the tahiti architecture, I reckon to have the lowest common denominator so libclc works on as many GPUs as possible, but that version of the ISA doesn't support fp16, so we'd need to update this version or add a way to change it at build time. In addition it seems that cl_khr_fp16 is defined anyway so we can't really use that right now in the code to skip the half variant for tahiti.

So we should probably leave out the half variant for now until we can setup half support for AMD in libclc properly.

@npmiller
Copy link
Contributor Author

I just forced pushed to remove the commit adding the fp16 variant as it doesn't work, other commit is untouched, see previous comment for reasoning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

libclc libclc project related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants