-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cl_half.h header #60
Conversation
This now has the following three conversion routines:
I've tested all three in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've finally had a chance to look at this. This is great work!
I've also opted to merge all the rounding modes into a single function
I think this is better. The only potential issue would be that the code might be a bit slower but since the function is going to be inlined and the rounding mode known at compilation time in the vast majority of cases, I don't think this is going to be a problem in practice.
I do not have any strong feelings about the function names,
FWIW, I think they're good.
There is quite a bit of code duplication between the float and double routines but there isn't a single obvious way of resolving this that I've been able to find. A few ideas that might be worth exploring:
- Introduce a common function that would take all the
CL_{DBL,FLT}_
values as parameters. You'd need to either use 64-bit arithmetic in the float case or maybe come up with some way of moving to 32-bit arithmetic when possible. - Introduce a few utility functions for common blocks of code (thinking of the overflow checks for example).
I'm not sure either of these ideas would make the code really better and I don't think it's strictly required to spend time on this for this change to go in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review! I agree with the comments about performance - I don't think it's a big issue due to inlining as you mention. I also benchmarked the first drop of this code against the routines in CTS and it was faster, FWIW. I don't think performance is a particular priority for this header however (IMO).
I've pulled out some common routines for handling overflow and underflow. I agree that there's still a bunch of similar code between the FP32 and FP64 routines which could potentially be addressed, but I've left that for future work if somebody thinks it's important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comments are mostly cosmetic. I don't think any should necessarily hold up merging the changes, but given that this is the first "header-only utility library" we're considering, I think they are worth considering.
The comment about Windows reminds me to mention that I have not tested this on Windows yet. It'd be great if somebody with a Windows platform is able to do this before we merge this. |
Windows compiled but generated some warnings (tested VS2019):
I believe the shift warnings are because |
That's great to hear! Agree performance is not a primary concern.
Thanks! Looks good. On naming generally:
|
This seems like a reasonable solution too, if we would prefer to keep the |
I've fixed the Windows warnings (thanks @bashbaug!), and renamed the functions to I've left things in the |
Thanks - the latest changes work for me, and I confirm that the Windows warnings are fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
We've got two approvals from two different implementers. This can't break existing functionatlity. This has been discussed several times in the working group; nobody opposed the change. I'm suggesting we just merge this, any objections? |
No objections, but do note that this file has the old license header and will need to be updated (see #76). |
@jrprice With updated license headers, this could go in straight away ;). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Late to the party, but this looks to be a very useful addition to the headers. Thanks for doing this.
@jrprice Ping! This just needs a change of license headers and then we can pop the champagne :). |
Sorry for the radio silence. The reason I've been holding off on doing this is because I'm still not clear that this header is best suited for this repository as opposed to the utility library in the OpenCL SDK, and we haven't had a chance to discuss this yet. Now that the fact we're doing an SDK is public maybe we can discuss this here. So, given that this is a header providing utility functions, shouldn't it be in the SDK's utility library instead? I'm not sure I understand @kpet's previous point that having these here means we could have automatic conversion for C++ - you'd still need to include an extra header (i.e. not the same one that provides the |
I had a suspicion that was the case :). The argument was that we could with further work do that provided the conversion routines live in the headers repo. There are a number of approaches that come to mind:
If we're giving up on trying to do that, then I agree that these routines are definitely better placed in the SDK. The question of the type remains though: would we want to provide a better definition for |
The part that hadn't clicked for me was that a different definition of I've updated the license and also added some test coverage (not at all comprehensive, but better than nothing). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the tests! Looks good for a first set but I'm guessing we'll want to increase coverage over time.
There's been agreement on this PR for a while, we now have tests and correct license headers. Merging. |
This is going to address #57 when complete. I'm posting it now to get early feedback on the approach before I go on to do the half->float direction and FP64.
This introduces a function to convert from FP32 to FP16. I've tested this by brute-forcing all FP32
values for all four rounding modes, and comparing against the functions in the CTS. I've also run the
half
suite from CTS using this function instead.I've ended up essentially rewriting the function from scratch rather than copying the CTS versions, which is why I wanted to check with the group before pushing forwards. This new version does not depend on the hex float macros used in CTS, or on
math.h
. Also, while testing this I discovered that the versions in CTS do not always work correctly if certain compiler optimizations are enabled (due to the use of floating point arithmetic). This new version solely uses bitwise integer operations, so does not have this issue.I've also opted to merge all the rounding modes into a single function, to avoid duplicating much of the code as is done in CTS. There's an
enum
to select the rounding mode. We could provide separate suffixed functions as well if that's desirable (e.g.cl_float_to_half_rte()
).I do not have any strong feelings about the function names, so feel free to suggest something different if desired.