Add cl_half.h header #60

jrprice · 2019-12-23T22:52:04Z

This is going to address #57 when complete. I'm posting it now to get early feedback on the approach before I go on to do the half->float direction and FP64.

This introduces a function to convert from FP32 to FP16. I've tested this by brute-forcing all FP32
values for all four rounding modes, and comparing against the functions in the CTS. I've also run the half suite from CTS using this function instead.

I've ended up essentially rewriting the function from scratch rather than copying the CTS versions, which is why I wanted to check with the group before pushing forwards. This new version does not depend on the hex float macros used in CTS, or on math.h. Also, while testing this I discovered that the versions in CTS do not always work correctly if certain compiler optimizations are enabled (due to the use of floating point arithmetic). This new version solely uses bitwise integer operations, so does not have this issue.

I've also opted to merge all the rounding modes into a single function, to avoid duplicating much of the code as is done in CTS. There's an enum to select the rounding mode. We could provide separate suffixed functions as well if that's desirable (e.g. cl_float_to_half_rte()).

I do not have any strong feelings about the function names, so feel free to suggest something different if desired.

jrprice · 2020-01-02T22:48:47Z

This now has the following three conversion routines:

FP32 -> FP16
FP16 -> FP32
FP64 -> FP16

I've tested all three in the half CTS suite on an NVIDIA GPU. I've also brute-forced all inputs for the first two, comparing bitwise to the same routines from CTS.

kpet

I've finally had a chance to look at this. This is great work!

I've also opted to merge all the rounding modes into a single function

I think this is better. The only potential issue would be that the code might be a bit slower but since the function is going to be inlined and the rounding mode known at compilation time in the vast majority of cases, I don't think this is going to be a problem in practice.

I do not have any strong feelings about the function names,

FWIW, I think they're good.

There is quite a bit of code duplication between the float and double routines but there isn't a single obvious way of resolving this that I've been able to find. A few ideas that might be worth exploring:

Introduce a common function that would take all the CL_{DBL,FLT}_ values as parameters. You'd need to either use 64-bit arithmetic in the float case or maybe come up with some way of moving to 32-bit arithmetic when possible.
Introduce a few utility functions for common blocks of code (thinking of the overflow checks for example).

I'm not sure either of these ideas would make the code really better and I don't think it's strictly required to spend time on this for this change to go in.

CL/cl_half.h

jrprice

Thanks for the review! I agree with the comments about performance - I don't think it's a big issue due to inlining as you mention. I also benchmarked the first drop of this code against the routines in CTS and it was faster, FWIW. I don't think performance is a particular priority for this header however (IMO).

I've pulled out some common routines for handling overflow and underflow. I agree that there's still a bunch of similar code between the FP32 and FP64 routines which could potentially be addressed, but I've left that for future work if somebody thinks it's important.

bashbaug

My comments are mostly cosmetic. I don't think any should necessarily hold up merging the changes, but given that this is the first "header-only utility library" we're considering, I think they are worth considering.

CL/cl_half.h

jrprice · 2020-01-09T22:37:13Z

The comment about Windows reminds me to mention that I have not tested this on Windows yet. It'd be great if somebody with a Windows platform is able to do this before we merge this.

bashbaug · 2020-01-09T23:37:29Z

I have not tested this on Windows yet.

Windows compiled but generated some warnings (tested VS2019):

CL/cl_half.h(266,59): warning C4293:  '<<': shift count negative or too big, undefined behavior
CL/cl_half.h(272,18): warning C4244:  'initializing': conversion from 'int64_t' to 'uint16_t', possible loss of data
CL/cl_half.h(280,23): warning C4244:  'initializing': conversion from 'uint64_t' to 'uint16_t', possible loss of data
CL/cl_half.h(316,43): warning C4293:  '<<': shift count negative or too big, undefined behavior
CL/cl_half.h(319,20): warning C4244:  '=': conversion from 'int64_t' to 'uint32_t', possible loss of data
CL/cl_half.h(328,19): warning C4244:  'initializing': conversion from 'uint64_t' to 'uint16_t', possible loss of data
CL/cl_half.h(331,42): warning C4334:  '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)

I believe the shift warnings are because unsigned long is still 32-bits on Windows, and can be fixed by replacing 1ul with (uint64_t)1. The remainder can be fixed with explicit casts.

CL/cl_half.h

kpet · 2020-01-10T12:33:14Z

I also benchmarked the first drop of this code against the routines in CTS and it was faster, FWIW. I don't think performance is a particular priority for this header however (IMO).

That's great to hear! Agree performance is not a primary concern.

I've pulled out some common routines for handling overflow and underflow.

Thanks! Looks good.

On naming generally:

Having said that I liked the function names, I'm now wondering whether we shouldn't prefix all of them with cl_half (or whatever we agree should be the prefix): cl_half_to_float, cl_half_from_double and cl_half_from_float.
@bashbaug I understand the concern but I really think this is functionality that we should ultimately provide in the cl namespace. I think what I'm saying is that I'd rather we document whatever additions we make (a spec appendix?) rather than move them to a separate namespace. One of the things I had in mind for future changes was to provide a better definition for cl_half for C++ applications where conversions and operations do the right thing automatically. If it looks like the discussions wouldn't converge reasonably quickly though, it might be better to just change the namespace for now (agree clu is reasonable) so we get forward progress. On the case, I think using snake case for these is a plus as it makes it clear these are not API functions.

bashbaug · 2020-01-10T16:05:34Z

what I'm saying is that I'd rather we document whatever additions we make (a spec appendix?) rather than move them to a separate namespace.

This seems like a reasonable solution too, if we would prefer to keep the cl prefix.

jrprice · 2020-01-13T22:04:22Z

I've fixed the Windows warnings (thanks @bashbaug!), and renamed the functions to cl_half_from_float() as suggested by @kpet .

I've left things in the cl namespace and using snake-case for now until we have consensus on the desired convention for these functions. I'm leaning (slightly) away from snake-case now since it makes them look a bit like the type definitions we have in the API. If we document them in a spec appendix, I'd be happy with clHalfToFloat(...) and clHalfFromFloat(...).

bashbaug · 2020-01-16T21:16:40Z

Thanks - the latest changes work for me, and I confirm that the Windows warnings are fixed.

kpet

LGTM!

kpet · 2020-03-20T15:26:24Z

We've got two approvals from two different implementers. This can't break existing functionatlity. This has been discussed several times in the working group; nobody opposed the change. I'm suggesting we just merge this, any objections?

bashbaug · 2020-03-22T22:38:25Z

any objections?

No objections, but do note that this file has the old license header and will need to be updated (see #76).

kpet · 2020-04-01T20:21:50Z

@jrprice With updated license headers, this could go in straight away ;).

Jeremy-Kemp

Late to the party, but this looks to be a very useful addition to the headers. Thanks for doing this.

kpet · 2020-04-28T13:38:27Z

@jrprice Ping! This just needs a change of license headers and then we can pop the champagne :).

jrprice · 2020-04-28T19:51:51Z

Sorry for the radio silence. The reason I've been holding off on doing this is because I'm still not clear that this header is best suited for this repository as opposed to the utility library in the OpenCL SDK, and we haven't had a chance to discuss this yet. Now that the fact we're doing an SDK is public maybe we can discuss this here.

So, given that this is a header providing utility functions, shouldn't it be in the SDK's utility library instead? I'm not sure I understand @kpet's previous point that having these here means we could have automatic conversion for C++ - you'd still need to include an extra header (i.e. not the same one that provides the cl_half definition), so surely that's no different from providing the functionality in the SDK instead?

kpet · 2020-04-28T20:17:01Z

I had a suspicion that was the case :). The argument was that we could with further work do that provided the conversion routines live in the headers repo. There are a number of approaches that come to mind:

Include cl_half.h from cl_platform.h, always or only for C++. Doesn't sound quite right but can be made fully backwards compatible.
Remove the existing definition in cl_platform.h and include cl_half.h from cl.h. Might break some code that includes cl_platform.h directly.
Any other?

If we're giving up on trying to do that, then I agree that these routines are definitely better placed in the SDK. The question of the type remains though: would we want to provide a better definition for cl_half in the SDK and make the one in headers opt-out? Would we give up altogether on trying to make the right conversions automatic?

CLAassistant · 2020-04-29T21:42:15Z

All committers have signed the CLA.

jrprice · 2020-04-29T21:44:46Z

The part that hadn't clicked for me was that a different definition of cl_half would be needed to provide these conversion functions, which is why it would need to be done here. I have no further objections :-)

I've updated the license and also added some test coverage (not at all comprehensive, but better than nothing).

kpet

Thanks for adding the tests! Looks good for a first set but I'm guessing we'll want to increase coverage over time.

kpet · 2020-05-12T11:04:24Z

There's been agreement on this PR for a while, we now have tests and correct license headers. Merging.

jrprice changed the title ~~Add cl_half.h header~~ [WIP] Add cl_half.h header Dec 23, 2019

jrprice force-pushed the cl-half-h branch from 6077bba to 1b260d4 Compare December 30, 2019 22:22

Add cl_half.h header

7d7f40c

jrprice force-pushed the cl-half-h branch from 1b260d4 to 7d7f40c Compare January 2, 2020 22:45

jrprice changed the title ~~[WIP] Add cl_half.h header~~ Add cl_half.h header Jan 2, 2020

jrprice requested a review from kpet January 2, 2020 22:46

kpet requested changes Jan 9, 2020

View reviewed changes

Make shared utility for handling {over,under}flow

e9cb804

jrprice commented Jan 9, 2020

View reviewed changes

bashbaug reviewed Jan 9, 2020

View reviewed changes

CL/cl_half.h Show resolved Hide resolved

CL/cl_half.h Show resolved Hide resolved

CL/cl_half.h Show resolved Hide resolved

Include stdint.h

5bc63a4

kpet reviewed Jan 10, 2020

View reviewed changes

CL/cl_half.h Outdated Show resolved Hide resolved

jrprice added 4 commits January 13, 2020 16:25

Prefix all new functions with cl_half_

943493d

Address Windows warnings

4dff599

Refactor lsb_pos calculation

f901f83

Avoid leading double-underscore in header guard

fcf3363

bashbaug approved these changes Jan 16, 2020

View reviewed changes

kpet approved these changes Feb 27, 2020

View reviewed changes

Jeremy-Kemp approved these changes Apr 17, 2020

View reviewed changes

jrprice added 3 commits April 29, 2020 16:33

Merge branch 'master' into cl-half-h

ab8d053

Update license

461e4e7

Add tests for cl_half.h

e6c705d

kpet approved these changes May 5, 2020

View reviewed changes

kpet merged commit 5cc337c into KhronosGroup:master May 12, 2020

This was referenced Aug 5, 2020

Use the float<->half conversion routines from the headers KhronosGroup/OpenCL-CTS#870

Closed

Provide conversion functions between cl_half and other types #57

Closed

Make conversions between cl_half and integer types automatic in C++ applications #102

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add cl_half.h header #60

Add cl_half.h header #60

jrprice commented Dec 23, 2019

jrprice commented Jan 2, 2020

kpet left a comment

jrprice left a comment

bashbaug left a comment

jrprice commented Jan 9, 2020

bashbaug commented Jan 9, 2020

kpet commented Jan 10, 2020

bashbaug commented Jan 10, 2020

jrprice commented Jan 13, 2020

bashbaug commented Jan 16, 2020

kpet left a comment

kpet commented Mar 20, 2020

bashbaug commented Mar 22, 2020

kpet commented Apr 1, 2020

Jeremy-Kemp left a comment

kpet commented Apr 28, 2020

jrprice commented Apr 28, 2020

kpet commented Apr 28, 2020

CLAassistant commented Apr 29, 2020 •

edited

Loading

jrprice commented Apr 29, 2020

kpet left a comment

kpet commented May 12, 2020

Add cl_half.h header #60

Add cl_half.h header #60

Conversation

jrprice commented Dec 23, 2019

jrprice commented Jan 2, 2020

kpet left a comment

Choose a reason for hiding this comment

jrprice left a comment

Choose a reason for hiding this comment

bashbaug left a comment

Choose a reason for hiding this comment

jrprice commented Jan 9, 2020

bashbaug commented Jan 9, 2020

kpet commented Jan 10, 2020

bashbaug commented Jan 10, 2020

jrprice commented Jan 13, 2020

bashbaug commented Jan 16, 2020

kpet left a comment

Choose a reason for hiding this comment

kpet commented Mar 20, 2020

bashbaug commented Mar 22, 2020

kpet commented Apr 1, 2020

Jeremy-Kemp left a comment

Choose a reason for hiding this comment

kpet commented Apr 28, 2020

jrprice commented Apr 28, 2020

kpet commented Apr 28, 2020

CLAassistant commented Apr 29, 2020 • edited Loading

jrprice commented Apr 29, 2020

kpet left a comment

Choose a reason for hiding this comment

kpet commented May 12, 2020

CLAassistant commented Apr 29, 2020 •

edited

Loading