Possible performance improvements to half float conversion

XMConvertHalfToFloat and XMConvertFloatToHalf both use a large number of integer ops when F16 intrinsics aren't available.  It may be faster to do it with floating point operations.  XMConvertHalfToFloat has a while loop for denormals, which is particularly slow.

Float-to-half conversion can use a trick: For positive numbers, (f + max(f, 2^-24)) will produce a float with an exponent at a fixed bias from the half float, and handle denormals and zero, and only needs 2 ops.  (Bit-exactness in this case is sensitive to handling of the dropped mantissa bits in the denormal case though.)

Half-to-float can handle denormals (and zero) by converting the mantissa to float and multiplying it by 2^-24, which should be faster than the loop.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Possible performance improvements to half float conversion #76

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Possible performance improvements to half float conversion #76

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions