Skip to content

Commit

Permalink
Fix half to float giving wrong results on older x86_64 CPUs on Windows (
Browse files Browse the repository at this point in the history
#358)

The lzcnt instruction is not supported on some older x86_64 CPUs. On these
CPUs it will silently execute a bsr instruction instead, leading to wrong
results.

Instead use the bsr instruction and one additional subtraction, which should
have a negligible impact on performance. Additionally, this likely improves
performance on ARM.

Thanks to Ray Molenkamp for tracking down this bug.

Signed-off-by: Brecht Van Lommel <brecht@blender.org>
  • Loading branch information
brechtvl authored and cary-ilm committed Jan 24, 2024
1 parent 7540434 commit 9a006c0
Showing 1 changed file with 8 additions and 2 deletions.
10 changes: 8 additions & 2 deletions src/Imath/half.h
Original file line number Diff line number Diff line change
Expand Up @@ -327,8 +327,14 @@ imath_half_to_float (imath_half_bits_t h)
// other compilers may provide count-leading-zeros primitives,
// but we need the community to inform us of the variants
uint32_t lc;
# if defined(_MSC_VER) && (_M_IX86 || _M_X64)
lc = __lzcnt (hexpmant);
# if defined(_MSC_VER)
// The direct intrinsic for this is __lznct, but that is not supported
// on older x86_64 hardware or ARM. Instead uses the bsr instruction
// and one additional subtraction. This assumes hexpmant != 0, for 0
// bsr and lznct would behave differently.
unsigned long bsr;
_BitScanReverse (&bsr, hexpmant);
lc = (31 - bsr);
# elif defined(__GNUC__) || defined(__clang__)
lc = (uint32_t) __builtin_clz (hexpmant);
# else
Expand Down

0 comments on commit 9a006c0

Please sign in to comment.