Fix half to float giving wrong results on older x86_64 CPUs on Windows (

#358) The lzcnt instruction is not supported on some older x86_64 CPUs. On these CPUs it will silently execute a bsr instruction instead, leading to wrong results. Instead use the bsr instruction and one additional subtraction, which should have a negligible impact on performance. Additionally, this likely improves performance on ARM. Thanks to Ray Molenkamp for tracking down this bug. Signed-off-by: Brecht Van Lommel <brecht@blender.org>
AcademySoftwareFoundation · Jan 24, 2024 · 9a006c0 · 9a006c0
1 parent 7540434
commit 9a006c0
Showing 1 changed file with 8 additions and 2 deletions.
diff --git a/src/Imath/half.h b/src/Imath/half.h
@@ -327,8 +327,14 @@ imath_half_to_float (imath_half_bits_t h)
         // other compilers may provide count-leading-zeros primitives,
         // but we need the community to inform us of the variants
         uint32_t lc;
-#    if defined(_MSC_VER) && (_M_IX86 || _M_X64)
-        lc = __lzcnt (hexpmant);
+#    if defined(_MSC_VER)
+        // The direct intrinsic for this is __lznct, but that is not supported
+        // on older x86_64 hardware or ARM. Instead uses the bsr instruction
+        // and one additional subtraction. This assumes hexpmant != 0, for 0
+        // bsr and lznct would behave differently.
+        unsigned long bsr;
+        _BitScanReverse (&bsr, hexpmant);
+        lc = (31 - bsr);
 #    elif defined(__GNUC__) || defined(__clang__)
         lc = (uint32_t) __builtin_clz (hexpmant);
 #    else