[libclc] Reduce bithacking for INF/NAN values #129738

frasercrmck · 2025-03-04T16:41:12Z

This commit makes use of 'native' floating-point INF and NAN operations to implement certain patterns previously reliant on bithacking.

To aid in this, a 'gentype' version of INF and NAN were introduced to be used with users of clc/math/gentype.inc.

This commit makes use of 'native' floating-point INF and NAN operations to implement certain patterns previously reliant on bithacking. To aid in this, a 'gentype' version of INF and NAN were introduced to be used with users of clc/math/gentype.inc.

github-actions · 2025-03-04T16:44:36Z

✅ With the latest revision this PR passed the C/C++ code formatter.

arsenm · 2025-03-04T16:49:41Z

libclc/clc/lib/generic/math/clc_hypot.inc

@@ -86,13 +84,10 @@ _CLC_DEF _CLC_OVERLOAD __CLC_GENTYPE __clc_hypot(__CLC_GENTYPE x,
  r = c ? s : r;

  // Check for NaN
-  c = __clc_isnan(x) || __clc_isnan(y);
-  r = c ? __CLC_AS_GENTYPE((__CLC_ULONGN)QNANBITPATT_DP64) : r;
+  r = __clc_isnan(x) || __clc_isnan(y) ? __CLC_GENTYPE_NAN : r;


Doesn't matter much, but can fold this condition into one __builtin_isunordered (or just let instcombine do it)

Above should also be rewritten with frexp and ldexp

Doesn't matter much, but can fold this condition into one __builtin_isunordered (or just let instcombine do it)

Yeah I think I'll just let instcombine do it (which it does).

Above should also be rewritten with frexp and ldexp

Sorry, I'm not seeing which bit - could you explain? I note that libclc's frexp and ldexp are more involved than any of the bithacking here (possibly due to subnormal support?) so I'm not sure what the goal would be.

arsenm · 2025-03-04T16:57:38Z

libclc/clc/lib/generic/math/clc_hypot.inc

-  retval = (ux == PINFBITPATT_SP32 || uy == PINFBITPATT_SP32)
-               ? __CLC_AS_GENTYPE((__CLC_UINTN)PINFBITPATT_SP32)
-               : retval;
+  retval = __clc_isinf(x) || __clc_isinf(y) ? __CLC_GENTYPE_INF : retval;


Above can be rewritten with frexp and ldexp.

The output of the temporary result also implies one or the other were infinity, you can fold this to one isinf check on the maximum of fabs x and y (which is done as integer above)

aux and auy can be rewritten to float typed fabs(x),y. ux and uy are then applying a umax, which can be replaced by fmax. The fmax result then has the implied infiniteness, and you can check if that is infinity down here.

The bithacking below that is frexp, and the scaling applied to fx and fy can be done with ldexp from the integer exponent obtained from the frexp

Right yeah, I see. Thank you for the explanation.

I'm still pondering the naming of the frexp and ldexp helpers you propose. It's not fully-fledged frexp or ldexp as far as I can tell. If it is, then our implementations of frexp and ldexp are far too complicated. Are we talking about a 'native' frexp and ldexp? A 'fast' one? That's just going on existing OpenCL terminology, which may or may not confuse things. Should they handle NaNs, Infs, subnormals?

AMDGPU instructions names call them frexp_exp and frexp_mant. It's just the two parts of the frexp, split into 2 instructions.

ldexp is just ldexp, there's nothing special to it

I can't tell from the documentation: what do frexp_exp and frexp_mant do with subnormal numbers? For exp, D0.i32 = exponent(S0.f32) - 127 + 1 implies it returns -126, which wouldn't be what frexp returns for the same input. So frexp relies on extra processing of frexp_exp and frexp_mant?

We would want consistent results across all libclc targets for a hypothetical __clc_frexp_exp and __clc_frexp_mant. We'd need to define the various edge cases and subnormal handling.

For targets without native support for either of these operations I worry that using these functions would be less optimal than doing bit manipulation/scaling/etc directly inline.

Might we not just end up in a scenario where AMDGPU in fact wants a custom implementation of __clc_hypot, rather than a generic solution for all targets involving frexp/ldexp?

frasercrmck added the libclc label Mar 4, 2025

frasercrmck requested a review from arsenm March 4, 2025 16:41

fix formatting

bfa4550

arsenm reviewed Mar 4, 2025

View reviewed changes

arsenm added the floating-point label Mar 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[libclc] Reduce bithacking for INF/NAN values #129738

[libclc] Reduce bithacking for INF/NAN values #129738

frasercrmck commented Mar 4, 2025

github-actions bot commented Mar 4, 2025 •

edited

Loading

arsenm Mar 4, 2025

arsenm Mar 4, 2025

frasercrmck Mar 5, 2025

arsenm Mar 4, 2025

arsenm Mar 5, 2025

frasercrmck Mar 5, 2025

arsenm Mar 5, 2025

frasercrmck Mar 6, 2025

[libclc] Reduce bithacking for INF/NAN values #129738

Are you sure you want to change the base?

[libclc] Reduce bithacking for INF/NAN values #129738

Conversation

frasercrmck commented Mar 4, 2025

github-actions bot commented Mar 4, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Mar 4, 2025 •

edited

Loading