Use a better error bound for `fp16` tests of the `rsum` microkernel. #6431

copybara-service · 2024-05-16T20:57:42Z

Use a better error bound for fp16 tests of the rsum microkernel.

PiperOrigin-RevId: 634519272

fbarchard · 2024-05-16T23:42:37Z

test/rsum-microkernel-tester.h

+      // We don't use the usual hard bound $\gamma_n = \frac{nu}{1 - nu}$ since
+      // for `fp16` $nu > 1$ already for $n > 2000$, rendering the bound
+      // meaningless.
+      const float fp16_u =


make 1 line

fbarchard · 2024-05-16T23:45:38Z

test/rsum-microkernel-tester.h

+      const float fp16_u =
+          4.88281e-4;  // Half of the ULP, max relative rounding error.
+      const float n = batch_size() - 1;
+      const float gamma_n = std::expm1(5.0f * std::sqrt(n) * fp16_u +


this is really complicated/obscure and different than all other testers, making it hard to maintain.
is rsum fp16 different than other testers?
assuming this is better, can we apply it to float and all other testers?
does it need to change for bf16 or qs8 comparisons?
it looks slow... did you benchmark?

Use a better error bound for fp16 tests of the rsum microkernel.

a1a5016

PiperOrigin-RevId: 634519272

fbarchard reviewed May 16, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use a better error bound for `fp16` tests of the `rsum` microkernel. #6431

Use a better error bound for `fp16` tests of the `rsum` microkernel. #6431

copybara-service bot commented May 16, 2024

fbarchard May 16, 2024

fbarchard May 16, 2024

Use a better error bound for fp16 tests of the rsum microkernel. #6431

Are you sure you want to change the base?

Use a better error bound for fp16 tests of the rsum microkernel. #6431

Conversation

copybara-service bot commented May 16, 2024

fbarchard May 16, 2024

Choose a reason for hiding this comment

fbarchard May 16, 2024

Choose a reason for hiding this comment

Use a better error bound for `fp16` tests of the `rsum` microkernel. #6431

Use a better error bound for `fp16` tests of the `rsum` microkernel. #6431