utilities for mixed-precision tests/benchmarks #352

ngc92 · 2024-05-04T15:01:52Z

This allows us to compile a single executable that can serve as test/benchmark for f32, f16, and bf16 versions of the kernels. So far, I've updated only those test files which already defined a BF16 macro.

Caveat:
This will try to compile float, half, and bfloat16 versions into a single exe, so the compilation fails if any of these isn't available at the moment. This is something we need to improve at some point, once we have a general strategy in place how to handle older hardware.

karpathy · 2024-05-05T22:33:06Z

This complicates dev/cuda quite a bit, with templates and macros, both a bit scary. What is the problem that it is trying to solve? Isn't it the case that our CI could just compile all the kernels separately for all precisions we care about and test them one by one?

ngc92 · 2024-05-06T18:21:52Z

it's less about automatic testing, and more about human testing and profiling; where I find it quite convenient not having to recompile the tests for each precision. And about reducing duplication between the different kernel test files; not letting get things out of sync.

Personally, I find the template solution much cleaner than moving the ifdefs into common.h and having floatX magically appear from there, but that would also be a solution to the problem.

If you don't like the PR in its entirety, there are still some individual things that should be merged; e.g., all the napkin math needs to be updated to actually reflect the floatX type's size.

karpathy · 2024-05-10T17:28:12Z

will avoid for now, closing.

ngc92 added 3 commits May 4, 2024 14:49

macro-helpers to generate test/benchmark main for multiple-dtype testing

f0bf7b9

update residual_forward test

434f5ae

updated encoder_forward

18e9d00

ngc92 mentioned this pull request May 4, 2024

Fused layernorm residual #353

Closed

karpathy closed this May 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

utilities for mixed-precision tests/benchmarks #352

utilities for mixed-precision tests/benchmarks #352

ngc92 commented May 4, 2024

karpathy commented May 5, 2024

ngc92 commented May 6, 2024

karpathy commented May 10, 2024

utilities for mixed-precision tests/benchmarks #352

utilities for mixed-precision tests/benchmarks #352

Conversation

ngc92 commented May 4, 2024

karpathy commented May 5, 2024

ngc92 commented May 6, 2024

karpathy commented May 10, 2024