-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Summary
Current test coverage for GPU kernel operations is incomplete. Most operations lack explicit NumPy comparison tests to verify correctness.
Current Coverage
✅ Tested (vs NumPy)
| Category | Operations | Test File |
|---|---|---|
| Elementwise | add, mul |
test_ops.py |
| Matmul | matmul, matmul_tiled |
test_ops.py |
| TF32 | matmul (TF32 mode) |
test_tf32_api.py |
❌ Missing NumPy Tests
| Category | Operations |
|---|---|
| Elementwise | sub, div, add_inplace, mul_inplace, copy_to |
| Unary | exp, log, relu |
| Reduction | sum, mean, max, softmax |
| NN | gelu, silu, layernorm, rmsnorm |
| Matmul | batched_matmul, transpose, linear_bias_gelu |
| FP8/NVF4 | matmul_fp8*, gemv_*, quantize_* (SM-dependent) |
| Tensor | concat_axis0, repeat_interleave_axis1, transpose_3d_021, cast_* |
Proposed Test Structure
class TestSubOperation:
def test_sub_basic(self):
a_np = np.random.rand(1024).astype(np.float32)
b_np = np.random.rand(1024).astype(np.float32)
a, b = gp.from_numpy(a_np), gp.from_numpy(b_np)
result = gp.sub(a, b).to_numpy()
np.testing.assert_array_almost_equal(result, a_np - b_np)
class TestExpOperation:
def test_exp_basic(self):
x_np = np.random.rand(1024).astype(np.float32)
x = gp.from_numpy(x_np)
result = gp.exp(x).to_numpy()
np.testing.assert_array_almost_equal(result, np.exp(x_np), decimal=5)
class TestSoftmaxOperation:
def test_softmax_1d(self):
x_np = np.random.rand(128).astype(np.float32)
x = gp.from_numpy(x_np)
result = gp.softmax(x).to_numpy()
expected = scipy.special.softmax(x_np)
np.testing.assert_array_almost_equal(result, expected, decimal=5)Acceptance Criteria
- All elementwise operations (
sub,div,add_inplace,mul_inplace,copy_to) have NumPy tests - All unary operations (
exp,log,relu) have NumPy tests - All reduction operations (
sum,mean,max,softmax) have NumPy tests - NN operations (
gelu,silu,layernorm,rmsnorm) have NumPy/SciPy reference tests - Tensor operations (
concat,transpose,cast) have NumPy tests - Tests use appropriate tolerances for floating-point comparison
- SM-dependent operations (FP8/NVF4) use
pytest.mark.skipiffor hardware availability
Notes
- Use
np.testing.assert_array_almost_equalwith appropriatedecimalparameter - For operations like
softmax, usescipy.special.softmaxas reference - For
layernorm/rmsnorm, implement NumPy reference manually - FP8/NVF4 tests should skip on unsupported hardware (SM < 90/120)
Metadata
Metadata
Assignees
Labels
No labels