Skip to content

feat: reorganize casting utilities and enhance CPU support#16

Merged
voltjia merged 3 commits intofeat/dev-infrafrom
feat/dev-multi-dtypes
Mar 11, 2026
Merged

feat: reorganize casting utilities and enhance CPU support#16
voltjia merged 3 commits intofeat/dev-infrafrom
feat/dev-multi-dtypes

Conversation

@zhangyue207
Copy link
Copy Markdown
Collaborator

  • Moved casting functions to separate CPU and CUDA headers for better organization.
  • Introduced a new Cast() function in the CPU implementation to handle type conversions, including support for custom types like float16_t and bfloat16_t.
  • Updated various operators to utilize the new casting utilities, ensuring consistent type handling across CPU and CUDA backends.
  • Enhanced test cases to cover additional data types and ensure compatibility with the new casting logic.

test

pytest -k [cpu|cuda]

- Moved casting functions to separate CPU and CUDA headers for better organization.
- Introduced a new `Cast()` function in the CPU implementation to handle type conversions, including support for custom types like `float16_t` and `bfloat16_t`.
- Updated various operators to utilize the new casting utilities, ensuring consistent type handling across CPU and CUDA backends.
- Enhanced test cases to cover additional data types and ensure compatibility with the new casting logic.
@zhangyue207
Copy link
Copy Markdown
Collaborator Author

(python3.10) zhangyue@server:~/InfiniOps$ pytest tests/ -k cuda
================================= test session starts ==================================
platform linux -- Python 3.10.19, pytest-9.0.2, pluggy-1.6.0
rootdir: /home/zhangyue/InfiniOps
configfile: pyproject.toml
plugins: xdist-3.8.0, cov-7.0.0
collected 3372 items / 1686 deselected / 1686 selected                                 

tests/test_add.py .............................................................. [  3%]
..............................................                                   [  6%]
tests/test_causal_softmax.py ..................                                  [  7%]
tests/test_gemm.py ............................................................. [ 11%]
................................................................................ [ 15%]
................................................................................ [ 20%]
................................................................................ [ 25%]
................................................................................ [ 30%]
................................................................................ [ 34%]
................................................................................ [ 39%]
................................................................................ [ 44%]
................................................................................ [ 49%]
................................................................................ [ 53%]
................................................................................ [ 58%]
................................................................................ [ 63%]
................................................................................ [ 68%]
................................................................................ [ 72%]
................................................................................ [ 77%]
................................................................................ [ 82%]
................................................................................ [ 87%]
................................................................................ [ 91%]
...............................................................................  [ 96%]
tests/test_rms_norm.py ....................................                      [ 98%]
tests/test_swiglu.py ........................                                    [100%]

======================== 1686 passed, 1686 deselected in 5.30s =========================

- Increased the tolerance for `bfloat16` from `1e-2` to `2e-2` to better accommodate numerical precision in tests.
@zhangyue207
Copy link
Copy Markdown
Collaborator Author

(python3.10) zhangyue@server:~/InfiniOps$ pytest -k cpu
================================= test session starts ==================================
platform linux -- Python 3.10.19, pytest-9.0.2, pluggy-1.6.0
rootdir: /home/zhangyue/InfiniOps
configfile: pyproject.toml
testpaths: tests
plugins: xdist-3.8.0, cov-7.0.0
collected 3372 items / 1686 deselected / 1686 selected                                 

tests/test_add.py .............................................................. [  3%]
..............................................                                   [  6%]
tests/test_causal_softmax.py ..................                                  [  7%]
tests/test_gemm.py ............................................................. [ 11%]
................................................................................ [ 15%]
................................................................................ [ 20%]
................................................................................ [ 25%]
................................................................................ [ 30%]
................................................................................ [ 34%]
................................................................................ [ 39%]
................................................................................ [ 44%]
................................................................................ [ 49%]
................................................................................ [ 53%]
................................................................................ [ 58%]
................................................................................ [ 63%]
................................................................................ [ 68%]
................................................................................ [ 72%]
................................................................................ [ 77%]
................................................................................ [ 82%]
................................................................................ [ 87%]
................................................................................ [ 91%]
...............................................................................  [ 96%]
tests/test_rms_norm.py ....................................                      [ 98%]
tests/test_swiglu.py ........................                                    [100%]

================== 1686 passed, 1686 deselected in 243.35s (0:04:03) ===================

@zhangyue207
Copy link
Copy Markdown
Collaborator Author

root@iluvatar:/workspace/InfiniOps# pytest -k cpu
==================================================== test session starts ====================================================
platform linux -- Python 3.10.18, pytest-9.0.2, pluggy-1.6.0
rootdir: /workspace/InfiniOps
configfile: pyproject.toml
testpaths: tests
plugins: anyio-4.9.0, cov-7.0.0, xdist-3.8.0, typeguard-4.4.4
collected 3372 items / 1686 deselected / 1686 selected                                                                      

tests/test_add.py ................................................................................................... [  5%]
.........                                                                                                             [  6%]
tests/test_causal_softmax.py ..................                                                                       [  7%]
tests/test_gemm.py .................................................................................................. [ 13%]
..................................................................................................................... [ 20%]
..................................................................................................................... [ 27%]
..................................................................................................................... [ 34%]
..................................................................................................................... [ 41%]
..................................................................................................................... [ 47%]
..................................................................................................................... [ 54%]
..................................................................................................................... [ 61%]
..................................................................................................................... [ 68%]
..................................................................................................................... [ 75%]
..................................................................................................................... [ 82%]
..................................................................................................................... [ 89%]
...................................................................................................................   [ 96%]
tests/test_rms_norm.py ....................................                                                           [ 98%]
tests/test_swiglu.py ........................                                                                         [100%]

===================================== 1686 passed, 1686 deselected in 241.49s (0:04:01) =====================================

@zhangyue207
Copy link
Copy Markdown
Collaborator Author

root@iluvatar:/workspace/InfiniOps# ruff check . --exclude InfiniCore All checks passed!

@voltjia voltjia merged commit 6e93d39 into feat/dev-infra Mar 11, 2026
@voltjia voltjia deleted the feat/dev-multi-dtypes branch March 11, 2026 03:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants