feat: reorganize casting utilities and enhance CPU support by zhangyue207 · Pull Request #16 · InfiniTensor/InfiniOps

zhangyue207 · 2026-03-10T05:44:47Z

Moved casting functions to separate CPU and CUDA headers for better organization.
Introduced a new Cast() function in the CPU implementation to handle type conversions, including support for custom types like float16_t and bfloat16_t.
Updated various operators to utilize the new casting utilities, ensuring consistent type handling across CPU and CUDA backends.
Enhanced test cases to cover additional data types and ensure compatibility with the new casting logic.

test

pytest -k [cpu|cuda]

- Moved casting functions to separate CPU and CUDA headers for better organization. - Introduced a new `Cast()` function in the CPU implementation to handle type conversions, including support for custom types like `float16_t` and `bfloat16_t`. - Updated various operators to utilize the new casting utilities, ensuring consistent type handling across CPU and CUDA backends. - Enhanced test cases to cover additional data types and ensure compatibility with the new casting logic.

zhangyue207 · 2026-03-10T05:47:33Z

(python3.10) zhangyue@server:~/InfiniOps$ pytest tests/ -k cuda
================================= test session starts ==================================
platform linux -- Python 3.10.19, pytest-9.0.2, pluggy-1.6.0
rootdir: /home/zhangyue/InfiniOps
configfile: pyproject.toml
plugins: xdist-3.8.0, cov-7.0.0
collected 3372 items / 1686 deselected / 1686 selected                                 

tests/test_add.py .............................................................. [  3%]
..............................................                                   [  6%]
tests/test_causal_softmax.py ..................                                  [  7%]
tests/test_gemm.py ............................................................. [ 11%]
................................................................................ [ 15%]
................................................................................ [ 20%]
................................................................................ [ 25%]
................................................................................ [ 30%]
................................................................................ [ 34%]
................................................................................ [ 39%]
................................................................................ [ 44%]
................................................................................ [ 49%]
................................................................................ [ 53%]
................................................................................ [ 58%]
................................................................................ [ 63%]
................................................................................ [ 68%]
................................................................................ [ 72%]
................................................................................ [ 77%]
................................................................................ [ 82%]
................................................................................ [ 87%]
................................................................................ [ 91%]
...............................................................................  [ 96%]
tests/test_rms_norm.py ....................................                      [ 98%]
tests/test_swiglu.py ........................                                    [100%]

======================== 1686 passed, 1686 deselected in 5.30s =========================

- Increased the tolerance for `bfloat16` from `1e-2` to `2e-2` to better accommodate numerical precision in tests.

zhangyue207 · 2026-03-10T07:56:39Z

(python3.10) zhangyue@server:~/InfiniOps$ pytest -k cpu
================================= test session starts ==================================
platform linux -- Python 3.10.19, pytest-9.0.2, pluggy-1.6.0
rootdir: /home/zhangyue/InfiniOps
configfile: pyproject.toml
testpaths: tests
plugins: xdist-3.8.0, cov-7.0.0
collected 3372 items / 1686 deselected / 1686 selected                                 

tests/test_add.py .............................................................. [  3%]
..............................................                                   [  6%]
tests/test_causal_softmax.py ..................                                  [  7%]
tests/test_gemm.py ............................................................. [ 11%]
................................................................................ [ 15%]
................................................................................ [ 20%]
................................................................................ [ 25%]
................................................................................ [ 30%]
................................................................................ [ 34%]
................................................................................ [ 39%]
................................................................................ [ 44%]
................................................................................ [ 49%]
................................................................................ [ 53%]
................................................................................ [ 58%]
................................................................................ [ 63%]
................................................................................ [ 68%]
................................................................................ [ 72%]
................................................................................ [ 77%]
................................................................................ [ 82%]
................................................................................ [ 87%]
................................................................................ [ 91%]
...............................................................................  [ 96%]
tests/test_rms_norm.py ....................................                      [ 98%]
tests/test_swiglu.py ........................                                    [100%]

================== 1686 passed, 1686 deselected in 243.35s (0:04:03) ===================

src/cpu/add/add.h

tests/utils.py

tests/test_swiglu.py

tests/test_rms_norm.py

tests/test_causal_softmax.py

tests/test_gemm.py

tests/test_add.py

src/iluvatar/gemm/cublas.h

tests/test_gemm.py

zhangyue207 · 2026-03-11T02:30:23Z

root@iluvatar:/workspace/InfiniOps# pytest -k cpu
==================================================== test session starts ====================================================
platform linux -- Python 3.10.18, pytest-9.0.2, pluggy-1.6.0
rootdir: /workspace/InfiniOps
configfile: pyproject.toml
testpaths: tests
plugins: anyio-4.9.0, cov-7.0.0, xdist-3.8.0, typeguard-4.4.4
collected 3372 items / 1686 deselected / 1686 selected                                                                      

tests/test_add.py ................................................................................................... [  5%]
.........                                                                                                             [  6%]
tests/test_causal_softmax.py ..................                                                                       [  7%]
tests/test_gemm.py .................................................................................................. [ 13%]
..................................................................................................................... [ 20%]
..................................................................................................................... [ 27%]
..................................................................................................................... [ 34%]
..................................................................................................................... [ 41%]
..................................................................................................................... [ 47%]
..................................................................................................................... [ 54%]
..................................................................................................................... [ 61%]
..................................................................................................................... [ 68%]
..................................................................................................................... [ 75%]
..................................................................................................................... [ 82%]
..................................................................................................................... [ 89%]
...................................................................................................................   [ 96%]
tests/test_rms_norm.py ....................................                                                           [ 98%]
tests/test_swiglu.py ........................                                                                         [100%]

===================================== 1686 passed, 1686 deselected in 241.49s (0:04:01) =====================================

zhangyue207 · 2026-03-11T02:30:45Z

root@iluvatar:/workspace/InfiniOps# ruff check . --exclude InfiniCore All checks passed!

fix: update bfloat16 test tolerance in test_rms_norm.py

ac2d81b

- Increased the tolerance for `bfloat16` from `1e-2` to `2e-2` to better accommodate numerical precision in tests.

Ziminli requested changes Mar 10, 2026

View reviewed changes

src/cpu/add/add.h Outdated Show resolved Hide resolved

voltjia requested changes Mar 10, 2026

View reviewed changes

format: simplify type dispatching in Add operator and formating

dba18a5

voltjia merged commit 6e93d39 into feat/dev-infra Mar 11, 2026

voltjia deleted the feat/dev-multi-dtypes branch March 11, 2026 03:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: reorganize casting utilities and enhance CPU support#16

feat: reorganize casting utilities and enhance CPU support#16
voltjia merged 3 commits intofeat/dev-infrafrom
feat/dev-multi-dtypes

zhangyue207 commented Mar 10, 2026

Uh oh!

zhangyue207 commented Mar 10, 2026

Uh oh!

zhangyue207 commented Mar 10, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zhangyue207 commented Mar 11, 2026

Uh oh!

zhangyue207 commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

zhangyue207 commented Mar 10, 2026

Uh oh!

zhangyue207 commented Mar 10, 2026

Uh oh!

zhangyue207 commented Mar 10, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zhangyue207 commented Mar 11, 2026

Uh oh!

zhangyue207 commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants