feat: support casting and CPU bfloat16 and float16 by Ziminli · Pull Request #11 · InfiniTensor/InfiniOps

Ziminli · 2026-03-04T16:35:09Z

TL;DR: Supports CPU and CUDA generic Cast function and adds the CPU implementation of BFloat16 and Float16.

Key Changes

CPU bf16 and fp16:
- Implement CPU custom BFloat16 and Float16 types in data_type.h
Generic Casting
- Add the CPU generic casting function Cast() in src/common/cast.h;
- Add the CUDA generic casting function Cast() in src/common/cuda/cast.h, which is seamlessly compatiable with different CUDA-ish platforms, currently verified to correctly dispatch hardware intrinsics on both NVIDIA and MetaX;
  - Mechanism: Uses priority-tag SFINAE dispatching to favor hardware intrinsics (e.g., __bfloat162int_rn) with an automatic fallback to a float-pivot conversion.
Style Correction
- Fix the naming of the utility function indexToOffset() to IndexToOffset() to comply with the styling rules.

Known Issues & Future Work:

More Testing: the current Cast() and CPU BFloat16 and Float16 are not extensively tested across real operators and platforms other than NVIDIA and MetaX.
Enrich CUDA Direct Casts: currently the CUDA Cast() only provides a subset of the hardware direct cast intrinsics. Should enrich the mapping in the future.

src/data_type.h

src/common/cuda/cast.h

…`Cast()` function - add the CPU implementation of float16 and bfloat16 as `float16_t` and `bfloat16_t` - add the CPU `Cast()` function that support conversion between any two CPU supported types, including the custom `float16_t` and `bfloat16_t`

…he styling requirement

…patch and move them into `common/cuda/cast.h`

…/cuda/cast.h` to better comply with the naming rules

…ons.h`

…and fix various styling issues.

voltjia · 2026-03-06T05:16:31Z

Ziminli self-assigned this Mar 4, 2026

voltjia requested changes Mar 6, 2026

View reviewed changes

src/data_type.h Outdated Show resolved Hide resolved

src/data_type.h Outdated Show resolved Hide resolved

src/common/cuda/cast.h Show resolved Hide resolved

src/common/cuda/cast.h Outdated Show resolved Hide resolved

Ziminli added 8 commits March 6, 2026 05:14

style: change indexToOffset() to IndexToOffset() to comply with t…

006392e

…he styling requirement

feat: add the CUDA Cast() function

9b846ff

refactor: refactor CUDA Cast utility with SFINAE-based hardware dis…

2b3c87f

…patch and move them into `common/cuda/cast.h`

style: change the naming of some types in common/cast.h and `common…

defe484

…/cuda/cast.h` to better comply with the naming rules

chore: remove unused header data_type.h in `common/cuda/kernel_comm…

5340edd

…ons.h`

style: adjust comments for styling rule compliance

5ca25c2

style: change float_t and bfloat16_t to Float16 and BFloat16 …

2ab0e33

…and fix various styling issues.

voltjia force-pushed the feat/support-cast-and-bf16-fp16 branch from 1b7c61b to 2ab0e33 Compare March 6, 2026 05:15

voltjia approved these changes Mar 6, 2026

View reviewed changes

voltjia merged commit 8442eff into feat/dev-infra Mar 6, 2026

voltjia deleted the feat/support-cast-and-bf16-fp16 branch March 6, 2026 05:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support casting and CPU bfloat16 and float16#11

feat: support casting and CPU bfloat16 and float16#11
voltjia merged 8 commits intofeat/dev-infrafrom
feat/support-cast-and-bf16-fp16

Ziminli commented Mar 4, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

voltjia commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Ziminli commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

voltjia commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Ziminli commented Mar 4, 2026 •

edited

Loading