Skip to content

feat: support casting and CPU bfloat16 and float16#11

Merged
voltjia merged 8 commits intofeat/dev-infrafrom
feat/support-cast-and-bf16-fp16
Mar 6, 2026
Merged

feat: support casting and CPU bfloat16 and float16#11
voltjia merged 8 commits intofeat/dev-infrafrom
feat/support-cast-and-bf16-fp16

Conversation

@Ziminli
Copy link
Copy Markdown
Collaborator

@Ziminli Ziminli commented Mar 4, 2026

TL;DR: Supports CPU and CUDA generic Cast function and adds the CPU implementation of BFloat16 and Float16.

Key Changes

  • CPU bf16 and fp16:

    • Implement CPU custom BFloat16 and Float16 types in data_type.h
  • Generic Casting

    • Add the CPU generic casting function Cast() in src/common/cast.h;

    • Add the CUDA generic casting function Cast() in src/common/cuda/cast.h, which is seamlessly compatiable with different CUDA-ish platforms, currently verified to correctly dispatch hardware intrinsics on both NVIDIA and MetaX;

      • Mechanism: Uses priority-tag SFINAE dispatching to favor hardware intrinsics (e.g., __bfloat162int_rn) with an automatic fallback to a float-pivot conversion.
  • Style Correction

    • Fix the naming of the utility function indexToOffset() to IndexToOffset() to comply with the styling rules.

Known Issues & Future Work:

  • More Testing: the current Cast() and CPU BFloat16 and Float16 are not extensively tested across real operators and platforms other than NVIDIA and MetaX.

  • Enrich CUDA Direct Casts: currently the CUDA Cast() only provides a subset of the hardware direct cast intrinsics. Should enrich the mapping in the future.

@Ziminli Ziminli self-assigned this Mar 4, 2026
Ziminli added 8 commits March 6, 2026 05:14
…`Cast()` function

- add the CPU implementation of float16 and bfloat16 as `float16_t` and `bfloat16_t`
- add the CPU `Cast()` function that support conversion between any two CPU supported types, including the custom `float16_t` and `bfloat16_t`
…patch and move them into `common/cuda/cast.h`
…/cuda/cast.h` to better comply with the naming rules
@voltjia voltjia force-pushed the feat/support-cast-and-bf16-fp16 branch from 1b7c61b to 2ab0e33 Compare March 6, 2026 05:15
@voltjia
Copy link
Copy Markdown
Collaborator

voltjia commented Mar 6, 2026

image

@voltjia voltjia merged commit 8442eff into feat/dev-infra Mar 6, 2026
@voltjia voltjia deleted the feat/support-cast-and-bf16-fp16 branch March 6, 2026 05:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants