Skip to content

Introduce a BF16 DType #1734

@AdamGS

Description

@AdamGS

bf16 (aka bfloat16) is a floating point number format introduced by Google to improve storage utilization and computation speed for machine learning models. It has the roughly the same range as a standard IEEE 754 float32 but with much reduced precision (8-bit mantissa instead of 24). Due to its popularity, more and more hardware vendors now support specialized instructions for it, including recent AVX extensions and GPU vendors.

Open questions in my mind are:

  • Is it an extension dtype?
  • How does it canonicalize into Arrow? Seems like there were efforts to introduce it as an "official" extension but nothing materialized as far as I can tell.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions