Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add bfloat16 data type #257

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
8 changes: 6 additions & 2 deletions docs/v3/core/v3.0.rst
Expand Up @@ -590,7 +590,7 @@ mandatory names:
The value must be a JSON number with no fraction or exponent part that is
within the representable range of the data type.

IEEE 754 floating point numbers (``float{16,32,64}``)
Floating point numbers (``float{16,32,64}``, ``bfloat16``)
The value may be either:

- A JSON number, that will be rounded to the nearest representable value.
Expand All @@ -599,7 +599,7 @@ mandatory names:

- ``"Infinity"``, denoting positive infinity;
- ``"-Infinity"``, denoting negative infinity;
- ``"NaN"``, denoting thenot-a-number (NaN) value where the sign bit is
- ``"NaN"``, denoting the not-a-number (NaN) value where the sign bit is
0 (positive), the most significant bit (MSB) of the mantissa is 1, and
all other bits of the mantissa are zero;
- ``"0xYYYYYYYY"``, specifying the byte representation of the floating
Expand Down Expand Up @@ -943,6 +943,10 @@ Core data types
- IEEE 754 single-precision floating point: sign bit, 8 bits exponent, 23 bits mantissa
* - ``float64``
- IEEE 754 double-precision floating point: sign bit, 11 bits exponent, 52 bits mantissa
* - ``bfloat16`` (optionally supported)
- `bfloat16 floating-point
format<https://en.wikipedia.org/wiki/Bfloat16_floating-point_format>`_:
sign bit, 5 bits exponent, 10 bits mantissa
* - ``complex64``
- real and complex components are each IEEE 754 single-precision floating point
* - ``complex128``
Expand Down