C++ template library for floating point operations
-
Updated
Nov 1, 2024 - C++
C++ template library for floating point operations
Half-precision floating-point mathematical constants.
Cube root of half-precision floating-point epsilon.
Size (in bytes) of a half-precision floating-point number.
Square root of half-precision floating-point epsilon.
Stage 3 IEEE 754 half-precision floating-point ponyfill
Large collection of number systems providing custom arithmetic for mixed-precision algorithm development and optimization for AI, Machine Learning, Computer Vision, Signal Processing, CAE, EDA, control, optimization, estimation, and approximation.
float16 provides IEEE 754 half-precision format (binary16) with correct conversions to/from float32
Fast SGEMM emulation on Tensor Cores
An implementation of the Subleq OISC using only linear operations on half-precision (16 bit) IEEE-754 floats (and a loop).
Conversion to/from half-precision floating point formats
Half-precision 16-bit floating point numbers
Fast Half precision Floating point operations for C++
Python module which finds the IEEE-754 representation of a floating point number.
CPP20 implementation of a 16-bit floating-point type mimicking most of the IEEE 754 behavior. Single file and header-only.
Swift Half-Precision Floating Point
Floating-Point Arithmetic Library for Z80
FP16 pseudo random number generator on GPU
The DYM Math Library for Graphics and Game Programming
Basic linear algebra routines implemented using the chop rounding function
Add a description, image, and links to the half-precision topic page so that developers can more easily learn about it.
To associate your repository with the half-precision topic, visit your repo's landing page and select "manage topics."