Conversion to/from half-precision floating point formats
-
Updated
Jul 31, 2024 - C++
Conversion to/from half-precision floating point formats
Large collection of number systems providing custom arithmetic for mixed-precision algorithm development and optimization for AI, Machine Learning, Computer Vision, Signal Processing, CAE, EDA, control, optimization, estimation, and approximation.
C++17 templates between [stl::vector | armadillo | eigen3 | ublas | blitz++] and HDF5 datasets
A memory balanced and communication efficient FullyConnected layer with CrossEntropyLoss model parallel implementation in PyTorch
Round matrix elements to lower precision in MATLAB
float16 provides IEEE 754 half-precision format (binary16) with correct conversions to/from float32
Stage 3 IEEE 754 half-precision floating-point ponyfill
half float library for C and for z80
Convert CUDA programs from float data type to half or half2 with SIMDization
CPP20 implementation of a 16-bit floating-point type mimicking most of the IEEE 754 behavior. Single file and header-only.
Python module which finds the IEEE-754 representation of a floating point number.
Half-Precision Floating-Point for Delphi
Floating-Point Arithmetic Library for Z80
Optimised Caffe with OpenCL supporting for less powerful devices such as mobile phones
The DYM Math Library for Graphics and Game Programming
Swift Half-Precision Floating Point
Basic linear algebra routines implemented using the chop rounding function
A library that encodes 3 to 16 bits wide floating-point numbers.
Half-precision 16-bit floating point numbers
Add a description, image, and links to the half-precision topic page so that developers can more easily learn about it.
To associate your repository with the half-precision topic, visit your repo's landing page and select "manage topics."