Add half precision / float16 support in Paddle #4853

kexinzhao · 2017-10-17T00:47:28Z

Currently, half precision floating point (float16) data type is not supported in Paddle.

Adding the float16 data type could potentially:

reduce storage space
save memory bandwidth usage
arithmetic speed up if supported by hardware

A brief survey of float16 support on different hardwares:

ARM processor:

float16 storage and conversion to/from float32 is generally supported in armv7 and armv8.

However float16 arithmetic is only supported since armv8.2A (Quote: "IEEE754-2008 formatted half-precision floating point data processing is added to Armv8.2-A").

There are currently very limited device using CPU of armv8.2A architecture (the only one I found is newly launched cortex-A75 which will be used in Qualcomm Snapdragon 845).

x86/x64 CPU:

float16 is only supported as a storage type and intrinsics are available for conversion between float16 and float32.

Nvidia GPU:

fp16 storage and arithmetic available since cuda 7.5 on supported GPUs (e.g. PASCAL GPUs).

kexinzhao · 2017-10-17T00:54:12Z

A brief survey of how float16 arithmetics work in ARM ComputeLibrary:

https://github.com/ARM-software/ComputeLibrary/blob/master/SConstruct#L125

elif env['arch'] == 'arm64-v8.2-a':
    env.Append(CXXFLAGS = ['-march=armv8.2-a+fp16+simd'])
    env.Append(CPPDEFINES = ['ARM_COMPUTE_ENABLE_FP16'])

ARM_COMPUTE_ENABLE_FP16 is defined only when the current arm processor is 64bit armv8.2-a. All the float16 arithmetics are used only when this flag is defined, such as this code.

We can follow a similar procedure to only enable float16 arithmetics on supported ARM processor.

kexinzhao · 2017-11-16T20:12:18Z

Add float16 data type
Update pybind/tensor_py.h to bind c++ float16 with numpy float16
Modify GetKernelType() method in framework/operator.h to make it compatible with float16
Create a type-casting operator that can convert the data type in tensor between float16 and other types

kexinzhao self-assigned this Oct 17, 2017

Xreki added the mobile label Oct 17, 2017

kexinzhao added this to Neon Optimize & Low Precision in Embedded and Mobile Deployment Oct 18, 2017

kexinzhao closed this as completed Nov 16, 2017

kexinzhao reopened this Nov 16, 2017

hedaoyuan mentioned this issue Nov 20, 2017

Paddle on Mobile #5782

Closed

kexinzhao mentioned this issue Nov 23, 2017

Add half precision float16 data type #5716

Merged

kexinzhao closed this as completed Apr 9, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add half precision / float16 support in Paddle #4853

Add half precision / float16 support in Paddle #4853

kexinzhao commented Oct 17, 2017

kexinzhao commented Oct 17, 2017

kexinzhao commented Nov 16, 2017 •

edited

Loading

Add half precision / float16 support in Paddle #4853

Add half precision / float16 support in Paddle #4853

Comments

kexinzhao commented Oct 17, 2017

ARM processor:

x86/x64 CPU:

Nvidia GPU:

kexinzhao commented Oct 17, 2017

kexinzhao commented Nov 16, 2017 • edited Loading

kexinzhao commented Nov 16, 2017 •

edited

Loading