Add FP16 capablity #44

marty1885 · 2019-07-19T09:24:08Z

WIP:

This PR aims to:

Add support to create FP16 tensor
Add FP16 tensor operators
Optimize HTM algorithm against FP16
Clean up how tensor's properties are checked in the backend
- Reduce LoC, more readable

To both CPU and OpenCL backend.

And for OpenCL exclusively:

Check the given OpenCL device can process FP16

Tests for the type system

Check types have the correct size
Check resulting type of unitary operations
Check resulting type of binary operations
- Besides comparison ops, I;m lazy
Check resulting type of general tensor operations
Somehow disable the fp16 tests when GPU doesn't support FP16
A way to convert SP/TM from float to float16
- Just cast the permeance!

Serialize:

Save/load fp16 tensors

Ref #12

…isted type

marty1885 · 2019-08-01T07:46:35Z

Done. Merging!

Add FP16 capablity

marty1885 added 25 commits July 19, 2019 14:01

add initial float16 code and type system tests

b298cfb

implement initial float16 support for CPU

bfb4393

Add fp16 support to OpenCLBackend

82fe5f6

fix g_backend_hold linking error on Linux

f6bee25

templatize dispatch()

9e4dc53

add test for result of exp()

7608517

fix wrong resulting type of uniary op on CPU

e15029f

Enable fp16 for GPU4

65e7d9e

add half percition info to deviceInfo

ff0b0ce

add tests for uniary operation types and make backend complicant

6b62599

enable FP16 tests for supported cases

c3792bd

add type test for sum

fdf9ccf

add test for binray operation result type

6a72bf8

make OpenCLBackend binary op return correct type

6083d37

make some type tests static

a08d5f6

serialize half

077525d

fix OpenCL sum not working with half

5e82874

fix cast not supporting half

13a2e1f

clean up tensor property check

ebead29

add support for some HtM algorithms in CPUBackend

241bfa7

fix CMake warning message for newer versions

5eb8ee4

support fp16 in HTM algorithms in CPUBackend

a00fc14

add property to test and implement check of tensor being one of the l…

90a2019

…isted type

enable HTM algorithms to run on OpenCL with half percision

ecea051

fix fp16 not working on OpenCL

a50608a

marty1885 marked this pull request as ready for review August 1, 2019 07:46

marty1885 merged commit d61c755 into etaler:master Aug 1, 2019

marty1885 mentioned this pull request Oct 15, 2019

Enable hardware fp16 on ARM #86

Closed

marty1885 added a commit to marty1885/Etaler that referenced this pull request Jun 19, 2020

Merge pull request etaler#44 from marty1885/fp16

4a56361

Add FP16 capablity

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add FP16 capablity #44

Add FP16 capablity #44

marty1885 commented Jul 19, 2019 •

edited

marty1885 commented Aug 1, 2019

Add FP16 capablity #44

Add FP16 capablity #44

Conversation

marty1885 commented Jul 19, 2019 • edited

marty1885 commented Aug 1, 2019

marty1885 commented Jul 19, 2019 •

edited