meta issue: fp16 support #4402

sklam · 2019-08-02T14:29:10Z

This meta issue lists the concrete tasks needed to implement half-precision floating type. (#4395).

llvmlite:

add half type

numba:

add float16 type and conversion to/from numpy.float16.
setup implicit casting rules. (note: survey what numpy does with float16 casting).
implement basic arithmetic ops.
- add, sub, mul
- div, rem (separate item because CUDA PTX do not provide these it seems)
decide what to do with other math functions? implicit cast to float32 variant?

The text was updated successfully, but these errors were encountered:

seibert · 2019-08-06T14:52:57Z

Related issue is how to handle bfloat16, which seems to be gaining traction and will have limited AVX-512 support on very recent Intel CPU architectures, and is also used on the Google TPU. Haven't seen any GPU hardware support for bfloat16 yet, although AMD's ROCm seems to have added software support.

njwhite · 2019-09-11T23:40:13Z

Nvidia to add support for f16 ("half") types in NVVM IR - otherwise need to sprinkle llvm.convert.to.fp16.f32 / llvm.convert.from.fp16.f32 everywhere (and lower types.f16 in numba code to i16s in IR).

MurrayData · 2020-07-27T08:47:58Z

Just a note that CuPy now supports float16 in user defined kernels.

>>> import cupy as cp
>>> c = cp.random.random(1000000).astype(cp.float16)

@cp.fuse()
... def squared_diff(x, y):
...     return (x - y) * (x - y)

>>> square(c)

array([0.848  , 0.637  , 0.05655, ..., 0.02043, 0.5337 , 0.8125 ],
      dtype=float16)

It would be good to have float16 in Numba kernels to prepare data for deep learning models and RAPIDS Numba based UDFs. At present RAPIDS converts float32 to float16 as a workaround.

gmarkall · 2020-08-17T13:52:32Z

I can't edit the issue to tick the box, but it looks like:

llvmlite:

add half type

is done in numba/llvmlite#509, and

Nvidia to add support for f16 ("half") types in NVVM IR - otherwise need to sprinkle llvm.convert.to.fp16.f32 / llvm.convert.from.fp16.f32 everywhere (and lower types.f16 in numba code to i16s in IR).

is not completed, but there is a merged PR that adds support for fp16 intrinsics: numba/llvmlite#510

seibert · 2020-08-17T14:32:12Z

I ticked the half box for you.

seibert · 2020-08-17T14:36:07Z

It also looks like LLVM 11 adds bfloat16 support: llvm/llvm-project@8c24f33158d8

The type name seems to be bfloat.

GuillaumeLeclerc · 2022-01-11T05:36:35Z

Hello,

Any updates on this ?

gmarkall · 2022-01-11T09:11:53Z

There is some support in CUDA from #7556 and #7460 - this work is ongoing though and there are more PRs that will be needed for full float16 support in CUDA.

calclavia · 2022-03-08T05:40:04Z

Seems like with numba right now it's possible to use FP16 inputs, but not FP16 shared memory. Does full fp16 support in CUDA mean we can use fp16 shared memory?

Example (currently does not work):

cuda.shared.array(shape=(1,), dtype=nb.float16)

lucidrains · 2022-08-10T15:42:06Z

Any progress?

Hjorthmedh · 2022-09-20T11:45:44Z

Tag. Interested to hear if there is an update on this.

dongrixinyu · 2023-01-11T03:11:42Z

Float16 is much faster than float32 and float64. So as a JIT tool for python, it is extremely neccesary to support float16 type.

gmarkall · 2023-01-11T11:20:35Z

Currently float16 is supported in CUDA (maybe not everything in the latest release, but on main at least). We are still working on adding float16 for CPU targets.

seibert · 2023-01-12T18:45:06Z

at this point, the only CPUs supporting native FP16 arithmetic are Intel Sapphire Rapids (which just came out) and ARMv8.2 and later, right? I don't think AMD has any FP16 support on CPU at all right now.

dongrixinyu · 2023-01-13T03:33:39Z

Currently float16 is supported in CUDA (maybe not everything in the latest release, but on main at least). We are still working on adding float16 for CPU targets.

Does float16 for CPU problem affected by the LLVM compiler not supporting float16?

gmarkall · 2023-01-13T09:59:59Z

Does float16 for CPU problem affected by the LLVM compiler not supporting float16?

LLVM supports it, as does llvmlite - support was added in numba/llvmlite#509

For systems that don't have it, perhaps compiler-rt implementations need to be used (@testhound may know / recall what the thinking here was better than I).

testhound · 2023-01-13T16:29:10Z

@dongrixinyu I am currently working on float16 support for the cpu and @gmarkall is correct, support for LLVM's compiler-rt will be required for targets that do not support float16 instructions.

dongrixinyu · 2023-01-16T06:16:10Z

@dongrixinyu I am currently working on float16 support for the cpu and @gmarkall is correct, support for LLVM's compiler-rt will be required for targets that do not support float16 instructions.

@testhound realy looking forward to your support for float16 on the cpu!!! Thanks!

dongrixinyu · 2023-01-17T06:16:51Z

@dongrixinyu I am currently working on float16 support for the cpu and @gmarkall is correct, support for LLVM's compiler-rt will be required for targets that do not support float16 instructions.

If you can not spare much time to finish float16, I could help

ashvardanian · 2023-06-15T12:44:27Z

This would also affect our work on USearch. It supports f16 and Numba JIT to define the metrics, but not together :) Here is an example

SkBlaz · 2023-09-13T17:54:46Z

Hi! Great initiative. Is the initial post's checkbox up-to-date in terms of items finalized?

gmarkall · 2023-09-13T22:19:52Z

It's supported on CUDA but not the CPU target. So either all or none of them could be checked depending on what we decide this issue is about.

SkBlaz · 2023-09-14T08:29:55Z

Great to hear that, thanks. Was indeed asking more for the CPU scope - is this still a planned feature (CPU support), or is it rather unlikely?

gmarkall · 2023-09-14T09:09:18Z

It's possible. There are some llvmlite PRs in flight that will be in support of it: numba/llvmlite#979 and numba/llvmlite#986

sklam added Task feature_request labels Aug 2, 2019

hameerabbasi mentioned this issue Aug 27, 2019

dtype: NotImplementedError: float16 pydata/sparse#263

Open

seibert mentioned this issue Sep 11, 2019

Add Half-Precision Type numba/llvmlite#509

Merged

seibert mentioned this issue Oct 7, 2019

I have a question about half precision float support in numba cuda #4670

Closed

2 tasks

stuartarchibald mentioned this issue Nov 12, 2019

float16 #4822

Closed

jbednar mentioned this issue May 9, 2020

datashader does not work with dtype=np.float16 holoviz/datashader#914

Closed

MurrayData mentioned this issue Jul 27, 2020

[FEA] Add Float 16 support to RAPIDS cudf rapidsai/cudf#5770

Closed

devanshkv mentioned this issue Aug 18, 2020

Add support for programming Tensor Cores in CUDA kernel #5899

Open

This was referenced Mar 15, 2022

Testhound/fp16 divide #7722

Closed

Adds CUDA FP16 arithmetic operators #7885

Merged

Testhound/fp16 comparison #7556

Merged

testhound mentioned this issue Apr 8, 2022

Fix fp16 support for cuda shared array #7972

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

meta issue: fp16 support #4402

meta issue: fp16 support #4402

sklam commented Aug 2, 2019 •

edited by seibert

seibert commented Aug 6, 2019

njwhite commented Sep 11, 2019

MurrayData commented Jul 27, 2020

gmarkall commented Aug 17, 2020

seibert commented Aug 17, 2020

seibert commented Aug 17, 2020

GuillaumeLeclerc commented Jan 11, 2022

gmarkall commented Jan 11, 2022

calclavia commented Mar 8, 2022

lucidrains commented Aug 10, 2022

Hjorthmedh commented Sep 20, 2022

dongrixinyu commented Jan 11, 2023

gmarkall commented Jan 11, 2023

seibert commented Jan 12, 2023

dongrixinyu commented Jan 13, 2023

gmarkall commented Jan 13, 2023

testhound commented Jan 13, 2023

dongrixinyu commented Jan 16, 2023

dongrixinyu commented Jan 17, 2023

ashvardanian commented Jun 15, 2023

SkBlaz commented Sep 13, 2023

gmarkall commented Sep 13, 2023

SkBlaz commented Sep 14, 2023

gmarkall commented Sep 14, 2023

meta issue: fp16 support #4402

meta issue: fp16 support #4402

Comments

sklam commented Aug 2, 2019 • edited by seibert

seibert commented Aug 6, 2019

njwhite commented Sep 11, 2019

MurrayData commented Jul 27, 2020

gmarkall commented Aug 17, 2020

seibert commented Aug 17, 2020

seibert commented Aug 17, 2020

GuillaumeLeclerc commented Jan 11, 2022

gmarkall commented Jan 11, 2022

calclavia commented Mar 8, 2022

lucidrains commented Aug 10, 2022

Hjorthmedh commented Sep 20, 2022

dongrixinyu commented Jan 11, 2023

gmarkall commented Jan 11, 2023

seibert commented Jan 12, 2023

dongrixinyu commented Jan 13, 2023

gmarkall commented Jan 13, 2023

testhound commented Jan 13, 2023

dongrixinyu commented Jan 16, 2023

dongrixinyu commented Jan 17, 2023

ashvardanian commented Jun 15, 2023

SkBlaz commented Sep 13, 2023

gmarkall commented Sep 13, 2023

SkBlaz commented Sep 14, 2023

gmarkall commented Sep 14, 2023

sklam commented Aug 2, 2019 •

edited by seibert