[DataType] Initial support of fp8 (e4m3/e5m2) #14863

yzh119 · 2023-05-16T21:42:07Z

Motivation

Recently NVIDIA announced official support of the fp8 data type: e4m3 and e5m2, the first one has 4 bits for exponent and 3 bits for mantissa while the second one has 5 bits for exponent and 2 bits for mantissa, and NVIDIA encourages using e4m3 for forward and e5m2 (larger dynamic range) for backward. Currently, TVM has no support for these data types, as the first step to support fp8, this PR adds new type codes for e4m3_float8 and e5m2_float8, and implement legalization passes FP8ComputeLegalize and FP8StorageLegalize so that we can use them for backends that do not have native fp8 support.

Future Work

Emit CUDA fp8 primitives in CUDA codegen.
Support wgmma.mma_async.sync ptx assembly to use fp8 tensor cores in Ada/Hopper.
Support fp8 in dlpack.

Notes

Infinity and NaN are not handled in our legalization pass (this behavior is the same as our previous BF16 legalization implementation) because it's costly to support them on the software side. It's the user's duty to guarantee that the conversion is safe.

Reference

cc @MasterJH5574 @masahi @tqchen @Hzfengsy

tvm-bot · 2023-05-16T21:42:11Z

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

No users to tag found in teams: datatype _{See #10317 for details}

_{Generated by tvm-bot}

Hzfengsy

Thanks for such great works! LGTM except some nits

src/tir/transforms/dtype_conversion.h

leandron

Thanks for the PR - I think adding support for fp8 is very positive.

I wonder if we should have an RFC, as this is adding a feature that may impact passes, schedules etc.

tqchen · 2023-05-17T14:10:16Z

Adding fp8 is unlikely going impact passes or schedules in a significant way, as most passes can start opt out the dtype :)

Having broad awareness is good, cross linking https://discuss.tvm.apache.org/t/tvm-support-for-fp8-a-discussion/14656, we can bring more discussions there as well

tqchen · 2023-06-01T13:48:26Z

Thank you everyone, looking forward for more FP8 improvements. This technology is still early so it could be possible that we will need to iterate a few times. It is great to get timely support for the community so we can start trying it out and learn

LeiWang1999 · 2023-06-13T10:42:51Z

looks like something unexpected happens with the pass FP8StorageLegalize under rocm backend. @yzh119

# from tvm.script import tir as T

M = 64
N = 64

@tvm.script.ir_module
class MyModule:
    @T.prim_func
    def main(a: T.handle, b: T.handle):
        T.func_attr({"global_symbol": "main"})
        A = T.match_buffer(a, (M, N))
        B = T.match_buffer(b, (M, N))
        for i, j in T.grid(M, N):
            with T.block("B"):
                vi, vj = T.axis.remap("SS", [i, j])
                B[vi, vj] = A[vi, vj] * 2.0

Traceback (most recent call last):
  File "memory_copy.py", line 37, in <module>
    rocm_mod = tvm.build(sch.mod, target="rocm --host=llvm")
  File "/home/aiscuser/v-leiwang3/tvm/python/tvm/driver/build_module.py", line 281, in build
    rt_mod_host = _driver_ffi.tir_to_runtime(annotated_mods, target_host)
  File "/home/aiscuser/v-leiwang3/tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 238, in __call__
    raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
  12: TVMFuncCall
  11: tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::runtime::TypedPackedFunc<tvm::runtime::Module (tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target)>::AssignTypedLambda<tvm::{lambda(tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target)#6}>(tvm::{lambda(tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target)#6}, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tvm::runtime::TVMRetValue)
  10: tvm::TIRToRuntime(tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target const&)
  9: tvm::SplitMixedModule(tvm::IRModule, tvm::Target const&, tvm::Target const&)
  8: tvm::ApplyPasses(tvm::IRModule, tvm::transform::Sequential)
  7: tvm::transform::Pass::operator()(tvm::IRModule) const
  6: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
  5: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
  4: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
  3: tvm::tir::transform::PrimFuncPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
  2: _ZN3tvm7runtime13PackedFuncObj
  1: tvm::runtime::TypedPackedFunc<tvm::tir::PrimFunc (tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)>::AssignTypedLambda<tvm::tir::transform::FP8StorageLegalize()::{lambda(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)#1}>(tvm::tir::transform::FP8StorageLegalize()::{lambda(tvm::tir::PrimFunc, tvm::IRModule, tvm::transform::PassContext)#1})::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}::operator()(tvm::runtime::TVMArgs const, tvm::runtime::TVMRetValue) const
  0: tvm::tir::StorageLegalizer::Legalize(tvm::tir::PrimFunc)
  File "/home/aiscuser/v-leiwang3/tvm/src/tir/transforms/unsupported_dtype_legalize.cc", line 478
TVMError: 
---------------------------------------------------------------
An error occurred during the execution of TVM.
For more information, please see: https://tvm.apache.org/docs/errors.html
---------------------------------------------------------------

  Check failed: func->buffer_map.size() == 0 (2 vs. 0) : This pass must be called after MakePackedAPI

yzh119 · 2023-06-14T17:02:32Z

@LeiWang1999 it should have been fixed in #15102.

yzh119 added 13 commits May 9, 2023 21:53

upd

050e837

upd

8d8fd69

upd

f242cac

upd

87593b4

upd

fb4758d

upd

eeb4678

upd

1456130

upd

5f2b1d6

add max(x, 0)

5698f73

upd

a18b450

bugfix

91b0fb1

complete test

beb0ce3

upd

2a59f67

github-actions bot requested review from MasterJH5574, masahi and tqchen May 16, 2023 21:47

rename

76c0c5d

github-actions bot requested a review from Hzfengsy May 16, 2023 21:53

yzh119 added 3 commits May 16, 2023 21:58

cleanup

e2ad041

cleanup

455bd53

fix lint

1a75376

Hzfengsy approved these changes May 17, 2023

View reviewed changes

src/tir/transforms/dtype_conversion.h Show resolved Hide resolved

src/tir/transforms/dtype_conversion.h Show resolved Hide resolved

yzh119 added 5 commits May 17, 2023 01:47

add round_to_zero judgement

252ddd4

add \return for c++ dostring

8cf0c61

more docstring

b590000

remove redundancy

d121986

fix bugs and improve tests

e42e009

leandron reviewed May 17, 2023

View reviewed changes

yzh119 added 7 commits May 17, 2023 17:57

reorder requirements

0a0eb33

handle the case that ml_dtypes is not installed

ed30275

upd

e6980d2

upd

3be640a

fix

37c937f

trigger

3e633f1

upd

0d5d0f6

tqchen approved these changes Jun 1, 2023

View reviewed changes

tqchen merged commit b13be93 into apache:main Jun 1, 2023
18 checks passed

yzh119 mentioned this pull request Jun 3, 2023

[Hotfix] Remove LOG(INFO) from unsupported dtype legalization pass #15015

Merged

ysh329 mentioned this pull request Jul 12, 2023

[Release] v0.13.0 Release Candidate Notes #15295

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DataType] Initial support of fp8 (e4m3/e5m2) #14863

[DataType] Initial support of fp8 (e4m3/e5m2) #14863

yzh119 commented May 16, 2023 •

edited

tvm-bot commented May 16, 2023

Hzfengsy left a comment

leandron left a comment

tqchen commented May 17, 2023 •

edited

tqchen commented Jun 1, 2023

LeiWang1999 commented Jun 13, 2023 •

edited

yzh119 commented Jun 14, 2023

[DataType] Initial support of fp8 (e4m3/e5m2) #14863

[DataType] Initial support of fp8 (e4m3/e5m2) #14863

Conversation

yzh119 commented May 16, 2023 • edited

Motivation

Future Work

Notes

Reference

tvm-bot commented May 16, 2023

Hzfengsy left a comment

Choose a reason for hiding this comment

leandron left a comment

Choose a reason for hiding this comment

tqchen commented May 17, 2023 • edited

tqchen commented Jun 1, 2023

LeiWang1999 commented Jun 13, 2023 • edited

yzh119 commented Jun 14, 2023

yzh119 commented May 16, 2023 •

edited

tqchen commented May 17, 2023 •

edited

LeiWang1999 commented Jun 13, 2023 •

edited