[CUDA] Various int8 fix (cublas, cutlass, etc) #10596

masahi · 2022-03-14T06:00:32Z

Fixed a funny bug where AutoTVM tensorcore dense on int8 is never selected by the op strategy. This is because it sits in the else branch of

tvm/python/tvm/relay/op/strategy/cuda.py

Lines 838 to 843 in 5c0ea30

    
           if ( 
        
               target.kind.name == "cuda" 
        
               and data.dtype == "int8" 
        
               and weights.dtype == "int8" 
        
               and out_type.dtype == "int32" 
        
           ):

Supported int8 -> int32 cublas batch_matmul
Fix "Too many predicates" error from cutlass int8 + align1 case "Too many predicates" error from `PredicatedTileAccessIteratorPredicates` NVIDIA/cutlass#409 (comment)

The first two fixes above make it possible to run int8 bert-base end to end via autotvm or cublas on tensorcore (cutlass already works without error).

@Laurawly @vinx13 @junrushao1994 @comaniac

jwfromm

Those tabs can be dangerous. Thanks for the fix Masa, LGTM!

* [CUTLASS] avoid tile size 256 for int8 + align1 case * allow selecting int8 dense strategy for vulkan * fixed cublas batch matmul for int8 * fixed int8 dense tensorcore strategy * add cutlass conv align1 + int8 case * support int8 mixed precision cublas bmm * black

masahi added 7 commits March 14, 2022 14:59

[CUTLASS] avoid tile size 256 for int8 + align1 case

62fcc60

allow selecting int8 dense strategy for vulkan

ca5aabc

fixed cublas batch matmul for int8

c1ba427

fixed int8 dense tensorcore strategy

ca11df2

add cutlass conv align1 + int8 case

3f7b7b6

support int8 mixed precision cublas bmm

5eb487a

black

12989d6

masahi force-pushed the cutlass-int8-fix branch from 4879857 to 12989d6 Compare March 14, 2022 06:52

jwfromm approved these changes Mar 14, 2022

View reviewed changes

jwfromm merged commit 7d5ef84 into apache:main Mar 14, 2022

masahi mentioned this pull request Mar 15, 2022

Add benchmark models that are not easily accessible tlc-pack/TLCBench#5

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CUDA] Various int8 fix (cublas, cutlass, etc) #10596

[CUDA] Various int8 fix (cublas, cutlass, etc) #10596

masahi commented Mar 14, 2022 •

edited

Loading

jwfromm left a comment

	if (
	target.kind.name == "cuda"
	and data.dtype == "int8"
	and weights.dtype == "int8"
	and out_type.dtype == "int32"
	):

[CUDA] Various int8 fix (cublas, cutlass, etc) #10596

[CUDA] Various int8 fix (cublas, cutlass, etc) #10596

Conversation

masahi commented Mar 14, 2022 • edited Loading

jwfromm left a comment

Choose a reason for hiding this comment

masahi commented Mar 14, 2022 •

edited

Loading