Enable matmul for nvFuser #207

Priya2698 · 2024-04-17T00:12:04Z

What does this PR do?

Enables matmul in nvFuser. Part of resolving NVIDIA/Fuser#2053

thunder/executors/nvfuserex_impl.py

jjsjann123 · 2024-04-18T17:49:57Z

Do you mind trying this in your branch?
#193

Note: you might want to remove the nv_enable_bookend and replace that with enabling matmul.

I'm getting assert on dtype being reduced float. (maybe add that in the check?! but why do we have that in the first place, wasn't it kicked to aten?!)

It's also failing with
RuntimeError: h.has_value() INTERNAL ASSERT FAILED at "/opt/pytorch/nvfuser/csrc/fusion_segmenter.cpp":3671, please report a bug with repro script to NVFuser at https://github.com/NVIDIA/Fuser/issues. Can not find a scheduler to schedule fusion segment

import torch
from nvfuser import FusionDefinition, DataType

def nvfuser_fusion_id0(fd : FusionDefinition) -> None :
    T0 = fd.define_tensor(shape=[-1, -1, -1], contiguity=[True, True, True], dtype=DataType.BFloat16, is_cpu=False, stride_order=[2, 1, 0])
    T1 = fd.define_tensor(shape=[-1, -1], contiguity=[True, True], dtype=DataType.BFloat16, is_cpu=False, stride_order=[1, 0])
    T2 = fd.ops.permute(T1, dims=[1, 0])
    T3 = fd.ops.permute(T0, dims=[2, 1, 0])
    S4 = fd.define_scalar(16, dtype=DataType.Int)
    S5 = fd.define_scalar(32, dtype=DataType.Int)
    V6 = fd.define_vector([S4, S5], dtype=DataType.Int)
    T7 = fd.ops.reshape(T3, new_shape=V6)
    T8 = fd.ops.matmul(T2, T7)
    S9 = fd.define_scalar(16, dtype=DataType.Int)
    S10 = fd.define_scalar(16, dtype=DataType.Int)
    S11 = fd.define_scalar(2, dtype=DataType.Int)
    V12 = fd.define_vector([S9, S10, S11], dtype=DataType.Int)
    T13 = fd.ops.reshape(T8, new_shape=V12)
    T14 = fd.ops.permute(T13, dims=[2, 1, 0])
    fd.add_output(T14)

with FusionDefinition() as fd:
    nvfuser_fusion_id0(fd)

inputs = [
    torch.randn((512,), dtype=torch.bfloat16, device='cuda:0').as_strided((2, 16, 16), (256, 16, 1)),
    torch.randn((256,), dtype=torch.bfloat16, device='cuda:0').as_strided((16, 16), (16, 1)),
]   
fd.execute(inputs)

thunder/executors/nvfuserex_impl.py

jjsjann123

sorry this falls off my radar.

cc'ing @IvanYashchuk regarding the matmul checker rejection vs throwing an error. It makes more sense to just reject the matmul instead of throwing an error with out-dated nvfuser version (our stable release is still using older nvfuser version).

Meanwhile, stamping to merge!
We can revisit Ivan's suggestion is he has a strong opinion on the exception, we can throw a warning if you are just concerned about silently running with out-dated library.

t-vi

Thank you @Priya2698 @IvanYashchuk @jjsjann123

for more information, see https://pre-commit.ci

Priya2698 changed the title ~~register matmul~~ Enable matmul for nvFuser Apr 17, 2024

Priya2698 requested a review from jjsjann123 April 17, 2024 19:56

jjsjann123 reviewed Apr 17, 2024

View reviewed changes

thunder/executors/nvfuserex_impl.py Show resolved Hide resolved

kevinstephano mentioned this pull request Apr 18, 2024

Add Thunder compile_options for enabling linear and matmul separately at the Framework level NVIDIA/Fuser#2053

Closed

IvanYashchuk reviewed Apr 19, 2024

View reviewed changes

thunder/executors/nvfuserex_impl.py Show resolved Hide resolved

Priya2698 mentioned this pull request Apr 22, 2024

Segmentation failure in matmul + reshape fusion NVIDIA/Fuser#2127

Closed

Priya2698 marked this pull request as ready for review April 23, 2024 22:12

Priya2698 requested review from mruberry, lantiga, robieta, t-vi and carmocca as code owners April 23, 2024 22:12

Priya2698 requested a review from jjsjann123 April 23, 2024 22:12

jjsjann123 approved these changes Apr 29, 2024

View reviewed changes

github-actions bot added the has conflicts label May 2, 2024

t-vi approved these changes May 2, 2024

View reviewed changes

pre-commit-ci bot and others added 2 commits May 2, 2024 21:46

[pre-commit.ci] auto fixes from pre-commit.com hooks

839e652

for more information, see https://pre-commit.ci

resolve merge conflict

573d328

Priya2698 force-pushed the nvf_matmul branch from fb5071f to 573d328 Compare May 2, 2024 21:49

[pre-commit.ci] auto fixes from pre-commit.com hooks

6dceaff

for more information, see https://pre-commit.ci

github-actions bot removed the has conflicts label May 2, 2024

t-vi merged commit 831d6d0 into main May 3, 2024
35 of 39 checks passed

t-vi deleted the nvf_matmul branch May 3, 2024 11:45

Priya2698 mentioned this pull request May 3, 2024

Disable Upcasting and Downcasting to BF16/FP16 around matmul and linear operations for the nvFuser Executor NVIDIA/Fuser#2054

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable matmul for nvFuser #207

Enable matmul for nvFuser #207

Priya2698 commented Apr 17, 2024 •

edited

Loading

jjsjann123 commented Apr 18, 2024

jjsjann123 left a comment

t-vi left a comment

Enable matmul for nvFuser #207

Enable matmul for nvFuser #207

Conversation

Priya2698 commented Apr 17, 2024 • edited Loading

What does this PR do?

jjsjann123 commented Apr 18, 2024

jjsjann123 left a comment

Choose a reason for hiding this comment

t-vi left a comment

Choose a reason for hiding this comment

Priya2698 commented Apr 17, 2024 •

edited

Loading