torch._int_mm: fix triton kernel caching #99283

vkuzo · 2023-04-17T03:23:00Z

Summary:

A fix to ensure that kernels generated for torch._int_mm can be cached. We can remove this hack one eager mode torch._int_mm is better supported.

Let me know if something more proper is needed instead of the hack.

Test plan:

// running the script below led to two compilations of triton
// int8,int8->int32 kernel before this PR, and only has
// one compilation which is reused after this PR

import torch
import torch.nn as nn

x = torch.randint(-128, 127, (32, 32), device='cuda', dtype=torch.int8)
y = torch.randint(-128, 127, (32, 32), device='cuda', dtype=torch.int8)

class M(nn.Module):
    def forward(self, x):
        x = torch._int_mm(x, y)
        x = x.to(torch.int8)
        x = torch._int_mm(x, y)
        return x

m = M().cuda().half()
m = torch.compile(m, options={"max-autotune": True})

z = m(x)
z = m(x)

Fixes #ISSUE_NUMBER

cc @soumith @voznesenskym @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @desertfire

pytorch-bot · 2023-04-17T03:23:03Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/99283

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Long queue for windows.4xlarge

❌ 1 Failures

As of commit e11ce6b:

NEW FAILURES - The following jobs have failed:

cuda11.8-py3.10-gcc7-sm86 / test (inductor, 1, 1, linux.g5.4xlarge.nvidia.gpu) (gh)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

nmacchioni

I'm not convinced this is needed, since I recently changed the way things are cached in #98351. Tl;dr; the caching is not longer based on the default choice's name but rather is hardcored into the autotuning call.

vkuzo · 2023-04-17T22:50:52Z

I'm not convinced this is needed, since I recently changed the way things are cached in #98351. Tl;dr; the caching is not longer based on the default choice's name but rather is hardcored into the autotuning call.

here is the line using choice[0] name as the cache key:

pytorch/torch/_inductor/select_algorithm.py

Line 742 in c2fd198

choices[0].name,

I guess we can just use that instead of this hack then, I wasn't sure if there was some reason it had to be choice[0].

nmacchioni · 2023-04-17T22:58:48Z

I'm not convinced this is needed, since I recently changed the way things are cached in #98351. Tl;dr; the caching is not longer based on the default choice's name but rather is hardcored into the autotuning call.

here is the line using choice[0] name as the cache key:

pytorch/torch/_inductor/select_algorithm.py

Line 742 in c2fd198

choices[0].name,

I guess we can just use that instead of this hack then, I wasn't sure if there was some reason it had to be choice[0].

Oh good catch, we never actually pass the hardcoded name to the lookup function. I think if we just pass name here instead of choices[0].name that would be a good fix. Thanks!

vkuzo · 2023-04-18T04:19:40Z

@pytorchbot merge

pytorchmergebot · 2023-04-18T04:21:31Z

Merge failed

Reason: Approval needed from one of the following:
jfix71, gqchen, tiandiao123, jerry39213gh, ananthsub, ...

Details for Dev Infra team

Raised by workflow job

Failing merge rule: Core Maintainers

vkuzo · 2023-04-18T16:32:02Z

@pytorchbot merge -f 'rule requiring approvers from an allowlist seems unrelated'

pytorchmergebot · 2023-04-18T16:33:58Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2023-04-18T16:34:05Z

Merge failed

Reason: Approval needed from one of the following:
shoumikhin, orionr, naveedgol, mikekgfb, jeanschmidt, ...

Details for Dev Infra team

Raised by workflow job

Failing merge rule: Core Maintainers

nmacchioni · 2023-04-18T16:40:54Z

@vkuzo I'm not on the list of maintainers, so you'll still need a stamp from one of them. @jansel can you review?

vkuzo · 2023-04-18T16:41:25Z

I guess each failed merge unmasks 5 more reviewers, so I can keep failing to merge until I see a familiar name...

vkuzo · 2023-04-18T16:42:15Z

@vkuzo I'm not on the list of maintainers, so you'll still need a stamp from one of them. @jansel can you review?

do you know what the list is? I didn't see anyone I know so far on the 5 people the job gives. Didn't want to ask a rando.

vkuzo · 2023-04-18T16:42:35Z

@pytorchmergebot merge -f 'give me 5 more names, bot'

nmacchioni · 2023-04-18T16:45:12Z

@vkuzo I'm not on the list of maintainers, so you'll still need a stamp from one of them. @jansel can you review?

do you know what the list is? I didn't see anyone I know so far on the 5 people the job gives. Didn't want to ask a rando.

I'm not sure of the full list, but Jason can definitely stamp this as he leads the autotuning work.

pytorchmergebot · 2023-04-18T16:45:24Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2023-04-18T16:45:31Z

Merge failed

Reason: Approval needed from one of the following:
chauhang, xw285cornell, janeyx99, hyuen, 842974287, ...

Details for Dev Infra team

Raised by workflow job

Failing merge rule: Core Maintainers

vkuzo · 2023-04-18T16:49:33Z

@pytorchbot merge -f 'gimme 5 more names plz'

vkuzo · 2023-04-18T16:50:33Z

@vkuzo I'm not on the list of maintainers, so you'll still need a stamp from one of them. @jansel can you review?

do you know what the list is? I didn't see anyone I know so far on the 5 people the job gives. Didn't want to ask a rando.

I'm not sure of the full list, but Jason can definitely stamp this as he leads the autotuning work.

thx! I'll see who manages to stamp this first. I gave some feedback to the job maintainers so hopefully we can make it less cryptic for next time.

pytorchmergebot · 2023-04-18T16:51:17Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2023-04-18T16:51:23Z

Merge failed

Reason: Approval needed from one of the following:
mattjgalloway, jiayisuse, larryliu0820, banitag1, byterover, ...

Details for Dev Infra team

Raised by workflow job

Failing merge rule: Core Maintainers

vkuzo · 2023-04-18T16:52:50Z

@pytorchbot merge -f 'gimme 5 more names plz'

pytorchmergebot · 2023-04-18T16:54:45Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2023-04-18T16:54:52Z

Merge failed

Reason: Approval needed from one of the following:
xta0, zpao, naveedgol, malfet, pinaki-mukerji, ...

Details for Dev Infra team

Raised by workflow job

Failing merge rule: Core Maintainers

janeyx99

a stamp

Summary: A hacky fix to ensure that kernels generated for `torch._int_mm` can be cached. We can remove this hack one eager mode `torch._int_mm` is better supported. Let me know if something more proper is needed instead of the hack. Test plan: ``` // running the script below led to two compilations of triton // int8,int8->int32 kernel before this PR, and only has // one compilcation which is reused after this PR import torch import torch.nn as nn x = torch.randint(-128, 127, (32, 32), device='cuda', dtype=torch.int8) y = torch.randint(-128, 127, (32, 32), device='cuda', dtype=torch.int8) class M(nn.Module): def forward(self, x): x = torch._int_mm(x, y) x = x.to(torch.int8) x = torch._int_mm(x, y) return x m = M().cuda().half() m = torch.compile(m, options={"max-autotune": True}) z = m(x) z = m(x) ```

vkuzo · 2023-04-18T21:56:19Z

@pytorchbot merge

pytorchmergebot · 2023-04-18T21:58:18Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2023-04-18T21:58:22Z

Merge failed

Reason: 1 jobs have failed, first few of them are: inductor / cuda11.8-py3.10-gcc7-sm86 / test (inductor, 1, 1, linux.g5.4xlarge.nvidia.gpu)

Details for Dev Infra team

Raised by workflow job

vkuzo · 2023-04-18T22:05:05Z

@pytorchbot merge -f 'CI failure seems unrelated'

pytorchmergebot · 2023-04-18T22:06:57Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

github-actions bot added ciflow/inductor module: inductor labels Apr 17, 2023

vkuzo added the topic: not user facing topic category label Apr 17, 2023

vkuzo requested review from cpuhrsch and ngimel April 17, 2023 14:59

nmacchioni requested changes Apr 17, 2023

View reviewed changes

vkuzo force-pushed the int_mm_fix_cache branch 2 times, most recently from 5cfea1d to 49cd98b Compare April 17, 2023 23:01

vkuzo requested a review from nmacchioni April 17, 2023 23:01

nmacchioni approved these changes Apr 17, 2023

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Apr 18, 2023

pytorchmergebot added the merging label Apr 18, 2023

pytorchmergebot removed the merging label Apr 18, 2023

pytorchmergebot added the merging label Apr 18, 2023

pytorchmergebot removed the merging label Apr 18, 2023

pytorchmergebot added the merging label Apr 18, 2023

janeyx99 approved these changes Apr 18, 2023

View reviewed changes

vkuzo force-pushed the int_mm_fix_cache branch from 49cd98b to e11ce6b Compare April 18, 2023 17:31

pytorchmergebot removed the merging label Apr 18, 2023

pytorchmergebot added the merging label Apr 18, 2023

pytorchmergebot added the Merged label Apr 18, 2023

pytorchmergebot closed this in 5ff2ad6 Apr 18, 2023

github-actions bot deleted the int_mm_fix_cache branch October 18, 2024 02:05

torch._int_mm: fix triton kernel caching #99283

torch._int_mm: fix triton kernel caching #99283

Uh oh!

Conversation

vkuzo commented Apr 17, 2023 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Apr 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/99283

❗ 1 Active SEVs

❌ 1 Failures

Uh oh!

nmacchioni left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vkuzo commented Apr 17, 2023

Uh oh!

nmacchioni commented Apr 17, 2023

Uh oh!

vkuzo commented Apr 18, 2023

Uh oh!

pytorchmergebot commented Apr 18, 2023

Merge failed

Uh oh!

vkuzo commented Apr 18, 2023

Uh oh!

pytorchmergebot commented Apr 18, 2023

Merge started

Uh oh!

pytorchmergebot commented Apr 18, 2023

Merge failed

Uh oh!

nmacchioni commented Apr 18, 2023

Uh oh!

vkuzo commented Apr 18, 2023

Uh oh!

vkuzo commented Apr 18, 2023

Uh oh!

vkuzo commented Apr 18, 2023

Uh oh!

nmacchioni commented Apr 18, 2023

Uh oh!

pytorchmergebot commented Apr 18, 2023

Merge started

Uh oh!

pytorchmergebot commented Apr 18, 2023

Merge failed

Uh oh!

vkuzo commented Apr 18, 2023

Uh oh!

vkuzo commented Apr 18, 2023

Uh oh!

pytorchmergebot commented Apr 18, 2023

Merge started

Uh oh!

pytorchmergebot commented Apr 18, 2023

Merge failed

Uh oh!

vkuzo commented Apr 18, 2023

Uh oh!

pytorchmergebot commented Apr 18, 2023

Merge started

Uh oh!

pytorchmergebot commented Apr 18, 2023

Merge failed

Uh oh!

janeyx99 left a comment

Choose a reason for hiding this comment

Uh oh!

vkuzo commented Apr 18, 2023

Uh oh!

pytorchmergebot commented Apr 18, 2023

Merge started

Uh oh!

pytorchmergebot commented Apr 18, 2023

Merge failed

Uh oh!

vkuzo commented Apr 17, 2023 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Apr 17, 2023 •

edited

Loading

nmacchioni left a comment •

edited

Loading