Skip to content

Conversation

vkuzo
Copy link
Contributor

@vkuzo vkuzo commented Apr 17, 2023

Summary:

A fix to ensure that kernels generated for torch._int_mm can be cached. We can remove this hack one eager mode torch._int_mm is better supported.

Let me know if something more proper is needed instead of the hack.

Test plan:

// running the script below led to two compilations of triton
// int8,int8->int32 kernel before this PR, and only has
// one compilation which is reused after this PR

import torch
import torch.nn as nn

x = torch.randint(-128, 127, (32, 32), device='cuda', dtype=torch.int8)
y = torch.randint(-128, 127, (32, 32), device='cuda', dtype=torch.int8)

class M(nn.Module):
    def forward(self, x):
        x = torch._int_mm(x, y)
        x = x.to(torch.int8)
        x = torch._int_mm(x, y)
        return x

m = M().cuda().half()
m = torch.compile(m, options={"max-autotune": True})

z = m(x)
z = m(x)

Fixes #ISSUE_NUMBER

cc @soumith @voznesenskym @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @desertfire

@pytorch-bot
Copy link

pytorch-bot bot commented Apr 17, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/99283

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 1 Failures

As of commit e11ce6b:

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copy link
Contributor

@nmacchioni nmacchioni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced this is needed, since I recently changed the way things are cached in #98351. Tl;dr; the caching is not longer based on the default choice's name but rather is hardcored into the autotuning call.

@vkuzo
Copy link
Contributor Author

vkuzo commented Apr 17, 2023

I'm not convinced this is needed, since I recently changed the way things are cached in #98351. Tl;dr; the caching is not longer based on the default choice's name but rather is hardcored into the autotuning call.

here is the line using choice[0] name as the cache key:

I guess we can just use that instead of this hack then, I wasn't sure if there was some reason it had to be choice[0].

@nmacchioni
Copy link
Contributor

I'm not convinced this is needed, since I recently changed the way things are cached in #98351. Tl;dr; the caching is not longer based on the default choice's name but rather is hardcored into the autotuning call.

here is the line using choice[0] name as the cache key:

I guess we can just use that instead of this hack then, I wasn't sure if there was some reason it had to be choice[0].

Oh good catch, we never actually pass the hardcoded name to the lookup function. I think if we just pass name here instead of choices[0].name that would be a good fix. Thanks!

@vkuzo vkuzo force-pushed the int_mm_fix_cache branch 2 times, most recently from 5cfea1d to 49cd98b Compare April 17, 2023 23:01
@vkuzo vkuzo requested a review from nmacchioni April 17, 2023 23:01
@vkuzo
Copy link
Contributor Author

vkuzo commented Apr 18, 2023

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Apr 18, 2023
@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: Approval needed from one of the following:
jfix71, gqchen, tiandiao123, jerry39213gh, ananthsub, ...

Details for Dev Infra team Raised by workflow job

Failing merge rule: Core Maintainers

@vkuzo
Copy link
Contributor Author

vkuzo commented Apr 18, 2023

@pytorchbot merge -f 'rule requiring approvers from an allowlist seems unrelated'

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: Approval needed from one of the following:
shoumikhin, orionr, naveedgol, mikekgfb, jeanschmidt, ...

Details for Dev Infra team Raised by workflow job

Failing merge rule: Core Maintainers

@nmacchioni
Copy link
Contributor

@vkuzo I'm not on the list of maintainers, so you'll still need a stamp from one of them. @jansel can you review?

@vkuzo
Copy link
Contributor Author

vkuzo commented Apr 18, 2023

I guess each failed merge unmasks 5 more reviewers, so I can keep failing to merge until I see a familiar name...

@vkuzo
Copy link
Contributor Author

vkuzo commented Apr 18, 2023

@vkuzo I'm not on the list of maintainers, so you'll still need a stamp from one of them. @jansel can you review?

do you know what the list is? I didn't see anyone I know so far on the 5 people the job gives. Didn't want to ask a rando.

@vkuzo
Copy link
Contributor Author

vkuzo commented Apr 18, 2023

@pytorchmergebot merge -f 'give me 5 more names, bot'

@nmacchioni
Copy link
Contributor

@vkuzo I'm not on the list of maintainers, so you'll still need a stamp from one of them. @jansel can you review?

do you know what the list is? I didn't see anyone I know so far on the 5 people the job gives. Didn't want to ask a rando.

I'm not sure of the full list, but Jason can definitely stamp this as he leads the autotuning work.

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: Approval needed from one of the following:
chauhang, xw285cornell, janeyx99, hyuen, 842974287, ...

Details for Dev Infra team Raised by workflow job

Failing merge rule: Core Maintainers

@vkuzo
Copy link
Contributor Author

vkuzo commented Apr 18, 2023

@pytorchbot merge -f 'gimme 5 more names plz'

@vkuzo
Copy link
Contributor Author

vkuzo commented Apr 18, 2023

@vkuzo I'm not on the list of maintainers, so you'll still need a stamp from one of them. @jansel can you review?

do you know what the list is? I didn't see anyone I know so far on the 5 people the job gives. Didn't want to ask a rando.

I'm not sure of the full list, but Jason can definitely stamp this as he leads the autotuning work.

thx! I'll see who manages to stamp this first. I gave some feedback to the job maintainers so hopefully we can make it less cryptic for next time.

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: Approval needed from one of the following:
mattjgalloway, jiayisuse, larryliu0820, banitag1, byterover, ...

Details for Dev Infra team Raised by workflow job

Failing merge rule: Core Maintainers

@vkuzo
Copy link
Contributor Author

vkuzo commented Apr 18, 2023

@pytorchbot merge -f 'gimme 5 more names plz'

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: Approval needed from one of the following:
xta0, zpao, naveedgol, malfet, pinaki-mukerji, ...

Details for Dev Infra team Raised by workflow job

Failing merge rule: Core Maintainers

Copy link
Contributor

@janeyx99 janeyx99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a stamp

Summary:

A hacky fix to ensure that kernels generated for `torch._int_mm` can be
cached. We can remove this hack one eager mode `torch._int_mm` is
better supported.

Let me know if something more proper is needed instead of the hack.

Test plan:

```
// running the script below led to two compilations of triton
// int8,int8->int32 kernel before this PR, and only has
// one compilcation which is reused after this PR

import torch
import torch.nn as nn

x = torch.randint(-128, 127, (32, 32), device='cuda', dtype=torch.int8)
y = torch.randint(-128, 127, (32, 32), device='cuda', dtype=torch.int8)

class M(nn.Module):
    def forward(self, x):
        x = torch._int_mm(x, y)
        x = x.to(torch.int8)
        x = torch._int_mm(x, y)
        return x

m = M().cuda().half()
m = torch.compile(m, options={"max-autotune": True})

z = m(x)
z = m(x)
```
@vkuzo vkuzo force-pushed the int_mm_fix_cache branch from 49cd98b to e11ce6b Compare April 18, 2023 17:31
@vkuzo
Copy link
Contributor Author

vkuzo commented Apr 18, 2023

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: inductor / cuda11.8-py3.10-gcc7-sm86 / test (inductor, 1, 1, linux.g5.4xlarge.nvidia.gpu)

Details for Dev Infra team Raised by workflow job

@vkuzo
Copy link
Contributor Author

vkuzo commented Apr 18, 2023

@pytorchbot merge -f 'CI failure seems unrelated'

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@github-actions github-actions bot deleted the int_mm_fix_cache branch October 18, 2024 02:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants