Skip to content

Conversation

oulgen
Copy link
Contributor

@oulgen oulgen commented Dec 15, 2023

Stack from ghstack (oldest at bottom):

This PR fixes two bugs

  1. Constant folding a triton kernel results in the kernel's inputs to be returned back without any modification. Disable constant folding for triton kernels. Need more investigation
  2. NoneLayout buffers should not be deleted as they do not exist

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov @ColinPeppler

Copy link

pytorch-bot bot commented Dec 15, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/115908

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 2d221e8 with merge base bf62511 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@oulgen oulgen mentioned this pull request Dec 15, 2023
@oulgen oulgen added ciflow/trunk Trigger trunk jobs on your pull request topic: not user facing topic category labels Dec 15, 2023
Copy link
Contributor

@aakhundov aakhundov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix!

@oulgen oulgen requested review from aakhundov and bdhirsh December 15, 2023 02:57
… tracking bugs"


This PR fixes two bugs
1) Constant folding a triton kernel results in the kernel's inputs to be returned back without any modification. Disable constant folding for triton kernels. Need more investigation
2) ExternKernelOut (matmul in this case) needs to track mutation of its inputs



cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
… tracking bugs"


This PR fixes two bugs
1) Constant folding a triton kernel results in the kernel's inputs to be returned back without any modification. Disable constant folding for triton kernels. Need more investigation
2) ExternKernelOut (matmul in this case) needs to track mutation of its inputs



cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
… tracking bugs"


This PR fixes two bugs
1) Constant folding a triton kernel results in the kernel's inputs to be returned back without any modification. Disable constant folding for triton kernels. Need more investigation
2) ExternKernelOut (matmul in this case) needs to track mutation of its inputs



cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
zou3519 pushed a commit that referenced this pull request Dec 15, 2023
@oulgen
Copy link
Contributor Author

oulgen commented Dec 17, 2023

So this PR results in matmul input buffers to live longer
OLD LEFT, NEW RIGHT
410426936_2692549810921842_1229635650480369372_n

the difference is
OLD

x = EMPTY()
y = matrix multiply (x...)
z = reinterp(x)
del x

NEW

x = prev_z
del prev_x
y = matrix_multiply(some_old_value...)
z = empty()

so in the new one z unnecessarily lives until the next matrix multiply

@oulgen
Copy link
Contributor Author

oulgen commented Dec 17, 2023

@aakhundov also discovered that
so triton_poi_fused__to_copy_add_mul_pow_tanh_backward_10 used to be:

def triton_(in_out_ptr0, in_ptr0, in_ptr1, xnumel, XBLOCK : tl.constexpr):
    …
    tmp0 = tl.load(in_out_ptr0 + (x0), None).to(tl.float32)
    …
    tl.store(in_out_ptr0 + (x0), tmp28, None)

and now it is

def triton_(in_ptr0, in_ptr1, in_ptr2, out_ptr0, xnumel, XBLOCK : tl.constexpr):
    …
    tmp0 = tl.load(in_ptr0 + (x0), None).to(tl.float32)
    …
    tl.store(out_ptr0 + (x0), tmp28, None)  

… tracking bugs"


This PR fixes two bugs
1) Constant folding a triton kernel results in the kernel's inputs to be returned back without any modification. Disable constant folding for triton kernels. Need more investigation
2) ExternKernelOut (matmul in this case) needs to track mutation of its inputs



cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
@oulgen
Copy link
Contributor Author

oulgen commented Dec 18, 2023

Mutation tracking looks good

kernel

… tracking bugs"


This PR fixes two bugs
1) Constant folding a triton kernel results in the kernel's inputs to be returned back without any modification. Disable constant folding for triton kernels. Need more investigation
2) ExternKernelOut (matmul in this case) needs to track mutation of its inputs



cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
… tracking bugs"


This PR fixes two bugs
1) Constant folding a triton kernel results in the kernel's inputs to be returned back without any modification. Disable constant folding for triton kernels. Need more investigation
2) ExternKernelOut (matmul in this case) needs to track mutation of its inputs



cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
@oulgen
Copy link
Contributor Author

oulgen commented Dec 19, 2023

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

dmenig pushed a commit to dmenig/pytorch that referenced this pull request Dec 21, 2023
…ugs (pytorch#115908)

This PR fixes two bugs
1) Constant folding a triton kernel results in the kernel's inputs to be returned back without any modification. Disable constant folding for triton kernels. Need more investigation
2) NoneLayout buffers should not be deleted as they do not exist

Pull Request resolved: pytorch#115908
Approved by: https://github.com/aakhundov, https://github.com/jansel
@facebook-github-bot facebook-github-bot deleted the gh/oulgen/42/head branch December 22, 2023 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants