Skip to content

Conversation

MarouaneMaatouk
Copy link

@MarouaneMaatouk MarouaneMaatouk commented Jan 31, 2024

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Jan 31, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

@pytorch-bot
Copy link

pytorch-bot bot commented Jan 31, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/118697

Note: Links to docs will display an error until the docs builds have been completed.

❌ 9 New Failures, 1 Unrelated Failure

As of commit ec1c622 with merge base 1adedc3 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@mlazos
Copy link
Contributor

mlazos commented Jan 31, 2024

@MarouaneMaatouk I think you did Adamax actually (which was also needed!) but can you add tests here: https://github.com/pytorch/pytorch/pull/117912/files#diff-893b1eea27352f336f4cd832919e48d721e4e90186e63400b8596db6b82e7450 for the cudagraphs support?

cc @janeyx99 for the status on capturable testing with optiminfos, I think we might need to wait for #118326 to go in to make sure we're testing the capturable for Adamax properly.

@janeyx99
Copy link
Contributor

No need to wait for me, but do make the Adamax related changes in common_optimizer.py in the linked PR. I don’t want my change to block anything!

@MarouaneMaatouk
Copy link
Author

@mlazos adamax tests pass, but radam fails and I wasn't able to make sense of the error maybe I am missing something.
From logs:

[2024-02-01 21:02:21,116] [0/0] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_KW 14 [SkipFilesVariable(), ListVariable(), ListVariable(), ListVariable(), ListVariable(), ListVariable(), ConstantVariable(float), ConstantVariable(float), ConstantVariable(float), ConstantVariable(float), ConstantVariable(float), ConstantVariable(bool), ConstantVariable(bool), ConstantVariable(bool), ConstantVariable(bool), TupleVariable()]
[2024-02-01 21:02:21,117] [0/0] torch._dynamo.symbolic_convert: [DEBUG] empty checkpoint
[2024-02-01 21:02:21,117] [0/0] torch._dynamo.symbolic_convert: [DEBUG] FAILED INLINING <code object radam at 0x7fc5c08e82f0, file "/<PATH>/pytorch/torch/optim/radam.py", line 223>
[2024-02-01 21:02:21,118] [0/0] torch._dynamo.symbolic_convert: [DEBUG] empty checkpoint
[2024-02-01 21:02:21,119] [0/0] torch._dynamo.symbolic_convert: [DEBUG] FAILED INLINING <code object step at 0x7fc5c08d7e10, file "/<PATH>/pytorch/torch/optim/radam.py", line 106>
[2024-02-01 21:02:21,119] [0/0] torch._dynamo.symbolic_convert: [DEBUG] empty checkpoint
File "<PATH>/pytorch/torch/_dynamo/symbolic_convert.py", line 1249, in CALL_FUNCTION_KW
  self.call_function(fn, args, kwargs)
File "<PATH>/pytorch/torch/_dynamo/symbolic_convert.py", line 651, in call_function
  self.push(fn.call_function(self, args, kwargs))
File "<PATH>/pytorch/torch/_dynamo/variables/misc.py", line 685, in call_function
  unimplemented(f"call torch._dynamo.disable() wrapped function {self.value}")
File "<PATH>/pytorch/torch/_dynamo/exc.py", line 190, in unimplemented
  raise Unsupported(msg)
torch._dynamo.exc.Unsupported: call torch._dynamo.disable() wrapped function <function _single_tensor_radam at 0x7f7d62edecb0>

@mlazos
Copy link
Contributor

mlazos commented Feb 1, 2024

@mlazos adamax tests pass, but radam fails and I wasn't able to make sense of the error maybe I am missing something. From logs:

[2024-02-01 21:02:21,116] [0/0] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_KW 14 [SkipFilesVariable(), ListVariable(), ListVariable(), ListVariable(), ListVariable(), ListVariable(), ConstantVariable(float), ConstantVariable(float), ConstantVariable(float), ConstantVariable(float), ConstantVariable(float), ConstantVariable(bool), ConstantVariable(bool), ConstantVariable(bool), ConstantVariable(bool), TupleVariable()]
[2024-02-01 21:02:21,117] [0/0] torch._dynamo.symbolic_convert: [DEBUG] empty checkpoint
[2024-02-01 21:02:21,117] [0/0] torch._dynamo.symbolic_convert: [DEBUG] FAILED INLINING <code object radam at 0x7fc5c08e82f0, file "/<PATH>/pytorch/torch/optim/radam.py", line 223>
[2024-02-01 21:02:21,118] [0/0] torch._dynamo.symbolic_convert: [DEBUG] empty checkpoint
[2024-02-01 21:02:21,119] [0/0] torch._dynamo.symbolic_convert: [DEBUG] FAILED INLINING <code object step at 0x7fc5c08d7e10, file "/<PATH>/pytorch/torch/optim/radam.py", line 106>
[2024-02-01 21:02:21,119] [0/0] torch._dynamo.symbolic_convert: [DEBUG] empty checkpoint
File "<PATH>/pytorch/torch/_dynamo/symbolic_convert.py", line 1249, in CALL_FUNCTION_KW
  self.call_function(fn, args, kwargs)
File "<PATH>/pytorch/torch/_dynamo/symbolic_convert.py", line 651, in call_function
  self.push(fn.call_function(self, args, kwargs))
File "<PATH>/pytorch/torch/_dynamo/variables/misc.py", line 685, in call_function
  unimplemented(f"call torch._dynamo.disable() wrapped function {self.value}")
File "<PATH>/pytorch/torch/_dynamo/exc.py", line 190, in unimplemented
  raise Unsupported(msg)
torch._dynamo.exc.Unsupported: call torch._dynamo.disable() wrapped function <function _single_tensor_radam at 0x7f7d62edecb0>

Remove this code: https://github.com/pytorch/pytorch/blob/923a7c757205a327e8f8a6d4f9dbda036ae531d3/torch/_dynamo/eval_frame.py#L1571C4-L1572C10

@pytorch-bot
Copy link

pytorch-bot bot commented Feb 1, 2024

Please seek CI approval before scheduling CIFlow labels

@MarouaneMaatouk
Copy link
Author

MarouaneMaatouk commented Feb 1, 2024

Remove this code: https://github.com/pytorch/pytorch/blob/923a7c757205a327e8f8a6d4f9dbda036ae531d3/torch/_dynamo/eval_frame.py#L1571C4-L1572C10

Thanks, done in the last commit. However I am still having some issues due to this condition (https://github.com/pytorch/pytorch/pull/118697/files#diff-4e7620901810b83e6a28709cbb678170338937eae1e3949b1c1e295d803cca68R368), using directly if wasn't working.
I'll look into this

@mlazos
Copy link
Contributor

mlazos commented Feb 2, 2024

Remove this code: 923a7c7/torch/_dynamo/eval_frame.py#L1571C4-L1572C10

Thanks, done in the last commit. However I am still having some issues due to this condition (#118697 (files)), using directly if wasn't working. I'll look into this

oh if you look at the multitensor version at the bottom of the same file I use torch.where, that will be more fitting for this use case. It selects elements from a left or right tensor based on elements in a bool tensor. cond is a little more general and experimental (it can't be lowered to GPU)

@pytorch-bot
Copy link

pytorch-bot bot commented Feb 3, 2024

Please seek CI approval before scheduling CIFlow labels

Copy link
Contributor

@janeyx99 janeyx99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will look more closely sometime after my meetings today/tomorrow but noticed two things from a brief glance

optim_error_inputs_func=optim_error_inputs_func_adamax,
supported_impls=("foreach", "differentiable"),
only_supports_capturable_on_foreach=True, # Remove this line when #117836 is done!
only_supports_capturable_on_foreach=False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please delete the whole line!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also remove the skipped tests for Adamax as well

@pytorch-bot
Copy link

pytorch-bot bot commented Feb 6, 2024

Please seek CI approval before scheduling CIFlow labels

@zou3519 zou3519 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Feb 8, 2024
@janeyx99 janeyx99 changed the title Add capturable single-tensor RAdam Add capturable single-tensor RAdam, Adamax Feb 13, 2024
Copy link
Contributor

@janeyx99 janeyx99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MarouaneMaatouk It looks like there is failing CI currently. I am also noticing that the CUDA graph tests (the ones that test the point of capturable) are missing. Add in the single tensor variants here: https://github.com/pytorch/pytorch/blob/main/test/test_cuda.py#L2695-L2766

Let us know if you need any help!

janeyx99 added a commit that referenced this pull request Mar 5, 2024
Finishes the work started in #118697. Thanks MarouaneMaatouk for the attempt, but due to inactivity I have opened this PR for Adamax. Note that the new capturable implementation is much simpler and I've modified the foreach capturable impl--it now calls fewer kernels and is more easily comparable to forloop.

Next steps:
* This PR discovered two bugs: #121178 and #121238.
* Move the now hefty graph optim tests in test_cuda to use OptimInfo.




cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
janeyx99 added a commit that referenced this pull request Mar 5, 2024
Finishes the work started in #118697. Thanks MarouaneMaatouk for the attempt, but due to inactivity I have opened this PR for Adamax. Note that the new capturable implementation is much simpler and I've modified the foreach capturable impl--it now calls fewer kernels and is more easily comparable to forloop.

Next steps:
* This PR discovered two bugs: #121178 and #121238.
* Move the now hefty graph optim tests in test_cuda to use OptimInfo.




cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
janeyx99 added a commit that referenced this pull request Mar 5, 2024
Finishes the work started in #118697. Thanks MarouaneMaatouk for the attempt, but due to inactivity I have opened this PR for Adamax. Note that the new capturable implementation is much simpler and I've modified the foreach capturable impl--it now calls fewer kernels and is more easily comparable to forloop.

Next steps:
* This PR discovered two bugs: #121178 and #121238.
* Move the now hefty graph optim tests in test_cuda to use OptimInfo.




cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
janeyx99 added a commit that referenced this pull request Mar 5, 2024
Finishes the work started in #118697. Thanks MarouaneMaatouk for the attempt, but due to inactivity I have opened this PR for Adamax. Note that the new capturable implementation is much simpler and I've modified the foreach capturable impl--it now calls fewer kernels and is more easily comparable to forloop.

Next steps:
* This PR discovered two bugs: #121178 and #121238.
* Move the now hefty graph optim tests in test_cuda to use OptimInfo.




cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
janeyx99 added a commit that referenced this pull request Mar 5, 2024
Implementation thanks to MarouaneMaatouk in #118697. Added tests and the cudagraph health check.




cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
janeyx99 added a commit that referenced this pull request Mar 5, 2024
Finishes the work started in #118697. Thanks MarouaneMaatouk for the attempt, but due to inactivity I have opened this PR for Adamax. Note that the new capturable implementation is much simpler and I've modified the foreach capturable impl--it now calls fewer kernels and is more easily comparable to forloop.

Next steps:
* This PR discovered two bugs: #121178 and #121238.
* Move the now hefty graph optim tests in test_cuda to use OptimInfo.




cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
janeyx99 added a commit that referenced this pull request Mar 5, 2024
Implementation thanks to MarouaneMaatouk in #118697. Added tests and the cudagraph health check.




cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
janeyx99 added a commit that referenced this pull request Mar 5, 2024
Finishes the work started in #118697. Thanks MarouaneMaatouk for the attempt, but due to inactivity I have opened this PR for Adamax. Note that the new capturable implementation is much simpler and I've modified the foreach capturable impl--it now calls fewer kernels and is more easily comparable to forloop.

Next steps:
* This PR discovered two bugs: #121178 and #121238.
* Move the now hefty graph optim tests in test_cuda to use OptimInfo.




cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
janeyx99 added a commit that referenced this pull request Mar 5, 2024
Implementation thanks to MarouaneMaatouk in #118697. Added tests and the cudagraph health check.




cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
janeyx99 added a commit that referenced this pull request Mar 5, 2024
Implementation thanks to MarouaneMaatouk in #118697. Added tests and the cudagraph health check.




cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Mar 7, 2024
Finishes the work started in #118697. Thanks @MarouaneMaatouk for the attempt, but due to inactivity I have opened this PR for Adamax. Note that the new capturable implementation is much simpler and I've modified the foreach capturable impl--it now calls fewer kernels and is more easily comparable to forloop.

Next steps:
* This PR discovered two bugs: #121178 and #121238.
* Move the now hefty graph optim tests in test_cuda to use OptimInfo.

Pull Request resolved: #121183
Approved by: https://github.com/albanD
janeyx99 added a commit that referenced this pull request Mar 7, 2024
Implementation thanks to MarouaneMaatouk in #118697. Added tests and the cudagraph health check.




cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
janeyx99 added a commit that referenced this pull request Mar 7, 2024
Implementation thanks to MarouaneMaatouk in #118697. Added tests and the cudagraph health check.




cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Mar 8, 2024
Implementation thanks to @MarouaneMaatouk in #118697, though I've since cleaned it up a lot to save perf on the rect < 5 eager case. It also just looks better now :) Added tests and the cudagraph health check.

Pull Request resolved: #121260
Approved by: https://github.com/mlazos
@github-actions
Copy link
Contributor

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@janeyx99
Copy link
Contributor

this has been done, closing

@janeyx99 janeyx99 closed this Apr 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor module: dynamo module: inductor open source release notes: optim Stale triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add capturable single-tensor RAdam

5 participants