Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable Autograd when calling RecordFunction callbacks #100727

Closed
wants to merge 2 commits into from

Conversation

soulitzer
Copy link
Contributor

@soulitzer soulitzer commented May 5, 2023

Stack from ghstack (oldest at bottom):

This PR disables autograd for all callbacks that are registered to RecordFunction.

Context: RecordFunction interposes before autograd in at least a couple places: custom Function, detach{,_}'s VariableTypeManual (why?) and calls arbitrary user-defined callbacks that can be registered via "addGlobalCallback" (for example, to log infs/nans to scuba).

Issue: if any callbacks call torch operations, it may result in tensors being saved for backward (potentially problematic in the case of non-reentrant checkpointing even if the graph/saved tensors are expected to die immediately).

Proposed solution: RecordFunction should be primarily used for profiling + logging, so it should probably live below autograd, but since actually moving where RecordFunction is called will take more work, this PR is a good stop gap solution.

Alternatives:

  • putting the guard on the specific callback rather than disabling autograd (safer)

Another detail:

  • we could use the dispatch below autograd guard instead

@pytorch-bot
Copy link

pytorch-bot bot commented May 5, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/100727

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 5c676b8:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

soulitzer added a commit that referenced this pull request May 5, 2023
ghstack-source-id: ed41d174a87bd70a7d1bcd16e60c3145ca3db4f7
Pull Request resolved: #100727
@soulitzer soulitzer requested a review from albanD May 5, 2023 17:11
@soulitzer soulitzer added release notes: profiler release notes category topic: improvements topic category labels May 5, 2023
This PR disables autograd for all callbacks that are registered to RecordFunction. 

Context: RecordFunction interposes before autograd in at least a couple places: custom Function, detach{,_}'s VariableTypeManual (why?) and makes it so that if any callbacks call torch operations, it may result in tensors being saved for backward (potentially problematic in the case of non-reentrant checkpointing even if the graph/saved tensors are expected to die immediately). 

RecordFunction should be primarily used for profiling + logging, so it should probably live below autograd, but since actually moving where RecordFunction is called will take more work, this PR is a good stop gap solution.

Alternatives:
- putting the guard on the specific callback rather than disabling autograd (safer)

Another detail:
- we could use the dispatch below autograd guard instead


[ghstack-poisoned]
soulitzer added a commit that referenced this pull request May 5, 2023
ghstack-source-id: c701b89e0184f84977c57722e52631b9713c5a65
Pull Request resolved: #100727
@soulitzer soulitzer closed this May 5, 2023
@facebook-github-bot facebook-github-bot deleted the gh/soulitzer/204/head branch June 8, 2023 18:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release notes: profiler release notes category topic: improvements topic category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant