Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deeply rework WeakIdKeyDictionary #90825

Closed
wants to merge 6 commits into from

Conversation

ezyang
Copy link
Contributor

@ezyang ezyang commented Dec 14, 2022

Stack from ghstack (oldest at bottom):

In the prior patch, I just YOLOed a mutable mapping implementation.
Many edge cases were not handled correctly. In this PR, I just
copy paste the WeakKeyDictionary from CPython and the hacked it up
to use WeakIdRef instead of weakref.ref. You can see each line
I changed with the comment CHANGED; there aren't many.

Being exactly API compatible with WeakKeyDictionary means I can also
rob all of the tests from CPython, which I also did for
test/test_weak.py

How to review? You could either try taking the delta from CPython
(recommended), or review everything from scratch (not recommended).
Can post diff representing delta on request.

Signed-off-by: Edward Z. Yang ezyang@fb.com

In the prior patch, I just YOLOed a mutable mapping implementation.
Many edge cases were not handled correctly.  In this PR, I just
copy paste the WeakKeyDictionary from CPython and the hacked it up
to use WeakIdRef instead of weakref.ref.  You can see each line
I changed with the comment CHANGED; there aren't many.

Being exactly API compatible with WeakKeyDictionary means I can also
rob all of the tests from CPython, which I also did for
test/test_weak.py

How to review?  You could either try taking the delta from CPython
(recommended), or review everything from scratch (not recommended).
Can post diff representing delta on request.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Dec 14, 2022

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/90825

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit ea2a4d7:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

In the prior patch, I just YOLOed a mutable mapping implementation.
Many edge cases were not handled correctly.  In this PR, I just
copy paste the WeakKeyDictionary from CPython and the hacked it up
to use WeakIdRef instead of weakref.ref.  You can see each line
I changed with the comment CHANGED; there aren't many.

Being exactly API compatible with WeakKeyDictionary means I can also
rob all of the tests from CPython, which I also did for
test/test_weak.py

How to review?  You could either try taking the delta from CPython
(recommended), or review everything from scratch (not recommended).
Can post diff representing delta on request.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Dec 14, 2022
In the prior patch, I just YOLOed a mutable mapping implementation.
Many edge cases were not handled correctly.  In this PR, I just
copy paste the WeakKeyDictionary from CPython and the hacked it up
to use WeakIdRef instead of weakref.ref.  You can see each line
I changed with the comment CHANGED; there aren't many.

Being exactly API compatible with WeakKeyDictionary means I can also
rob all of the tests from CPython, which I also did for
test/test_weak.py

How to review?  You could either try taking the delta from CPython
(recommended), or review everything from scratch (not recommended).
Can post diff representing delta on request.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

ghstack-source-id: d2498245435465c08cce6b9cbd584a4ca01271ae
Pull Request resolved: #90825
Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This solves #81597 ?

Sounds good!

dict2 = WeakIdKeyDictionary(dict)
self.assertEqual(dict[o], 364)

def make_weak_keyed_dict(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not used?


# These tests are ported from cpython/Lib/test/test_weakref.py,
# but adapted to use tensor rather than object
class WeakTest(TestCase):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no test in there that ensures that we only hold a weak reference and not a strong one?
Also any test to ensure we don't introduce ref cycles (which would delay deletion in practice)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weak reference is indirectly tested, as we do test that entries are promptly removed from dictionary when they die (this is how I found _fix_weakref leak).

There were some ref cycle tests in cpython that seemed complicated, but I could try to intro them

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leaving this as follow up, I think prompt deletion also covers the ref cycle case

Copy link
Contributor

@eellison eellison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not the same as the prior implementation, because the prior implementation was weak on both key and value

del self.data[WeakIdRef(key)] # CHANGED

def __getitem__(self, key):
return self.data[WeakIdRef(key)] # CHANGED
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to do _fix_weakref here. You're calling it elsewhere but if we're going to expose this as a utility we shouldn't require users to call it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree. I never take out a strong reference to the key here, so I don't need to fix weakref

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Was thinking that data was a WeakValueDictionary


def pop(self, key, *args):
self._dirty_len = True
return self.data.pop(WeakIdRef(key), *args) # CHANGED
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • 1, _fix_weakref, right ?

@ezyang
Copy link
Contributor Author

ezyang commented Dec 14, 2022

I don't think so, pretty sure old impl was strong on value

In the prior patch, I just YOLOed a mutable mapping implementation.
Many edge cases were not handled correctly.  In this PR, I just
copy paste the WeakKeyDictionary from CPython and the hacked it up
to use WeakIdRef instead of weakref.ref.  You can see each line
I changed with the comment CHANGED; there aren't many.

Being exactly API compatible with WeakKeyDictionary means I can also
rob all of the tests from CPython, which I also did for
test/test_weak.py

How to review?  You could either try taking the delta from CPython
(recommended), or review everything from scratch (not recommended).
Can post diff representing delta on request.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
In the prior patch, I just YOLOed a mutable mapping implementation.
Many edge cases were not handled correctly.  In this PR, I just
copy paste the WeakKeyDictionary from CPython and the hacked it up
to use WeakIdRef instead of weakref.ref.  You can see each line
I changed with the comment CHANGED; there aren't many.

Being exactly API compatible with WeakKeyDictionary means I can also
rob all of the tests from CPython, which I also did for
test/test_weak.py

How to review?  You could either try taking the delta from CPython
(recommended), or review everything from scratch (not recommended).
Can post diff representing delta on request.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
@ezyang
Copy link
Contributor Author

ezyang commented Dec 15, 2022

@pytorchbot merge -f "known master failure only"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: Command git -C /home/runner/work/pytorch/pytorch cherry-pick -x b21ccd99a845d056d917a16aae97e44cbe7b2fc3 returned non-zero exit code 1

Auto-merging .lintrunner.toml
Auto-merging torch/_subclasses/fake_tensor.py
CONFLICT (content): Merge conflict in torch/_subclasses/fake_tensor.py
error: could not apply b21ccd99a8... Deeply rework WeakIdKeyDictionary
hint: After resolving the conflicts, mark them with
hint: "git add/rm <pathspec>", then run
hint: "git cherry-pick --continue".
hint: You can instead skip this commit with "git cherry-pick --skip".
hint: To abort and get back to the state before "git cherry-pick",
hint: run "git cherry-pick --abort".
Details for Dev Infra team Raised by workflow job

In the prior patch, I just YOLOed a mutable mapping implementation.
Many edge cases were not handled correctly.  In this PR, I just
copy paste the WeakKeyDictionary from CPython and the hacked it up
to use WeakIdRef instead of weakref.ref.  You can see each line
I changed with the comment CHANGED; there aren't many.

Being exactly API compatible with WeakKeyDictionary means I can also
rob all of the tests from CPython, which I also did for
test/test_weak.py

How to review?  You could either try taking the delta from CPython
(recommended), or review everything from scratch (not recommended).
Can post diff representing delta on request.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
@ezyang
Copy link
Contributor Author

ezyang commented Dec 15, 2022

@pytorchbot merge

ezyang added a commit that referenced this pull request Dec 15, 2022
In the prior patch, I just YOLOed a mutable mapping implementation.
Many edge cases were not handled correctly.  In this PR, I just
copy paste the WeakKeyDictionary from CPython and the hacked it up
to use WeakIdRef instead of weakref.ref.  You can see each line
I changed with the comment CHANGED; there aren't many.

Being exactly API compatible with WeakKeyDictionary means I can also
rob all of the tests from CPython, which I also did for
test/test_weak.py

How to review?  You could either try taking the delta from CPython
(recommended), or review everything from scratch (not recommended).
Can post diff representing delta on request.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

ghstack-source-id: 32b861c2462c3a686269882d4c6dae77ac5f0083
Pull Request resolved: #90825
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: Command git -C /home/runner/work/pytorch/pytorch cherry-pick -x b21ccd99a845d056d917a16aae97e44cbe7b2fc3 returned non-zero exit code 1

Auto-merging .lintrunner.toml
Auto-merging torch/_subclasses/fake_tensor.py
CONFLICT (content): Merge conflict in torch/_subclasses/fake_tensor.py
error: could not apply b21ccd99a8... Deeply rework WeakIdKeyDictionary
hint: After resolving the conflicts, mark them with
hint: "git add/rm <pathspec>", then run
hint: "git cherry-pick --continue".
hint: You can instead skip this commit with "git cherry-pick --skip".
hint: To abort and get back to the state before "git cherry-pick",
hint: run "git cherry-pick --abort".
Details for Dev Infra team Raised by workflow job

In the prior patch, I just YOLOed a mutable mapping implementation.
Many edge cases were not handled correctly.  In this PR, I just
copy paste the WeakKeyDictionary from CPython and the hacked it up
to use WeakIdRef instead of weakref.ref.  You can see each line
I changed with the comment CHANGED; there aren't many.

Being exactly API compatible with WeakKeyDictionary means I can also
rob all of the tests from CPython, which I also did for
test/test_weak.py

How to review?  You could either try taking the delta from CPython
(recommended), or review everything from scratch (not recommended).
Can post diff representing delta on request.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Dec 15, 2022
In the prior patch, I just YOLOed a mutable mapping implementation.
Many edge cases were not handled correctly.  In this PR, I just
copy paste the WeakKeyDictionary from CPython and the hacked it up
to use WeakIdRef instead of weakref.ref.  You can see each line
I changed with the comment CHANGED; there aren't many.

Being exactly API compatible with WeakKeyDictionary means I can also
rob all of the tests from CPython, which I also did for
test/test_weak.py

How to review?  You could either try taking the delta from CPython
(recommended), or review everything from scratch (not recommended).
Can post diff representing delta on request.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

ghstack-source-id: a2c7d6869804fc139233b8dd03e5974e5728adb0
Pull Request resolved: #90825
@ezyang
Copy link
Contributor Author

ezyang commented Dec 15, 2022

@pytorchbot merge -f "idk doesn't look conflicting to me"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Comment on lines +52 to +53
if hasattr(r, '_fix_weakref'):
r._fix_weakref() # type: ignore[union-attr]
Copy link
Contributor

@danthe3rd danthe3rd Dec 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class seems to still keep a strong reference to the object. Repro:

import torch
import gc
import weakref

x = torch.zeros([1])
y = torch.zeros([1])
refx = weakref.ref(x)
refy = weakref.ref(y)
WeakIdRef(x) == WeakIdRef(torch.ones([1]))
del x
del y
gc.collect()
print("refX: ", type(refx())) # refX:  <class 'torch.Tensor'>
print("refY: ", type(refy())) # refY:  <class 'NoneType'>

Commenting out "r._fix_weakref()" here fixes it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have the first PR in this stack? There was a memory leak that I fixed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah on master, fixing your example, it works

(/Users/ezyang/Dev/pytorch-cpu-env) ezyang-mbp:pytorch-cpu ezyang$ cat foo.py
import torch
import gc
import weakref
from torch.utils.weak import WeakIdRef

x = torch.zeros([1])
y = torch.zeros([1])
refx = weakref.ref(x)
refy = weakref.ref(y)
WeakIdRef(x) == WeakIdRef(torch.ones([1]))
del x
del y
gc.collect()
print("refX: ", type(refx())) # refX:  <class 'torch.Tensor'>
print("refY: ", type(refy())) # refY:  <class 'NoneType'>
(/Users/ezyang/Dev/pytorch-cpu-env) ezyang-mbp:pytorch-cpu ezyang$ python foo.py
refX:  <class 'NoneType'>
refY:  <class 'NoneType'>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, thanks a lot! I just imported the weak.py file (as I wanted it to work with older versions of pytorch as well). I'll look into the other PR. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For old versions just get rid of the fix weakref call, it's super busted lol. I suppose if you are willing to write a little C extension code, manually decreffing the Tensor after fixing weakref would work too

@facebook-github-bot facebook-github-bot deleted the gh/ezyang/1655/head branch June 8, 2023 16:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request Merged release notes: composability release notes category topic: new features topic category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants