-
Notifications
You must be signed in to change notification settings - Fork 25.1k
Fix freeze_module pass for sharedtype #42457
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kimishpatel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
💊 CI failures summary and remediationsAs of commit e8bf0e2 (more details on the Dr. CI page):
🚧 3 fixed upstream failures:These were probably caused by upstream breakages that were already fixed.
Please rebase on the
|
hmm, I'm probably not the best person to review this code properly. Can someone from JIT review? |
Yeah, I put you here for FYI. I put Zino for review. |
73ad43b
to
be7fbfb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kimishpatel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@@ -11,6 +12,20 @@ namespace torch { | |||
namespace jit { | |||
|
|||
namespace { | |||
ModulePtr getModulePtrForGetAttrNode( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we use findConstantAttr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could. It would require some refactoring of that function, because it does more than just follow getAttr chain. If getModulePtrForGetAttrNode
was too big I would probably do that. If you prefer that I can think of refactoring that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It am okay to keep them separate for now. There is some duplication of code but not too much....
|
||
#include <torch/csrc/jit/ir/alias_analysis.h> | ||
#include <torch/csrc/jit/passes/inliner.h> | ||
#include <torch/csrc/jit/passes/quantization/helper.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove quantanzation/helper.h dependency?
you are invoking findChildModule (small routine that can be inlined.
be7fbfb
to
226bb37
Compare
226bb37
to
0b69301
Compare
0b69301
to
86529f6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kimishpatel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
This seems to have broken the mac build (see https://circleci.com/gh/pytorch/pytorch/6664211?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link status on master). That matches one of the failing builds when this PR was submitted (https://circleci.com/gh/pytorch/pytorch/6620836?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link), so I'm reverting this PR. |
Thanks. My bad. Let me check. |
@kimishpatel merged this pull request in 4665f3f. |
86529f6
to
52e1bce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kimishpatel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
bb7c6fe
to
de23d86
Compare
Summary: During cleanup phase, calling recordReferencedAttrs would record the attributes which are referenced and hence kept. However, if you have two instances of the same type which are preserved through freezing process, as the added testcase shows, then during recording the attributes which are referenced, we iterate through the type INSTANCES that we have seen so far and record those ones. Thus if we have another instance of the same type, we will just look at the first instance in the list, and record that instances. This PR fixes that by traversing the getattr chains and getting the actual instance of the getattr output. Test Plan: python test/test_jit.py TestFreezing Reviewers: Subscribers: Tasks: Tags:
de23d86
to
e8bf0e2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kimishpatel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Can we revert this ? it's responsible for #45902 |
@eellison btw this fails for me:
Failure: |
I remember filing the original issue #42039 because I saw this on a toy model, and it was blocking writing a good test plan for an unrelated quantization PR. I don't have any data one way or the other on the impact of this on real models, but seems like the original issue should be fixed in some way (whether that's this PR or something else). |
I think this is fixing the issue in the wrong way, and it's also breaking freezing for OSS users. As far as I can tell the issue is in quantization - we should fix the issue there. I wouldn't be surprised if that quant issue is also responsible for #46054 |
Independent of the quantization issue, the bug fixed by this PR is real. Effectively the previous code was looking at the first instance of the given type rather than finding the actual instance. |
I'd love to learn more context on what is pointing to a problem in quantization code. |
Yea, maybe this premature, but the test case used for the PR is quantization. If it's not a quant issue it'd be nice to get an independent repro. |
ah, ic, cool. Yeah, it seemed like an issue in freezing (since the quantization code used to surface it is super simple and has been stable for awhile), but definitely let us know if something is fishy. |
@kimishpatel could you provide a repro that doesn't depend on quantization? |
I dont have a repro without quantization right now. It will take some time produce something that does that. |
@eellison also do you know more about the reason of the failure? As in, is it something specific to the model, since I have seen this error before when we could not follow chains of getattr due to something else. |
@kimishpatel if you look at the linked PR, the test case in this PR was fixed by #46250. The failure seems to be a quant bug of not updating the getAttr mapped type. Copying myself from the linked PR:
|
@eellison sounds good. I will try to repro without quantization and see if this still holds. |
Summary: Fixes #45902 by reverting #42457 The test case introduced by #42457 was fixed by #46250, which I'm assuming is the real source of the bug. In the future it would be good to provide repro's for freezing issues without including a quantization dependency; there was another another issue in freezing (see: #46054) who's root cause was the same quantization issue #46250. Pull Request resolved: #46285 Reviewed By: bdhirsh Differential Revision: D24288739 Pulled By: eellison fbshipit-source-id: b69ee8c713f749cd93d5eba370c3eafed86568bb
Summary:
During cleanup phase, calling recordReferencedAttrs would record
the attributes which are referenced and hence kept.
However, if you have two instances of the same type which are preserved
through freezing process, as the added testcase shows, then during
recording the attributes which are referenced, we iterate through the
type INSTANCES that we have seen so far and record those ones.
Thus if we have another instance of the same type, we will just look at
the first instance in the list, and record that instances.
This PR fixes that by traversing the getattr chains and getting the
actual instance of the getattr output.
Test Plan:
python test/test_jit.py TestFreezing
Reviewers:
Subscribers:
Tasks:
Tags:
Fixes #{issue number}