-
Notifications
You must be signed in to change notification settings - Fork 22.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework requires_grad on DifferentiableGraphOp #57575
Conversation
💊 CI failures summary and remediationsAs of commit 8f02ee2 (more details on the Dr. CI page):
🕵️ 1 new failure recognized by patternsThe following CI failures do not appear to be due to upstream breakages: pytorch_linux_xenial_py3_6_gcc5_4_test (1/1)Step: "Run tests" (full log | diagnosis details | 🔁 rerun)
|
This PR does two things: 1. reverts "Manual revert of D27369251 (#56080)" in commit 92a09fb. 2. fixing DifferentiableGraph output with wrong requires_grad flag Fixing requires_grad on outputs from DifferentiableGraph, the proper flag is retrieved from profiling information. We previously only retrieves the profiling information on the first profile node in all its uses. However, in case where control flows are present, we need to iteratively search for profile node with profiling information available, in case the first use is in an inactive code path. e.g. ``` graph(%0 : Tensor, %1 : Bool): ..., %2 : Tensor = prim::DifferentiableGraph_0(%0) %3 : Tensor = prim::If(%1) block0(): %4 : Tensor = prim::DifferentiableGraph_1(%2) -> (%4) block1(): %5 : Tensor = prim::DifferentiableGraph_2(%2) -> (%5) -> (%3) with prim::DifferentiableGraph_0 = graph(%0 : Tensor): ... %out : Tensor = aten::operation(...) ... return (..., %out) with prim::DifferentiableGraph_1 = graph(%0 : Tensor): %temp : Tensor = prim::profile[profiled_type=Tensor](%0) ... with prim::DifferentiableGraph_2 = graph(%0 : Tensor): %temp : Tensor = prim::profile[profiled_type=Float(...)](%0) ... ``` [ghstack-poisoned]
This PR does two things: 1. reverts "Manual revert of D27369251 (#56080)" in commit 92a09fb. 2. fixing DifferentiableGraph output with wrong requires_grad flag Fixing requires_grad on outputs from DifferentiableGraph, the proper flag is retrieved from profiling information. We previously only retrieves the profiling information on the first profile node in all its uses. However, in case where control flows are present, we need to iteratively search for profile node with profiling information available, in case the first use is in an inactive code path. e.g. ``` graph(%0 : Tensor, %1 : Bool): ..., %2 : Tensor = prim::DifferentiableGraph_0(%0) %3 : Tensor = prim::If(%1) block0(): %4 : Tensor = prim::DifferentiableGraph_1(%2) -> (%4) block1(): %5 : Tensor = prim::DifferentiableGraph_2(%2) -> (%5) -> (%3) with prim::DifferentiableGraph_0 = graph(%0 : Tensor): ... %out : Tensor = aten::operation(...) ... return (..., %out) with prim::DifferentiableGraph_1 = graph(%0 : Tensor): %temp : Tensor = prim::profile[profiled_type=Tensor](%0) ... with prim::DifferentiableGraph_2 = graph(%0 : Tensor): %temp : Tensor = prim::profile[profiled_type=Float(...)](%0) ... ``` ghstack-source-id: e4ee036f68b63966ea1566dab6377e4413c26679 Pull Request resolved: #57575
This PR does two things: 1. reverts "Manual revert of D27369251 (#56080)" in commit 92a09fb. 2. fixing DifferentiableGraph output with wrong requires_grad flag Fixing requires_grad on outputs from DifferentiableGraph, the proper flag is retrieved from profiling information. We previously only retrieves the profiling information on the first profile node in all its uses. However, in case where control flows are present, we need to iteratively search for profile node with profiling information available, in case the first use is in an inactive code path. e.g. ``` graph(%0 : Tensor, %1 : Bool): ..., %2 : Tensor = prim::DifferentiableGraph_0(%0) %3 : Tensor = prim::If(%1) block0(): %4 : Tensor = prim::DifferentiableGraph_1(%2) -> (%4) block1(): %5 : Tensor = prim::DifferentiableGraph_2(%2) -> (%5) -> (%3) with prim::DifferentiableGraph_0 = graph(%0 : Tensor): ... %out : Tensor = aten::operation(...) ... return (..., %out) with prim::DifferentiableGraph_1 = graph(%0 : Tensor): %temp : Tensor = prim::profile[profiled_type=Tensor](%0) ... with prim::DifferentiableGraph_2 = graph(%0 : Tensor): %temp : Tensor = prim::profile[profiled_type=Float(...)](%0) ... ``` [ghstack-poisoned]
@jjsjann123 let me do some extra testing against this stack internally. Fingers crossed. |
This PR does two things: 1. reverts "Manual revert of D27369251 (#56080)" in commit 92a09fb. 2. fixing DifferentiableGraph output with wrong requires_grad flag Fixing requires_grad on outputs from DifferentiableGraph, the proper flag is retrieved from profiling information. We previously only retrieves the profiling information on the first profile node in all its uses. However, in case where control flows are present, we need to iteratively search for profile node with profiling information available, in case the first use is in an inactive code path. e.g. ``` graph(%0 : Tensor, %1 : Bool): ..., %2 : Tensor = prim::DifferentiableGraph_0(%0) %3 : Tensor = prim::If(%1) block0(): %4 : Tensor = prim::DifferentiableGraph_1(%2) -> (%4) block1(): %5 : Tensor = prim::DifferentiableGraph_2(%2) -> (%5) -> (%3) with prim::DifferentiableGraph_0 = graph(%0 : Tensor): ... %out : Tensor = aten::operation(...) ... return (..., %out) with prim::DifferentiableGraph_1 = graph(%0 : Tensor): %temp : Tensor = prim::profile[profiled_type=Tensor](%0) ... with prim::DifferentiableGraph_2 = graph(%0 : Tensor): %temp : Tensor = prim::profile[profiled_type=Float(...)](%0) ... ``` ghstack-source-id: 781a52aee2e152a59a031e8aec687c78416b7acc Pull Request resolved: #57575
This PR does two things: 1. reverts "Manual revert of D27369251 (#56080)" in commit 92a09fb. 2. fixing DifferentiableGraph output with wrong requires_grad flag Fixing requires_grad on outputs from DifferentiableGraph, the proper flag is retrieved from profiling information. We previously only retrieves the profiling information on the first profile node in all its uses. However, in case where control flows are present, we need to iteratively search for profile node with profiling information available, in case the first use is in an inactive code path. e.g. ``` graph(%0 : Tensor, %1 : Bool): ..., %2 : Tensor = prim::DifferentiableGraph_0(%0) %3 : Tensor = prim::If(%1) block0(): %4 : Tensor = prim::DifferentiableGraph_1(%2) -> (%4) block1(): %5 : Tensor = prim::DifferentiableGraph_2(%2) -> (%5) -> (%3) with prim::DifferentiableGraph_0 = graph(%0 : Tensor): ... %out : Tensor = aten::operation(...) ... return (..., %out) with prim::DifferentiableGraph_1 = graph(%0 : Tensor): %temp : Tensor = prim::profile[profiled_type=Tensor](%0) ... with prim::DifferentiableGraph_2 = graph(%0 : Tensor): %temp : Tensor = prim::profile[profiled_type=Float(...)](%0) ... ``` [ghstack-poisoned]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jjsjann123 could you please rebase? |
This PR does two things: 1. reverts "Manual revert of D27369251 (#56080)" in commit 92a09fb. 2. fixing DifferentiableGraph output with wrong requires_grad flag Fixing requires_grad on outputs from DifferentiableGraph, the proper flag is retrieved from profiling information. We previously only retrieves the profiling information on the first profile node in all its uses. However, in case where control flows are present, we need to iteratively search for profile node with profiling information available, in case the first use is in an inactive code path. e.g. ``` graph(%0 : Tensor, %1 : Bool): ..., %2 : Tensor = prim::DifferentiableGraph_0(%0) %3 : Tensor = prim::If(%1) block0(): %4 : Tensor = prim::DifferentiableGraph_1(%2) -> (%4) block1(): %5 : Tensor = prim::DifferentiableGraph_2(%2) -> (%5) -> (%3) with prim::DifferentiableGraph_0 = graph(%0 : Tensor): ... %out : Tensor = aten::operation(...) ... return (..., %out) with prim::DifferentiableGraph_1 = graph(%0 : Tensor): %temp : Tensor = prim::profile[profiled_type=Tensor](%0) ... with prim::DifferentiableGraph_2 = graph(%0 : Tensor): %temp : Tensor = prim::profile[profiled_type=Float(...)](%0) ... ``` [ghstack-poisoned]
This PR does two things: 1. reverts "Manual revert of D27369251 (#56080)" in commit 92a09fb. 2. fixing DifferentiableGraph output with wrong requires_grad flag Fixing requires_grad on outputs from DifferentiableGraph, the proper flag is retrieved from profiling information. We previously only retrieves the profiling information on the first profile node in all its uses. However, in case where control flows are present, we need to iteratively search for profile node with profiling information available, in case the first use is in an inactive code path. e.g. ``` graph(%0 : Tensor, %1 : Bool): ..., %2 : Tensor = prim::DifferentiableGraph_0(%0) %3 : Tensor = prim::If(%1) block0(): %4 : Tensor = prim::DifferentiableGraph_1(%2) -> (%4) block1(): %5 : Tensor = prim::DifferentiableGraph_2(%2) -> (%5) -> (%3) with prim::DifferentiableGraph_0 = graph(%0 : Tensor): ... %out : Tensor = aten::operation(...) ... return (..., %out) with prim::DifferentiableGraph_1 = graph(%0 : Tensor): %temp : Tensor = prim::profile[profiled_type=Tensor](%0) ... with prim::DifferentiableGraph_2 = graph(%0 : Tensor): %temp : Tensor = prim::profile[profiled_type=Float(...)](%0) ... ``` ghstack-source-id: 7d493ce5cf179a04b7a05723e2a18117e29313f8 Pull Request resolved: #57575
This PR does two things: 1. reverts "Manual revert of D27369251 (#56080)" in commit 92a09fb. 2. fixing DifferentiableGraph output with wrong requires_grad flag Fixing requires_grad on outputs from DifferentiableGraph, the proper flag is retrieved from profiling information. We previously only retrieves the profiling information on the first profile node in all its uses. However, in case where control flows are present, we need to iteratively search for profile node with profiling information available, in case the first use is in an inactive code path. e.g. ``` graph(%0 : Tensor, %1 : Bool): ..., %2 : Tensor = prim::DifferentiableGraph_0(%0) %3 : Tensor = prim::If(%1) block0(): %4 : Tensor = prim::DifferentiableGraph_1(%2) -> (%4) block1(): %5 : Tensor = prim::DifferentiableGraph_2(%2) -> (%5) -> (%3) with prim::DifferentiableGraph_0 = graph(%0 : Tensor): ... %out : Tensor = aten::operation(...) ... return (..., %out) with prim::DifferentiableGraph_1 = graph(%0 : Tensor): %temp : Tensor = prim::profile[profiled_type=Tensor](%0) ... with prim::DifferentiableGraph_2 = graph(%0 : Tensor): %temp : Tensor = prim::profile[profiled_type=Float(...)](%0) ... ``` [ghstack-poisoned]
@Krovatkin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@Krovatkin merged this pull request in 9ad0de3. |
Stack from ghstack:
This PR does two things:
reverts "Manual revert of D27369251 (Manual revert of D27369251 #56080)" in commit
92a09fb.
fixing DifferentiableGraph output with wrong requires_grad flag
Fixing requires_grad on outputs from DifferentiableGraph, the proper flag is
retrieved from profiling information. We previously only retrieves the profiling
information on the first profile node in all its uses. However, in case where
control flows are present, we need to iteratively search for profile node with
profiling information available, in case the first use is in an inactive code
path.
e.g.
Differential Revision: D29038773