-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assigning a parameter to an indexed tensor that was produced by DDP no longer works in torch nightly (1.7) #46242
Comments
I was able to repro this as well. |
@bdhirsh Don't forget to add a module label to bugs when you are triaging, this will make sure the correct person gets CC'ed |
Hi, After looking into this her are a few comments:
|
Thanks for the quick response, given the scenario where we are wanting to replace some indices of an input tensors with a learnable embedding (determined by a boolean mask, which will be different for all batches and sometimes the mask will be all false), is there a recommended approach for is this behavior or is it just fundamentally incompatible with DDP after torch 1.6? |
Hi, You just need to clone the input that was given to you before modifying it inplace and all will be good. |
Removing the 1.7 milestone as the fix is fairly dangerous and would only change from an internal assert to a nice error message. The release note will explicitly mention this case as raising an internal error to reduce user confusion when getting this error. Also note that the underlying autograd issue was already present in 1.6. |
putting back 1.7.0 milestone while we investigate it from the DDP side. |
#41567 changed the behavior of chunk and split, and renamed the previous version as unsafe_*. As a result, comm.scatter outputs become views, which leads to the regression reported in #46242 This commit revert to use the previous versions of split and chunk. ghstack-source-id: 1676545927922b917c0aa55a4c88c669a781a291 Pull Request resolved: #46361
The linked PR #46296 is not landed yet. |
馃悰 Bug
In torch 1.6 assigning a parameter to an indexed tensor successfully create a new tensor which was part of forward graph, this behavior no long works using the latest torch 1.7 nightly
To Reproduce
Which returns the (truncated) output
Expected behavior
No error to occur, as in torch 1.6
Environment
Additional context
When running the same code with torch 1.6 the grad_fn on
x
after assiging the parameter isgrad_fn=<IndexPutBackward>
cc @ezyang @gchanan @zou3519 @bdhirsh @albanD @gqchen @pearu @nikitaved @pietern @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @aazzolini @xush6528 @osalpekar @jiayisuse @agolynski @ejguan
The text was updated successfully, but these errors were encountered: