Skip to content

Conversation

@LysandreJik
Copy link
Member

The internal torch method _softmax_backward_data changed API between 1.10 and 1.11, from requiring a tensor as its last output to requiring a size.

This PR updates the concerned models so that they are correctly supported.

Torch 1.11: https://github.com/pytorch/pytorch/blame/e47a5a64bbf4d388b70397e3237f9d5710ee4c9c/tools/autograd/derivatives.yaml#L1861
Before: https://github.com/pytorch/pytorch/blame/768cfaa8f86bf7c7b0af441d1536f060274c27a0/tools/autograd/derivatives.yaml#L1704

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Mar 10, 2022

The documentation is not available anymore as the PR was closed or merged.


logger = logging.get_logger(__name__)

convert_to_dtype = not version.parse(torch.__version__) < version.parse("1.11")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a better name for that value would be something more specific to its purpose, but then the name starts being long and @sgugger gets angry, wdyt?

Suggested change
convert_to_dtype = not version.parse(torch.__version__) < version.parse("1.11")
convert_softmax_tensor_to_dtype = not version.parse(torch.__version__) < version.parse("1.11")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd maybe just add a do_convert... to make it a bit clearer it's a flag and not a function

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't mind the name after :-)

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the torch int div we wrote our own function that does the test internally. I think we should do this the same way and write our own _softmax_backward_data inside pytorch_utils which will do the test internally, then import this one here.

@LysandreJik LysandreJik requested a review from sgugger March 10, 2022 13:20
@LysandreJik
Copy link
Member Author

Addressed your comment @sgugger, could you do a second review?

As seen with Sylvain offline, I've moved out the packaging.version.parse operation of the methods, as otherwise they would be called inside the methods themselves, which are called multipled times in forward passes. @patrickvonplaten could you check if that's fine with you?

Comment on lines +24 to +25
is_torch_less_than_1_8 = version.parse(torch.__version__) < version.parse("1.8.0")
is_torch_less_than_1_11 = version.parse(torch.__version__) < version.parse("1.11")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The torch version cannot change during runtime, so this is harmless

return torch.div(tensor1, tensor2, rounding_mode="floor")


def softmax_backward_data(parent, grad_output, output, dim, self):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The self comes from the signature of the PyTorch function which is identical

@LysandreJik LysandreJik merged commit e66743e into master Mar 10, 2022
@LysandreJik LysandreJik deleted the fix-softmax-backward-torch-1.11 branch March 10, 2022 14:01
@speediedan
Copy link

@sgugger @LysandreJik Thanks for your awesome work on building this immensely valuable ecosystem (and community!).
I'm waiting to release a package that requires this post-4.17 commit and it would be great to avoid pointing to a specific commit for packaging purposes. Is a 4.17.1 patch release planned? I asked on Discord but this was indicated to be a better forum for the question. Thanks again for helping lead this great community!

@sgugger
Copy link
Collaborator

sgugger commented Mar 24, 2022

I don't think we have a patch planned. We will have 4.18 released probably next week instead :-)

@speediedan
Copy link

I don't think we have a patch planned. We will have 4.18 released probably next week instead :-)

AWESOME! looking forward to it!

@callmeBalloch
Copy link

Note that this evaluates True for pre-releases such as '1.11.0a0+b6df043'. So the the error is still present.
is_torch_less_than_1_11 = version.parse(version.parse(torch.__version__).base_version) < version.parse("1.11") may help.
Am I missing something?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants