-
Notifications
You must be signed in to change notification settings - Fork 901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not attempt to pad nested tensors #2041
Conversation
I wasn't aware of nested tensors, thanks for bringing this to my attention. AFAICT, accelerate doesn't really take them into account for now, I wouldn't be surprised if more errors can occur. It would probably require some unit tests to check this.
As for padding, I don't understand yet why |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
@frankier Could you please merge with/rebase on main, that should fix the issue with CI. |
73ccf16
to
4bf502f
Compare
There are not more issues when I'm using accelerate from
The test I know I how to write now at the moment is just a smoke/unit test that checks that nested_tensor is returned as-is roundtripped when passed to
The problem is that it uses in-band signalling. Whichever padding element you choose, may actually occur in the array. Adding this as default behavior could cause confusing bugs for users. It's better to force them to choose how to deal with the nested_tensor.
This might be from and old, possibly miscorrect memory I have about the default behavior of |
There could still be edge cases that you don't encounter but that other users may run into. E.g. the simple fact that That being said, there is no harm in accelerate getting out of the way if users happen to choose nested tensors, which is why I think this PR still makes sense.
I don't think it would be vacuous. You need to consider that someone else might come in the future and remove the check inadvertently. Having a test would alert us to this.
Not completely sure,
I see, that makes sense. However, I do wonder if we should give a warning in that situation, so that the user is aware that padding did not take place.
AFAICT, there is nothing in accelerate that requires this. A quick search didn't reveal anything in transformers |
Agree with all of @BenjaminBossan's points
Yes indeed! |
@frankier do you need any more direction on how to add the tests? :) |
74c1d0c
to
8f7ffcc
Compare
Please excuse the delay. I think I've addressed all this now. |
@frankier Could you please run |
8f7ffcc
to
ca44605
Compare
ca44605
to
3d19738
Compare
Whoops. Thought I had(!) Done now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
What does this PR do?
Fixes pad_across_processes erroring on nested tensors when reaching
len(tensor.shape)
Fixes #2040
Fixes huggingface/transformers#26687
Looking for a bit of guidance/direction on whether other things are needed.
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@pacman100