Do not attempt to pad nested tensors #2041

frankier · 2023-10-09T11:48:05Z

What does this PR do?

Fixes pad_across_processes erroring on nested tensors when reaching len(tensor.shape)

Fixes #2040
Fixes huggingface/transformers#26687

Looking for a bit of guidance/direction on whether other things are needed.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@pacman100

BenjaminBossan · 2023-10-09T12:17:32Z

I wasn't aware of nested tensors, thanks for bringing this to my attention. AFAICT, accelerate doesn't really take them into account for now, I wouldn't be surprised if more errors can occur. It would probably require some unit tests to check this.

There is not an obvious default since numpy doesn't have an exact equivalent.

As for padding, I don't understand yet why to_padded_tensor cannot be used, what is the relation to numpy?

HuggingFaceDocBuilderDev · 2023-10-09T12:46:00Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

BenjaminBossan · 2023-10-09T15:02:40Z

@frankier Could you please merge with/rebase on main, that should fix the issue with CI.

frankier · 2023-10-10T07:51:15Z

AFAICT, accelerate doesn't really take them into account for now, I wouldn't be surprised if more errors can occur.

There are not more issues when I'm using accelerate from transformers Trainer, but there are issues in transformers if they are not converted in a custom preprocess_logits_for_metrics. Once this PR is merged, I will write another PR to transformers to improve the error messages in transformers.

It would probably require some unit tests to check this.

The test I know I how to write now at the moment is just a smoke/unit test that checks that nested_tensor is returned as-is roundtripped when passed to pad_across_processes. I think this test would be a bit vacuous, but I can easily add it to the PR if you like. Could you please direct me which file/class would be the best place for it?

As for padding, I don't understand yet why to_padded_tensor cannot be used

The problem is that it uses in-band signalling. Whichever padding element you choose, may actually occur in the array. Adding this as default behavior could cause confusing bugs for users. It's better to force them to choose how to deal with the nested_tensor.

what is the relation to numpy?

This might be from and old, possibly miscorrect memory I have about the default behavior of preprocess_logits_for_metrics converting everything to numpy arrays.

BenjaminBossan · 2023-10-10T09:49:24Z

There are not more issues when I'm using accelerate from transformers Trainer, but there are issues in transformers if they are not converted in a custom preprocess_logits_for_metrics.

There could still be edge cases that you don't encounter but that other users may run into. E.g. the simple fact that tensor.shape may fail could easily lead to problems. Maybe accelerate dodges all of them, but we cannot know until we really test it. As is, I'd say we don't officially support nested tensors -- the docs mention they're at prototype stage, so it makes sense to wait.

That being said, there is no harm in accelerate getting out of the way if users happen to choose nested tensors, which is why I think this PR still makes sense.

The test I know I how to write now at the moment is just a smoke/unit test that checks that nested_tensor is returned as-is roundtripped when passed to pad_across_processes. I think this test would be a bit vacuous, but I can easily add it to the PR if you like.

I don't think it would be vacuous. You need to consider that someone else might come in the future and remove the check inadvertently. Having a test would alert us to this.

Could you please direct me which file/class would be the best place for it?

Not completely sure, test_utils.py seems most appropriate, @muellerzr WDYT?

The problem is that it uses in-band signalling. Whichever padding element you choose, may actually occur in the array.

I see, that makes sense. However, I do wonder if we should give a warning in that situation, so that the user is aware that padding did not take place.

This might be from and old, possibly miscorrect memory I have about the default behavior of preprocess_logits_for_metrics converting everything to numpy arrays.

AFAICT, there is nothing in accelerate that requires this. A quick search didn't reveal anything in transformers Trainer either, but I don't have experience with it, so might be wrong.

muellerzr · 2023-10-10T11:52:32Z

Agree with all of @BenjaminBossan's points

Not completely sure, test_utils.py seems most appropriate, @muellerzr WDYT?

Yes indeed!

muellerzr · 2023-10-27T12:56:32Z

@frankier do you need any more direction on how to add the tests? :)

frankier · 2023-11-10T11:58:23Z

Please excuse the delay. I think I've addressed all this now.

BenjaminBossan · 2023-11-10T15:02:26Z

@frankier Could you please run make style and make quality?

frankier · 2023-11-10T18:30:19Z

Whoops. Thought I had(!) Done now.

BenjaminBossan

Thanks, looks good to me.

muellerzr

Thanks!

frankier force-pushed the dont-pad-nested-tensor branch from 73ccf16 to 4bf502f Compare October 10, 2023 05:33

frankier force-pushed the dont-pad-nested-tensor branch 2 times, most recently from 74c1d0c to 8f7ffcc Compare November 9, 2023 13:20

frankier force-pushed the dont-pad-nested-tensor branch from 8f7ffcc to ca44605 Compare November 10, 2023 18:18

Do not attempt to pad nested tensors

3d19738

frankier force-pushed the dont-pad-nested-tensor branch from ca44605 to 3d19738 Compare November 10, 2023 18:28

BenjaminBossan approved these changes Nov 15, 2023

View reviewed changes

muellerzr approved these changes Nov 17, 2023

View reviewed changes

muellerzr merged commit a5a7c03 into huggingface:main Nov 17, 2023
24 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not attempt to pad nested tensors #2041

Do not attempt to pad nested tensors #2041

frankier commented Oct 9, 2023 •

edited

Loading

BenjaminBossan commented Oct 9, 2023

HuggingFaceDocBuilderDev commented Oct 9, 2023

BenjaminBossan commented Oct 9, 2023

frankier commented Oct 10, 2023

BenjaminBossan commented Oct 10, 2023

muellerzr commented Oct 10, 2023

muellerzr commented Oct 27, 2023

frankier commented Nov 10, 2023

BenjaminBossan commented Nov 10, 2023

frankier commented Nov 10, 2023

BenjaminBossan left a comment

muellerzr left a comment

Do not attempt to pad nested tensors #2041

Do not attempt to pad nested tensors #2041

Conversation

frankier commented Oct 9, 2023 • edited Loading

What does this PR do?

Before submitting

Who can review?

BenjaminBossan commented Oct 9, 2023

HuggingFaceDocBuilderDev commented Oct 9, 2023

BenjaminBossan commented Oct 9, 2023

frankier commented Oct 10, 2023

BenjaminBossan commented Oct 10, 2023

muellerzr commented Oct 10, 2023

muellerzr commented Oct 27, 2023

frankier commented Nov 10, 2023

BenjaminBossan commented Nov 10, 2023

frankier commented Nov 10, 2023

BenjaminBossan left a comment

Choose a reason for hiding this comment

muellerzr left a comment

Choose a reason for hiding this comment

frankier commented Oct 9, 2023 •

edited

Loading