Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
This is weird because we explicitly have Something must have changed for loading models |
|
That's true. I think I found the problem. |
37e20b5 to
71a581f
Compare
|
I created a centralized |
| "batch_size": batch_size, | ||
| "max_cache_len": max_cache_length, | ||
| }, | ||
| eos_token_id=model.config.eos_token_id, |
There was a problem hiding this comment.
I am not sure why Seq2SeqLMExportableModule doesn't use something similar to other place like generation_config = model.generation_config.
(maybe @vasqu or @zucchini-nlp could say sth here)
For the change here, I am also not sure model.config is the best thing to use, or it should use model.generation_config instead.
And one more thing: if we change here, would other tests will be affected. I guess no as I only see only T5 test file use Seq2SeqLMExportableModule.
There was a problem hiding this comment.
You're right. It's better to use model.generation_config, I'll do it.
Yes, only T5 uses Seq2SeqLMExportableModule.
There was a problem hiding this comment.
Yes, it makes sense to set via generation_config - note that generate is always custom overwritten for executorch related models, e.g. the encoder-decoder overwrite is here
transformers/src/transformers/integrations/executorch.py
Lines 967 to 1000 in de306e8
It's always as simple as possible to mimic just simple do_sample=False, the other variants seem to rely on max length for example:
There was a problem hiding this comment.
Note
transformers/src/transformers/integrations/executorch.py
Lines 996 to 998 in de306e8
What kind of failures we have here? I mean could we just adjust the test logic (if that makes sense)? To change the logic in |
| mtf_score = -(labels.shape[-1] * loss.item()) | ||
|
|
||
| EXPECTED_SCORE = -59.0293 | ||
| EXPECTED_SCORE = -40.1645 |
There was a problem hiding this comment.
Let's try to use more Expectation class.
BTW, do you know why such difference?
There was a problem hiding this comment.
For below line
self.assertTrue
let's avoid such usage, but torch.testing.assert_close
There was a problem hiding this comment.
it seems failing since Nov 14 2025. Is this failing since this test is added, or there is some PRs fails it as some point while it passed before ?
when we see or make such big change, it's better to check and give more context, so we are sure we are not doing something wrong.
There was a problem hiding this comment.
I think it’s difficult to track the origin of what led to this change. There are many factors that could cause the expected scores or text completions to change. However, I checked that the new value has been consistent for a good period of time.
The current logic in |
|
View the CircleCI Test Summary for this PR: https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=43374&sha=030d5e |
|
Re: keeping modules in fp32, see my comment #43374 (comment) I'm pretty sure this behavior is wrong and was only recently changed - iirc @eustlb added the strict flag, maybe cc @Cyrilvallez for model loading? |
ArthurZucker
left a comment
There was a problem hiding this comment.
I think the change is just that now dtype="auto" by default, which will follow the config and if no config IDK. We should try to just for fp32 in the test model lkoading imo
src/transformers/modeling_utils.py
Outdated
| def _build_dtype_plan_for_loading(self, dtype: torch.dtype) -> dict: | ||
| """ | ||
| Builds dtype_plan based on target dtype and keep_in_fp32 module settings. | ||
| """ | ||
|
|
||
| # Add _keep_in_fp32_modules only for FP16 loading | ||
| if isinstance(self._keep_in_fp32_modules, list) and dtype == torch.float16: | ||
| self.dtype_plan.update(dict.fromkeys(self._keep_in_fp32_modules, torch.float32)) | ||
|
|
||
| # Add _keep_in_fp32_modules_strict for both FP16 and BF16 | ||
| if isinstance(self._keep_in_fp32_modules_strict, list) and dtype in (torch.float16, torch.bfloat16): | ||
| self.dtype_plan.update(dict.fromkeys(self._keep_in_fp32_modules_strict, torch.float32)) | ||
|
|
||
| return self.dtype_plan | ||
|
|
There was a problem hiding this comment.
I see @ArthurZucker
I actually added a function that correctly builds the dtype_plan during loading. Let me know what you think. Otherwise, I can revert it and just update the test.
Cyrilvallez
left a comment
There was a problem hiding this comment.
Opened #43683 (will merge asap when the tests finish running) for the dtype issue because I wanted to add very strict tests on this!
The rest LGTM, let's simply revert that part on pull main! Sorry for the inconvenience 😬
71d93cb to
d0e73c2
Compare
|
[For maintainers] Suggested jobs to run (before merge) run-slow: t5 |
Cyrilvallez
left a comment
There was a problem hiding this comment.
Alright LGTM! Thanks a lot!
|
Force merging as failures are completely unrelated! |
Fix all the remaining T5 failures.
test_export_t5_summarizationwas failing becauseeos_token_idwas not being passed to GenerationConfig inSeq2SeqLMExportableModule, causing the generate loop to never stop at the end-of-sequence token and produce extra tokens.test_fp16_fp32_conversiontest was failing because T5's_keep_in_fp32_modules = ["wo"]was unconditionally forcing thewolayer to FP32 for both FP16 and BF16 loading, so I created a centralized_build_dtype_plan_for_loading()method that encapsulates all thedtype_planbuilding logic in one place, taking the target dtype as input and properly applying_keep_in_fp32_modules(FP16 only) and_keep_in_fp32_modules_strict(FP16 and BF16) based on the actual loadingdtype.