Skip to content

[tests] Improve ideogram4 tests#13862

Merged
sayakpaul merged 6 commits into
mainfrom
improve-ideogram4-tests
Jun 5, 2026
Merged

[tests] Improve ideogram4 tests#13862
sayakpaul merged 6 commits into
mainfrom
improve-ideogram4-tests

Conversation

@sayakpaul
Copy link
Copy Markdown
Member

What does this PR do?

New model tests must use pytest and newly added mixins.

@sayakpaul sayakpaul requested a review from dg845 June 4, 2026 06:00
@github-actions github-actions Bot added models tests size/M PR with diff < 200 LOC labels Jun 4, 2026
Comment on lines +146 to +149
# Skip: the non-persistent fp32 RoPE inv_freq buffer is truncated to fp16 by the in-memory
# .to(dtype) path but kept fp32 by from_pretrained, so the two outputs diverge well beyond any
# meaningful tolerance. Dtype preservation is already covered by test_from_save_pretrained_dtype
# and test_keep_in_fp32_modules.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit concerning to me.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an underlying issue with test_from_save_pretrained_dtype_inference is that the model.to(dtype) cast at

model.to(dtype).save_pretrained(tmp_path)

is not dtype-aware (it will cast everything to dtype), but from_pretrained(..., torch_dtype=dtype) is dtype-aware (it respects _keep_in_fp32_modules, etc.). This causes the behavior of the two to diverge in several scenarios:

  1. There are _keep_in_fp32_modules specified on the model (the more common case)
  2. A non-persistent buffer like inv_freq is created with an explicit dtype (which is the case here)

which leads to a divergence between the outputs of model and model_loaded.

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Copy Markdown
Collaborator

@dg845 dg845 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

@sayakpaul sayakpaul merged commit 924cfb2 into main Jun 5, 2026
14 of 16 checks passed
@sayakpaul sayakpaul deleted the improve-ideogram4-tests branch June 5, 2026 02:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

models size/M PR with diff < 200 LOC tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants