Skip to content

Flux2: Tensor tuples can cause issues for checkpointing#12777

Merged
dg845 merged 11 commits intohuggingface:mainfrom
dxqb:flux2_tuples
Feb 19, 2026
Merged

Flux2: Tensor tuples can cause issues for checkpointing#12777
dg845 merged 11 commits intohuggingface:mainfrom
dxqb:flux2_tuples

Conversation

@dxqb
Copy link
Contributor

@dxqb dxqb commented Dec 2, 2025

addresses #12776

What does this PR do?

This PR keeps the tuples, but moves the splitting from tensors into tuples of tensors to the transformer blocks, to avoid issues with checkpointing. By passing a tensor directly, torch.utils.checkpoint() identifies the tensor and saves it accordingly without running a backward through it multiple times.

This is a draft. If you agree with this change I can make it nicer. Among other things:

  • type hints are incorrect
  • splitting might not be necessary anymore, because they are used immediately after

Who can review?

@yiyixuxu and @asomoza

@github-actions
Copy link
Contributor

github-actions bot commented Jan 9, 2026

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label Jan 9, 2026
@yiyixuxu yiyixuxu requested a review from dg845 January 10, 2026 02:24
@dg845
Copy link
Collaborator

dg845 commented Jan 13, 2026

Hi @dxqb, thanks for opening this PR and thanks for your patience! This change looks good to me. As mentioned in #12776 (comment), it would be nice to have a small script to reproduce/test this behavior.

@github-actions github-actions bot removed the stale Issues that haven't received updates label Jan 13, 2026
@dxqb
Copy link
Contributor Author

dxqb commented Feb 1, 2026

Hi @dxqb, thanks for opening this PR and thanks for your patience! This change looks good to me. As mentioned in #12776 (comment), it would be nice to have a small script to reproduce/test this behavior.

no repro-code, but it's clear now why this happens. it's documented by pytorch: #12776 (comment)

@dxqb dxqb marked this pull request as ready for review February 13, 2026 08:58
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@dg845 dg845 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes! Can you solve the merge conflicts with main? I think they may be a result of #12524, which switches over to using Python 3.9+ style type hints without explicit typing imports, including in transformer_flux2.py.

@dxqb
Copy link
Contributor Author

dxqb commented Feb 17, 2026

Thanks for the changes! Can you solve the merge conflicts with main? I think they may be a result of #12524, which switches over to using Python 3.9+ style type hints without explicit typing imports, including in transformer_flux2.py.

done and tested using Nerogar/OneTrainer#1279

@dg845
Copy link
Collaborator

dg845 commented Feb 19, 2026

@bot /style

@github-actions
Copy link
Contributor

github-actions bot commented Feb 19, 2026

Style bot fixed some files and pushed the changes.

@dg845
Copy link
Collaborator

dg845 commented Feb 19, 2026

Merging as the CI failure is unrelated.

Copy link
Collaborator

@dg845 dg845 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@dg845 dg845 merged commit a577ec3 into huggingface:main Feb 19, 2026
10 of 11 checks passed
dxqb added a commit to Nerogar/OneTrainer that referenced this pull request Feb 19, 2026
@dxqb dxqb deleted the flux2_tuples branch February 19, 2026 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants