Flux2: Tensor tuples can cause issues for checkpointing

### Describe the bug

The modulations calculated here...
https://github.com/huggingface/diffusers/blob/edf36f5128abf3e6ecf92b5145115514363c58e6/src/diffusers/models/transformers/transformer_flux2.py#L716

...return tuples of Tensors:
https://github.com/huggingface/diffusers/blob/edf36f5128abf3e6ecf92b5145115514363c58e6/src/diffusers/models/transformers/transformer_flux2.py#L628

These tuples are passed from outside the transformer blocks into the checkpointed transformer blocks.
If the tensors inside the tuples require gradients, this can cause issues for the backward pass:
```
RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.
```

torch checkpointing doesn't identify the tuples as tensors. Only tensors are identified:
https://github.com/pytorch/pytorch/blob/d38164a545b4a4e4e0cf73ce67173f70574890b6/torch/utils/checkpoint.py#L252

### Reproduction

isolated reproduction code is difficult because of the size of the model. but I'll post a draft PR in a minute.

### Logs

```shell

```

### System Info

torch 2.8, diffusers HEAD

### Who can help?

@DN6 @yiyixuxu @sayakpaul

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Flux2: Tensor tuples can cause issues for checkpointing #12776

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Flux2: Tensor tuples can cause issues for checkpointing #12776

Description

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions