Civit AI flux model razor-8step-rapid-real not working with diffusers single file

### Describe the bug

We have this civit AI model: https://civitai.com/models/849864/razor-8step-rapid-real which we want to run using `from_single_file`, but it errors out

### Reproduction

1) First create your CivitAI API key by logging into civit ai and navigating to https://civitai.com/user/account
Then go to "API Keys" section in the bottom and create your key. 
2) Run the following command on terminal: `wget --show-progress -O model.safetensors "https://api.civitai.com/download/models/950841?token=YOUR_TOKEN"`
3) Try the code:
```
import torch
from diffusers import FluxPipeline

#wget --show-progress -O model.safetensors "https://api.civitai.com/download/models/950841?token="

pipe = FluxPipeline.from_single_file(
    "model.safetensors",
    torch_dtype=torch.bfloat16,
    
)
pipe.to("cuda")
prompt = "A cat holding a sign that says hello world"
image = pipe(
    prompt,
    height=1024,
    width=1024,
    guidance_scale=3.5,
    num_inference_steps=50,
    max_sequence_length=512,
    generator=torch.Generator("cuda").manual_seed(0)
).images[0]
image.save("flux.png")
```

### Logs

```shell
(3.7) user@c6dbd33b-904f-4d4e-bc4e-f68f78a80315:~/runware/Ali/sd-base-api$ python ali.py 
Fetching 16 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16/16 [00:00<00:00, 24745.16it/s]
Loading pipeline components...:  57%|█████████████████████████████████████████████████████████▏                                          | 4/7 [00:16<00:12,  4.16s/it]You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Loading pipeline components...:  71%|███████████████████████████████████████████████████████████████████████▍                            | 5/7 [00:16<00:06,  3.37s/it]
Traceback (most recent call last):
  File "/home/user/runware/Ali/sd-base-api/ali.py", line 7, in <module>
    pipe = FluxPipeline.from_single_file(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/anaconda3/envs/3.7/lib/python3.12/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/user/anaconda3/envs/3.7/lib/python3.12/site-packages/diffusers/loaders/single_file.py", line 509, in from_single_file
    loaded_sub_model = load_single_file_sub_model(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/anaconda3/envs/3.7/lib/python3.12/site-packages/diffusers/loaders/single_file.py", line 104, in load_single_file_sub_model
    loaded_sub_model = load_method(
                       ^^^^^^^^^^^^
  File "/home/user/anaconda3/envs/3.7/lib/python3.12/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/user/anaconda3/envs/3.7/lib/python3.12/site-packages/diffusers/loaders/single_file_model.py", line 343, in from_single_file
    diffusers_format_checkpoint = checkpoint_mapping_fn(
                                  ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/anaconda3/envs/3.7/lib/python3.12/site-packages/diffusers/loaders/single_file_utils.py", line 2255, in convert_flux_transformer_checkpoint_to_diffusers
    q, k, v, mlp = torch.split(checkpoint.pop(f"single_blocks.{i}.linear1.weight"), split_size, dim=0)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/anaconda3/envs/3.7/lib/python3.12/site-packages/torch/functional.py", line 207, in split
    return tensor.split(split_size_or_sections, dim)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/anaconda3/envs/3.7/lib/python3.12/site-packages/torch/_tensor.py", line 983, in split
    return torch._VF.split_with_sizes(self, split_size, dim)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: split_with_sizes expects split_sizes to sum exactly to 33030144 (input tensor's size at dimension 0), but got split_sizes=[3072, 3072, 3072, 12288]
```

### System Info

Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.

- 🤗 Diffusers version: 0.33.0.dev0
- Platform: Linux-5.15.0-134-generic-x86_64-with-glibc2.35
- Running on Google Colab?: No
- Python version: 3.12.9
- PyTorch version (GPU?): 2.5.1+cu124 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Huggingface_hub version: 0.27.1
- Transformers version: 4.47.1
- Accelerate version: 1.2.1
- PEFT version: 0.14.0
- Bitsandbytes version: not installed
- Safetensors version: 0.5.0
- xFormers version: not installed
- Accelerator: NVIDIA A100-SXM4-80GB, 81920 MiB
- Using GPU in script?: Yes
- Using distributed or parallel set-up in script?: No

### Who can help?

@sayakpaul 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Civit AI flux model razor-8step-rapid-real not working with diffusers single file #11127

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Civit AI flux model razor-8step-rapid-real not working with diffusers single file #11127

Description

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions