Value error: has no attribute norm_added_k --- while trying to load stable-diffusion-3.5-large using StableDiffusion3Pipeline

### Describe the bug

After the model files being finished with the downloads, I received two warnings: 
_**1. You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
2. The config attributes {'qk_norm': 'rms_norm'} were passed to SD3Transformer2DModel, but are not expected and will be ignored. Please verify your config.json configuration file.**_

Then, lastly received this error given in the image. 
![image](https://github.com/user-attachments/assets/98e2d1f6-361b-478c-af23-0d1f1c5b439b)


### Reproduction

!pip -q install datasets diffusers -U transformers accelerate torch

from huggingface_hub import login
login("hf_token")
import torch
from diffusers import StableDiffusion3Pipeline

pipe = StableDiffusion3Pipeline.from_pretrained("stabilityai/stable-diffusion-3.5-large").to("cuda")

### Logs

```shell
You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
The config attributes {'qk_norm': 'rms_norm'} were passed to SD3Transformer2DModel, but are not expected and will be ignored. Please verify your config.json configuration file.
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-3-689f4e5ef958> in <cell line: 7>()
      5 from diffusers import StableDiffusion3Pipeline
      6 
----> 7 pipe = StableDiffusion3Pipeline.from_pretrained("stabilityai/stable-diffusion-3.5-large").to("cuda")

7 frames
/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py in _inner_fn(*args, **kwargs)
    112             kwargs = smoothly_deprecate_use_auth_token(fn_name=fn.__name__, has_token=has_token, kwargs=kwargs)
    113 
--> 114         return fn(*args, **kwargs)
    115 
    116     return _inner_fn  # type: ignore

/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/pipeline_utils.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
    874             else:
    875                 # load sub model
--> 876                 loaded_sub_model = load_sub_model(
    877                     library_name=library_name,
    878                     class_name=class_name,

/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/pipeline_loading_utils.py in load_sub_model(library_name, class_name, importable_classes, pipelines, is_pipeline_module, pipeline_class, torch_dtype, provider, sess_options, device_map, max_memory, offload_folder, offload_state_dict, model_variants, name, from_flax, variant, low_cpu_mem_usage, cached_folder)
    698     # check if the module is in a subdirectory
    699     if os.path.isdir(os.path.join(cached_folder, name)):
--> 700         loaded_sub_model = load_method(os.path.join(cached_folder, name), **loading_kwargs)
    701     else:
    702         # else load from the root directory

/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py in _inner_fn(*args, **kwargs)
    112             kwargs = smoothly_deprecate_use_auth_token(fn_name=fn.__name__, has_token=has_token, kwargs=kwargs)
    113 
--> 114         return fn(*args, **kwargs)
    115 
    116     return _inner_fn  # type: ignore

/usr/local/lib/python3.10/dist-packages/diffusers/models/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
    772                         force_hook = False
    773                     try:
--> 774                         accelerate.load_checkpoint_and_dispatch(
    775                             model,
    776                             model_file if not is_sharded else index_file,

/usr/local/lib/python3.10/dist-packages/accelerate/big_modeling.py in load_checkpoint_and_dispatch(model, checkpoint, device_map, max_memory, no_split_module_classes, offload_folder, offload_buffers, dtype, offload_state_dict, skip_keys, preload_module_classes, force_hooks, strict)
    611     if offload_state_dict is None and device_map is not None and "disk" in device_map.values():
    612         offload_state_dict = True
--> 613     load_checkpoint_in_model(
    614         model,
    615         checkpoint,

/usr/local/lib/python3.10/dist-packages/accelerate/utils/modeling.py in load_checkpoint_in_model(model, checkpoint, device_map, offload_folder, dtype, offload_state_dict, offload_buffers, keep_in_fp32_modules, offload_8bit_bnb, strict)
   1876                         offload_weight(param, param_name, state_dict_folder, index=state_dict_index)
   1877                 else:
-> 1878                     set_module_tensor_to_device(
   1879                         model,
   1880                         param_name,

/usr/local/lib/python3.10/dist-packages/accelerate/utils/modeling.py in set_module_tensor_to_device(module, tensor_name, device, value, dtype, fp16_statistics, tied_params_map)
    334             new_module = getattr(module, split)
    335             if new_module is None:
--> 336                 raise ValueError(f"{module} has no attribute {split}.")
    337             module = new_module
    338         tensor_name = splits[-1]

ValueError: Attention(
  (to_q): Linear(in_features=2432, out_features=2432, bias=True)
  (to_k): Linear(in_features=2432, out_features=2432, bias=True)
  (to_v): Linear(in_features=2432, out_features=2432, bias=True)
  (add_k_proj): Linear(in_features=2432, out_features=2432, bias=True)
  (add_v_proj): Linear(in_features=2432, out_features=2432, bias=True)
  (add_q_proj): Linear(in_features=2432, out_features=2432, bias=True)
  (to_out): ModuleList(
    (0): Linear(in_features=2432, out_features=2432, bias=True)
    (1): Dropout(p=0.0, inplace=False)
  )
  (to_add_out): Linear(in_features=2432, out_features=2432, bias=True)
) has no attribute norm_added_k.
```


### System Info

Platform: google colab
Modules - versions: 
diffusers - 0.30.3
transformers - 0.44.2
torch - 2.5.0+cu121
accelerate - 0.34.2

### Who can help?

@sayakpaul
@yiyixuxu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Value error: has no attribute norm_added_k --- while trying to load stable-diffusion-3.5-large using StableDiffusion3Pipeline #9780

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Value error: has no attribute norm_added_k --- while trying to load stable-diffusion-3.5-large using StableDiffusion3Pipeline #9780

Description

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions