DiffusionPipeline fails to load custom model

### Describe the bug

I was exploring [this](https://github.com/SUDO-AI-3D/zero123plus) repository. It works well when using `diffusers==0.20.2`. To make it work with the latest diffusers release (0.21.4), you have to modify the pipeline file a bit, but that's not part of what's causing the issue.

The issue is mainly caused by the changes introduced in #5472. Specifically, line 1675 of pipeline_utils.py, but it is the result of other possible bugs.

### Reproduction

Note that the below code works with diffusers==0.20.2 (and with the latest release diffusers==0.21.4 after doing minor modifications to their code such as `_encode_prompt()` to `.encode_prompt()`) but not with `pip install git+<DIFFUSERS_REPO_URL>`.

```python
import torch
from diffusers import DiffusionPipeline, EulerAncestralDiscreteScheduler

pipeline = DiffusionPipeline.from_pretrained(
    "sudo-ai/zero123plus-v1.1", custom_pipeline="sudo-ai/zero123plus-pipeline",
    torch_dtype=torch.float16
)
```

### Logs

```shell
With `git+<DIFFUSERS_REPO_URL>`:


Traceback (most recent call last):
  File "/workspace/server/load_zero123.py", line 13, in <module>
    pipeline = DiffusionPipeline.from_pretrained(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/ldm/lib/python3.11/site-packages/diffusers/pipelines/pipeline_utils.py", line 1074, in from_pretrained
    cached_folder = cls.download(
                    ^^^^^^^^^^^^^
  File "/opt/conda/envs/ldm/lib/python3.11/site-packages/diffusers/pipelines/pipeline_utils.py", line 1678, in download
    candidate_file = os.path.join(component, module_candidate + ".py")
                                             ~~~~~~~~~~~~~~~~~^~~~~~~
TypeError: unsupported operand type(s) for +: 'float' and 'str'
```

I added some print statements to the download() code to find more information of what's causing the issue:

```
dict_items([('_class_name', 'Zero123PlusPipeline'), ('_diffusers_version', '0.17.1'), ('feature_extractor_clip', ['transformers', 'CLIPImageProcessor']), ('feature_extractor_vae', ['transformers', 'CLIPImageProcessor']), ('ramping_coefficients', [0.0, 0.2060057818889618, 0.18684479594230652, 0.24342191219329834, 0.18507817387580872, 0.1703828126192093, 0.15628913044929504, 0.14174538850784302, 0.13617539405822754, 0.13569170236587524, 0.1269884556531906, 0.1200924888253212, 0.12816639244556427, 0.13058121502399445, 0.14201879501342773, 0.15004529058933258, 0.1620427817106247, 0.17207716405391693, 0.18534132838249207, 0.20002241432666779, 0.21657466888427734, 0.22996725142002106, 0.24613411724567413, 0.25141021609306335, 0.26613450050354004, 0.271847128868103, 0.2850190997123718, 0.285749226808548, 0.2813953757286072, 0.29509517550468445, 0.30109965801239014, 0.31370124220848083, 0.3134534955024719, 0.3108579218387604, 0.32147032022476196, 0.33548328280448914, 0.3301997184753418, 0.3254660964012146, 0.3514464199542999, 0.35993096232414246, 0.3510829508304596, 0.37661612033843994, 0.3913513123989105, 0.42122599482536316, 0.3954688012599945, 0.4260983467102051, 0.479139506816864, 0.4588979482650757, 0.4873477816581726, 0.5095643401145935, 0.5133851170539856, 0.520708441734314, 0.5363377928733826, 0.5661528706550598, 0.5859065651893616, 0.6207258701324463, 0.6560986638069153, 0.6379964351654053, 0.6777164340019226, 0.6589891910552979, 0.7574057579040527, 0.7446827292442322, 0.7695522308349609, 0.8163619041442871, 0.9502472281455994, 0.9918442368507385, 0.9398387670516968, 1.005432367324829, 0.9295969605445862, 0.9899859428405762, 1.044832706451416, 1.0427014827728271, 1.0829696655273438, 1.0062562227249146, 1.0966323614120483, 1.0550328493118286, 1.2108079195022583]), ('safety_checker', [None, None]), ('scheduler', ['diffusers', 'EulerAncestralDiscreteScheduler']), ('text_encoder', ['transformers', 'CLIPTextModel']), ('tokenizer', ['transformers', 'CLIPTokenizer']), ('unet', ['diffusers', 'UNet2DConditionModel']), ('vae', ['diffusers', 'AutoencoderKL']), ('vision_encoder', ['transformers', 'CLIPVisionModelWithProjection'])])
['feature_extractor_clip', 'feature_extractor_vae', 'ramping_coefficients', 'safety_checker', 'scheduler', 'text_encoder', 'tokenizer', 'unet', 'vae', 'vision_encoder']
<class 'str'> feature_extractor_clip transformers
<class 'str'> feature_extractor_vae transformers
<class 'float'> ramping_coefficients 0.0
Traceback (most recent call last):
  File "/workspace/server/load_zero123.py", line 13, in <module>
    pipeline = DiffusionPipeline.from_pretrained(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/ldm/lib/python3.11/site-packages/diffusers/pipelines/pipeline_utils.py", line 1074, in from_pretrained
    cached_folder = cls.download(
                    ^^^^^^^^^^^^^
  File "/opt/conda/envs/ldm/lib/python3.11/site-packages/diffusers/pipelines/pipeline_utils.py", line 1678, in download
    candidate_file = os.path.join(component, module_candidate + ".py")
                                             ~~~~~~~~~~~~~~~~~^~~~~~~
TypeError: unsupported operand type(s) for +: 'float' and 'str'
```

diffusers is trying to load the `ramping_coefficients` constructor parameter from https://huggingface.co/sudo-ai/zero123plus-v1.1/blob/main/model_index.json as a module instead of a list, which is inconsistent behaviour with old diffusers versions, I believe.

Note that, after making minor modifications, as mentioned above, to make Zero123++ compatibile with diffusers==0.21.4, everything works. You can find the modified version in [this](https://colab.research.google.com/drive/1M4sDX_95P6J_q_HUW4LCdMifWlfZhjaI?usp=sharing) colab. However, I would like to make it work with latest main of diffusers and I believe this to be a bug with the new additions that were recently introduced.


### System Info

- `diffusers` version: 0.22.0.dev0
- Platform: Linux-5.4.0-144-generic-x86_64-with-glibc2.31
- Python version: 3.11.0
- PyTorch version (GPU?): 2.2.0.dev20231026+cu118 (True)
- Huggingface_hub version: 0.18.0
- Transformers version: 4.34.1
- Accelerate version: 0.24.0
- xFormers version: not installed
- Using GPU in script?: True
- Using distributed or parallel set-up in script?: False

### Who can help?

@patrickvonplaten @sayakpaul

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DiffusionPipeline fails to load custom model #5553

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

DiffusionPipeline fails to load custom model #5553

Description

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions