Cannot load a pretrained CLIPTextModel while using SD2.1

I'm tring to write a stable-diffusion pipeline on my own, and I read some tutorials about it, such as https://huggingface.co/blog/stable_diffusion

This tutorials is based on "CompVis/stable-diffusion-v1-4" and I tried it on colab, it works well.

But when I tried to change the model to "stabilityai/stable-diffusion-2-1-base",  the CLIPTextModel downloaded from hub

```python
tokenizer = CLIPTokenizer.from_pretrained("openai/clip-vit-large-patch14")
text_encoder = CLIPTextModel.from_pretrained("openai/clip-vit-large-patch14")
```

is still for SD1.4 with hidden_size 768, but SD2.1 requires a hidden_size 1024, doesn't match.

Also I tried 
```python
CLIPTextModel.from_pretrained("stabilityai/stable-diffusion-2-1-base")
``` 

It raise "stabilityai/stable-diffusion-2-1-base does not appear to have a file named config.json."

I dont't know how to load the correct CLIPTextModel for SD2.1, and I don't wanna use `StableDiffusionPipeline` for now.

Can you help me solve this problem?
Thanks a lot.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cannot load a pretrained CLIPTextModel while using SD2.1 #3150

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Cannot load a pretrained CLIPTextModel while using SD2.1 #3150

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions