## Overview

Stable Diffusion models are avaliable in different formats depending on the framework they're trained and saved with, and where you download them from. Converting these formats for use in Diffusers allows you to use all the features supported by the library, such as using different schedulers, diffusers, and more.

This notebook will show you how to convert Stable Diffusion formats to be compatible with 🤗 Diffusers

## PyTorch .ckpt

The checkpoint or `.ckpt` format is commonly used to store and save models. The `.ckpt` file contains the entire model and is typically several gigabytes in size. While you can load and use a `.ckpt` file directly with the `from_single_file()` method, it is generallt better to convert the `.ckpt` file to 🤗 Diffusers so both formats are avaliable.

There are two options for converting a `.ckpt` file to 🤗 Diffusers:

### Convert with a Space

The easiest and most convenient way to convert a`.ckpt` file is to use the SD to diffuser space. Just need to follow the instructions on the Space. This approach works well for basic models, but it may struggle with more customized models. You will know the Space failed if it retuns an empty PR or error. In this case, you can try converting the `.ckpt` file weith a script.

### Convert with a script

The script is [conver_original_sd_to_diffusers](https://github.com/huggingface/diffusers/blob/main/scripts/convert_original_stable_diffusion_to_diffusers.py). And there are many of important arguments:

* `checkpoint_path`: The path to the `.ckpt` file you want to convert.
* `original_config_file`: a YAML file defininf the configuration of the original architecture. If you cannot find this file, try searching for the YAML file in the Github repo where you found the `.ckpt` file.
* `dump_path`: the path to the converted model
  * For example, you can take the cldm_v15.yaml file from the ControlNet repository because the TemporalNet model is a SD v1.5 and ControlNet model.

For example below: 

In [None]:
python ../diffusers/scripts/convert_original_stable_diffusion_to_diffusers.py --checkpoint_path temporalnetv3.ckpt --original_config_file cldm_v15.yaml --dump_path ./ --controlnet

## A1111 LoRA files

A1111 is a popular web UI for Stable Diffusion that supports model sharing platforms like Civitai. Model trained with the LoRA technique are especially popular because they're fast to train and have a much smaller file size than fintuned model.DIffusers supports loading A1111 LoRA checkpoints with [`load_lora_weights()`](https://huggingface.co/docs/diffusers/v0.18.0/en/api/pipelines/stable_diffusion/depth2img#diffusers.StableDiffusionDepth2ImgPipeline.load_lora_weights)

In [2]:
# check the paltform, Apple Silicon or Linux
import os, platform

torch_device="cpu"

if 'kaggle' in os.environ.get('KAGGLE_URL_BASE','localhost'):
    torch_device = 'cuda'
else:
    torch_device = 'mps' if platform.system() == 'Darwin' else 'cpu'

In [6]:
os.environ['PYTORCH_ENABLE_MPS_FALLBACK'] = '1'

In [None]:
torch_device

In [3]:
from diffusers import DiffusionPipeline, UniPCMultistepScheduler
import torch

pipe = DiffusionPipeline.from_pretrained(
    'runwayml/stable-diffusion-v1-5', torch_dtype=torch.float16, safety_checker=None
).to(torch_device)

pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)

You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .


Download a LoRA checkpoint from Civital; this example uses the Howls Moving Castle, Interior/Scenery Lora(Ghibli style) checkpoint, but feel free to try out any LoRA chckpoint.

In [None]:
# uncomment to download the safetensor weights
!wget https://civitai.com/api/download/models/112969 -O filmvelvia3.safetensors

Load the LoRA checkpoint into the pipeline with the `load_lora_weights()` method.

In [4]:
pipe.load_lora_weights('.',weight_name='filmvelvia3.safetensors')

In [7]:
prompt = "<lora:FilmVelvia3:0.6>, young 1girl with braided hair and fluffy cat ears, dressed in Off-Shoulder Sundress, standing in a rustic farm setting. She has a soft, gentle smile, expressive eyes and sexy cleavage. The background features a charming barn, fields of golden wheat, and a clear blue sky. The composition should be bathed in the warm, golden hour light, with a gentle depth of field and soft bokeh to accentuate the pastoral serenity. Capture the image as if it were taken on an old-school 35mm film for added charm, looking at viewer"
negative_prompt = "((worst quality, low quality), bad_pictures, negative_hand-neg:1.2),"

images = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=512,
    height=512,
    num_inference_steps=27,
    num_images_per_prompt=4,
    generator=torch.manual_seed(1793772152)
).images

The following part of your input was truncated because CLIP can only handle sequences up to 77 tokens: ['be bathed in the warm, golden hour light, with a gentle depth of field and soft bokeh to accentuate the pastoral serenity. capture the image as if it were taken on an old - school 3 5 mm film for added charm, looking at viewer']


  0%|          | 0/27 [00:00<?, ?it/s]

Display images

In [None]:
from PIL import Image

def image_grid(imgs, rows=2, cols=2):
    w, h = imgs[0].size
    grid = Image.new('RGB', size=(cols*w, rows*h))

    for i, img in enumerate(imgs):
        grid.paste(img, box=(i%cols*w, i//cols*h))
    return grid

image_grid(images)