<a href="https://www.kaggle.com/code/aisuko/using-civitai-checkpoints-lora-with-diffusers?scriptVersionId=161773820" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# Overview

Let's see how to use custom models on CivitAI with diffusers.

In [None]:
!pip install diffusers==0.23.1
!pip install omegaconf==2.3.0
!pip install compel==2.0.2

Here we will download the custom model is based on `StableDiffusion V1.5`.

In [None]:
import os
from kaggle_secrets import UserSecretsClient

user_secrets = UserSecretsClient()
civitai_key = user_secrets.get_secret("CIVITAI")

# https://education.civitai.com/civitais-guide-to-downloading-via-api/
os.environ["CHECKPOINT_MERGED"] = "meichidark_mix_v3.5.safetensors"
os.environ['CHECKPOINT_URL'] = "https://civitai.com/api/download/models/90778?&token="+civitai_key
os.environ["LORA_ADAPTER"] = "armor.safetensors"
os.environ["LORA_URL"]="https://civitai.com/api/download/models/224823?&token="+civitai_key

# Downloading the custom checkpoint(merged)

We download the merged checkpoint first.

In [None]:
!wget -q -O  ${CHECKPOINT_MERGED} ${CHECKPOINT_URL} --content-disposition 

In [None]:
!wget -q -O ${LORA_ADAPTER} ${LORA_URL} --content-disposition

# Loading the custom checkpoint(merged)

In [None]:
from diffusers import StableDiffusionPipeline
import torch

# Loading the pretrained checkpoint(merged) from downloaded safetensors
pipe=StableDiffusionPipeline.from_single_file(
    os.getenv('CHECKPOINT_MERGED'),
    use_safetensors=True,
    torch_dtype=torch.float16 # For CUDA
)
pipe.enable_model_cpu_offload()
pipe

# Loading the custom LoRA

Here we only use one Lora adapter, it can also support using multiple adapters.

```python
# Using multiple LoRAs with different scaling factors.

lora_dirs = ["lora1.safetensors", "lora2.safetensors", ...]
lora_scales = [0.7, 0.7, ...]

ldir, lsc in zip(lora_dirs, lora_scales):
    # Iteratively add new LoRA.
    pipe.load_lora_weights(ldir)
    # And scale them accordingly.
    pipe.fuse_lora(lora_scale = lsc)
```

In [None]:
pipe.load_lora_weights(os.getenv('LORA_ADAPTER'))
pipe

# Setting Clip_skip

In [None]:
clip_skip=2
pipe.text_encoder.text_model.encoder.layers=pipe.text_encoder.text_model.encoder.layers[:-clip_skip]
pipe.safety_checker=None

# Loading scheduler

In [None]:
from diffusers import (EulerAncestralDiscreteScheduler,
                       EulerDiscreteScheduler,
                       DPMSolverMultistepScheduler)
scheduler='none'

if scheduler=='EDS':
    pipe.scheduler= EulerDiscreteScheduler.from_config(pipe.scheduler.config)
elif scheduler=='EADS':
    pipe.scheduler=EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)
else:
    pipe.scheduler=DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe.enable_model_cpu_offload()
pipe

# Preprocess the prompt

We want to overcome the 77 tokens limit. So, we use [compel](https://github.com/damian0815/compel) to embedding the prompts to a PyTorch tensor.

In [None]:
from compel import Compel


embeddings=True

prompt='((a female wearing a red hood)), fantasy theme, medieval, fierce look, ((dynamic poses)), tattoo on her arms, white ruffled sleeve shirt, choker,  gloves, corset, leather pants, belt, waist sash, cape,  (((masterpiece))),  ((best quality)), ((intricate detailed)), ((Hyperrealistic)), a woman with perfect body figure wearing cyberpunk cloth, pale skin, ((huge breast)),  highly detailed, illustration, perfect hands, detailed fingers, beautiful detailed eyes, red hair, black hair, multiple color hair, long hair, (fantasy:1.2),  armor, detailed background, tavern , night, light by candle, lens flare, tempting look, looking at the viewer, from above,  <lora:adventurers_v1:1>  <lora:sxz-niji-v2:0.6>'
negative_prompt='easynegative, badhandv4, (low quality, worst quality:1.4), poorly drawn hands, bad anatomy, monochrome, { long body }, bad anatomy, liquid body, malformed, mutated, anatomical nonsense, bad proportions, uncoordinated body, unnatural body, disfigured, ugly, gross proportions, mutation, disfigured, deformed, { mutation}, {poorlydrawn}, bad hand, mutated hand, bad fingers, mutated fingers,   badhandv4, liquid tongue, long neck, fused ears, bad ears, poorly drawn ears, extra ears, liquid ears, heavy ears, missing ears, fused animal ears, bad animal ears, poorly drawn animal ears, extra animal ears, liquid animal ears, heavy animal ears, missing animal ears, bad hairs, poorly drawn hairs, fused hairs, bad face, fused face, poorly drawn face, cloned face, big face, long face, bad eyes, fused eyes poorly drawn eyes, extra eyes, bad mouth, fused mouth, poorly drawn mouth, bad tongue, big mouth, bad perspective, bad objects placement, NSFW, bad weapon, fused weapon, extra weapons, poorly weapon, bad sword, poor sword'


compel=Compel(tokenizer=pipe.tokenizer, text_encoder=pipe.text_encoder)
con_embeds=compel([prompt])
neg_embeds=compel([negative_prompt])

print(con_embeds.size())
print(neg_embeds.size())


# Warning
# Below function will cause the different size of two prompts(con, neg_promt)

# max_length=pipe.tokenizer.model_max_length
# def get_prompt_embeddings(prompt, negative_prompt):
#     count_prompt=len(prompt.split(' '))
#     count_negative_prompt=len(prompt.split(' '))
    
#     if count_prompt>=count_negative_prompt:
#         input_ids=pipe.tokenizer(prompt, 
#                                   truncation=False, 
#                                   return_tensors='pt').input_ids.to('cuda')
#         shape_max_length=input_ids.shape[-1]
        
#         negative_ids=pipe.tokenizer(negative_prompt, truncation=False, 
#                                     padding='max_length', 
#                                     max_length=shape_max_length, 
#                                     return_tensors='pt').input_ids.to('cuda')
#     else:
#         negative_ids=pipe.tokenizer(negative_prompt, truncation=False,
#                                    return_tensors='pt').input_ids.to('cuda')
#         shape_max_length=negative_ids.shape[-1]
#         input_ids=pipe.tokenizer(prompt, truncation=False,
#                                 padding='max_length',
#                                 max_length=shape_max_length,
#                                 return_tensors='pt').input_ids.to('cuda')
        
#     concat_embeds=[]
#     neg_embeds=[]
#     for i in range(0,shape_max_length, max_length):
#         concat_embeds.append(pipe.text_encoder(input_ids[:,i:i+max_length])[0])
#         neg_embeds.append(pipe.text_encoder(negative_ids[:,i:i+max_length])[0])
    
#     prompt_embeddings=torch.cat(concat_embeds, dim=1)
#     negative_prompt_embeddings=torch.cat(neg_embeds, dim=1)
    
#     return prompt_embeddings, negative_prompt_embeddings

# if embeddings:
#     prompt_embed, negative_embed=get_prompt_embeddings(prompt, negative_prompt)
# print(prompt_embed.size())
# print(negative_embed.size())

In [None]:
if embeddings:
    imgs=pipe(prompt_embeds=con_embeds,
             negative_prompt_embeds=neg_embeds,
             width=512,
             height=832,
             guidance_scale=7.0,
             num_inference_steps=35,
             num_images_per_prompt=1,
             generator=torch.manual_seed(93421173)).images
else:
    imgs=pipe(prompt=prompt,
              negative_prompt=negative_prompt,
              width=512,height=832,
              guidance_scale=12.0,num_inference_steps=50,
              num_images_per_prompt=1,
              generator=torch.manual_seed(0)).images
imgs[0]