<a href="https://colab.research.google.com/github/R3gm/stablepy/blob/main/stablepy_demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Stablepy

Install dependencies

In [None]:
!pip install stablepy==0.3.0 -q

To use the version with the latest changes, you can install directly from the repository.

`pip install -q git+https://github.com/R3gm/stablepy.git`

Download our models and other stuffs

In [None]:
%cd /content/

# Model
!wget https://huggingface.co/frankjoshua/toonyou_beta6/resolve/main/toonyou_beta6.safetensors

# VAE
!wget https://huggingface.co/fp16-guy/anything_kl-f8-anime2_vae-ft-mse-840000-ema-pruned_blessed_clearvae_fp16_cleaned/resolve/main/anything_fp16.safetensors

# LoRAs
!wget https://civitai.com/api/download/models/183149 --content-disposition
!wget https://civitai.com/api/download/models/97655 --content-disposition

# Embeddings
!wget https://huggingface.co/embed/negative/resolve/main/bad-hands-5.pt
!wget https://huggingface.co/embed/negative/resolve/main/bad-artist.pt

# Upscaler
!wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth

# Inference with Stable diffusion 1.5

First, we pass the path of the model we will use.

The default task is txt2img but it can be changed to: openpose, canny, mlsd, scribble, softedge, segmentation, depth, normalbae, lineart, shuffle, ip2p, img2img or inpaint

In [None]:
from stablepy import Model_Diffusers
import torch

model_path = "./toonyou_beta6.safetensors"
vae_path = "./anything_fp16.safetensors"

model = Model_Diffusers(
    base_model_id = model_path, # path to the model
    task_name = "canny", # task
    vae_model = vae_path, # path vae
)

To switch tasks, we can call `model.load_pipe()` and specify the new task or model. This will load the necessary components.

In [None]:
model.load_pipe(
    base_model_id = model_path, # path to the model
    task_name = "txt2img", # task
    vae_model = None, # Use default VAE
)

We will use a basic txt2img task in which we can specify different common parameters, such as Loras, embeddings, upscaler, etc.

In [None]:
from IPython.display import display

lora1_path = "./EmptyEyes_Diffuser_v10.safetensors"
lora2_path = "./FE_V2.safetensors" # pixel art lora
upscaler_path = "./RealESRGAN_x4plus.pth"

images, image_list = model(
    prompt = "pixel art (masterpiece, best quality), 1girl, collarbone, wavy hair, looking at viewer, blurry foreground, upper body, necklace, contemporary, plain pants, ((intricate, print, pattern)), ponytail, freckles, red hair, dappled sunlight, smile, happy,",
    negative_prompt = "(worst quality, low quality, letterboxed), bad_artist_token, bad_hand_token",
    img_width = 513,
    img_height = 1022,
    num_images = 1,
    num_steps = 30,
    guidance_scale = 8.0,
    clip_skip = True, # Clip skip to the penultimate layer, in other implementations it is equivalent to use clipskip 2.
    seed = -1, # random seed
    sampler="DPM++ SDE Karras",
    syntax_weights="Compel",  # (word)weight and (word)+ for prompts weights

    lora_A = lora1_path,
    lora_scale_A = 0.8,
    lora_B = lora2_path,
    lora_scale_B = 0.9,

    textual_inversion=[("bad_artist_token", "./bad-artist.pt"), ("bad_hand_token", "./bad-hands-5.pt")], # Is a list of tuples with [("<token_activation>","<path_embeding>"),...]

    upscaler_model_path = upscaler_path, # Upscale the image and Hires-fix
    upscaler_increases_size=1.5,
    hires_steps = 25,
    hires_denoising_strength = 0.35,
    hires_prompt = "", # If this is left as is, the main prompt will be used instead.
    hires_negative_prompt = "",
    hires_sampler = "Use same sampler",

    #By default, the generated images are saved in the current location within the 'images' folder.
    image_storage_location = "./images",

    #You can disable saving the images with this parameter.
    save_generated_images = False,
)

for image in images:
  display(image)

## ControlNet

In [None]:
model.load_pipe(
    base_model_id = model_path,
    task_name = "canny",
    # Use default VAE
)

Select a control image

In [None]:
from PIL import Image

!wget https://huggingface.co/lllyasviel/sd-controlnet-canny/resolve/main/images/bird.png -q

control_image = "bird.png"
image = Image.open(control_image)
display(image)

Inference with canny

In [None]:
images, image_list = model(
    prompt = "(masterpiece, best quality), bird",
    negative_prompt = "(worst quality, low quality, letterboxed)",
    image = control_image,
    # preprocessor_name = "None", canny not need the preprocessor_name, active by default
    preprocess_resolution = 512, # It is the resize of the image that will be obtained from the preprocessor.
    image_resolution = 768, # The equivalent resolution to be used for inference.
    controlnet_conditioning_scale = 1.0, # ControlNet Output Scaling in UNet
    control_guidance_start = 0.0, # ControlNet Start Threshold (%)
    control_guidance_end= 1.0, # ControlNet Stop Threshold (%)

    upscaler_model_path = upscaler_path,
    upscaler_increases_size=1.4,

    # By default, 'hires-fix' is applied when we use an upscaler; to deactivate it, we can set 'hires steps' to 0
    hires_steps = 0,
)

for image in images:
  display(image)

Valid `preprocessor_name` depending on the task:


| Task name    | Preprocessor Name |
|----------|-------------------|
| openpose | "None" "Openpose" |
|scribble|"None" "HED" "Pidinet"|
|softedge|"None" "HED" "Pidinet" "HED safe" "Pidinet safe"|
|segmentation|"None" "UPerNet"|
|depth|"None" "DPT" "Midas"|
|normalbae|"None" "NormalBae"|
|lineart|"None" "Lineart" "Lineart coarse" "None (anime)" "LineartAnime"|
|shuffle|"None" "ContentShuffle"|
|canny||
|mlsd||
|ip2p||


## Adetailer

In [None]:
model.load_pipe(
    base_model_id = model_path,
    task_name = "txt2img",
)

There must be a match of parameters for good results to be obtained with adetailer, it is also useful to use `strength` in adetailer_inpaint_params with low values ​​below 0.4.

In [None]:
# These are the parameters that adetailer A uses by default, but we can modify them if needed, the same applies to adetailer B.
adetailer_params_A = {
    "face_detector_ad" : True,
    "person_detector_ad" : True,
    "hand_detector_ad" : False,
    "prompt": "", # The main prompt will be used if left empty
    "negative_prompt" : "",
    "strength" : 0.35, # need low values
    "mask_dilation" : 4,
    "mask_blur" : 4,
    "mask_padding" : 32,
    "inpaint_only" : True, # better
    "sampler" : "Use same sampler",
}

images, image_list = model(
    prompt = "(masterpiece, best quality), 1girl, collarbone, wavy hair, looking at viewer, blurry foreground, upper body, necklace, contemporary, plain pants, ((intricate, print, pattern)), ponytail, freckles, red hair, dappled sunlight, smile, happy,",
    negative_prompt = "(worst quality, low quality, letterboxed)",
    img_width = 512,
    img_height = 1024,
    num_images = 1,
    num_steps = 30,
    guidance_scale = 8.0,
    clip_skip = True,
    seed = 33,
    sampler="DPM++ SDE Karras",

    FreeU=True, # Improves diffusion model sample quality at no costs.
    adetailer_A=True,
    adetailer_A_params=adetailer_params_A,

    adetailer_B=True, # "If we don't use adetailer_B_params, it will use default values.

    # By default, the upscaler will be deactivated if we don't pass a model to it.
    # It's also valid to use a url to the model, Lanczos or Nearest.
    #upscaler_model_path = "Lanczos",
)

for image in images:
  display(image)

## Inpaint

In [None]:
model.load_pipe(
    base_model_id = model_path,
    task_name = "inpaint",
)

We can specify the directory of our mask image, but we can also generate it, which is what we'll do in this example

You need a mouse to draw on this canvas.

In [None]:
images, image_list = model(
    image = control_image,
    # image_mask = "/mask.png",
    prompt = "a blue bird",
    strength = 0.5,
    negative_prompt = "(worst quality, low quality, letterboxed)",
    image_resolution = 768, # The equivalent resolution to be used for inference.
    sampler="DPM++ SDE Karras",
)

for image in images:
  display(image)

If you're using a device without a mouse or Jupyter Notebook outside of Colab, the function to create a mask automatically won't work correctly. Therefore, you'll need to specify the path of your mask image manually.

# Styles
These are additions to the prompt and negative prompt to utilize a specific style in generation. By default, there are only 9 of these, and we can know their names by using:

In [None]:
model.STYLE_NAMES

But if we want to use other styles, we can load them through a JSON, like this one for example.
Here are more JSON style files: [PromptStylers](https://github.com/wolfden/ComfyUi_PromptStylers), [sdxl_prompt_styler](https://github.com/ali1234/sdxl_prompt_styler/tree/main)

In [None]:
!wget https://raw.githubusercontent.com/ahgsql/StyleSelectorXL/main/sdxl_styles.json

In [None]:
model.load_style_file("sdxl_styles.json")

The file was loaded with 77 styles replacing the previous ones, now we can see the new names:

In [None]:
model.STYLE_NAMES

Now we can use the style in the inference.

In [None]:
# Image to Image task.
model.load_pipe(
    base_model_id = model_path,
    task_name = "img2img",
)

# We can also use multiple styles in a list ["Silhouette", "Kirigami"]
images, image_list = model(
    style_prompt = "Silhouette", # The style will be added to the prompt and negative prompt
    image = control_image,
    prompt = "a bird",
    negative_prompt = "worst quality",
    strength = 0.48,
    image_resolution = 512,
    sampler="DPM++ SDE Karras",
)

for image in images:
  display(image)

#Verbosity Level
To change the verbosity level, you can use the logger from StablePy


In [None]:
import logging
from stablepy import logger

logging_level_mapping = {
    'DEBUG': logging.DEBUG,
    'INFO': logging.INFO,
    'WARNING': logging.WARNING,
    'ERROR': logging.ERROR,
    'CRITICAL': logging.CRITICAL
}

Verbosity_Level = "WARNING" # Messages INFO and DEBUG will not be printed

logger.setLevel(logging_level_mapping.get(Verbosity_Level, logging.INFO))

# LCM

Latent Consistency Models (LCM) can generate images in a few steps. When selecting the 'LCM' sampler, the model automatically loads the LCM_LoRA for the task. Generally, guidance_scale is used at 1.0 or a maximum of 2.0, with steps between 4 and 8

In [None]:
# Generating an image with txt2img
model.load_pipe(
    base_model_id = model_path,
    task_name = "txt2img",
)
images, image_list = model(
    prompt = "(masterpiece, best quality), 1girl, collarbone, wavy hair, looking at viewer, blurry foreground, upper body, necklace, contemporary, plain pants, ((intricate, print, pattern)), ponytail, freckles, red hair, dappled sunlight, smile, happy,",
    negative_prompt = "(worst quality, low quality, letterboxed)",
    num_images = 1,
    num_steps = 7,
    guidance_scale = 1.0,
    sampler="LCM",
    syntax_weights="Classic", # (word:weight) and (word) for prompts weights
    disable_progress_bar = True,
    save_generated_images = False,
    display_images = True,
)

# Using the image generated in img2img
# If we use the same model and VAE, we can switch tasks quickly
model.load_pipe(
    base_model_id = model_path,
    task_name = "img2img",
)
images_i2i, image_list = model(
    prompt = "masterpiece, sunlight",
    image = images[0], # only one image
    style_prompt = "Disco", # Apply a style
    strength = 0.70,
    num_steps = 4,
    guidance_scale = 1.0,
    sampler="LCM",
    disable_progress_bar = True,
    save_generated_images = False,
    display_images = True,
)

In [None]:
logger.setLevel(logging.INFO) # return info

# Inference with SDXL

If you are using Colab with a T4, you might encounter OOM (Out Of Memory) issues when loading an SDXL safetensor model. However, you can still use it in the following way

```
from stablepy import Model_Diffusers

model_path = "./my_sdxl_model.safetensors"

model = Model_Diffusers(
    base_model_id = model_path,
    task_name = "txt2img",
    sdxl_safetensors = True
)
```

If you change tasks, you don't need to specify this parameter `sdxl_safetensors = True` unless you change the model.
```
second_model_path = "./second_sdxl_model.safetensors"

model.load_pipe(
    base_model_id = second_model_path,
    task_name = "img2img",
    sdxl_safetensors = True
)
```



Currently, SDXL models in fp16 Diffusers format can be used in Colab with a T4. You only need to specify the repository name to load the model from Hugging Face. You can search for specifically compatible models by looking for [XL FP16 on Hugging Face](https://huggingface.co/models?search=-xl-fp16) .

In [None]:
repo = "SG161222/RealVisXL_V2.0"

model.load_pipe(
    base_model_id = repo,
    task_name = "txt2img",
)

At the moment, SDXL is compatible with the following tasks:


*   txt2img
*   inpaint
* img2img
* sdxl_canny
* sdxl_sketch
* sdxl_lineart
* sdxl_depth-midas
* sdxl_openpose


In [None]:
# Example sdxl_depth-midas
model.load_pipe(
    base_model_id = repo, # sdxl repo
    task_name = "sdxl_depth-midas",
)

# We can also use multiple styles in a list ["Silhouette", "Kirigami"]
images, image_list = model(
    image = control_image,
    prompt = "a green bird",
    negative_prompt = "worst quality",

    # If we want to use the preprocessor
    t2i_adapter_preprocessor = True,
    preprocess_resolution = 1024,

    # Relative resolution
    image_resolution = 1024,

    sampler="DPM++ 2M SDE Lu", # Specific variant for SDXL. We can also use euler at final with "DPM++ 2M SDE Ef"

    t2i_adapter_conditioning_scale = 1.0,
    t2i_adapter_conditioning_factor = 1.0,

    display_images = True,
)

In [None]:
# For more details about the parameters
help(model.__call__)