## Dreambooth Stable Diffusion训练
修改自 [bilibili 秋葉aaaki 的 dreambooth autodl 训练脚本](https://github.com/Akegarasu/dreambooth-autodl)
修改自 [Nyanko Lepsoni 的 Colab 笔记本](https://colab.research.google.com/drive/17yM4mlPVOFdJE_81oWBz5mXH9cxvhmz8)

### 全局变量

In [None]:
import sys
py310 = sys.executable

!$py310 --version
PATH = "/home/studio-lab-user/dreambooth-SageMaskerLab"
%cd $PATH

TRAINER = "train_dreambooth.py"
CONVERTER = "convert_v2.py"
BACK_CONVERTER = "back_convert.py"
ACCELERATE_BIN = "/root/miniconda3/envs/diffusers/bin/accelerate"

SRC_PATH = "~/tmp/model-sd"
MODEL_NAME = "./model-hf"

### 下载animefull-final-pruned模型及vae.pt文件

In [None]:
%cd $SRC_PATH
!curl -Lo model.ckpt https://huggingface.co/a1079602570/animefull-final-pruned/resolve/main/model-001.ckpt
!curl -Lo config.yaml https://huggingface.co/a1079602570/animefull-final-pruned/resolve/main/config.yaml
!curl -Lo animevae.pt https://huggingface.co/a1079602570/animefull-final-pruned/resolve/main/animevae.pt

### 转换ckpt文件

In [None]:
# 这步骤有些慢，没准要等个几分钟
vae_arg = f"--vae_path {SRC_PATH}/animevae.pt"
!$py310 $CONVERTER --checkpoint_path $SRC_PATH/model.ckpt --original_config_file $SRC_PATH/config.yaml $vae_arg --dump_path $MODEL_NAME --scheduler_type ddim

### 配置dreambooth提示词

In [None]:
# INSTANCE_PROMPT
INSTANCE_PROMPT = "masterpiece, best quality, bocchitherock 1girl"
INSTANCE_DIR = "./instance-images"

# class image 设置
CLASS_PROMPT = "masterpiece, best quality, 1girl"
CLASS_NEGATIVE_PROMPT = "lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry"
CLASS_DIR = "./class-images"

# 预览图tag设置
SAVE_SAMPLE_PROMPT = "masterpiece, best quality, bocchitherock 1girl, looking at viewer"
SAVE_SAMPLE_NEGATIVE_PROMPT = "lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry"

# 模型保存路径
OUTPUT_DIR = "~/tmp2"
print(f"[*] 模型将会保存在这个路径 {OUTPUT_DIR}")
!mkdir -p $OUTPUT_DIR

### 训练监控仪表盘

In [None]:
use_tensorboard = True 
use_wandb = False
save_weights_to_wandb = False
wandb_apikey = ""

if use_wandb:
  if wandb_apikey == "":
    raise ValueError('Invalid wandb.ai APIKey')
  !$py310 -m wandb login $wandb_apikey

if use_tensorboard:
  !rm -rf /tmp/.tensorboard-info/
  %load_ext tensorboard
  %tensorboard --logdir $OUTPUT_DIR/logs

### 配置accelerate

In [None]:
%%bash

mkdir -p ~/.cache/huggingface/accelerate

cat > ~/.cache/huggingface/accelerate/default_config.yaml <<- EOM
compute_environment: LOCAL_MACHINE
deepspeed_config: {}
distributed_type: 'NO'
downcast_bf16: 'no'
fsdp_config: {}
machine_rank: 0
main_process_ip: null
main_process_port: null
main_training_function: main
mixed_precision: fp16
num_machines: 1
num_processes: 1
use_cpu: false
EOM

### 设置训练参数

### max_train_steps
训练步数

### learning_rate
学习率
这里设置的5e-6是科学计数法的(5乘以10的-6次方)。一般就用这个值就可以了，有时候这个默认值有点大，可以小一些比如3e-6。如果你还是觉得太大可以缩小到1e-6、甚至是5e-7等等。

### lr_scheduler
学习率调整策略
一般lr_scheduler就用cosine、cosine_with_restarts就可以了。
想了解更多关于lr_scheduler可以看看这个[知乎](https://www.zhihu.com/question/315772308/answer/2384958925)

### batch_size
一般是1。我推荐不要超过3。调整batch_size需要同时调整学习率
详情参考bilibili 秋葉aaaki的视频[BV1A8411775m](https://www.bilibili.com/video/BV1A8411775m/)

### num_class_images
class image的数量。如果class-images文件夹内的图片数量小于这个值，则会AI自动生成一些图片。
如果关闭了下面的with_prior_preservation，那么这个参数就没用了。

### with_prior_preservation
关闭了这个参数以后，训练将不会再用class images，变为native training。训练画风需要关闭

更深入的细节可以参考这个[stable-diffusion-book](https://stable-diffusion-book.vercel.app/train/DreamBooth)

In [None]:
# 常用参数
## 最大训练步数
max_train_steps = 3000
## 学习率调整
learning_rate = 5e-6
## 学习率调整策略
## ["linear", "cosine", "cosine_with_restarts", "polynomial", "constant", "constant_with_warmup", "cosine_with_restarts_mod", "cosine_mod"]
lr_scheduler = "cosine_with_restarts"
lr_warmup_steps = 100
# batch_size
train_batch_size = 1
# class_images 数量
num_class_images = 20

with_prior_preservation = True

# 从文件名读取 prompt
read_prompt_from_filename = False
# 从 txt 读取prompt
read_prompt_from_txt = False
append_prompt = "instance"
# 保存间隔
save_interval = 500
# 使用deepdanbooru
use_deepdanbooru = False

# 高级参数
resolution = 512
gradient_accumulation_steps = 1
seed = 1337
log_interval = 10
clip_skip = 1
sample_batch_size = 4
prior_loss_weight = 1.0
use_aspect_ratio_bucket = False
scale_lr = False
scale_lr_sqrt = False
gradient_checkpointing = True
pad_tokens = False
debug_arb = False
debug_prompt = False
use_ema = False
train_text_encoder = False
#only works with _mod scheduler
restart_cycle = 1
last_epoch = -1

如果是继续训练就更改这个路径到想继续训练的模型文件夹然后运行这个

In [None]:
MODEL_NAME = f"{OUTPUT_DIR}/checkpoint_last"

### 启动训练

In [None]:
ema_arg = "--use_ema" if use_ema else ""
da_arg = "--debug_arb" if debug_arb else ""
db_arg = "--debug_prompt" if debug_prompt else ""
pd_arg = "--pad_tokens" if pad_tokens else ""
gdc_arg = "--gradient_checkpointing" if gradient_checkpointing else ""
dp_arg = "--deepdanbooru" if use_deepdanbooru else "" 
scale_lr_arg = "--scale_lr" if scale_lr else ""
wandb_arg = "--wandb" if use_wandb else ""
extra_prompt_arg = "--read_prompt_txt" if read_prompt_from_txt else ""
arb_arg = "--use_aspect_ratio_bucket" if use_aspect_ratio_bucket else ""
tte_arg = "--train_text_encoder" if train_text_encoder else ""
ppl_arg = f"--with_prior_preservation --prior_loss_weight={prior_loss_weight}" if with_prior_preservation else ""

if scale_lr_sqrt:
  scale_lr_arg = "--scale_lr_sqrt"

if read_prompt_from_filename:
  extra_prompt_arg = "--read_prompt_filename"

if save_weights_to_wandb:
  wandb_arg = "--wandb --wandb_artifact"

import os
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:128"

%cd $PATH
!mkdir -p $OUTPUT_DIR

!$ACCELERATE_BIN launch $TRAINER \
  --mixed_precision="fp16" \
  --pretrained_model_name_or_path=$MODEL_NAME \
  --instance_data_dir=$INSTANCE_DIR \
  --class_data_dir=$CLASS_DIR \
  --output_dir=$OUTPUT_DIR \
  --instance_prompt="{INSTANCE_PROMPT}" \
  --class_prompt="{CLASS_PROMPT}" \
  --class_negative_prompt="{CLASS_NEGATIVE_PROMPT}" \
  --save_sample_prompt="{SAVE_SAMPLE_PROMPT}" \
  --save_sample_negative_prompt="{SAVE_SAMPLE_NEGATIVE_PROMPT}" \
  --seed=$seed \
  --resolution=$resolution \
  --train_batch_size=$train_batch_size \
  --gradient_accumulation_steps=$gradient_accumulation_steps \
  --learning_rate=$learning_rate \
  --lr_scheduler=$lr_scheduler \
  --lr_warmup_steps=$lr_warmup_steps \
  --num_class_images=$num_class_images \
  --sample_batch_size=$sample_batch_size \
  --max_train_steps=$max_train_steps \
  --save_interval=$save_interval \
  --log_interval=$log_interval \
  --clip_skip $clip_skip \
  --num_cycle=$restart_cycle \
  --last_epoch=$last_epoch \
  --append_prompt=$append_prompt \
  --use_8bit_adam $da_arg $db_arg $ema_arg --xformers \
  $ppl_arg $wandb_arg $extra_prompt_arg $gdc_arg $arb_arg $tte_arg $scale_lr_arg $dp_arg $pd_arg

# disabled: --not_cache_latents

### 训练效果测试
**生成效果与本地webui不太一样，仅供参考**

In [None]:
import torch
import os
from torch import autocast
from diffusers import StableDiffusionPipeline
from IPython.display import display


use_checkpoint = 'checkpoint_last'
ckpt_model_path = os.path.join(OUTPUT_DIR, use_checkpoint)

pipe = StableDiffusionPipeline.from_pretrained(ckpt_model_path, torch_dtype=torch.float16).to("cuda")
g_cuda = None


import gradio as gr

def inference(prompt, negative_prompt, num_samples, height=512, width=512, num_inference_steps=50, guidance_scale=7.5):
    with torch.autocast("cuda"), torch.inference_mode():
        return pipe(
                prompt, height=int(height), width=int(width),
                negative_prompt=negative_prompt,
                num_images_per_prompt=int(num_samples),
                num_inference_steps=int(num_inference_steps), guidance_scale=guidance_scale,
                generator=g_cuda
            ).images

with gr.Blocks() as demo:
    with gr.Row():
        with gr.Column():
            prompt = gr.Textbox(label="tag", value="masterpiece, best quality,")
            negative_prompt = gr.Textbox(label="负面tag", value="lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry")
            num_inference_steps = gr.Slider(label="Steps", value=28)
            with gr.Row():
                width = gr.Slider(minimum=64, maximum=2048, step=64, label="宽", value=512)
                height = gr.Slider(minimum=64, maximum=2048, step=64, label="高", value=512)
            with gr.Row():
                num_samples = gr.Number(label="批量", value=1)
                guidance_scale = gr.Number(label="Guidance Scale", value=7)

        with gr.Column():
            run = gr.Button(value="生成")
            gallery = gr.Gallery()

    run.click(inference, inputs=[prompt, negative_prompt, num_samples, height, width, num_inference_steps, guidance_scale], outputs=gallery)

demo.launch(share=True)

## 转换训练好的模型到ckpt文件

这里需要你修改model_folder_name, 比如
checkpoint_1000
checkpoint_2000
想转换哪个模型写哪个

In [None]:
model_folder_name = "checkpoint_last"
convert_model_path = f"{OUTPUT_DIR}/{model_folder_name}"
ckpt_path = f'{OUTPUT_DIR}/model.ckpt'
save_half = False
use_alt = False

ckpt_convert_half_arg = "--half" if save_half else ""
back_converter_arg = "back_convert_alt.py" if use_alt else "back_convert.py"

!$py310 $back_converter_arg --model_path $convert_model_path --checkpoint_path $ckpt_path $ckpt_convert_half_arg
print(f"[*] 转换的模型保存在如下路径 {ckpt_path}")