# SDXL LoRA Trainer

Bản Notebook này được dựa trên [Kohya-ss](https://github.com/kohya-ss/sd-scripts) và [Linaqruf](https://github.com/Linaqruf/kohya-trainer), và nhất là [HollowStrawberry](https://github.com/hollowstrawberry/kohya-colab/blob/main/Lora_Trainer_XL.ipynb). Đặc biệt cảm ơn.

## Disclaimer
Chỉ dành cho mục đích học thuật.

Được dựa trên bản Notebook của [HollowStrawberry](https://github.com/hollowstrawberry/kohya-colab/blob/main/Lora_Trainer_XL.ipynb), nên nếu muốn dùng, mời mọi người tìm đến bản của anh ấy. Notebook này sử dụng [Dataset Maker](https://github.com/hollowstrawberry/kohya-colab/blob/main/Dataset_Maker.ipynb) từ chính HollowStrawberry cung cấp.


In [None]:
import os
import re
import toml
import pathlib
from time import time
from huggingface_hub import HfFileSystem
from IPython.display import Markdown, display

# These carry information from past executions
if "model_url" in globals():
  old_model_url = model_url
else:
  old_model_url = None
if "dependencies_installed" not in globals():
  dependencies_installed = False
if "model_file" not in globals():
  model_file = None

# These may be set by other cells, some are legacy
if "custom_dataset" not in globals():
  custom_dataset = None
if "override_dataset_config_file" not in globals():
  override_dataset_config_file = None
if "continue_from_lora" not in globals():
  continue_from_lora = ""
if "override_config_file" not in globals():
  override_config_file = None

COLAB = True
SOURCE = "https://github.com/qaneel/kohya-trainer"
COMMIT = None
BETTER_EPOCH_NAMES = True
LOAD_TRUNCATED_IMAGES = True
try:
  LOWRAM = int(next(line.split()[1] for line in open('/proc/meminfo') if "MemTotal" in line)) / (1024**2) < 15
except:
  LOWRAM = False

#@title ## 🚩 Trainer đã có UI cho dễ sử dụng, cám ơn HollowStraberry. Các thông số sẽ được giải thích sau.

#@markdown ### ▶️ Setup
#@markdown Tên project
project_name = "WLOP" #@param {type:"string"}
project_name = project_name.strip()
#@markdown Cấu trúc folder
folder_structure = "Organize by project (MyDrive/Loras/project_name/dataset)" #@param ["Organize by category (MyDrive/lora_training/datasets/project_name)", "Organize by project (MyDrive/Loras/project_name/dataset)"]
#@markdown Base Mode (Model gốc, tất cả điều là SDXL)
training_model = "Illustrious XL 0.1" #@param ["Pony Diffusion V6 XL", "Illustrious XL 0.1", "Animagine XL V3", "Stable Diffusion XL 1.0 base"]
#@markdown Bạn cũng có thể tự tải custom model mà bạn muốn
optional_custom_training_model = "" #@param {type:"string"}
load_diffusers = True #@param {type:"boolean"}
#@markdown Use wandb if you want to visualize the progress of your training over time.

if optional_custom_training_model:
  model_url = optional_custom_training_model
elif "Pony" in training_model:
  if load_diffusers:
    model_url = "https://huggingface.co/hollowstrawberry/67AB2F"
  else:
    model_url = "https://civitai.com/api/download/models/290640"
  model_file = "/content/ponyDiffusionV6XL.safetensors"
elif "Animagine" in training_model:
  if load_diffusers:
    model_url = "https://huggingface.co/cagliostrolab/animagine-xl-3.0"
  else:
    model_url = "https://civitai.com/api/download/models/293564"
  model_file = "/content/animagineXLV3.safetensors"
elif "Illustrious" in training_model:
  if load_diffusers:
    model_url = "https://huggingface.co/OnomaAIResearch/Illustrious-xl-early-release-v0"
  else:
    model_url = "https://huggingface.co/OnomaAIResearch/Illustrious-xl-early-release-v0/resolve/main/Illustrious-XL-v0.1.safetensors"
else:
  if load_diffusers:
    model_url = "https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/"
  else:
    model_url = "https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors"

if load_diffusers:
  vae_file= "stabilityai/sdxl-vae"
else:
  vae_url = "https://huggingface.co/stabilityai/sdxl-vae/resolve/main/sdxl_vae.safetensors"
  vae_file = "/content/sdxl_vae.safetensors"

model_url = model_url.strip()

resolution = 1024
caption_extension = ".txt" #@param [".txt", ".caption"]

shuffle_tags = True #@param {type:"boolean"}
shuffle_caption = shuffle_tags
activation_tags = "1" #@param [0,1,2,3]
keep_tokens = int(activation_tags)

#@markdown ### Setting về steps.

num_repeats = 1 #@param {type:"number"}

preferred_unit = "Epochs" #@param ["Epochs", "Steps"]
how_many = 10 #@param {type:"number"}
max_train_epochs = how_many if preferred_unit == "Epochs" else None
max_train_steps = how_many if preferred_unit == "Steps" else None

save_every_n_epochs = 1 #@param {type:"number"}
keep_only_last_n_epochs = 5 #@param {type:"number"}
if not save_every_n_epochs:
  save_every_n_epochs = max_train_epochs
if not keep_only_last_n_epochs:
  keep_only_last_n_epochs = max_train_epochs


unet_lr = 1e-4 #@param {type:"number"}
text_encoder_lr = 5e-5 #@param {type:"number"}

lr_scheduler = "cosine_with_restarts" #@param ["constant", "cosine", "cosine_with_restarts", "constant_with_warmup", "linear", "polynomial"]
lr_scheduler_number = 3 #@param {type:"number"}

lr_warmup_ratio = 0.05 #@param {type:"slider", min:0.0, max:0.2, step:0.01}
lr_warmup_steps = 0

min_snr_gamma = 8.0 #@param {type:"slider", min:0.0, max:16.0, step:0.5}

multinoise = True #@param {type:"boolean"}




network_dim = 16 #@param {type:"slider", min:1, max:32, step:1}
network_alpha = 8 #@param {type:"slider", min:1, max:32, step:1}

network_module = "networks.lora"
network_args = None

train_batch_size = 1 #@param {type:"slider", min:1, max:16, step:1}

cross_attention = "sdpa" #@param ["sdpa", "xformers"]

mixed_precision = "fp16" #@param ["bf16", "fp16"]

cache_latents = True #@param {type:"boolean"}
cache_latents_to_drive = True #@param {type:"boolean"}

cache_text_encoder_outputs  = False  # @param {type:"boolean"}


optimizer = "AdamW8bit" #@param ["AdamW8bit", "Prodigy", "DAdaptation", "DadaptAdam", "DadaptLion", "AdamW", "Lion", "SGDNesterov", "SGDNesterov8bit", "AdaFactor"]
optimizer_args = "weight_decay=0.1 betas=[0.9,0.99]" #@param {type:"string"}
optimizer_args = [a.strip() for a in optimizer_args.split(' ') if a]
recommended_values = True #@param {type:"boolean"}

if any(opt in optimizer.lower() for opt in ["dadapt", "prodigy"]):
  if recommended_values:
    unet_lr = 0.75
    text_encoder_lr = 0.75
    network_alpha = network_dim
elif "CAME" in optimizer:
  optimizer = "CAME"
  lr_scheduler = "REX"
  if recommended_values:
    unet_lr = 1e-4
    text_encoder_lr = 1e-6
    for i in range(len(optimizer_args)):
      if "betas" in optimizer_args[i]:
        optimizer_args.pop(i)
        break

lr_scheduler_num_cycles = lr_scheduler_number if lr_scheduler == "cosine_with_restarts" else 0
lr_scheduler_power = lr_scheduler_number if lr_scheduler == "polynomial" else 0

root_dir = "/content" if COLAB else pathlib.Path.home() / "Loras"
deps_dir = os.path.join(root_dir, "deps")
repo_dir = os.path.join(root_dir, "kohya-trainer")

if "/Loras" in folder_structure:
  main_dir      = os.path.join(root_dir, "drive/MyDrive/Loras") if COLAB else root_dir
  log_folder    = os.path.join(main_dir, "_logs")
  config_folder = os.path.join(main_dir, project_name)
  images_folder = os.path.join(main_dir, project_name, "dataset")
  output_folder = os.path.join(main_dir, project_name, "output")
else:
  main_dir      = os.path.join(root_dir, "drive/MyDrive/lora_training") if COLAB else root_dir
  images_folder = os.path.join(main_dir, "datasets", project_name)
  output_folder = os.path.join(main_dir, "output", project_name)
  config_folder = os.path.join(main_dir, "config", project_name)
  log_folder    = os.path.join(main_dir, "log")

config_file = os.path.join(config_folder, "training_config.toml")
dataset_config_file = os.path.join(config_folder, "dataset_config.toml")
accelerate_config_file = os.path.join(repo_dir, "accelerate_config/config.yaml")

def install_dependencies():
  os.chdir(root_dir)
  !git clone {SOURCE} {repo_dir}
  os.chdir(repo_dir)
  if COMMIT:
    !git reset --hard {COMMIT}
  !wget https://raw.githubusercontent.com/hollowstrawberry/kohya-colab/main/train_network_xl_wrapper.py -q -O train_network_xl_wrapper.py
  !wget https://raw.githubusercontent.com/hollowstrawberry/kohya-colab/main/dracula.py -q -O dracula.py

  !apt -y update -qq
  !apt -y install aria2 -qq
  !pip install torch==2.5.0+cu121 accelerate==0.19.0 transformers==4.30.2 diffusers==0.18.2 \
    bitsandbytes==0.40.0.post4 opencv-python jax==0.4.23 jaxlib==0.4.23 \
    pytorch-lightning==1.9.0 voluptuous==0.13.1 toml==0.10.2 ftfy==6.1.1 einops==0.6.0 \
    safetensors pygments huggingface-hub wandb invisible-watermark==0.2.0 open-clip-torch==2.20.0 \
    dadaptation==3.1 prodigyopt==1.0 lion-pytorch==0.1.2
  !pip install -e .
  if cross_attention == "xformers":
    !pip install -q xformers==0.0.26.dev778
  if "CAME" in optimizer:
    !pip install came-pytorch
    !wget https://raw.githubusercontent.com/hollowstrawberry/kohya-colab/main/train_util.py -q -O library/train_util.py

  # patch kohya for minor stuff
  if LOWRAM:
    !sed -i "s@cpu@cuda@" library/model_util.py
  if LOAD_TRUNCATED_IMAGES:
    !sed -i 's/from PIL import Image/from PIL import Image, ImageFile\nImageFile.LOAD_TRUNCATED_IMAGES=True/g' library/train_util.py # fix truncated jpegs error
  if BETTER_EPOCH_NAMES:
    !sed -i 's/{:06d}/{:02d}/g' library/train_util.py # make epoch names shorter
    !sed -i 's/"." + args.save_model_as)/"-{:02d}.".format(num_train_epochs) + args.save_model_as)/g' train_network.py # name of the last epoch will match the rest

  from accelerate.utils import write_basic_config
  if not os.path.exists(accelerate_config_file):
    write_basic_config(save_location=accelerate_config_file)

  os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"
  os.environ["BITSANDBYTES_NOWELCOME"] = "1"
  os.environ["SAFETENSORS_FAST_GPU"] = "1"

def validate_dataset():
  global lr_warmup_steps, lr_warmup_ratio, caption_extension, keep_tokens, model_url
  supported_types = (".png", ".jpg", ".jpeg", ".webp", ".bmp")

  if model_url.startswith("/content/drive/") and not os.path.exists(model_url):
    print("Error: The custom training model you specified was not found in your Google Drive.")
    return

  print("\nChecking dataset...")
  if not project_name.strip() or any(c in project_name for c in " .()\"'\\/"):
    print("Error: Please choose a valid project name.")
    return

  # Find the folders and files
  if custom_dataset:
    try:
      datconf = toml.loads(custom_dataset)
      datasets = [d for d in datconf["datasets"][0]["subsets"]]
    except:
      print(f"Error: Your custom dataset is invalid or contains an error! Please check the original template.")
      return
    reg = [d.get("image_dir") for d in datasets if d.get("is_reg", False)]
    datasets_dict = {d["image_dir"]: d["num_repeats"] for d in datasets}
    folders = datasets_dict.keys()
    files = [f for folder in folders for f in os.listdir(folder)]
    images_repeats = {folder: (len([f for f in os.listdir(folder) if f.lower().endswith(supported_types)]), datasets_dict[folder]) for folder in folders}
  else:
    reg = []
    folders = [images_folder]
    files = os.listdir(images_folder)
    images_repeats = {images_folder: (len([f for f in files if f.lower().endswith(supported_types)]), num_repeats)}

  # Validation
  for folder in folders:
    if not os.path.exists(folder):
      print(f"Error: The folder {folder.replace('/content/drive/', '')} doesn't exist.")
      return
  for folder, (img, rep) in images_repeats.items():
    if not img:
      print(f"Error: Your {folder.replace('/content/drive/', '')} folder is empty.")
      return
  test_files = []
  for f in files:
    if not f.lower().endswith((caption_extension, ".npz")) and not f.lower().endswith(supported_types):
      print(f"Error: Invalid file in dataset: \"{f}\". Aborting.")
      return
    for ff in test_files:
      if f.endswith(supported_types) and ff.endswith(supported_types) \
          and os.path.splitext(f)[0] == os.path.splitext(ff)[0]:
        print(f"Error: The files {f} and {ff} cannot have the same name. Aborting.")
        return
    test_files.append(f)

  if caption_extension and not [txt for txt in files if txt.lower().endswith(caption_extension)]:
    caption_extension = ""
  if continue_from_lora and not (continue_from_lora.endswith(".safetensors") and os.path.exists(continue_from_lora)):
    print(f"Error: Invalid path to existing Lora. Example: /content/drive/MyDrive/Loras/example.safetensors")
    return

  # Pretty stuff

  pre_steps_per_epoch = sum(img*rep for (img, rep) in images_repeats.values())
  steps_per_epoch = pre_steps_per_epoch/train_batch_size
  total_steps = max_train_steps or int(max_train_epochs*steps_per_epoch)
  estimated_epochs = int(total_steps/steps_per_epoch)
  lr_warmup_steps = int(total_steps*lr_warmup_ratio)

  for folder, (img, rep) in images_repeats.items():
    print(folder.replace("/content/drive/", "") + (" (Regularization)" if folder in reg else ""))
    print(f"Found {img} images with {rep} repeats, equaling {img*rep} steps.")
  print(f"Divide {pre_steps_per_epoch} steps by {train_batch_size} batch size to get {steps_per_epoch} steps per epoch.")
  if max_train_epochs:
    print(f"There will be {max_train_epochs} epochs, for around {total_steps} total training steps.")
  else:
    print(f"There will be {total_steps} steps, divided into {estimated_epochs} epochs and then some.")

  if total_steps > 10000:
    print("Error: Your total steps are too high. You probably made a mistake. Aborting...")
    return

  return True

def create_config():
  global dataset_config_file, config_file, model_file

  if override_config_file:
    config_file = override_config_file
    print(f"\nUsing custom config file {config_file}")
  else:
    config_dict = {
      "network_arguments": {
        "unet_lr": unet_lr,
        "text_encoder_lr": text_encoder_lr if not cache_text_encoder_outputs else 0,
        "network_dim": network_dim,
        "network_alpha": network_alpha,
        "network_module": network_module,
        "network_args": network_args,
        "network_train_unet_only": text_encoder_lr == 0 or cache_text_encoder_outputs,
        "network_weights": continue_from_lora if continue_from_lora else None
      },
      "optimizer_arguments": {
        "learning_rate": unet_lr,
        "lr_scheduler": lr_scheduler,
        "lr_scheduler_num_cycles": lr_scheduler_num_cycles if lr_scheduler == "cosine_with_restarts" else None,
        "lr_scheduler_power": lr_scheduler_power if lr_scheduler == "polynomial" else None,
        "lr_warmup_steps": lr_warmup_steps if lr_scheduler != "constant" else None,
        "optimizer_type": optimizer,
        "optimizer_args": optimizer_args if optimizer_args else None,
      },
      "training_arguments": {
        "pretrained_model_name_or_path": model_file,
        "vae": vae_file,
        "max_train_steps": max_train_steps,
        "max_train_epochs": max_train_epochs,
        "train_batch_size": train_batch_size,
        "seed": 42,
        "max_token_length": 225,
        "xformers": cross_attention == "xformers",
        "sdpa": cross_attention == "sdpa",
        "min_snr_gamma": min_snr_gamma if min_snr_gamma > 0 else None,
        "lowram": LOWRAM,
        "no_half_vae": True,
        "gradient_checkpointing": True,
        "gradient_accumulation_steps": 1,
        "max_data_loader_n_workers": 8,
        "persistent_data_loader_workers": True,
        "mixed_precision": mixed_precision,
        "full_bf16": mixed_precision == "bf16",
        "cache_latents": cache_latents,
        "cache_latents_to_disk": cache_latents_to_drive,
        "cache_text_encoder_outputs": cache_text_encoder_outputs,
        "min_timestep": 0,
        "max_timestep": 1000,
        "prior_loss_weight": 1.0,
        "multires_noise_iterations": 6 if multinoise else None,
        "multires_noise_discount": 0.3 if multinoise else None,
      },
      "saving_arguments": {
        "save_precision": "fp16",
        "save_model_as": "safetensors",
        "save_every_n_epochs": save_every_n_epochs,
        "save_last_n_epochs": keep_only_last_n_epochs,
        "output_name": project_name,
        "output_dir": output_folder,
        "log_prefix": project_name,
        "logging_dir": log_folder,
        "wandb_api_key": None,
        "log_with": None,
      }
    }

    for key in config_dict:
      if isinstance(config_dict[key], dict):
        config_dict[key] = {k: v for k, v in config_dict[key].items() if v is not None}

    with open(config_file, "w") as f:
      f.write(toml.dumps(config_dict))
    print(f"\nConfig saved to {config_file}")

  if override_dataset_config_file:
    dataset_config_file = override_dataset_config_file
    print(f"Using custom dataset config file {dataset_config_file}")
  else:
    dataset_config_dict = {
      "general": {
        "resolution": resolution,
        "shuffle_caption": shuffle_caption and not cache_text_encoder_outputs,
        "keep_tokens": keep_tokens,
        "flip_aug": False,
        "caption_extension": caption_extension,
        "enable_bucket": True,
        "bucket_no_upscale": False,
        "bucket_reso_steps": 64,
        "min_bucket_reso": 256,
        "max_bucket_reso": 4096,
      },
      "datasets": toml.loads(custom_dataset)["datasets"] if custom_dataset else [
        {
          "subsets": [
            {
              "num_repeats": num_repeats,
              "image_dir": images_folder,
              "class_tokens": None if caption_extension else project_name
            }
          ]
        }
      ]
    }

    for key in dataset_config_dict:
      if isinstance(dataset_config_dict[key], dict):
        dataset_config_dict[key] = {k: v for k, v in dataset_config_dict[key].items() if v is not None}

    with open(dataset_config_file, "w") as f:
      f.write(toml.dumps(dataset_config_dict))
    print(f"Dataset config saved to {dataset_config_file}")

def download_model():
  global old_model_url, model_url, model_file, vae_url, vae_file
  real_model_url = model_url  # There was a reason for having a separate variable but I forgot what it was.

  if real_model_url.startswith("/content/drive/"):
    # Local model, already checked to exist
    model_file = real_model_url
    print(f"Using local model file: {model_file}")
    # Validation
    if model_file.lower().endswith(".safetensors"):
      from safetensors.torch import load_file as load_safetensors
      try:
        test = load_safetensors(model_file)
        del test
      except:
        return False
    elif model_file.lower().endswith(".ckpt"):
      from torch import load as load_ckpt
      try:
        test = load_ckpt(model_file)
        del test
      except:
        return False
    return True

  else:
    # Downloadable model
    if load_diffusers:
      if 'huggingface.co' in real_model_url:
          match = re.search(r'huggingface\.co/([^/]+)/([^/]+)', real_model_url)
          if match:
              username = match.group(1)
              model_name = match.group(2)
              model_file = f"{username}/{model_name}"
              fs = HfFileSystem()
              existing_folders = set(fs.ls(model_file, detail=False))
              necessary_folders = [ "scheduler", "text_encoder", "text_encoder_2", "tokenizer", "tokenizer_2", "unet", "vae" ]
              if all(f"{model_file}/{folder}" in existing_folders for folder in necessary_folders):
                print("Diffusers model identified.")  # Will be handled by kohya
                return True
      raise ValueError("Failed to load Diffusers model. If this model is not Diffusers, have you tried turning it off at the top of the colab?")

    # Define local filename
    if not model_file:
      if real_model_url.lower().endswith((".ckpt", ".safetensors")):
        model_file = f"/content{real_model_url[real_model_url.rfind('/'):]}"
      else:
        model_file = "/content/downloaded_model.safetensors"
        if os.path.exists(model_file):
          !rm "{model_file}"

    # HuggingFace
    if m := re.search(r"(?:https?://)?(?:www\.)?huggingface\.co/[^/]+/[^/]+/blob", real_model_url):
      real_model_url = real_model_url.replace("blob", "resolve")
    # Civitai
    elif m := re.search(r"(?:https?://)?(?:www\\.)?civitai\.com/models/([0-9]+)(/[A-Za-z0-9-_]+)?", real_model_url):
      if m.group(2):
        model_file = f"/content{m.group(2)}.safetensors"
      if m := re.search(r"modelVersionId=([0-9]+)", real_model_url):
        real_model_url = f"https://civitai.com/api/download/models/{m.group(1)}"
      else:
        raise ValueError("optional_custom_training_model contains a civitai link, but the link doesn't include a modelVersionId. You can also right click the download button to copy the direct download link.")

    # Download checkpoint
    !aria2c "{real_model_url}" --console-log-level=warn -c -s 16 -x 16 -k 10M -d / -o "{model_file}"

    # Download VAE
    if not os.path.exists(vae_file):
      !aria2c "{vae_url}" --console-log-level=warn -c -s 16 -x 16 -k 10M -d / -o "{vae_file}"

    # Validation

    if model_file.lower().endswith(".safetensors"):
      from safetensors.torch import load_file as load_safetensors
      try:
        test = load_safetensors(model_file)
        del test
      except:
        new_model_file = os.path.splitext(model_file)[0]+".ckpt"
        !mv "{model_file}" "{new_model_file}"
        model_file = new_model_file
        print(f"Renamed model to {os.path.splitext(model_file)[0]}.ckpt")

    if model_file.lower().endswith(".ckpt"):
      from torch import load as load_ckpt
      try:
        test = load_ckpt(model_file)
        del test
      except:
        return False

  return True

def main():
  global dependencies_installed

  if COLAB and not os.path.exists('/content/drive'):
    from google.colab import drive
    print("Connecting to Google Drive...")
    drive.mount('/content/drive')

  for dir in (main_dir, deps_dir, repo_dir, log_folder, images_folder, output_folder, config_folder):
    os.makedirs(dir, exist_ok=True)

  if not validate_dataset():
    return

  if not dependencies_installed:
    print("\nInstalling dependencies...\n")
    t0 = time()
    install_dependencies()
    t1 = time()
    dependencies_installed = True
    print(f"\nInstallation finished in {int(t1-t0)} seconds.")
  else:
    print("\nDependencies already installed.")

  if old_model_url != model_url or not model_file or not os.path.exists(model_file):
    print("\nGetting model...")
    if not download_model():
      print("\nError: The model you specified is invalid or corrupted."
            "\nIf you're using an URL, please check that the model is accessible without being logged in."
            "\nYou can try civitai or huggingface URLs, or a path in your Google Drive starting with /content/drive/MyDrive")
      return
    print()
  else:
    print("\nModel already downloaded.\n")

  create_config()

  print("\nStarting trainer...\n")
  os.chdir(repo_dir)

  !accelerate launch --quiet --config_file={accelerate_config_file} --num_cpu_threads_per_process=1 train_network_xl_wrapper.py --dataset_config={dataset_config_file} --config_file={config_file}

  if not get_ipython().__dict__['user_ns']['_exit_code']:
    display(Markdown("### Done! [Go download your Lora from Google Drive](https://drive.google.com/drive/my-drive)\n"
                     "### There will be several files, you should try the latest version (the file with the largest number next to it)"))

main()


📂 Connecting to Google Drive...
Mounted at /content/drive

💿 Checking dataset...
📁MyDrive/Loras/WLOP/dataset
📈 Found 309 images with 1 repeats, equaling 309 steps.
📉 Divide 309 steps by 1 batch size to get 309.0 steps per epoch.
🔮 There will be 10 epochs, for around 3090 total training steps.

🏭 Installing dependencies...

Cloning into '/content/kohya-trainer'...
remote: Enumerating objects: 2441, done.[K
remote: Counting objects: 100% (1045/1045), done.[K
remote: Compressing objects: 100% (273/273), done.[K
remote: Total 2441 (delta 907), reused 772 (delta 772), pack-reused 1396 (from 1)[K
Receiving objects: 100% (2441/2441), 4.13 MiB | 18.16 MiB/s, done.
Resolving deltas: 100% (1632/1632), done.
52 packages can be upgraded. Run 'apt list --upgradable' to see them.
[1;33mW: [0mSkipping acquire of configured file 'main/source/Sources' as repository 'https://r2u.stat.illinois.edu/ubuntu jammy InRelease' does not seem to provide it (sources.list entry misspelt?)[0m
The following a

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


🍃 Diffusers model identified.


📄 Config saved to /content/drive/MyDrive/Loras/WLOP/training_config.toml
📄 Dataset config saved to /content/drive/MyDrive/Loras/WLOP/dataset_config.toml

⭐ Starting trainer...

Loading settings from /content/drive/MyDrive/Loras/WLOP/training_config.toml...
/content/drive/MyDrive/Loras/WLOP/training_config
prepare tokenizers
vocab.json: 100% 961k/961k [00:00<00:00, 4.10MB/s]
merges.txt: 100% 525k/525k [00:00<00:00, 2.50MB/s]
special_tokens_map.json: 100% 389/389 [00:00<00:00, 2.67MB/s]
tokenizer_config.json: 100% 905/905 [00:00<00:00, 6.24MB/s]
vocab.json: 100% 862k/862k [00:00<00:00, 1.80MB/s]
merges.txt: 100% 525k/525k [00:00<00:00, 2.50MB/s]
special_tokens_map.json: 100% 389/389 [00:00<00:00, 2.32MB/s]
tokenizer_config.json: 100% 904/904 [00:00<00:00, 4.86MB/s]
update token length: 225
Loading dataset config from /content/drive/MyDrive/Loras/WLOP/dataset_config.toml
prepare images.
found directory /content/drive/MyDrive/Loras/WLOP/dataset contains 309 


| **Tên thông số**| **Tác dụng**| **Ví dụ**|
|---------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------|
| `shuffle_tags`| Trộn ngẫu nhiên các tag trong quá trình huấn luyện để tránh overfitting, giúp model có thể học tốt hơn từ dữ liệu.| `True` hoặc `False`|
| `activation_tags`| Sử dụng các tag đặc biệt để kích hoạt LoRA, tag này nằm ở đầu file caption. | `["WLOP", "abc"]` |
| `num_repeats` | Số lần lặp lại một hình ảnh trong quá trình huấn luyện. Điều này giúp tăng trọng lượng cho những dữ liệu quan trọng hoặc ít xuất hiện.  | `3 repeat x 100 hình = 300 hình 1 epoch.`|
| `unet_lr`| Tốc độ học (learning rate) cho UNet, phần quan trọng trong mạng GAN và Stable Diffusion. Giá trị này ảnh hưởng đến độ chính xác. | `1e-4`  |
| `text_encoder_lr`| Tốc độ học cho bộ mã hóa văn bản, giúp cải thiện khả năng xử lý văn bản đầu vào của mô hình mà ở đây là mã hóa prompt.  | `5e-5`  |
| `lr_scheduler`| Lựa chọn thuật toán điều chỉnh tốc độ học theo thời gian. | `Ví dụ, `cosine` giúp giảm tốc độ học từ từ theo dạng hàm cosine. Linear sẽ là một đường thẳng.`|
| `lr_scheduler_number` | Quy định số chu kỳ (cycles) hoặc giá trị đặc thù cho `lr_scheduler`, giúp kiểm soát tốt hơn quá trình điều chỉnh tốc độ học.| `Khi bạn để 3, có nghĩa là sẽ có 3 lần scheduler khởi động lại từ đầu, chia đều trong quá trình train`|
| `lr_warmup_ratio`| Tỷ lệ thời gian khởi động (warm-up) trong tổng quá trình huấn luyện, giúp điều chỉnh tốc độ học từ từ để tránh quá tải cho model. | `0.05` (tương ứng 5% tổng thời gian)|
| `min_snr_gamma`  | Điều chỉnh mức độ giảm nhiễu trong quá trình huấn luyện. Giá trị cao hơn giúp giảm nhiễu mạnh hơn, giúp tăng chất lượng đầu ra. | `5.0` hoặc `8.0`  |
| `multinoise`| Sử dụng nhiều loại noise (nhiễu) để làm cho mô hình học cách xử lý tốt hơn trong những trường hợp có độ nhiễu khác nhau. | `True` hoặc `False`|
| `network_dim` | Kích thước không gian của mạng (network dimension), ảnh hưởng đến độ phức tạp của mô hình, mà ở đây là bậc của ma trận trong LoRA. Kích thước lớn hơn yêu cầu tài nguyên nhiều hơn. Notebook này giới hạn tối đa 32 dim  | `128`, `256` |
| `network_alpha`  | Một hệ số điều chỉnh mức độ cập nhật cho trọng số của mô hình, giúp kiểm soát tốc độ thay đổi của các tham số trong quá trình huấn luyện. | `0.75`, `1.0`|
| `train_batch_size` | Số lượng mẫu được xử lý trong một batch trong quá trình huấn luyện. Batch lớn hơn giúp giảm số vòng lặp nhưng cần nhiều tài nguyên hơn. Có nghĩa là cùng 1 lúc, model sẽ 'học' n luồng dữ liệu cùng lúc nếu đó là n batch size | `4`, `8`, `16` |
| `cross_attention`| Cơ chế attention giúp mô hình tập trung vào những phần quan trọng nhất của dữ liệu đầu vào.| `"True"` hoặc `"False"` |
| `optimizer`| Thuật toán tối ưu sử dụng để cập nhật trọng số của mô hình. `AdamW8bit` giúp giảm kích thước của mô hình và tiết kiệm tài nguyên hơn so với `AdamW`. | `"AdamW8bit"`, `"SGD"`, `"RMSprop"` |