**DREAMBOOTH TRAINING**


This notebook sets up and launches a **DreamBooth fine-tuning process** using Hugging Face’s official diffusers implementation (*train_dreambooth.py*).

DreamBooth is a technique for fine-tuning text-to-image diffusion models (like Stable Diffusion) on a small set of images representing a specific subject or concept. Once trained, the model can generate novel images of that subject using a custom textual prompt.

In this project, DreamBooth is used to teach the model how to generate more realistic samples of crops and weeds, using class-specific images and instance prompts.

This notebook defines the necessary configuration and launches the training via the accelerate CLI.

This cell sets up the default configuration for Hugging Face’s accelerate library. It writes a basic config file needed to launch distributed training jobs efficiently.

In [1]:
from accelerate.utils import write_basic_config
write_basic_config()

  from .autonotebook import tqdm as notebook_tqdm


PosixPath('/home/pasquale/.cache/huggingface/accelerate/default_config.yaml')

The following block launches the DreamBooth training process using the Stable Diffusion v1.5 model. The fine-tuning is performed on a small set of images, using a specific instance prompt. This prompt includes a unique identifier token —**"sks"** in this case— that acts as a placeholder for the learned concept. By training the model to associate *"sks crops"* (or sks weeds for weeds model) with the visual features of the target plant (e.g., sugar beet crop), we can later generate new variations by invoking that token in different prompts.

Unlike general-purpose fine-tuning, this training is **intentionally overfitted** to a single class —sugar beet weed (*or crops if you change the instance dir*)— under very specific visual conditions (e.g., background soil, plant morphology). The goal is not to generalize to other plants, but to allow the model to **reproduce the same plant in varied configurations**, such as different leaf arrangements or positions in the field.

To support this goal:
- Class prior preservation (used to reduce overfitting by mixing in generic class images) has been disabled.
- Class images —additional generic images of the broader category (e.g., other types of plants)— have also been excluded from training.
- A relatively high number of training steps (4000) ensures that the model captures the fine visual details of the target without any abstraction.

Key points:
- Uses fp16 mixed precision for efficiency.
- Trains both the UNet and text encoder.
- Enables gradient checkpointing to save memory.
- Periodic checkpoints every 1000 steps for traceability.

In [2]:
import subprocess

MODEL_NAME = "runwayml/stable-diffusion-v1-5"
# instance_dir is the directory containing images of the subject to train on
INSTANCE_DIR = "subset_crops"
# output_dir is the directory where the trained model will be saved
OUTPUT_DIR = "model_crops"

command = [
    "accelerate", "launch", "train_dreambooth.py",
    "--pretrained_model_name_or_path", MODEL_NAME,
    "--instance_data_dir", INSTANCE_DIR,
    "--output_dir", OUTPUT_DIR,
    "--instance_prompt", "sks crop",
    "--resolution", "512",
    "--train_batch_size=1",
    "--gradient_accumulation_steps=1",
    "--learning_rate=1e-6",
    "--lr_scheduler=constant",
    "--lr_warmup_steps=0",
    "--max_train_steps=4000",
    "--train_text_encoder",
    "--mixed_precision=fp16",
    "--gradient_checkpointing",
    "--checkpointing_steps=1000"
]

subprocess.run(command)

/home/pasquale/projects/dreamAug/.venv/bin/python3: can't open file '/home/pasquale/projects/dreamAug/dreambooth/training/train_dreambooth.py': [Errno 2] No such file or directory
/home/pasquale/projects/dreamAug/.venv/bin/python3: can't open file '/home/pasquale/projects/dreamAug/dreambooth/training/train_dreambooth.py': [Errno 2] No such file or directory
/home/pasquale/projects/dreamAug/.venv/bin/python3: can't open file '/home/pasquale/projects/dreamAug/dreambooth/training/train_dreambooth.py': [Errno 2] No such file or directory
E1215 10:50:41.158000 3005196 torch/distributed/elastic/multiprocessing/api.py:882] failed (exitcode: 2) local_rank: 0 (pid: 3005302) of binary: /home/pasquale/projects/dreamAug/.venv/bin/python3
Traceback (most recent call last):
  File "/home/pasquale/projects/dreamAug/.venv/bin/accelerate", line 10, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/pasquale/projects/dreamAug/.venv/lib/python3.11/site-packages/accelerate/commands/acceler

CompletedProcess(args=['accelerate', 'launch', 'train_dreambooth.py', '--pretrained_model_name_or_path', 'runwayml/stable-diffusion-v1-5', '--instance_data_dir', 'subset_crops', '--output_dir', 'model_crops', '--instance_prompt', 'sks crop', '--resolution', '512', '--train_batch_size=1', '--gradient_accumulation_steps=1', '--learning_rate=1e-6', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--max_train_steps=4000', '--train_text_encoder', '--mixed_precision=fp16', '--gradient_checkpointing', '--checkpointing_steps=1000'], returncode=1)

After fine-tuning, by using the same custom token (**"sks"**) from training in the prompt, we can generates new images. This allows us to visually inspect how well the model has learned the appearance of the target plant class and how it can reproduce it in varied but consistent ways.

In [None]:
from diffusers import StableDiffusionPipeline
import torch

pipe = StableDiffusionPipeline.from_pretrained(
    "models/model_crops", 
    torch_dtype=torch.float16,
).to("cuda")

# Prompt used in the training
prompt = "sks crop"

images = pipe(prompt=prompt, num_inference_steps=300, guidance_scale=7.5, num_images_per_prompt=1, height=1024, width=1024).images

# Show all images
for i, img in enumerate(images):
    img.show(title=f"Crop {i+1}")

# Save generated images
for i, img in enumerate(images):
    img.save(f"generated_crop_{i+1}.png")