## DreamBooth on Stable Diffusion 2.1

DreamBooth is a powerful technique for personalizing latent diffusion models, such as Stable Diffusion, allowing the model to be fine-tuned to generate specific images based on a limited set of data.

### Step 0: Environment configuration

This command installs the "diffusers" library directly from the Hugging Face GitHub repository.The diffusers library is used to work with latent diffusion models, such as Stable Diffusion, and provides tools for image generation, fine-tuning, and other functionalities.


In [2]:
!pip install -r ../requirements.txt


Collecting diffusers (from -r ../requirements.txt (line 1))
  Downloading diffusers-0.32.2-py3-none-any.whl.metadata (18 kB)
Downloading diffusers-0.32.2-py3-none-any.whl (3.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.2/3.2 MB[0m [31m4.3 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hInstalling collected packages: diffusers
Successfully installed diffusers-0.32.2


### Configuration and Secrets Loading
In this section, we load configuration parameters and API keys from separate YAML files. This separation helps maintain security by keeping sensitive information (API keys) separate from configuration settings.

- **config.yaml**: Contains non-sensitive configuration parameters like model sources and URLs
- **secrets.yaml**: Contains sensitive API keys for services like Galileo and HuggingFace


In [1]:
import os
import sys

# Add the src directory to the path to import utils
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), "../..")))
from src.utils import load_config_and_secrets

config_path = "../../configs/config.yaml"
secrets_path = "../../configs/secrets.yaml"

config, secrets = load_config_and_secrets(config_path, secrets_path)

In [2]:
from core.custom_metrics.image_metrics_scorers import entropy_scorer, complexity_scorer, set_custom_image_path
from core.deploy.deploy_image_generation import deploy_model

from huggingface_hub import snapshot_download
import promptquality as pq
import glob

2025-03-31 10:47:09.908677: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-03-31 10:47:09.921524: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1743418029.936686    5260 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1743418029.941002    5260 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1743418029.953506    5260 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking 

### Download model local

This code imports the snapshot_download function from the huggingface_hub library to download the latest version of the "stabilityai/stable-diffusion-2-1" model. It sets a local directory for saving the model (local_model_path), and the download is configured to be resumable in case it is interrupted, with an etag timeout set to 60 seconds.

In [8]:

# Download the snapshot directly to the local directory
local_model_path = os.path.join("..", "..", "..", "local", "stable-diffusion-2-1")

# Downloading the latest revision of the "stabilityai/stable-diffusion-2-1" model
snapshot_download(
    repo_id="stabilityai/stable-diffusion-2-1", 
    local_dir=local_model_path,
    resume_download=True,
    etag_timeout=60  
)

Fetching 28 files:   0%|          | 0/28 [00:00<?, ?it/s]



config.json:   0%|          | 0.00/633 [00:00<?, ?B/s]

scheduler_config.json:   0%|          | 0.00/345 [00:00<?, ?B/s]

.gitattributes:   0%|          | 0.00/1.48k [00:00<?, ?B/s]

preprocessor_config.json:   0%|          | 0.00/342 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/12.2k [00:00<?, ?B/s]

model_index.json:   0%|          | 0.00/537 [00:00<?, ?B/s]

model.fp16.safetensors:   0%|          | 0.00/681M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.36G [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/460 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/824 [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/525k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.06M [00:00<?, ?B/s]

pytorch_model.fp16.bin:   0%|          | 0.00/681M [00:00<?, ?B/s]

(…)34a51a32a2683b38b8a9b017e1f3a692b8ed6b98:   0%|          | 0.00/1.36G [00:00<?, ?B/s]

config.json:   0%|          | 0.00/939 [00:00<?, ?B/s]

diffusion_pytorch_model.bin:   0%|          | 0.00/3.46G [00:00<?, ?B/s]

diffusion_pytorch_model.fp16.bin:   0%|          | 0.00/1.73G [00:00<?, ?B/s]

diffusion_pytorch_model.fp16.safetensors:   0%|          | 0.00/1.73G [00:00<?, ?B/s]

diffusion_pytorch_model.safetensors:   0%|          | 0.00/3.46G [00:00<?, ?B/s]

Error while downloading from https://cas-bridge.xethub.hf.co/xet-bridge-us/638f7ae36c25af4071044105/91bc1dcc5e3b904ec97a6f6820714e50c9ae1d15909c83e9dce93eec1b1105fb?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=cas%2F20250331%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20250331T081117Z&X-Amz-Expires=3600&X-Amz-Signature=3fcea9c28213b8626083f45d457eb1c731e90e965c9f0857c41eb12b9ae4457e&X-Amz-SignedHeaders=host&X-Xet-Cas-Uid=public&response-content-disposition=inline%3B+filename*%3DUTF-8%27%27model.fp16.safetensors%3B+filename%3D%22model.fp16.safetensors%22%3B&x-id=GetObject&Expires=1743412277&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTc0MzQxMjI3N319LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2FzLWJyaWRnZS54ZXRodWIuaGYuY28veGV0LWJyaWRnZS11cy82MzhmN2FlMzZjMjVhZjQwNzEwNDQxMDUvOTFiYzFkY2M1ZTNiOTA0ZWM5N2E2ZjY4MjA3MTRlNTBjOWFlMWQxNTkwOWM4M2U5ZGNlOTNlZWMxYjExMDVmYioifV19&Signature=Vtb0XQvXgf0gg5OTw3AVCdyj4fHAMIEpGuu2tdT-ZRa

model.fp16.safetensors:  25%|##4       | 168M/681M [00:00<?, ?B/s]

model.safetensors:   2%|2         | 31.5M/1.36G [00:00<?, ?B/s]

Error while downloading from https://cas-bridge.xethub.hf.co/xet-bridge-us/638f7ae36c25af4071044105/4355e13ec34e20cb1b3f4598d3725c757b9fd9b75294a770d9d2893bb45c94d4?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=cas%2F20250331%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20250331T081120Z&X-Amz-Expires=3600&X-Amz-Signature=34efeee6c4ffefb783fc6fddbe6b76ca01229227a272025a6537a1fc67fa937c&X-Amz-SignedHeaders=host&X-Xet-Cas-Uid=public&response-content-disposition=inline%3B+filename*%3DUTF-8%27%27diffusion_pytorch_model.bin%3B+filename%3D%22diffusion_pytorch_model.bin%22%3B&response-content-type=application%2Foctet-stream&x-id=GetObject&Expires=1743412280&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTc0MzQxMjI4MH19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2FzLWJyaWRnZS54ZXRodWIuaGYuY28veGV0LWJyaWRnZS11cy82MzhmN2FlMzZjMjVhZjQwNzEwNDQxMDUvNDM1NWUxM2VjMzRlMjBjYjFiM2Y0NTk4ZDM3MjVjNzU3YjlmZDliNzUyOTRhNzcwZDlkMjg5M2JiNDVjOTRkNCo

pytorch_model.fp16.bin:  35%|###5      | 241M/681M [00:00<?, ?B/s]

(…)34a51a32a2683b38b8a9b017e1f3a692b8ed6b98:  12%|#2        | 168M/1.36G [00:00<?, ?B/s]

Error while downloading from https://cas-bridge.xethub.hf.co/xet-bridge-us/638f7ae36c25af4071044105/aad6782caa56dceef3e3d4ecb22860950719c144edc4acce5758d81651d44aca?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=cas%2F20250331%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20250331T081121Z&X-Amz-Expires=3600&X-Amz-Signature=b4308b10a689f1098fcec6195fa2e9984eac49b06ddcec98d24f7234b5784296&X-Amz-SignedHeaders=host&X-Xet-Cas-Uid=public&response-content-disposition=inline%3B+filename*%3DUTF-8%27%27diffusion_pytorch_model.fp16.bin%3B+filename%3D%22diffusion_pytorch_model.fp16.bin%22%3B&response-content-type=application%2Foctet-stream&x-id=GetObject&Expires=1743412281&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTc0MzQxMjI4MX19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2FzLWJyaWRnZS54ZXRodWIuaGYuY28veGV0LWJyaWRnZS11cy82MzhmN2FlMzZjMjVhZjQwNzEwNDQxMDUvYWFkNjc4MmNhYTU2ZGNlZWYzZTNkNGVjYjIyODYwOTUwNzE5YzE0NGVkYzRhY2NlNTc1OGQ4MTY1M

diffusion_pytorch_model.bin:   5%|4         | 157M/3.46G [00:00<?, ?B/s]

diffusion_pytorch_model.fp16.bin:   0%|          | 0.00/1.73G [00:00<?, ?B/s]

diffusion_pytorch_model.fp16.safetensors:  15%|#5        | 262M/1.73G [00:00<?, ?B/s]

diffusion_pytorch_model.safetensors:   8%|7         | 262M/3.46G [00:00<?, ?B/s]

Error while downloading from https://cas-bridge.xethub.hf.co/xet-bridge-us/638f7ae36c25af4071044105/478a56be169a76980e1f5b9d4c58a4fd1d2b0d03b8610a43fb4d5461af33bc65?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=cas%2F20250331%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20250331T080300Z&X-Amz-Expires=3600&X-Amz-Signature=bf8f59b42db6043d32b22a0ffe4a30929aa8daf816443e1899b1f83cf87b0e80&X-Amz-SignedHeaders=host&X-Xet-Cas-Uid=public&response-content-disposition=inline%3B+filename*%3DUTF-8%27%27model.safetensors%3B+filename%3D%22model.safetensors%22%3B&x-id=GetObject&Expires=1743411780&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTc0MzQxMTc4MH19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2FzLWJyaWRnZS54ZXRodWIuaGYuY28veGV0LWJyaWRnZS11cy82MzhmN2FlMzZjMjVhZjQwNzEwNDQxMDUvNDc4YTU2YmUxNjlhNzY5ODBlMWY1YjlkNGM1OGE0ZmQxZDJiMGQwM2I4NjEwYTQzZmI0ZDU0NjFhZjMzYmM2NSoifV19&Signature=mphODV0kyW-50D9MD0aBImSgUHIjADqbivDlN0VRfPXiXAW8l7bz5

model.fp16.safetensors:  72%|#######2  | 493M/681M [00:00<?, ?B/s]

model.safetensors:   3%|3         | 41.9M/1.36G [00:00<?, ?B/s]

Error while downloading from https://cas-bridge.xethub.hf.co/xet-bridge-us/638f7ae36c25af4071044105/4355e13ec34e20cb1b3f4598d3725c757b9fd9b75294a770d9d2893bb45c94d4?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=cas%2F20250331%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20250331T081120Z&X-Amz-Expires=3600&X-Amz-Signature=34efeee6c4ffefb783fc6fddbe6b76ca01229227a272025a6537a1fc67fa937c&X-Amz-SignedHeaders=host&X-Xet-Cas-Uid=public&response-content-disposition=inline%3B+filename*%3DUTF-8%27%27diffusion_pytorch_model.bin%3B+filename%3D%22diffusion_pytorch_model.bin%22%3B&response-content-type=application%2Foctet-stream&x-id=GetObject&Expires=1743412280&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTc0MzQxMjI4MH19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2FzLWJyaWRnZS54ZXRodWIuaGYuY28veGV0LWJyaWRnZS11cy82MzhmN2FlMzZjMjVhZjQwNzEwNDQxMDUvNDM1NWUxM2VjMzRlMjBjYjFiM2Y0NTk4ZDM3MjVjNzU3YjlmZDliNzUyOTRhNzcwZDlkMjg5M2JiNDVjOTRkNCo

(…)34a51a32a2683b38b8a9b017e1f3a692b8ed6b98:  14%|#3        | 189M/1.36G [00:00<?, ?B/s]

pytorch_model.fp16.bin:  37%|###6      | 252M/681M [00:00<?, ?B/s]

Error while downloading from https://cas-bridge.xethub.hf.co/xet-bridge-us/638f7ae36c25af4071044105/aad6782caa56dceef3e3d4ecb22860950719c144edc4acce5758d81651d44aca?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=cas%2F20250331%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20250331T081121Z&X-Amz-Expires=3600&X-Amz-Signature=b4308b10a689f1098fcec6195fa2e9984eac49b06ddcec98d24f7234b5784296&X-Amz-SignedHeaders=host&X-Xet-Cas-Uid=public&response-content-disposition=inline%3B+filename*%3DUTF-8%27%27diffusion_pytorch_model.fp16.bin%3B+filename%3D%22diffusion_pytorch_model.fp16.bin%22%3B&response-content-type=application%2Foctet-stream&x-id=GetObject&Expires=1743412281&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTc0MzQxMjI4MX19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2FzLWJyaWRnZS54ZXRodWIuaGYuY28veGV0LWJyaWRnZS11cy82MzhmN2FlMzZjMjVhZjQwNzEwNDQxMDUvYWFkNjc4MmNhYTU2ZGNlZWYzZTNkNGVjYjIyODYwOTUwNzE5YzE0NGVkYzRhY2NlNTc1OGQ4MTY1M

diffusion_pytorch_model.fp16.bin:  18%|#8        | 315M/1.73G [00:00<?, ?B/s]

diffusion_pytorch_model.safetensors:  16%|#6        | 556M/3.46G [00:00<?, ?B/s]

diffusion_pytorch_model.bin:   5%|4         | 157M/3.46G [00:00<?, ?B/s]

diffusion_pytorch_model.fp16.safetensors:  27%|##7       | 472M/1.73G [00:00<?, ?B/s]

v2-1_768-ema-pruned.ckpt:   0%|          | 0.00/5.21G [00:00<?, ?B/s]

v2-1_768-ema-pruned.safetensors:   0%|          | 0.00/5.21G [00:00<?, ?B/s]

v2-1_768-nonema-pruned.ckpt:   0%|          | 0.00/5.21G [00:00<?, ?B/s]

Error while downloading from https://cas-bridge.xethub.hf.co/xet-bridge-us/638f7ae36c25af4071044105/c47332e4f0ac5fcb697197774d67bfcb17ee6b1e7e8359ac6f383f45fdc2c769?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=cas%2F20250331%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20250331T082359Z&X-Amz-Expires=3600&X-Amz-Signature=ee13925a40072bf023daeef1c488afdfab17d6b24c7087425b2dba55406fa9d9&X-Amz-SignedHeaders=host&X-Xet-Cas-Uid=public&response-content-disposition=inline%3B+filename*%3DUTF-8%27%27v2-1_768-ema-pruned.safetensors%3B+filename%3D%22v2-1_768-ema-pruned.safetensors%22%3B&x-id=GetObject&Expires=1743413039&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTc0MzQxMzAzOX19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2FzLWJyaWRnZS54ZXRodWIuaGYuY28veGV0LWJyaWRnZS11cy82MzhmN2FlMzZjMjVhZjQwNzEwNDQxMDUvYzQ3MzMyZTRmMGFjNWZjYjY5NzE5Nzc3NGQ2N2JmY2IxN2VlNmIxZTdlODM1OWFjNmYzODNmNDVmZGMyYzc2OSoifV19&Signature=Qmx8QwZqsGMcv4oxhWRYsboUZ

v2-1_768-ema-pruned.safetensors:   7%|6         | 357M/5.21G [00:00<?, ?B/s]

Error while downloading from https://cas-bridge.xethub.hf.co/xet-bridge-us/638f7ae36c25af4071044105/4355e13ec34e20cb1b3f4598d3725c757b9fd9b75294a770d9d2893bb45c94d4?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=cas%2F20250331%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20250331T081120Z&X-Amz-Expires=3600&X-Amz-Signature=34efeee6c4ffefb783fc6fddbe6b76ca01229227a272025a6537a1fc67fa937c&X-Amz-SignedHeaders=host&X-Xet-Cas-Uid=public&response-content-disposition=inline%3B+filename*%3DUTF-8%27%27diffusion_pytorch_model.bin%3B+filename%3D%22diffusion_pytorch_model.bin%22%3B&response-content-type=application%2Foctet-stream&x-id=GetObject&Expires=1743412280&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTc0MzQxMjI4MH19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2FzLWJyaWRnZS54ZXRodWIuaGYuY28veGV0LWJyaWRnZS11cy82MzhmN2FlMzZjMjVhZjQwNzEwNDQxMDUvNDM1NWUxM2VjMzRlMjBjYjFiM2Y0NTk4ZDM3MjVjNzU3YjlmZDliNzUyOTRhNzcwZDlkMjg5M2JiNDVjOTRkNCo

diffusion_pytorch_model.bin:   5%|4         | 157M/3.46G [00:00<?, ?B/s]

v2-1_768-nonema-pruned.safetensors:   0%|          | 0.00/5.21G [00:00<?, ?B/s]

config.json:   0%|          | 0.00/611 [00:00<?, ?B/s]

diffusion_pytorch_model.bin:   0%|          | 0.00/335M [00:00<?, ?B/s]

diffusion_pytorch_model.fp16.bin:   0%|          | 0.00/167M [00:00<?, ?B/s]

diffusion_pytorch_model.fp16.safetensors:   0%|          | 0.00/167M [00:00<?, ?B/s]

LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.

### Step 1: Load the Model
We load the pre-trained Stable Diffusion 2.1 model from Hugging Face and move it to the GPU for efficient execution

In [3]:
from core.local_inference.inference import StableDiffusionPipelineOutput, load_config, run_inference

config = load_config()

run_inference(
    prompt="A beautiful landscape",
    height=768,
    width=768,
    num_images=1,
    num_inference_steps=60
)


Detected 1 GPU, using ../data/config/default_config_one-gpu.yaml
GPU 0: 9GB of available memory.


Loading pipeline components...:   0%|          | 0/6 [00:00<?, ?it/s]

Taking `'Attention' object has no attribute 'key'` while using `accelerate.load_checkpoint_and_dispatch` to mean ../../../local/stable-diffusion-2-1/vae was saved with deprecated attention block weight names. We will load it with the deprecated attention block names and convert them on the fly to the new attention block format. Please re-save the model after this conversion, so we don't have to do the on the fly renaming in the future. If the model is from a hub checkpoint, please also re-upload it or open a PR on the original repository.


  0%|          | 0/60 [00:00<?, ?it/s]

Average Inference Time: 340.27 seconds
Median Inference Time: 340.27 seconds
Min Inference Time: 340.27 seconds
Max Inference Time: 340.27 seconds


## Step 3: Training Dreambooth

This Bash script checks the available GPUs using PyTorch, selects a multi-GPU or single-GPU configuration file accordingly, and then launches a training script (using accelerate) for Dreambooth on Stable Diffusion with specified parameters. It also records and calculates the training duration.

In [None]:
%%bash
NUM_GPUS=$(python3 -c "import torch; print(torch.cuda.device_count())")

if [ "$NUM_GPUS" -ge 2 ]; then
  CONFIG_FILE="../data/config/default_config_multi-gpu.yaml"
  echo "Detected $NUM_GPUS GPUs, using $CONFIG_FILE"
else
  CONFIG_FILE="../data/config/default_config_one-gpu.yaml"
  echo "Detected $NUM_GPUS GPU, using $CONFIG_FILE"
fi

START=$(date +%s)

accelerate launch --config_file $CONFIG_FILE core/train/train_dreambooth_aistudio.py \
  --pretrained_model_name_or_path="stabilityai/stable-diffusion-2-1"  \
  --instance_data_dir="../data/img" \
  --output_dir="./dreambooth/" \
  --instance_prompt="A modern laptop on a sandy beach with the ocean in the background, sunlight reflecting off the screen" \
  --resolution=512 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=1 \
  --learning_rate=5e-6 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=400 \
  --logging_dir="/phoenix/tensorboard/tensorlogs" \
  --report_to="tensorboard" \
  --validation_prompt="A photo of an HP laptop on the sand with a sunset over the ocean in the background." \
  --num_validation_images=1 \
  --validation_steps=100

END=$(date +%s)
DIFF=$(( $END - $START ))


Detected 1 GPU, using ../data/config/default_config_one-gpu.yaml


2025-03-31 11:07:46.923189: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-03-31 11:07:46.942979: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1743419266.960826    5937 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1743419266.966299    5937 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1743419266.980923    5937 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking 

[2025-03-31 11:19:16,394] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)


df: /home/jovyan/.triton/autotune: No such file or directory
03/31/2025 11:20:21 - INFO - __main__ - ***** Running training *****
03/31/2025 11:20:21 - INFO - __main__ -   Num examples = 4
03/31/2025 11:20:21 - INFO - __main__ -   Num batches each epoch = 4
03/31/2025 11:20:21 - INFO - __main__ -   Num Epochs = 100
03/31/2025 11:20:21 - INFO - __main__ -   Instantaneous batch size per device = 1
03/31/2025 11:20:21 - INFO - __main__ -   Total train batch size (w. parallel, distributed & accumulation) = 1
03/31/2025 11:20:21 - INFO - __main__ -   Gradient Accumulation steps = 1
03/31/2025 11:20:21 - INFO - __main__ -   Total optimization steps = 400
03/31/2025 11:20:21 - INFO - __main__ - Checking validation condition: step 0, validation every 100 steps




Steps:  25%|██▌       | 100/400 [37:08<1:31:06, 18.22s/it, loss=0.069, lr=5e-6]03/31/2025 11:57:30 - INFO - __main__ - Running validation at step 100.
03/31/2025 11:57:30 - INFO - __main__ - Starting validation process.
03/31/2025 11:57:30 - INFO - __main__ - Generating 1 images with prompt: A photo of an HP laptop on the sand with a sunset over the ocean in the background..

Fetching 3 files:   0%|          | 0/3 [00:00<?, ?it/s][A
Fetching 3 files: 100%|�██▎      | 1/3 [00:02<00:05,  2.67s/it][A��█████████| 3/3 [00:02<00:00,  1.12it/s]
{'image_encoder'} was not found in config. Values will be initialized to default values.

Loading pipeline components...:   0%|          | 0/6 [00:00<?, ?it/s][A
Loading pipeline components...:  17%|█▋        | 1/6 [00:02<00:13,  2.63s/it][A
Loading pipeline components...:  33%|███▎      | 2/6 [00:06<00:13,  3.41s/it][A{'sample_max_value', 'dynamic_thresholding_ratio', 'timestep_spacing', 'rescale_betas_zero_snr', 'clip_sample_range', 'thresholdin

## Inference Local Model

This code imports functions from the inference_dreambooth module, loads a configuration, and then runs inference to generate images. It uses a prompt to create three images with a resolution of 768x768 pixels, executing 100 inference steps per image.

In [None]:
from core.dreambooth_inference.inference_dreambooth import StableDiffusionPipelineOutput, load_config, run_inference

config = load_config()

run_inference(
    prompt="A high-quality photo of an HP laptop placed on the sand at the beach, with a sunset over the ocean in the background.", 
    height=768, 
    width=768, 
    num_images=3, 
    num_inference_steps=100  
)


## Galileo Evaluate Custom metrics
Galileo GenAI Studio supports Custom Metrics (programmatic or GPT-based) for all your Evaluate and Observe projects. 

In [None]:

#########################################
# In order to connect to Galileo, create a secrets.yaml file in the same folder as this notebook
# This file should be an entry called Galileo, with the your personal Galileo API Key
# Galileo API keys can be created on https://console.hp.galileocloud.io/settings/api-keys
#########################################

with open('secrets.yaml') as file:
    secrets = yaml.safe_load(file)
    os.environ['GALILEO_API_KEY'] = secrets["Galileo"]

os.environ['GALILEO_CONSOLE_URL'] = "https://console.hp.galileocloud.io/" 

pq.login(os.environ['GALILEO_CONSOLE_URL'])

In [None]:

#########################################

# Returns the path of the most recent image that matches the specified pattern.

#########################################

def get_latest_generated_image(directory: str = "./", prefix: str = "local_model_result_", ext: str = ".png") -> str:
    files = glob.glob(os.path.join(directory, f"{prefix}*{ext}"))
    if not files:
        raise FileNotFoundError("No generated images founda.")
    latest_file = max(files, key=os.path.getmtime)
    return latest_file

In [None]:

config = load_config()

prompt_text = ("A high-quality photo of an HP laptop placed on the sand at the beach, "
               "with a sunset over the ocean in the background.")

run_inference(
    prompt=prompt_text, 
    height=768, 
    width=768, 
    num_images=1, 
    num_inference_steps=100  
)

generated_image_path = get_latest_generated_image()

set_custom_image_path(generated_image_path)

template = prompt_text

result_custom = pq.run(template=template, scorers=[entropy_scorer, complexity_scorer])
print("Result:", result_custom)

## Model Service

Using MLflow, we will save and load the model in an integrated manner, enabling the traceability and reproducibility of experiments. MLflow will facilitate model versioning, monitoring, and deployment, ensuring a robust pipeline for your project.

In [None]:
deploy_model()