# Fine Tune Stable Diffusion

Fine tuning Stable Diffusion on Pokemon, 
for more details see the [Lambda Labs examples repo](https://github.com/LambdaLabsML/examples). 

We recommend using a multi-GPU machine, for example an instance from [Lambda GPU Cloud](https://lambdalabs.com/service/gpu-cloud). If running on Colab this notebook is likely to need a GPU with >16GB of VRAM and a runtime with high RAM, which will almost certainly need Colab Pro or Pro+. (If you get errors suchs as `Killed` or `CUDA out of memory` then one of these is not sufficient)

## check gpu 

In [1]:
!ls

DreamBooth_Stable_Diffusion4pro.ipynb  main.py
LICENSE				       models
README.md			       my-dataset
assets				       nft-finetune.ipynb
configs				       notebook_helpers.py
data				       notebooks
data_check.ipynb		       pokemon_finetune.ipynb
examples			       requirements.txt
im-examples			       scripts
latent_diffusion.egg-info	       setup.py
ldm				       src


In [9]:
!git clone https://github.com/Jumabek/stable-diffusion.git

Cloning into 'stable-diffusion'...
remote: Enumerating objects: 1631, done.[K
remote: Counting objects: 100% (28/28), done.[K
remote: Compressing objects: 100% (24/24), done.[K
remote: Total 1631 (delta 3), reused 15 (delta 2), pack-reused 1603[K
Receiving objects: 100% (1631/1631), 73.93 MiB | 19.00 MiB/s, done.
Resolving deltas: 100% (982/982), done.


In [1]:

!pip install --upgrade pip
!pip install -r requirements.txt

!pip install --upgrade keras # on lambda stack we need to upgrade keras
!pip uninstall -y torchtext # on colab we need to remove torchtext

Defaulting to user installation because normal site-packages is not writeable
Collecting pip
  Downloading pip-22.3-py3-none-any.whl (2.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.1/2.1 MB[0m [31m115.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 22.2.2
    Uninstalling pip-22.2.2:
      Successfully uninstalled pip-22.2.2
Successfully installed pip-22.3
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu113
Obtaining taming-transformers from git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers (from -r requirements.txt (line 24))
  Cloning https://github.com/CompVis/taming-transformers.git (to revision master) to ./src/taming-transformers
  Running command git clone --filter=blob:none --quiet https://github.com/CompVis/taming-tr

In [2]:
!ls

DreamBooth_Stable_Diffusion4pro.ipynb  main.py
LICENSE				       models
README.md			       my-dataset
assets				       nft-finetune.ipynb
configs				       notebook_helpers.py
data				       notebooks
data_check.ipynb		       pokemon_finetune.ipynb
examples			       requirements.txt
im-examples			       scripts
latent_diffusion.egg-info	       setup.py
ldm				       src


To get the weights you need to you'll need to [go to the model card](https://huggingface.co/CompVis/stable-diffusion-v1-4-original), read the license and tick the checkbox if you agree.

In [2]:
!pip install huggingface_hub
from huggingface_hub import notebook_login

notebook_login()

Defaulting to user installation because normal site-packages is not writeable


VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [3]:
from huggingface_hub import hf_hub_download
ckpt_path = hf_hub_download(repo_id="CompVis/stable-diffusion-v-1-4-original", filename="sd-v1-4-full-ema.ckpt", use_auth_token=True)

Downloading:   0%|          | 0.00/7.70G [00:00<?, ?B/s]

Set your parameters below depending on your GPU setup, the settings below were used for training on a 2xA6000 machine, (the A6000 has 48GB of VRAM). On this set up good results are achieved in around 6 hours.

You can make up for using smaller batches or fewer gpus by accumulating batches:

`total batch size = batach size * n gpus * accumulate batches`

In [13]:
# 2xA6000:
BATCH_SIZE = 4
N_GPUS = 8
ACCUMULATE_BATCHES = 1

gpu_list = ",".join((str(x) for x in range(N_GPUS))) + ","
print(f"Using GPUs: {gpu_list}")
gpu_list

Using GPUs: 0,1,2,3,4,5,6,7,


'0,1,2,3,4,5,6,7,'

# TO ADJUST celeBQ // ffHQ FOR OUR PURPUSES
- CLOSE UP
- ZOOMED IN 
# Prepare dataset
- MORE VARIATION IN CAPTIONS
- use ffhq and caption
- scrape passport size images
    

## Fine tuning 

In [7]:
# Run training
%cd /home/ubuntu/stable-diffusion
!(python main.py \
    -t \
    --base configs/stable-diffusion/nft.yaml \
    --gpus '0,1,2,3,4,5,6,7,' \
    --scale_lr False \
    --num_nodes 1 \
    --check_val_every_n_epoch 10 \
    --finetune_from "$ckpt_path" \
    data.params.batch_size=2 \
    lightning.trainer.accumulate_grad_batches=1 \
    data.params.validation.params.n_gpus=1 \
)

/home/ubuntu/stable-diffusion
Global seed set to 23
Running on GPUs 0,1,2,3,4,5,6,7,
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Keeping EMAs of 688.
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
Some weights of the model checkpoint at openai/clip-vit-large-patch14 were not used when initializing CLIPTextModel: ['vision_model.encoder.layers.15.self_attn.v_proj.weight', 'vision_model.encoder.layers.13.mlp.fc1.bias', 'vision_model.encoder.layers.1.layer_norm1.weight', 'vision_model.encoder.layers.19.layer_norm1.weight', 'vision_model.encoder.layers.0.mlp.fc1.bias', 'vision_model.encoder.layers.16.mlp.fc2.weight', 'vision_model.encoder.layers.16.self_attn.v_proj.bias', 'vision_model.encoder.layers.10.self_attn.out_proj.weight', 'vision_model.encoder.layers.7.layer_norm1.weight', 'vision_model.encoder.layers.9.mlp.fc2.weight',

/home/ubuntu/stable-diffusion/logs/2022-10-20T14-28-21_nft/checkpoints## Inference

In [16]:
#%cd /home/ubuntu/stable-diffusion/logs/2022-10-20T14-28-21_nft/checkpoints\
!ls

In [12]:
# Run the model
%cd /home/ubuntu/stable-diffusion/
!(python scripts/txt2img.py \
    --prompt 'donald trump' \
    --outdir 'outputs/generated_nft' \
    --H 512 --W 512 \
    --n_samples 4 \
    --config 'configs/stable-diffusion/nft.yaml' \
    --ckpt 'logs/2022-10-20T14-28-21_nft/checkpoints/last.ckpt')

/home/ubuntu/stable-diffusion
Global seed set to 42
Loading model from logs/2022-10-20T14-28-21_nft/checkpoints/last.ckpt
Traceback (most recent call last):
  File "scripts/txt2img.py", line 285, in <module>
    main()
  File "scripts/txt2img.py", line 194, in main
    model = load_model_from_config(config, f"{opt.ckpt}")
  File "scripts/txt2img.py", line 27, in load_model_from_config
    pl_sd = torch.load(ckpt, map_location="cpu")
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/serialization.py", line 699, in load
    with _open_file_like(f, 'rb') as opened_file:
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/serialization.py", line 230, in _open_file_like
    return _open_file(name_or_buffer, mode)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/serialization.py", line 211, in __init__
    super(_open_file, self).__init__(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'logs/2022-10-20T14-28-21_nft/checkpoints/last