<a href="https://colab.research.google.com/github/p3bozuric/headshot_generator/blob/main/headshot_generator_finetuning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# This is no-code tuning for headshot generator AI model.

This notebook is build to be run in Google Colab.

A100 GPU would be optimal for running this training. Depending on settings you set it will take a couple of hours. Make sure you keep this session running.

## Giving access to Google Drive

Manual approval to Google is mandatory after running next cell.




In [None]:
from google.colab import drive
from google.colab import userdata

drive.mount('/content/drive')
hf_token = userdata.get('HF_TOKEN')

## Preparing AI toolkit environment

In [None]:
!git clone https://github.com/ostris/ai-toolkit
!cd ai-toolkit && git submodule update --init --recursive && pip install -r requirements.txt

Cloning into 'ai-toolkit'...
remote: Enumerating objects: 3769, done.[K
remote: Counting objects: 100% (2323/2323), done.[K
remote: Compressing objects: 100% (192/192), done.[K
remote: Total 3769 (delta 2216), reused 2144 (delta 2129), pack-reused 1446 (from 1)[K
Receiving objects: 100% (3769/3769), 29.64 MiB | 35.13 MiB/s, done.
Resolving deltas: 100% (2870/2870), done.


## Huggingface token preparation (HF_TOKEN)

How to get & prepare HF_TOKEN:
1. Log in to huggingface
2. Create a token here: https://huggingface.co/settings/tokens
3. You need to click on the key icon to the left and place your token there under the name 'HF_TOKEN'.

In [None]:
import os

# Set the environment variable
os.environ['HF_TOKEN'] = hf_token

print("HF_TOKEN environment variable has been set.")

HF_TOKEN environment variable has been set.


## Importing packages


In [None]:
import os
import sys
sys.path.append('/content/ai-toolkit')
from toolkit.job import run_job
from collections import OrderedDict
from PIL import Image
import os
os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"

## Dataset preparation

1. Dataset needs to be in folder **flux_dataset** which is inside of output directory you set in next cell.

2. Dataset needs to have 20-30 images of your face from different angles and backgrounds in various situations and face expressions.

3. Images should have names like: image001.jpg

4. Corresponding .txt files explaining the content of those images should be named like so: image001.txt

Keep in mind that .txt files are not necessary, but fine tuning will be better with written content of images.

# Hyperparameter setup

Fill out the form and run the cell.


In [None]:
#@markdown ---
#@markdown ## **Project Configuration**
#@markdown Title of your project
project_name = 'professional_headshot_generator' # @param {type:"string"}
#@markdown Model you'll be fine-tuning. This code is optimized for FLUX.1-dev.
model_name = 'black-forest-labs/FLUX.1-dev' # @param ["black-forest-labs/FLUX.1-dev"]
#@markdown Where you want your model to be saved at. Keep in mind "/content/drive/MyDrive/" should be a constant.
output_dir = '/content/drive/MyDrive/headshot-generator' # @param {type:"string"}
#@markdown Instance prompt is a unique code by which you refer to yourself when prompting the generator in the inference.
instance_prompt = "pa3k" # @param {type:"string"}

#@markdown ---
#@markdown ## **Training Configuration**
#@markdown These parameters control the training process of your model.

#@markdown Number of images processed in one iteration.
batch_size = 2 # @param {type:"integer"}

#@markdown Total number of training iterations. 1000-4000 is a good range.
total_steps = 2000 # @param {type:"integer"}

#@markdown Rate at which the model learns. Higher values may lead to faster learning but potential instability.
learning_rate = 1e-4 # @param {type:"number"}

#@markdown Image resolutions to use during training. FLUX model benefits from multiple resolutions.
resolution = [512, 768, 1024] # @param {type:"raw"}

#@markdown Whether to train the U-Net part of the model. Usually kept True.
train_unet = True # @param {type:"boolean"}

#@markdown Whether to train the text encoder. Usually False for FLUX models.
train_text_encoder = False # @param {type:"boolean"}

#@markdown Use 8-bit Adam optimizer for reduced memory usage. Recommended if your GPU supports it.
use_8bit_adam = True # @param {type:"boolean"}

#@markdown Saves memory by doing forward/backward passes in chunks. Needed unless you have a lot of VRAM.
use_gradient_checkpointing = True # @param {type:"boolean"}

#@markdown Use Exponential Moving Average for more stable training. Recommended to leave on.
use_ema = True # @param {type:"boolean"}

#@markdown EMA decay rate. Higher values give more weight to recent iterations. Best to leave at 0.99.
ema_decay = 0.99 # @param {type:"number"}

#@markdown Use bfloat16 precision. Speeds up training if your GPU supports it.
use_bf16 = True # @param {type:"boolean"}

#@markdown ---
#@markdown ## **Sampling Configuration**
#@markdown These settings control the generation of test images during the training process.

#@markdown Generate sample images every this many steps.
sample_every = 500 # @param {type:"integer"}

#@markdown Width of the generated sample images.
sample_width = 1024 # @param {type:"integer"}

#@markdown Height of the generated sample images.
sample_height = 1024 # @param {type:"integer"}

#@markdown How closely the image adheres to the prompt. Higher values = closer adherence.
guidance_scale = 4 # @param {type:"number"}

#@markdown Number of denoising steps in image generation. More steps = potentially higher quality but slower.
sample_steps = 40 # @param {type:"integer"}

from collections import OrderedDict

job_to_run = OrderedDict([
    ('job', 'extension'),
    ('config', OrderedDict([
        # this name will be the folder and filename name
        ('name', project_name),
        ('process', [
            OrderedDict([
                ('type', 'sd_trainer'),
                # root folder to save training sessions/samples/weights
                ('training_folder', output_dir),
                # uncomment to see performance stats in the terminal every N steps
                ('performance_log_every', 500),
                ('device', 'cuda:0'),
                # if a trigger word is specified, it will be added to captions of training data if it does not already exist
                # alternatively, in your captions you can add [trigger] and it will be replaced with the trigger word
                ('trigger_word', instance_prompt),
                ('network', OrderedDict([
                    ('type', 'lora'),
                    ('linear', 16),
                    ('linear_alpha', 16)
                ])),
                ('save', OrderedDict([
                    ('dtype', 'float16'),  # precision to save
                    ('save_every', 500),  # save every this many steps
                    ('max_step_saves_to_keep', 4)  # how many intermittent saves to keep
                ])),
                ('datasets', [
                    # datasets are a folder of images. captions need to be txt files with the same name as the image
                    # for instance image2.jpg and image2.txt. Only jpg, jpeg, and png are supported currently
                    # images will automatically be resized and bucketed into the resolution specified
                    OrderedDict([
                        ('folder_path', f'{output_dir}/flux_dataset'),
                        ('caption_ext', 'txt'),
                        ('caption_dropout_rate', 0.05),  # will drop out the caption 5% of time
                        ('shuffle_tokens', False),  # shuffle caption order, split by commas
                        ('cache_latents_to_disk', True),  # leave this true unless you know what you're doing
                        ('resolution', resolution)  # flux enjoys multiple resolutions
                    ])
                ]),
                ('train', OrderedDict([
                    ('batch_size', batch_size),
                    ('steps', total_steps),  # total number of steps to train 500 - 4000 is a good range
                    ('gradient_accumulation_steps', 2),
                    ('train_unet', train_unet),
                    ('train_text_encoder', train_text_encoder),  # probably won't work with flux
                    ('content_or_style', 'content'),  # content, style, balanced
                    ('gradient_checkpointing', use_gradient_checkpointing),  # need this on unless you have a ton of vram
                    ('noise_scheduler', 'flowmatch'),  # for training only
                    ('optimizer', 'adamw8bit' if use_8bit_adam else 'adamw'),
                    ('lr', learning_rate),
                    # uncomment this to skip the pre training sample
                    ('skip_first_sample', True),

                    # ema will smooth out learning, but could slow it down. Recommended to leave on.
                    ('ema_config', OrderedDict([
                        ('use_ema', use_ema),
                        ('ema_decay', ema_decay)
                    ])),

                    # will probably need this if gpu supports it for flux, other dtypes may not work correctly
                    ('dtype', 'bf16' if use_bf16 else 'float32')
                ])),
                ('model', OrderedDict([
                    # huggingface model name or path
                    ('name_or_path', model_name),
                    ('is_flux', True),
                    ('quantize', True),  # run 8bit mixed precision
                    #('low_vram', True),  # uncomment this if the GPU is connected to your monitors. It will use less vram to quantize, but is slower.
                ])),
                ('sample', OrderedDict([
                    ('sampler', 'flowmatch'),  # must match train.noise_scheduler
                    ('sample_every', sample_every),  # sample every this many steps
                    ('width', sample_width),
                    ('height', sample_height),
                    ('prompts', [
                        # you can add [trigger] to the prompts here and it will be replaced with the trigger word
                        f'professional headshot of {instance_prompt} in a suit, studio lighting, neutral background',
                        f'business portrait of {instance_prompt} smiling, office setting, soft lighting',
                        f'corporate headshot of {instance_prompt} with confident expression, blurred office background',
                        f'professional profile picture of {instance_prompt} in business casual attire, outdoors',
                        f'LinkedIn profile photo of {instance_prompt} with friendly expression, solid color background',
                        f'{instance_prompt} giving a presentation in a conference room, professional attire',
                        f'close-up portrait of {instance_prompt} for company website, modern office background',
                        f'{instance_prompt} in a casual business meeting, gesturing while speaking, natural light'
                    ]),
                    ('neg', ''),  # not used on flux
                    ('seed', 42),
                    ('walk_seed', True),
                    ('guidance_scale', guidance_scale),
                    ('sample_steps', sample_steps)
                ]))
            ])
        ])
    ])),
    # you can add any additional meta info here. [name] is replaced with config name at top
    ('meta', OrderedDict([
        ('name', project_name),
        ('version', '1.0')
    ]))
])

# Start the training with cell bellow when you're ready

This might take a while. Keep the session running while training.

In [None]:
run_job(job_to_run)

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]

  check_for_updates()
  return register_model(fn_wrapper)
  return register_model(fn_wrapper)
  return register_model(fn_wrapper)
  return register_model(fn_wrapper)
  return register_model(fn_wrapper)
  self.scaler = torch.cuda.amp.GradScaler()


{
    "type": "sd_trainer",
    "training_folder": "/content/drive/MyDrive/headshot-generator",
    "performance_log_every": 500,
    "device": "cuda:0",
    "trigger_word": "pa3k",
    "network": {
        "type": "lora",
        "linear": 16,
        "linear_alpha": 16
    },
    "save": {
        "dtype": "float16",
        "save_every": 500,
        "max_step_saves_to_keep": 4
    },
    "datasets": [
        {
            "folder_path": "/content/drive/MyDrive/headshot-generator/flux_dataset_v2",
            "caption_ext": "txt",
            "caption_dropout_rate": 0.05,
            "shuffle_tokens": false,
            "cache_latents_to_disk": true,
            "resolution": [
                512,
                768,
                1024
            ]
        }
    ],
    "train": {
        "batch_size": 2,
        "steps": 3000,
        "gradient_accumulation_steps": 2,
        "train_unet": true,
        "train_text_encoder": false,
        "content_or_style": "content",
      

transformer/config.json:   0%|          | 0.00/378 [00:00<?, ?B/s]

(…)ion_pytorch_model.safetensors.index.json:   0%|          | 0.00/121k [00:00<?, ?B/s]

(…)pytorch_model-00001-of-00003.safetensors:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

(…)pytorch_model-00002-of-00003.safetensors:   0%|          | 0.00/9.95G [00:00<?, ?B/s]

(…)pytorch_model-00003-of-00003.safetensors:   0%|          | 0.00/3.87G [00:00<?, ?B/s]

Quantizing transformer


scheduler/scheduler_config.json:   0%|          | 0.00/274 [00:00<?, ?B/s]

Loading vae


vae/config.json:   0%|          | 0.00/774 [00:00<?, ?B/s]

diffusion_pytorch_model.safetensors:   0%|          | 0.00/168M [00:00<?, ?B/s]

Loading t5


tokenizer_2/tokenizer_config.json:   0%|          | 0.00/20.8k [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

tokenizer_2/tokenizer.json:   0%|          | 0.00/2.42M [00:00<?, ?B/s]

tokenizer_2/special_tokens_map.json:   0%|          | 0.00/2.54k [00:00<?, ?B/s]

You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers


text_encoder_2/config.json:   0%|          | 0.00/782 [00:00<?, ?B/s]

(…)t_encoder_2/model.safetensors.index.json:   0%|          | 0.00/19.9k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/4.99G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/4.53G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Quantizing T5
Loading clip


text_encoder/config.json:   0%|          | 0.00/613 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/246M [00:00<?, ?B/s]

tokenizer/tokenizer_config.json:   0%|          | 0.00/705 [00:00<?, ?B/s]

tokenizer/vocab.json:   0%|          | 0.00/1.06M [00:00<?, ?B/s]

tokenizer/merges.txt:   0%|          | 0.00/525k [00:00<?, ?B/s]

tokenizer/special_tokens_map.json:   0%|          | 0.00/588 [00:00<?, ?B/s]

making pipe
preparing
create LoRA network. base dim (rank): 16, alpha: 16
neuron dropout: p=None, rank dropout: p=None, module dropout: p=None
create LoRA for Text Encoder: 0 modules.
create LoRA for U-Net: 494 modules.
enable LoRA for U-Net
Dataset: /content/drive/MyDrive/headshot-generator/flux_dataset_v2
  -  Preprocessing image dimensions


 13%|█▎        | 5/38 [00:01<00:11,  2.83it/s]



100%|██████████| 38/38 [00:13<00:00,  2.80it/s]


  -  Found 38 images
Bucket sizes for /content/drive/MyDrive/headshot-generator/flux_dataset_v2:
256x896: 1 files
320x768: 1 files
320x704: 3 files
640x384: 9 files
384x576: 3 files
256x960: 1 files
384x640: 3 files
576x448: 6 files
512x512: 5 files
448x576: 4 files
256x1024: 1 files
448x512: 1 files
12 buckets made
Caching latents for /content/drive/MyDrive/headshot-generator/flux_dataset_v2
 - Saving latents to disk


Caching latents to disk:  11%|█         | 4/38 [00:02<00:18,  1.80it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=684, file_item.scale_to_height=384, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_023.jpg
unexpected values: w=1440, h=960, file_item.scale_to_width=384, file_item.scale_to_height=576, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_003.jpg


Caching latents to disk:  21%|██        | 8/38 [00:04<00:20,  1.49it/s]

unexpected values: w=3000, h=4000, file_item.scale_to_width=598, file_item.scale_to_height=449, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_002.jpg


Caching latents to disk:  29%|██▉       | 11/38 [00:05<00:12,  2.15it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=684, file_item.scale_to_height=384, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_018.jpg


Caching latents to disk:  32%|███▏      | 12/38 [00:06<00:11,  2.19it/s]

unexpected values: w=2208, h=2944, file_item.scale_to_width=598, file_item.scale_to_height=448, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_015.jpg


Caching latents to disk:  37%|███▋      | 14/38 [00:06<00:09,  2.65it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=684, file_item.scale_to_height=384, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_012.jpg
unexpected values: w=6120, h=8160, file_item.scale_to_width=598, file_item.scale_to_height=448, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_024.jpg


Caching latents to disk:  42%|████▏     | 16/38 [00:09<00:13,  1.58it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=684, file_item.scale_to_height=384, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_017.jpg


Caching latents to disk:  45%|████▍     | 17/38 [00:09<00:11,  1.79it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=684, file_item.scale_to_height=384, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_014.jpg


Caching latents to disk:  47%|████▋     | 18/38 [00:09<00:10,  1.90it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=684, file_item.scale_to_height=384, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_007.jpg


Caching latents to disk:  50%|█████     | 19/38 [00:10<00:09,  1.96it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=684, file_item.scale_to_height=384, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_009.jpg


Caching latents to disk:  53%|█████▎    | 20/38 [00:10<00:09,  1.94it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=684, file_item.scale_to_height=384, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_025.jpg


Caching latents to disk:  61%|██████    | 23/38 [00:12<00:08,  1.82it/s]

unexpected values: w=3000, h=4000, file_item.scale_to_width=598, file_item.scale_to_height=449, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_001.jpg


Caching latents to disk:  63%|██████▎   | 24/38 [00:12<00:07,  1.94it/s]

unexpected values: w=2208, h=2944, file_item.scale_to_width=598, file_item.scale_to_height=448, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_010.jpg


Caching latents to disk:  66%|██████▌   | 25/38 [00:13<00:06,  2.02it/s]

unexpected values: w=2208, h=2944, file_item.scale_to_width=598, file_item.scale_to_height=448, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_022.jpg


Caching latents to disk:  68%|██████▊   | 26/38 [00:13<00:05,  2.14it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=684, file_item.scale_to_height=384, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_016.jpg


Caching latents to disk: 100%|██████████| 38/38 [00:15<00:00,  2.51it/s]


Dataset: /content/drive/MyDrive/headshot-generator/flux_dataset_v2
  -  Preprocessing image dimensions


100%|██████████| 38/38 [00:00<00:00, 43476.15it/s]

  -  Found 38 images
Bucket sizes for /content/drive/MyDrive/headshot-generator/flux_dataset_v2:
384x1344: 1 files
448x1152: 1 files
512x1024: 2 files
960x512: 9 files
576x832: 3 files
384x1472: 1 files
576x896: 1 files
832x640: 6 files
768x768: 5 files
640x768: 2 files
384x1536: 1 files
576x960: 1 files
448x576: 1 files
640x832: 2 files
512x1088: 2 files
15 buckets made
Caching latents for /content/drive/MyDrive/headshot-generator/flux_dataset_v2
 - Saving latents to disk



Caching latents to disk:  11%|█         | 4/38 [00:00<00:05,  6.65it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=960, file_item.scale_to_height=540, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_023.jpg
unexpected values: w=1440, h=960, file_item.scale_to_width=576, file_item.scale_to_height=864, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_003.jpg


Caching latents to disk:  18%|█▊        | 7/38 [00:01<00:05,  6.06it/s]

unexpected values: w=3000, h=4000, file_item.scale_to_width=854, file_item.scale_to_height=640, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_002.jpg


Caching latents to disk:  29%|██▉       | 11/38 [00:01<00:04,  5.46it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=960, file_item.scale_to_height=540, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_018.jpg


Caching latents to disk:  32%|███▏      | 12/38 [00:02<00:04,  5.22it/s]

unexpected values: w=2208, h=2944, file_item.scale_to_width=854, file_item.scale_to_height=640, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_015.jpg


Caching latents to disk:  37%|███▋      | 14/38 [00:02<00:04,  5.88it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=960, file_item.scale_to_height=540, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_012.jpg
unexpected values: w=6120, h=8160, file_item.scale_to_width=854, file_item.scale_to_height=640, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_024.jpg


Caching latents to disk:  42%|████▏     | 16/38 [00:03<00:08,  2.68it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=960, file_item.scale_to_height=540, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_017.jpg


Caching latents to disk:  45%|████▍     | 17/38 [00:04<00:06,  3.03it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=960, file_item.scale_to_height=540, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_014.jpg


Caching latents to disk:  47%|████▋     | 18/38 [00:04<00:05,  3.37it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=960, file_item.scale_to_height=540, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_007.jpg


Caching latents to disk:  50%|█████     | 19/38 [00:04<00:05,  3.61it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=960, file_item.scale_to_height=540, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_009.jpg


Caching latents to disk:  53%|█████▎    | 20/38 [00:04<00:04,  3.90it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=960, file_item.scale_to_height=540, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_025.jpg


Caching latents to disk:  61%|██████    | 23/38 [00:05<00:04,  3.54it/s]

unexpected values: w=3000, h=4000, file_item.scale_to_width=854, file_item.scale_to_height=640, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_001.jpg


Caching latents to disk:  63%|██████▎   | 24/38 [00:05<00:03,  3.76it/s]

unexpected values: w=2208, h=2944, file_item.scale_to_width=854, file_item.scale_to_height=640, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_010.jpg


Caching latents to disk:  66%|██████▌   | 25/38 [00:06<00:03,  3.92it/s]

unexpected values: w=2208, h=2944, file_item.scale_to_width=854, file_item.scale_to_height=640, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_022.jpg


Caching latents to disk:  68%|██████▊   | 26/38 [00:06<00:02,  4.15it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=960, file_item.scale_to_height=540, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_016.jpg


Caching latents to disk: 100%|██████████| 38/38 [00:07<00:00,  5.10it/s]


Dataset: /content/drive/MyDrive/headshot-generator/flux_dataset_v2
  -  Preprocessing image dimensions


100%|██████████| 38/38 [00:00<00:00, 43666.73it/s]

  -  Found 38 images
Bucket sizes for /content/drive/MyDrive/headshot-generator/flux_dataset_v2:
512x1856: 1 files
640x1536: 1 files
704x1408: 2 files
1344x768: 9 files
832x1216: 1 files
512x1920: 1 files
768x1280: 1 files
1152x832: 6 files
1024x1024: 5 files
896x1088: 1 files
512x2048: 1 files
768x1344: 1 files
960x1088: 1 files
448x576: 1 files
832x1152: 2 files
896x1152: 1 files
704x1472: 2 files
640x960: 1 files
18 buckets made
Caching latents for /content/drive/MyDrive/headshot-generator/flux_dataset_v2
 - Saving latents to disk



Caching latents to disk:  11%|█         | 4/38 [00:00<00:07,  4.71it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=1367, file_item.scale_to_height=768, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_023.jpg


Caching latents to disk:  13%|█▎        | 5/38 [00:00<00:06,  5.26it/s]

unexpected values: w=1440, h=960, file_item.scale_to_width=832, file_item.scale_to_height=1248, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_003.jpg


Caching latents to disk:  18%|█▊        | 7/38 [00:01<00:07,  4.22it/s]

unexpected values: w=3000, h=4000, file_item.scale_to_width=1152, file_item.scale_to_height=864, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_002.jpg


Caching latents to disk:  29%|██▉       | 11/38 [00:02<00:06,  4.02it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=1367, file_item.scale_to_height=768, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_018.jpg


Caching latents to disk:  32%|███▏      | 12/38 [00:02<00:06,  3.86it/s]

unexpected values: w=2208, h=2944, file_item.scale_to_width=1152, file_item.scale_to_height=864, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_015.jpg


Caching latents to disk:  37%|███▋      | 14/38 [00:03<00:05,  4.21it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=1367, file_item.scale_to_height=768, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_012.jpg
unexpected values: w=6120, h=8160, file_item.scale_to_width=1152, file_item.scale_to_height=864, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_024.jpg


Caching latents to disk:  42%|████▏     | 16/38 [00:04<00:10,  2.09it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=1367, file_item.scale_to_height=768, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_017.jpg


Caching latents to disk:  45%|████▍     | 17/38 [00:05<00:08,  2.40it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=1367, file_item.scale_to_height=768, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_014.jpg


Caching latents to disk:  47%|████▋     | 18/38 [00:05<00:07,  2.69it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=1367, file_item.scale_to_height=768, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_007.jpg


Caching latents to disk:  50%|█████     | 19/38 [00:05<00:06,  2.92it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=1367, file_item.scale_to_height=768, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_009.jpg


Caching latents to disk:  53%|█████▎    | 20/38 [00:05<00:05,  3.13it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=1367, file_item.scale_to_height=768, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_025.jpg


Caching latents to disk:  58%|█████▊    | 22/38 [00:06<00:05,  3.01it/s]

unexpected values: w=3000, h=4000, file_item.scale_to_width=1152, file_item.scale_to_height=864, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_001.jpg


Caching latents to disk:  63%|██████▎   | 24/38 [00:07<00:04,  3.03it/s]

unexpected values: w=2208, h=2944, file_item.scale_to_width=1152, file_item.scale_to_height=864, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_010.jpg


Caching latents to disk:  66%|██████▌   | 25/38 [00:07<00:04,  3.21it/s]

unexpected values: w=2208, h=2944, file_item.scale_to_width=1152, file_item.scale_to_height=864, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_022.jpg


Caching latents to disk:  68%|██████▊   | 26/38 [00:07<00:03,  3.34it/s]

unexpected values: w=1808, h=3216, file_item.scale_to_width=1367, file_item.scale_to_height=768, file_item.path=/content/drive/MyDrive/headshot-generator/flux_dataset_v2/pa3k_headshot_016.jpg


Caching latents to disk: 100%|██████████| 38/38 [00:09<00:00,  3.95it/s]


Skipping first sample due to config setting


  self.pid = os.fork()
  with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context:  # type: ignore[attr-defined]
pa3k_professional_headshot_generator_v2:  17%|█▋        | 499/3000 [25:07<2:23:00,  3.43s/it, lr: 1.0e-04 loss: 3.004e-01]
Generating Images:   0%|          | 0/8 [00:00<?, ?it/s][A
Generating Images:  12%|█▎        | 1/8 [00:36<04:13, 36.19s/it][A
Generating Images:  25%|██▌       | 2/8 [01:12<03:36, 36.05s/it][A
Generating Images:  38%|███▊      | 3/8 [01:48<03:00, 36.01s/it][A
Generating Images:  50%|█████     | 4/8 [02:24<02:23, 35.99s/it][A
Generating Images:  62%|██████▎   | 5/8 [03:00<01:47, 35.98s/it][A
Generating Images:  75%|███████▌  | 6/8 [03:35<01:11, 35.97s/it][A
Generating Images:  88%|████████▊ | 7/8 [04:11<00:35, 35.96s/it][A
Generating Images: 100%|██████████| 8/8 [04:47<00:00, 35.96s/it][A
pa3k_professional_headshot_generator_v2:  17%|█▋        | 499/3000 [25:07<2:23:00,  3.43s/it, lr: 1.0e-04 loss: 3.004e-01]

Saving at step 500


pa3k_professional_headshot_generator_v2:  17%|█▋        | 499/3000 [25:10<2:23:00,  3.43s/it, lr: 1.0e-04 loss: 3.004e-01]

Saved to /content/drive/MyDrive/headshot-generator/pa3k_professional_headshot_generator_v2/optimizer.pt

Timer 'pa3k_professional_headshot_generator_v2 Timer':
 - 2.9640s avg - train_loop, num = 10
 - 1.8270s avg - backward, num = 10
 - 0.8477s avg - predict_unet, num = 10
 - 0.2617s avg - reset_batch, num = 6
 - 0.1349s avg - optimizer_step, num = 10
 - 0.1129s avg - calculate_loss, num = 10
 - 0.0683s avg - encode_prompt, num = 10
 - 0.0029s avg - preprocess_batch, num = 10
 - 0.0027s avg - get_batch, num = 10
 - 0.0014s avg - prepare_noise, num = 10
 - 0.0005s avg - batch_cleanup, num = 10
 - 0.0005s avg - prepare_latents, num = 10
 - 0.0001s avg - grad_setup, num = 10
 - 0.0000s avg - scheduler_step, num = 10
 - 0.0000s avg - prepare_prompt, num = 10
 - 0.0000s avg - log_to_tensorboard, num = 5



pa3k_professional_headshot_generator_v2:  33%|███▎      | 999/3000 [50:00<1:41:43,  3.05s/it, lr: 1.0e-04 loss: 2.787e-01]
Generating Images:   0%|          | 0/8 [00:00<?, ?it/s][A
Generating Images:  12%|█▎        | 1/8 [00:36<04:12, 36.01s/it][A
Generating Images:  25%|██▌       | 2/8 [01:11<03:35, 35.98s/it][A
Generating Images:  38%|███▊      | 3/8 [01:47<02:59, 35.97s/it][A
Generating Images:  50%|█████     | 4/8 [02:23<02:23, 35.96s/it][A
Generating Images:  62%|██████▎   | 5/8 [02:59<01:47, 35.96s/it][A
Generating Images:  75%|███████▌  | 6/8 [03:35<01:11, 35.96s/it][A
Generating Images:  88%|████████▊ | 7/8 [04:11<00:35, 35.96s/it][A
Generating Images: 100%|██████████| 8/8 [04:47<00:00, 35.96s/it][A
pa3k_professional_headshot_generator_v2:  33%|███▎      | 999/3000 [50:00<1:41:43,  3.05s/it, lr: 1.0e-04 loss: 2.787e-01]

Saving at step 1000


pa3k_professional_headshot_generator_v2:  33%|███▎      | 999/3000 [50:03<1:41:43,  3.05s/it, lr: 1.0e-04 loss: 2.787e-01]

Saved to /content/drive/MyDrive/headshot-generator/pa3k_professional_headshot_generator_v2/optimizer.pt

Timer 'pa3k_professional_headshot_generator_v2 Timer':
 - 2.7078s avg - train_loop, num = 10
 - 1.6551s avg - backward, num = 10
 - 0.7910s avg - predict_unet, num = 10
 - 0.2702s avg - reset_batch, num = 7
 - 0.1347s avg - optimizer_step, num = 10
 - 0.0947s avg - calculate_loss, num = 10
 - 0.0714s avg - encode_prompt, num = 10
 - 0.0031s avg - get_batch, num = 10
 - 0.0028s avg - preprocess_batch, num = 10
 - 0.0014s avg - prepare_noise, num = 10
 - 0.0006s avg - batch_cleanup, num = 10
 - 0.0004s avg - prepare_latents, num = 10
 - 0.0001s avg - grad_setup, num = 10
 - 0.0000s avg - prepare_prompt, num = 10
 - 0.0000s avg - scheduler_step, num = 10
 - 0.0000s avg - log_to_tensorboard, num = 5



pa3k_professional_headshot_generator_v2:  50%|████▉     | 1499/3000 [1:15:05<1:05:40,  2.63s/it, lr: 1.0e-04 loss: 2.625e-01]
Generating Images:   0%|          | 0/8 [00:00<?, ?it/s][A
Generating Images:  12%|█▎        | 1/8 [00:36<04:12, 36.00s/it][A
Generating Images:  25%|██▌       | 2/8 [01:11<03:35, 35.97s/it][A
Generating Images:  38%|███▊      | 3/8 [01:47<02:59, 35.95s/it][A
Generating Images:  50%|█████     | 4/8 [02:23<02:23, 35.95s/it][A
Generating Images:  62%|██████▎   | 5/8 [02:59<01:47, 35.95s/it][A
Generating Images:  75%|███████▌  | 6/8 [03:35<01:11, 35.95s/it][A
Generating Images:  88%|████████▊ | 7/8 [04:11<00:35, 35.95s/it][A
Generating Images: 100%|██████████| 8/8 [04:47<00:00, 35.95s/it][A
pa3k_professional_headshot_generator_v2:  50%|████▉     | 1499/3000 [1:15:05<1:05:40,  2.63s/it, lr: 1.0e-04 loss: 2.625e-01]

Saving at step 1500


pa3k_professional_headshot_generator_v2:  50%|████▉     | 1499/3000 [1:15:07<1:05:40,  2.63s/it, lr: 1.0e-04 loss: 2.625e-01]

Saved to /content/drive/MyDrive/headshot-generator/pa3k_professional_headshot_generator_v2/optimizer.pt

Timer 'pa3k_professional_headshot_generator_v2 Timer':
 - 3.2264s avg - train_loop, num = 10
 - 1.9928s avg - backward, num = 10
 - 0.9288s avg - predict_unet, num = 10
 - 0.2655s avg - reset_batch, num = 7
 - 0.1294s avg - optimizer_step, num = 10
 - 0.1240s avg - calculate_loss, num = 10
 - 0.0687s avg - encode_prompt, num = 10
 - 0.0031s avg - preprocess_batch, num = 10
 - 0.0030s avg - get_batch, num = 10
 - 0.0015s avg - prepare_noise, num = 10
 - 0.0005s avg - prepare_latents, num = 10
 - 0.0004s avg - batch_cleanup, num = 10
 - 0.0001s avg - grad_setup, num = 10
 - 0.0000s avg - prepare_prompt, num = 10
 - 0.0000s avg - scheduler_step, num = 10
 - 0.0000s avg - log_to_tensorboard, num = 5



pa3k_professional_headshot_generator_v2:  67%|██████▋   | 1999/3000 [1:40:01<50:22,  3.02s/it, lr: 1.0e-04 loss: 4.477e-01]
Generating Images:   0%|          | 0/8 [00:00<?, ?it/s][A
Generating Images:  12%|█▎        | 1/8 [00:36<04:12, 36.00s/it][A
Generating Images:  25%|██▌       | 2/8 [01:11<03:35, 35.97s/it][A
Generating Images:  38%|███▊      | 3/8 [01:47<02:59, 35.96s/it][A
Generating Images:  50%|█████     | 4/8 [02:23<02:23, 35.96s/it][A
Generating Images:  62%|██████▎   | 5/8 [02:59<01:47, 35.96s/it][A
Generating Images:  75%|███████▌  | 6/8 [03:35<01:11, 35.96s/it][A
Generating Images:  88%|████████▊ | 7/8 [04:11<00:35, 35.96s/it][A
Generating Images: 100%|██████████| 8/8 [04:47<00:00, 35.96s/it][A
pa3k_professional_headshot_generator_v2:  67%|██████▋   | 1999/3000 [1:40:01<50:22,  3.02s/it, lr: 1.0e-04 loss: 4.477e-01]

Saving at step 2000


pa3k_professional_headshot_generator_v2:  67%|██████▋   | 1999/3000 [1:40:04<50:22,  3.02s/it, lr: 1.0e-04 loss: 4.477e-01]

Saved to /content/drive/MyDrive/headshot-generator/pa3k_professional_headshot_generator_v2/optimizer.pt

Timer 'pa3k_professional_headshot_generator_v2 Timer':
 - 2.9723s avg - train_loop, num = 10
 - 1.8011s avg - backward, num = 10
 - 0.8371s avg - predict_unet, num = 10
 - 0.2673s avg - reset_batch, num = 7
 - 0.1379s avg - optimizer_step, num = 10
 - 0.1102s avg - calculate_loss, num = 10
 - 0.0724s avg - encode_prompt, num = 10
 - 0.0067s avg - get_batch, num = 10
 - 0.0034s avg - preprocess_batch, num = 10
 - 0.0015s avg - prepare_noise, num = 10
 - 0.0006s avg - batch_cleanup, num = 10
 - 0.0005s avg - prepare_latents, num = 10
 - 0.0002s avg - grad_setup, num = 10
 - 0.0000s avg - prepare_prompt, num = 10
 - 0.0000s avg - scheduler_step, num = 10
 - 0.0000s avg - log_to_tensorboard, num = 5



pa3k_professional_headshot_generator_v2:  83%|████████▎ | 2499/3000 [2:05:07<23:53,  2.86s/it, lr: 1.0e-04 loss: 2.919e-01]
Generating Images:   0%|          | 0/8 [00:00<?, ?it/s][A
Generating Images:  12%|█▎        | 1/8 [00:35<04:11, 36.00s/it][A
Generating Images:  25%|██▌       | 2/8 [01:11<03:35, 35.96s/it][A
Generating Images:  38%|███▊      | 3/8 [01:47<02:59, 35.95s/it][A
Generating Images:  50%|█████     | 4/8 [02:23<02:23, 35.96s/it][A
Generating Images:  62%|██████▎   | 5/8 [02:59<01:47, 35.96s/it][A
Generating Images:  75%|███████▌  | 6/8 [03:35<01:11, 35.96s/it][A
Generating Images:  88%|████████▊ | 7/8 [04:11<00:35, 35.96s/it][A
Generating Images: 100%|██████████| 8/8 [04:47<00:00, 35.96s/it][A
pa3k_professional_headshot_generator_v2:  83%|████████▎ | 2499/3000 [2:05:07<23:53,  2.86s/it, lr: 1.0e-04 loss: 2.919e-01]

Saving at step 2500


pa3k_professional_headshot_generator_v2:  83%|████████▎ | 2499/3000 [2:05:10<23:53,  2.86s/it, lr: 1.0e-04 loss: 2.919e-01]

Saved to /content/drive/MyDrive/headshot-generator/pa3k_professional_headshot_generator_v2/optimizer.pt
Removing old save: /content/drive/MyDrive/headshot-generator/pa3k_professional_headshot_generator_v2/pa3k_professional_headshot_generator_v2_000000500.safetensors

Timer 'pa3k_professional_headshot_generator_v2 Timer':
 - 2.8598s avg - train_loop, num = 10
 - 1.7714s avg - backward, num = 10
 - 0.8108s avg - predict_unet, num = 10
 - 0.2687s avg - reset_batch, num = 6
 - 0.1208s avg - optimizer_step, num = 10
 - 0.1088s avg - calculate_loss, num = 10
 - 0.0687s avg - encode_prompt, num = 10
 - 0.0029s avg - get_batch, num = 10
 - 0.0028s avg - preprocess_batch, num = 10
 - 0.0014s avg - prepare_noise, num = 10
 - 0.0005s avg - batch_cleanup, num = 10
 - 0.0004s avg - prepare_latents, num = 10
 - 0.0001s avg - grad_setup, num = 10
 - 0.0000s avg - scheduler_step, num = 10
 - 0.0000s avg - prepare_prompt, num = 10
 - 0.0000s avg - log_to_tensorboard, num = 5



pa3k_professional_headshot_generator_v2: 100%|█████████▉| 2999/3000 [2:30:01<00:03,  3.00s/it, lr: 1.0e-04 loss: 4.236e-01]



Saved to /content/drive/MyDrive/headshot-generator/pa3k_professional_headshot_generator_v2/optimizer.pt
