# FLUX.1 Training Notebook

- Script customized by Trelis Research.
- Based on an original script from AI Toolkit by Ostris [here](https://github.com/ostris/ai-toolkit/tree/main/notebooks).

Model License:
- FLUX Schnell is openly licensed and training works fine.
- Perhaps FLUX Dev is a bit better quality, but it can only be used for non-commercial purposes. To use FLUX Dev, sign into HF and accept the model access here [black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) - you'll then need to update the models being loaded below. [Get a READ key from huggingface](https://huggingface.co/settings/tokens/new?) and place it in the next cell after running it.

## Installation

In [1]:
!nvidia-smi

Tue Jan 14 12:40:13 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03              Driver Version: 560.35.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA A100 80GB PCIe          On  |   00000000:01:00.0 Off |                    0 |
| N/A   29C    P0             44W /  300W |       1MiB /  81920MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
                                                

In [2]:
!git clone https://github.com/ostris/ai-toolkit
!mkdir -p /workspace/dataset # or !mkdir -p /content/dataset # for colab

Cloning into 'ai-toolkit'...
remote: Enumerating objects: 4067, done.[K
remote: Counting objects: 100% (2485/2485), done.[K
remote: Compressing objects: 100% (252/252), done.[K
remote: Total 4067 (delta 2390), reused 2233 (delta 2233), pack-reused 1582 (from 4)[K
Receiving objects: 100% (4067/4067), 29.81 MiB | 44.05 MiB/s, done.
Resolving deltas: 100% (3080/3080), done.


Put your image dataset in the `/workspace/dataset` folder

In [3]:
!cd ai-toolkit && git submodule update --init --recursive && pip install -r requirements.txt -qU

Submodule 'repositories/batch_annotator' (https://github.com/ostris/batch-annotator) registered for path 'repositories/batch_annotator'
Submodule 'repositories/ipadapter' (https://github.com/tencent-ailab/IP-Adapter.git) registered for path 'repositories/ipadapter'
Submodule 'repositories/leco' (https://github.com/p1atdev/LECO) registered for path 'repositories/leco'
Submodule 'repositories/sd-scripts' (https://github.com/kohya-ss/sd-scripts.git) registered for path 'repositories/sd-scripts'
Cloning into '/workspace/ai-toolkit/repositories/batch_annotator'...
Cloning into '/workspace/ai-toolkit/repositories/ipadapter'...
Cloning into '/workspace/ai-toolkit/repositories/leco'...
Cloning into '/workspace/ai-toolkit/repositories/sd-scripts'...
Submodule path 'repositories/batch_annotator': checked out '420e142f6ad3cc14b3ea0500affc2c6c7e7544bf'
Submodule 'repositories/controlnet' (https://github.com/lllyasviel/ControlNet-v1-1-nightly.git) registered for path 'repositories/batch_annotator/r

In [4]:
import getpass
import os

# # Prompt for the token
# hf_token = getpass.getpass('Enter your HF access token and press enter: ')

# # Set the environment variable
# os.environ['HF_TOKEN'] = hf_token

# print("HF_TOKEN environment variable has been set.")

In [5]:
import os
import sys
sys.path.append('/workspace/ai-toolkit') # or content/ai-toolkit for colab
from toolkit.job import run_job
from collections import OrderedDict
from PIL import Image
import os
os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "0"

## Inference

BEWARE that if you run inference before fine-tuning, the model will already be loaded to the GPU and you probably will run out of VRAM.

So, after running inference, restart the kernel, comment out this inference cell and run the notebook from the top (minus the installs, as they are already done).

In [6]:
!pip install -U diffusers


# from diffusers import AutoPipelineForText2Image
# import torch

import torch
from diffusers import FluxPipeline

pipeline = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-schnell", torch_dtype=torch.bfloat16)

# ## For running on Colab or Kaggle with a T4:
# # This will work, but doesn't use the GPU very well and is a bit slow.
# pipeline.enable_model_cpu_offload()  # Save VRAM by offloading the model to CPU. You can uncomment ONLY this if you want to cut VRAM but your GPU does support bfloat16
# pipeline.enable_sequential_cpu_offload()
# pipeline.vae.enable_slicing()
# pipeline.vae.enable_tiling()
# pipeline.to(torch.float16)

## For running on a GPU with at least 32 GB of VRAM
pipeline = pipeline.to("cuda")

# pipeline.load_lora_weights('output/ronantrelis_A40', weight_name='ronantrelis_A40.safetensors') # update the output dir and weight name here

[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.3.1[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython -m pip install --upgrade pip[0m


The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]

model_index.json:   0%|          | 0.00/536 [00:00<?, ?B/s]

Fetching 23 files:   0%|          | 0/23 [00:00<?, ?it/s]

scheduler/scheduler_config.json:   0%|          | 0.00/274 [00:00<?, ?B/s]

tokenizer/special_tokens_map.json:   0%|          | 0.00/588 [00:00<?, ?B/s]

tokenizer/tokenizer_config.json:   0%|          | 0.00/705 [00:00<?, ?B/s]

tokenizer/vocab.json:   0%|          | 0.00/1.06M [00:00<?, ?B/s]

tokenizer_2/special_tokens_map.json:   0%|          | 0.00/2.54k [00:00<?, ?B/s]

text_encoder_2/config.json:   0%|          | 0.00/782 [00:00<?, ?B/s]

tokenizer_2/tokenizer.json:   0%|          | 0.00/2.42M [00:00<?, ?B/s]

(…)t_encoder_2/model.safetensors.index.json:   0%|          | 0.00/19.9k [00:00<?, ?B/s]

tokenizer_2/tokenizer_config.json:   0%|          | 0.00/20.8k [00:00<?, ?B/s]

transformer/config.json:   0%|          | 0.00/321 [00:00<?, ?B/s]

tokenizer/merges.txt:   0%|          | 0.00/525k [00:00<?, ?B/s]

text_encoder/config.json:   0%|          | 0.00/613 [00:00<?, ?B/s]

(…)ion_pytorch_model.safetensors.index.json:   0%|          | 0.00/121k [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

vae/config.json:   0%|          | 0.00/774 [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/4.53G [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/246M [00:00<?, ?B/s]

(…)pytorch_model-00001-of-00003.safetensors:   0%|          | 0.00/9.96G [00:00<?, ?B/s]

(…)pytorch_model-00002-of-00003.safetensors:   0%|          | 0.00/9.95G [00:00<?, ?B/s]

(…)pytorch_model-00003-of-00003.safetensors:   0%|          | 0.00/3.87G [00:00<?, ?B/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/4.99G [00:00<?, ?B/s]

diffusion_pytorch_model.safetensors:   0%|          | 0.00/168M [00:00<?, ?B/s]

Loading pipeline components...:   0%|          | 0/7 [00:00<?, ?it/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers


In [7]:
image = pipeline('PrateekPani A man playing a game of championship snooker, close-up').images[0]
image.save("snooker_closeup_base.png")

# image = pipeline('PrateekPani A man surrounded by wind turbines and solar panels, close-up, wind-swept', height=1024, width=1824).images[0]
# image.save("windturbine5.png")

# image = pipeline('PrateekPani A man taking a photograph', height=1024, width=1824).images[0]
# image.save("photograph5.png")

image = pipeline('PrateekPani Professor at a chalkboard, with a large empty speech bubble', height=1024, width=1824).images[0]
image.save("chalkboard5.png")

  0%|          | 0/28 [00:00<?, ?it/s]

  0%|          | 0/28 [00:00<?, ?it/s]

## Fine-tuning Setup

This is your config. It is documented pretty well. Normally you would do this as a yaml file, but for colab, this will work. This will run as is without modification, but feel free to edit as you want.

In [8]:
from collections import OrderedDict

job_to_run = OrderedDict([
    ('job', 'extension'),
    ('config', OrderedDict([
        # this name will be the folder and filename name
        ('name', 'PrateekPani'),
        ('process', [
            OrderedDict([
                ('type', 'sd_trainer'),
                # root folder to save training sessions/samples/weights
                ('log_dir', 'logs'),  # log directory
                ('log_config', OrderedDict([
                    ('log_interval', '10')  # log interval
                ])),
                ('training_folder', '/workspace/output'),
                # uncomment to see performance stats in the terminal every N steps
                ('performance_log_every', 50),
                ('device', 'cuda:0'),
                # if a trigger word is specified, it will be added to captions of training data if it does not already exist
                # alternatively, in your captions you can add [trigger] and it will be replaced with the trigger word
                ('trigger_word', 'PrateekPani'),
                ('network', OrderedDict([
                    ('type', 'lora'),
                    ('linear', 32),
                    ('linear_alpha', 32)
                ])),
                ('save', OrderedDict([
                    ('dtype', 'bfloat16'),  # precision to save
                    ('save_every', 500),  # save every this many steps
                    ('max_step_saves_to_keep', 3),  # how many intermittent saves to keep
                    ('push_to_hub', True),  # how many intermittent saves to keep
                    ('hf_repo_id', 'prtk1729/PrateekPani'), # your model slug on hf
                    ('hf_private', True),
                ])),
                ('datasets', [
                    # datasets are a folder of images. captions need to be txt files with the same name as the image
                    # for instance image2.jpg and image2.txt. Only jpg, jpeg, and png are supported currently
                    # images will automatically be resized and bucketed into the resolution specified
                    OrderedDict([
                        ('folder_path', '/workspace/dataset'),
                        ('caption_ext', 'txt'),
                        ('caption_dropout_rate', 0.05),  # will drop out the caption 5% of time
                        ('shuffle_tokens', False),  # shuffle caption order, split by commas
                        ('cache_latents_to_disk', True),  # leave this true unless you know what you're doing
                        ('resolution', [512, 768, 1024])  # flux enjoys multiple resolutions
                    ])
                ]),
                ('train', OrderedDict([
                    ('batch_size', 1),
                    ('steps', 1000),  # total number of steps to train 500 - 4000 is a good range
                    ('gradient_accumulation_steps', 1),
                    ('train_unet', True),
                    ('train_text_encoder', False),  # probably won't work with flux
                    ('gradient_checkpointing', True),  # need the on unless you have a ton of vram
                    ('noise_scheduler', 'flowmatch'),  # for training only
                    ('optimizer', 'adamw8bit'),
                    ('lr', 1e-4),

                    # uncomment this to skip the pre training sample
                    ('skip_first_sample', True),

                    # uncomment to completely disable sampling
                    # ('disable_sampling', True),

                    # uncomment to use new vell curved weighting. Experimental but may produce better results
                    # ('linear_timesteps', True),

                    # ema will smooth out learning, but could slow it down. Recommended to leave on.
                    ('ema_config', OrderedDict([
                        ('use_ema', True),
                        ('ema_decay', 0.99)
                    ])),

                    # will probably need this if gpu supports it for flux, other dtypes may not work correctly
                    ('dtype', 'bf16')
                ])),
                ('model', OrderedDict([
                    # huggingface model name or path
                    ('name_or_path', 'black-forest-labs/FLUX.1-schnell'),
                    ('cache_dir', 'cache_dir'),
                    ('assistant_lora_path', 'ostris/FLUX.1-schnell-training-adapter'), # Required for flux schnell training
                    ('is_flux', True),
                    ('quantize', True),  # run 8bit mixed precision
                    # low_vram is painfully slow to fuse in the adapter avoid it unless absolutely necessary
                    #('low_vram', True),  # uncomment this if the GPU is connected to your monitors. It will use less vram to quantize, but is slower.
                ])),
                ('sample', OrderedDict([
                    ('sampler', 'flowmatch'),  # must match train.noise_scheduler
                    ('sample_every', 500),  # sample every this many steps
                    ('width', 1024),
                    ('height', 1024),
                    ('prompts', [
                        # you can add [trigger] to the prompts here and it will be replaced with the trigger word
                        # '[trigger] holding a sign that says \'I LOVE PROMPTS!\'',
                        "[trigger] A manholding a sign that says 'I LOVE PROMPTS!'",
                        "[trigger] A man playing chess at the park, bomb going off in the background",
                        "[trigger] A man holding a coffee cup, in a beanie, sitting at a cafe",
                        "[trigger] A manis a DJ at a night club, fish eye lens, smoke machine, lazer lights, holding a martini",
                        "[trigger] A man showing off his cool new t shirt at the beach, a shark is jumping out of the water in the background",
                    ]),
                    ('neg', ''),  # not used on flux
                    ('seed', 42),
                    ('walk_seed', True),
                    ('guidance_scale', 1), # schnell does not do guidance
                    ('sample_steps', 4) # 1 - 4 works well
                ]))
            ])
        ])
    ])),
    # you can add any additional meta info here. [name] is replaced with config name at top
    ('meta', OrderedDict([
        ('name', '[name]'),
        ('version', '1.0')
    ]))
])

In [9]:
!nvidia-smi

Tue Jan 14 12:53:14 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03              Driver Version: 560.35.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA A100 80GB PCIe          On  |   00000000:01:00.0 Off |                    0 |
| N/A   35C    P0             66W /  300W |   41611MiB /  81920MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
                                                

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


## Run Fine-tuning.

Check your folders to the left for results. Items will be in output/LoRA/your_name_v1 In the samples folder, there are preiodic sampled. This doesnt work great with colab. They will be in /workspace/output

In [10]:
run_job(job_to_run)

  check_for_updates()
  return register_model(fn_wrapper)
  return register_model(fn_wrapper)
  return register_model(fn_wrapper)
  return register_model(fn_wrapper)
  return register_model(fn_wrapper)
  self.scaler = torch.cuda.amp.GradScaler()


{
    "type": "sd_trainer",
    "log_dir": "logs",
    "log_config": {
        "log_interval": "10"
    },
    "training_folder": "/workspace/output",
    "performance_log_every": 50,
    "device": "cuda:0",
    "trigger_word": "PrateekPani",
    "network": {
        "type": "lora",
        "linear": 32,
        "linear_alpha": 32
    },
    "save": {
        "dtype": "bfloat16",
        "save_every": 500,
        "max_step_saves_to_keep": 3,
        "push_to_hub": true,
        "hf_repo_id": "prtk1729/PrateekPani",
        "hf_private": true
    },
    "datasets": [
        {
            "folder_path": "/workspace/dataset",
            "caption_ext": "txt",
            "caption_dropout_rate": 0.05,
            "shuffle_tokens": false,
            "cache_latents_to_disk": true,
            "resolution": [
                512,
                768,
                1024
            ]
        }
    ],
    "train": {
        "batch_size": 1,
        "steps": 1000,
        "gradient_accumula

Fetching 3 files:   0%|          | 0/3 [00:00<?, ?it/s]

Grabbing lora from the hub: ostris/FLUX.1-schnell-training-adapter


pytorch_lora_weights.safetensors:   0%|          | 0.00/451M [00:00<?, ?B/s]

Fusing in LoRA
Quantizing transformer
Loading vae
Loading t5


Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Quantizing T5
Loading clip
making pipe
preparing
Loading assistant lora
Loading assistant adapter from /root/.cache/huggingface/hub/models--ostris--FLUX.1-schnell-training-adapter/snapshots/2715e8057d640acb4519b99f1d138ed3f1ac227c/pytorch_lora_weights.safetensors
create LoRA network. base dim (rank): 42, alpha: 42
neuron dropout: p=None, rank dropout: p=None, module dropout: p=None
create LoRA for Text Encoder: 0 modules.
create LoRA for U-Net: 494 modules.
enable LoRA for U-Net
Missing keys: []
create LoRA network. base dim (rank): 32, alpha: 32
neuron dropout: p=None, rank dropout: p=None, module dropout: p=None
create LoRA for Text Encoder: 0 modules.
create LoRA for U-Net: 494 modules.
enable LoRA for U-Net
Dataset: /workspace/dataset
  -  Preprocessing image dimensions


100%|██████████| 4/4 [00:00<00:00, 146.20it/s]

  -  Found 4 images
Bucket sizes for /workspace/dataset:
192x448: 1 files
448x512: 1 files
576x384: 1 files
448x576: 1 files
4 buckets made
Caching latents for /workspace/dataset
 - Saving latents to disk



Caching latents to disk: 100%|██████████| 4/4 [00:00<00:00, 24.01it/s]


Dataset: /workspace/dataset
  -  Preprocessing image dimensions


100%|██████████| 4/4 [00:00<00:00, 31714.96it/s]

  -  Found 4 images
Bucket sizes for /workspace/dataset:
192x448: 1 files
640x768: 1 files
832x576: 1 files
640x832: 1 files
4 buckets made
Caching latents for /workspace/dataset
 - Saving latents to disk



Caching latents to disk: 100%|██████████| 4/4 [00:00<00:00, 18.81it/s]


Dataset: /workspace/dataset
  -  Preprocessing image dimensions


100%|██████████| 4/4 [00:00<00:00, 31895.85it/s]

  -  Found 4 images
Bucket sizes for /workspace/dataset:
192x448: 1 files
896x1088: 1 files
1216x832: 1 files
832x1152: 1 files
4 buckets made
Caching latents for /workspace/dataset
 - Saving latents to disk



Caching latents to disk: 100%|██████████| 4/4 [00:00<00:00, 11.70it/s]


Skipping first sample due to config setting


PrateekPani:   5%|▍         | 49/1000 [01:35<27:23,  1.73s/it, lr: 1.0e-04 loss: 5.345e-01]


Timer 'PrateekPani Timer':
 - 2.1883s avg - train_loop, num = 10
 - 1.2875s avg - backward, num = 10
 - 0.6399s avg - predict_unet, num = 10
 - 0.2604s avg - reset_batch, num = 4
 - 0.0797s avg - calculate_loss, num = 10
 - 0.0736s avg - optimizer_step, num = 10
 - 0.0422s avg - encode_prompt, num = 10
 - 0.0025s avg - get_batch, num = 10
 - 0.0015s avg - preprocess_batch, num = 10
 - 0.0009s avg - prepare_noise, num = 10
 - 0.0003s avg - prepare_latents, num = 10
 - 0.0003s avg - batch_cleanup, num = 10
 - 0.0000s avg - prepare_prompt, num = 10
 - 0.0000s avg - grad_setup, num = 10
 - 0.0000s avg - scheduler_step, num = 10



PrateekPani:  10%|▉         | 99/1000 [03:10<32:41,  2.18s/it, lr: 1.0e-04 loss: 5.481e-01]


Timer 'PrateekPani Timer':
 - 1.8308s avg - train_loop, num = 10
 - 1.0456s avg - backward, num = 10
 - 0.5518s avg - predict_unet, num = 10
 - 0.2604s avg - reset_batch, num = 4
 - 0.0633s avg - optimizer_step, num = 10
 - 0.0620s avg - calculate_loss, num = 10
 - 0.0421s avg - encode_prompt, num = 10
 - 0.0021s avg - get_batch, num = 10
 - 0.0014s avg - preprocess_batch, num = 10
 - 0.0008s avg - prepare_noise, num = 10
 - 0.0003s avg - prepare_latents, num = 10
 - 0.0002s avg - batch_cleanup, num = 10
 - 0.0003s avg - log_to_tensorboard, num = 1
 - 0.0000s avg - prepare_prompt, num = 10
 - 0.0000s avg - grad_setup, num = 10
 - 0.0000s avg - scheduler_step, num = 10



PrateekPani:  15%|█▍        | 149/1000 [04:45<34:58,  2.47s/it, lr: 1.0e-04 loss: 5.335e-01]


Timer 'PrateekPani Timer':
 - 2.0747s avg - train_loop, num = 10
 - 1.2264s avg - backward, num = 10
 - 0.6026s avg - predict_unet, num = 10
 - 0.2574s avg - reset_batch, num = 4
 - 0.0695s avg - calculate_loss, num = 10
 - 0.0684s avg - optimizer_step, num = 10
 - 0.0418s avg - encode_prompt, num = 10
 - 0.0021s avg - get_batch, num = 10
 - 0.0013s avg - preprocess_batch, num = 10
 - 0.0008s avg - prepare_noise, num = 10
 - 0.0003s avg - prepare_latents, num = 10
 - 0.0002s avg - batch_cleanup, num = 10
 - 0.0000s avg - grad_setup, num = 10
 - 0.0000s avg - prepare_prompt, num = 10
 - 0.0000s avg - scheduler_step, num = 10



PrateekPani:  20%|█▉        | 199/1000 [06:17<22:00,  1.65s/it, lr: 1.0e-04 loss: 3.753e-01]


Timer 'PrateekPani Timer':
 - 1.7060s avg - train_loop, num = 10
 - 0.9700s avg - backward, num = 10
 - 0.5076s avg - predict_unet, num = 10
 - 0.2573s avg - reset_batch, num = 4
 - 0.0620s avg - optimizer_step, num = 10
 - 0.0553s avg - calculate_loss, num = 10
 - 0.0448s avg - encode_prompt, num = 10
 - 0.0021s avg - get_batch, num = 10
 - 0.0013s avg - preprocess_batch, num = 10
 - 0.0009s avg - prepare_noise, num = 10
 - 0.0002s avg - prepare_latents, num = 10
 - 0.0002s avg - batch_cleanup, num = 10
 - 0.0000s avg - prepare_prompt, num = 10
 - 0.0003s avg - log_to_tensorboard, num = 1
 - 0.0000s avg - grad_setup, num = 10
 - 0.0000s avg - scheduler_step, num = 10



PrateekPani:  25%|██▍       | 249/1000 [07:51<23:52,  1.91s/it, lr: 1.0e-04 loss: 4.675e-01]


Timer 'PrateekPani Timer':
 - 1.8126s avg - train_loop, num = 10
 - 1.0847s avg - backward, num = 10
 - 0.5430s avg - predict_unet, num = 10
 - 0.2549s avg - reset_batch, num = 4
 - 0.0643s avg - calculate_loss, num = 10
 - 0.0640s avg - optimizer_step, num = 10
 - 0.0401s avg - encode_prompt, num = 10
 - 0.0016s avg - get_batch, num = 10
 - 0.0011s avg - preprocess_batch, num = 10
 - 0.0006s avg - prepare_noise, num = 10
 - 0.0003s avg - prepare_latents, num = 10
 - 0.0002s avg - batch_cleanup, num = 10
 - 0.0000s avg - prepare_prompt, num = 10
 - 0.0000s avg - scheduler_step, num = 10
 - 0.0000s avg - grad_setup, num = 10



PrateekPani:  30%|██▉       | 299/1000 [09:28<24:06,  2.06s/it, lr: 1.0e-04 loss: 2.227e-01]


Timer 'PrateekPani Timer':
 - 2.0373s avg - train_loop, num = 10
 - 1.2025s avg - backward, num = 10
 - 0.5872s avg - predict_unet, num = 10
 - 0.2568s avg - reset_batch, num = 5
 - 0.0719s avg - optimizer_step, num = 10
 - 0.0714s avg - calculate_loss, num = 10
 - 0.0415s avg - encode_prompt, num = 10
 - 0.0023s avg - get_batch, num = 10
 - 0.0012s avg - preprocess_batch, num = 10
 - 0.0007s avg - prepare_noise, num = 10
 - 0.0003s avg - prepare_latents, num = 10
 - 0.0003s avg - batch_cleanup, num = 10
 - 0.0005s avg - log_to_tensorboard, num = 1
 - 0.0000s avg - prepare_prompt, num = 10
 - 0.0000s avg - grad_setup, num = 10
 - 0.0000s avg - scheduler_step, num = 10



PrateekPani:  35%|███▍      | 349/1000 [11:02<16:10,  1.49s/it, lr: 1.0e-04 loss: 4.399e-01]


Timer 'PrateekPani Timer':
 - 1.8298s avg - train_loop, num = 10
 - 1.0513s avg - backward, num = 10
 - 0.5459s avg - predict_unet, num = 10
 - 0.2581s avg - reset_batch, num = 4
 - 0.0644s avg - optimizer_step, num = 10
 - 0.0595s avg - calculate_loss, num = 10
 - 0.0420s avg - encode_prompt, num = 10
 - 0.0024s avg - get_batch, num = 10
 - 0.0012s avg - preprocess_batch, num = 10
 - 0.0008s avg - prepare_noise, num = 10
 - 0.0003s avg - prepare_latents, num = 10
 - 0.0002s avg - batch_cleanup, num = 10
 - 0.0000s avg - grad_setup, num = 10
 - 0.0000s avg - prepare_prompt, num = 10
 - 0.0000s avg - scheduler_step, num = 10



PrateekPani:  40%|███▉      | 399/1000 [12:35<17:43,  1.77s/it, lr: 1.0e-04 loss: 5.374e-01]


Timer 'PrateekPani Timer':
 - 1.8239s avg - train_loop, num = 10
 - 1.0526s avg - backward, num = 10
 - 0.5379s avg - predict_unet, num = 10
 - 0.2570s avg - reset_batch, num = 4
 - 0.0639s avg - optimizer_step, num = 10
 - 0.0610s avg - calculate_loss, num = 10
 - 0.0419s avg - encode_prompt, num = 10
 - 0.0021s avg - get_batch, num = 10
 - 0.0012s avg - preprocess_batch, num = 10
 - 0.0008s avg - prepare_noise, num = 10
 - 0.0003s avg - prepare_latents, num = 10
 - 0.0002s avg - batch_cleanup, num = 10
 - 0.0003s avg - log_to_tensorboard, num = 1
 - 0.0000s avg - prepare_prompt, num = 10
 - 0.0000s avg - grad_setup, num = 10
 - 0.0000s avg - scheduler_step, num = 10



PrateekPani:  45%|████▍     | 449/1000 [14:11<18:11,  1.98s/it, lr: 1.0e-04 loss: 1.461e-01]


Timer 'PrateekPani Timer':
 - 2.1608s avg - train_loop, num = 10
 - 1.2794s avg - backward, num = 10
 - 0.6234s avg - predict_unet, num = 10
 - 0.2587s avg - reset_batch, num = 4
 - 0.0785s avg - calculate_loss, num = 10
 - 0.0744s avg - optimizer_step, num = 10
 - 0.0418s avg - encode_prompt, num = 10
 - 0.0023s avg - get_batch, num = 10
 - 0.0013s avg - preprocess_batch, num = 10
 - 0.0008s avg - prepare_noise, num = 10
 - 0.0003s avg - prepare_latents, num = 10
 - 0.0002s avg - batch_cleanup, num = 10
 - 0.0000s avg - grad_setup, num = 10
 - 0.0000s avg - prepare_prompt, num = 10
 - 0.0000s avg - scheduler_step, num = 10



PrateekPani:  50%|████▉     | 499/1000 [15:46<16:54,  2.02s/it, lr: 1.0e-04 loss: 3.548e-01]

Unloading assistant lora



Generating Images:   0%|          | 0/5 [00:00<?, ?it/s][A
Generating Images:  20%|██        | 1/5 [00:04<00:19,  4.78s/it][A
Generating Images:  40%|████      | 2/5 [00:09<00:14,  4.75s/it][A
Generating Images:  60%|██████    | 3/5 [00:14<00:09,  4.74s/it][A
Generating Images:  80%|████████  | 4/5 [00:18<00:04,  4.73s/it][A
Generating Images: 100%|██████████| 5/5 [00:23<00:00,  4.72s/it][A
                                                                [A

Loading assistant lora


PrateekPani:  50%|████▉     | 499/1000 [15:46<16:54,  2.02s/it, lr: 1.0e-04 loss: 3.548e-01]

Saving at step 500


PrateekPani:  50%|████▉     | 499/1000 [15:59<16:54,  2.02s/it, lr: 1.0e-04 loss: 3.548e-01]

Saved to /workspace/output/PrateekPani/optimizer.pt

Timer 'PrateekPani Timer':
 - 2.0870s avg - train_loop, num = 10
 - 1.2221s avg - backward, num = 10
 - 0.6124s avg - predict_unet, num = 10
 - 0.2585s avg - reset_batch, num = 4
 - 0.0728s avg - calculate_loss, num = 10
 - 0.0706s avg - optimizer_step, num = 10
 - 0.0419s avg - encode_prompt, num = 10
 - 0.0023s avg - get_batch, num = 10
 - 0.0013s avg - preprocess_batch, num = 10
 - 0.0008s avg - prepare_noise, num = 10
 - 0.0003s avg - prepare_latents, num = 10
 - 0.0003s avg - batch_cleanup, num = 10
 - 0.0000s avg - prepare_prompt, num = 10
 - 0.0003s avg - log_to_tensorboard, num = 1
 - 0.0000s avg - grad_setup, num = 10
 - 0.0000s avg - scheduler_step, num = 10



PrateekPani:  55%|█████▍    | 549/1000 [17:20<15:25,  2.05s/it, lr: 1.0e-04 loss: 4.244e-01]


Timer 'PrateekPani Timer':
 - 2.0100s avg - train_loop, num = 10
 - 1.1979s avg - backward, num = 10
 - 0.6160s avg - predict_unet, num = 10
 - 0.2732s avg - reset_batch, num = 4
 - 0.0693s avg - optimizer_step, num = 10
 - 0.0671s avg - calculate_loss, num = 10
 - 0.0427s avg - encode_prompt, num = 10
 - 0.0014s avg - get_batch, num = 10
 - 0.0013s avg - preprocess_batch, num = 10
 - 0.0007s avg - prepare_noise, num = 10
 - 0.0003s avg - prepare_latents, num = 10
 - 0.0002s avg - batch_cleanup, num = 10
 - 0.0000s avg - prepare_prompt, num = 10
 - 0.0000s avg - scheduler_step, num = 10
 - 0.0000s avg - grad_setup, num = 10



PrateekPani:  60%|█████▉    | 599/1000 [18:54<11:35,  1.73s/it, lr: 1.0e-04 loss: 4.604e-01]


Timer 'PrateekPani Timer':
 - 1.9953s avg - train_loop, num = 10
 - 1.1508s avg - backward, num = 10
 - 0.5936s avg - predict_unet, num = 10
 - 0.2784s avg - reset_batch, num = 5
 - 0.0699s avg - optimizer_step, num = 10
 - 0.0681s avg - calculate_loss, num = 10
 - 0.0419s avg - encode_prompt, num = 10
 - 0.0022s avg - get_batch, num = 10
 - 0.0013s avg - preprocess_batch, num = 10
 - 0.0008s avg - prepare_noise, num = 10
 - 0.0003s avg - prepare_latents, num = 10
 - 0.0002s avg - batch_cleanup, num = 10
 - 0.0004s avg - log_to_tensorboard, num = 1
 - 0.0000s avg - prepare_prompt, num = 10
 - 0.0000s avg - grad_setup, num = 10
 - 0.0000s avg - scheduler_step, num = 10



PrateekPani:  65%|██████▍   | 649/1000 [20:30<11:05,  1.90s/it, lr: 1.0e-04 loss: 4.495e-01]


Timer 'PrateekPani Timer':
 - 1.9280s avg - train_loop, num = 10
 - 1.1201s avg - backward, num = 10
 - 0.5698s avg - predict_unet, num = 10
 - 0.2702s avg - reset_batch, num = 4
 - 0.0650s avg - optimizer_step, num = 10
 - 0.0618s avg - calculate_loss, num = 10
 - 0.0426s avg - encode_prompt, num = 10
 - 0.0022s avg - get_batch, num = 10
 - 0.0013s avg - preprocess_batch, num = 10
 - 0.0008s avg - prepare_noise, num = 10
 - 0.0002s avg - prepare_latents, num = 10
 - 0.0002s avg - batch_cleanup, num = 10
 - 0.0000s avg - prepare_prompt, num = 10
 - 0.0000s avg - grad_setup, num = 10
 - 0.0000s avg - scheduler_step, num = 10



PrateekPani:  70%|██████▉   | 699/1000 [22:04<08:13,  1.64s/it, lr: 1.0e-04 loss: 6.318e-01]


Timer 'PrateekPani Timer':
 - 1.9235s avg - train_loop, num = 10
 - 1.1200s avg - backward, num = 10
 - 0.5643s avg - predict_unet, num = 10
 - 0.2646s avg - reset_batch, num = 4
 - 0.0677s avg - optimizer_step, num = 10
 - 0.0638s avg - calculate_loss, num = 10
 - 0.0433s avg - encode_prompt, num = 10
 - 0.0023s avg - get_batch, num = 10
 - 0.0014s avg - preprocess_batch, num = 10
 - 0.0009s avg - prepare_noise, num = 10
 - 0.0003s avg - prepare_latents, num = 10
 - 0.0002s avg - batch_cleanup, num = 10
 - 0.0004s avg - log_to_tensorboard, num = 1
 - 0.0000s avg - prepare_prompt, num = 10
 - 0.0000s avg - grad_setup, num = 10
 - 0.0000s avg - scheduler_step, num = 10



PrateekPani:  75%|███████▍  | 749/1000 [23:38<08:37,  2.06s/it, lr: 1.0e-04 loss: 3.640e-01]


Timer 'PrateekPani Timer':
 - 1.7202s avg - train_loop, num = 10
 - 0.9806s avg - backward, num = 10
 - 0.5158s avg - predict_unet, num = 10
 - 0.2762s avg - reset_batch, num = 4
 - 0.0608s avg - optimizer_step, num = 10
 - 0.0502s avg - calculate_loss, num = 10
 - 0.0428s avg - encode_prompt, num = 10
 - 0.0023s avg - get_batch, num = 10
 - 0.0013s avg - preprocess_batch, num = 10
 - 0.0008s avg - prepare_noise, num = 10
 - 0.0003s avg - prepare_latents, num = 10
 - 0.0002s avg - batch_cleanup, num = 10
 - 0.0000s avg - prepare_prompt, num = 10
 - 0.0000s avg - grad_setup, num = 10
 - 0.0000s avg - scheduler_step, num = 10



PrateekPani:  80%|███████▉  | 799/1000 [25:13<07:24,  2.21s/it, lr: 1.0e-04 loss: 6.853e-02]


Timer 'PrateekPani Timer':
 - 1.8730s avg - train_loop, num = 10
 - 1.0867s avg - backward, num = 10
 - 0.5479s avg - predict_unet, num = 10
 - 0.2598s avg - reset_batch, num = 4
 - 0.0652s avg - optimizer_step, num = 10
 - 0.0645s avg - calculate_loss, num = 10
 - 0.0423s avg - encode_prompt, num = 10
 - 0.0020s avg - get_batch, num = 10
 - 0.0013s avg - preprocess_batch, num = 10
 - 0.0009s avg - prepare_noise, num = 10
 - 0.0003s avg - prepare_latents, num = 10
 - 0.0002s avg - batch_cleanup, num = 10
 - 0.0003s avg - log_to_tensorboard, num = 1
 - 0.0000s avg - grad_setup, num = 10
 - 0.0000s avg - prepare_prompt, num = 10
 - 0.0000s avg - scheduler_step, num = 10



PrateekPani:  85%|████████▍ | 849/1000 [26:49<05:10,  2.06s/it, lr: 1.0e-04 loss: 4.762e-01]


Timer 'PrateekPani Timer':
 - 1.9599s avg - train_loop, num = 10
 - 1.1666s avg - backward, num = 10
 - 0.6023s avg - predict_unet, num = 10
 - 0.2622s avg - reset_batch, num = 4
 - 0.0683s avg - optimizer_step, num = 10
 - 0.0639s avg - calculate_loss, num = 10
 - 0.0429s avg - encode_prompt, num = 10
 - 0.0014s avg - get_batch, num = 10
 - 0.0012s avg - preprocess_batch, num = 10
 - 0.0007s avg - prepare_noise, num = 10
 - 0.0003s avg - prepare_latents, num = 10
 - 0.0002s avg - batch_cleanup, num = 10
 - 0.0000s avg - prepare_prompt, num = 10
 - 0.0000s avg - scheduler_step, num = 10
 - 0.0000s avg - grad_setup, num = 10



PrateekPani:  90%|████████▉ | 899/1000 [28:24<03:15,  1.93s/it, lr: 1.0e-04 loss: 5.526e-01]


Timer 'PrateekPani Timer':
 - 2.2104s avg - train_loop, num = 10
 - 1.2987s avg - backward, num = 10
 - 0.6488s avg - predict_unet, num = 10
 - 0.2582s avg - reset_batch, num = 5
 - 0.0804s avg - calculate_loss, num = 10
 - 0.0731s avg - optimizer_step, num = 10
 - 0.0430s avg - encode_prompt, num = 10
 - 0.0023s avg - get_batch, num = 10
 - 0.0013s avg - preprocess_batch, num = 10
 - 0.0008s avg - prepare_noise, num = 10
 - 0.0003s avg - batch_cleanup, num = 10
 - 0.0003s avg - prepare_latents, num = 10
 - 0.0000s avg - grad_setup, num = 10
 - 0.0000s avg - prepare_prompt, num = 10
 - 0.0003s avg - log_to_tensorboard, num = 1
 - 0.0000s avg - scheduler_step, num = 10



PrateekPani:  95%|█████████▍| 949/1000 [29:58<01:47,  2.11s/it, lr: 1.0e-04 loss: 4.439e-01]


Timer 'PrateekPani Timer':
 - 1.9070s avg - train_loop, num = 10
 - 1.0895s avg - backward, num = 10
 - 0.5747s avg - predict_unet, num = 10
 - 0.2624s avg - reset_batch, num = 4
 - 0.0681s avg - optimizer_step, num = 10
 - 0.0638s avg - calculate_loss, num = 10
 - 0.0449s avg - encode_prompt, num = 10
 - 0.0023s avg - get_batch, num = 10
 - 0.0013s avg - preprocess_batch, num = 10
 - 0.0008s avg - prepare_noise, num = 10
 - 0.0003s avg - prepare_latents, num = 10
 - 0.0002s avg - batch_cleanup, num = 10
 - 0.0000s avg - prepare_prompt, num = 10
 - 0.0000s avg - grad_setup, num = 10
 - 0.0000s avg - scheduler_step, num = 10



PrateekPani: 100%|█████████▉| 999/1000 [31:29<00:01,  1.89s/it, lr: 1.0e-04 loss: 2.105e-01]


Unloading assistant lora


                                                                

Loading assistant lora

Saved to /workspace/output/PrateekPani/optimizer.pt



Fine-grained tokens added complexity to the permissions, making it irrelevant to check if a token has 'write' access.



    _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
    _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
    _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
    _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
    _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|



Enter your token (input will not be visible):  ········
Add token as git credential? (Y/n)  Y


Token has not been saved to git credential helper.


[1m[31mCannot authenticate through git-credential as no helper is defined on your machine.
You might have to re-authenticate when pushing to the Hugging Face Hub.
Run the following command in your terminal in case you want to set the 'store' credential helper as default.

git config --global credential.helper store

Read https://git-scm.com/book/en/v2/Git-Tools-Credential-Storage for more details.[0m


Upload 2 LFS files:   0%|          | 0/2 [00:00<?, ?it/s]

PrateekPani.safetensors:   0%|          | 0.00/344M [00:00<?, ?B/s]

PrateekPani_000000500.safetensors:   0%|          | 0.00/344M [00:00<?, ?B/s]

Output files and samples are now in your output directory.

### Start Tensorboard to view progress (optional)

To do this, ssh into the terminal using VSCode/Cursor or otherwise and cd into the `workspace` directory

Then run:
```
pip install tensorboard
tensorboard --logdir logs/
```
VSCode should port the server to your localhost so you can view the tensorboard results. Note that nothing will appear until the first data has been logged (50 steps above)

## Push to Hub

In [4]:
from huggingface_hub import HfApi, create_repo

# Initialize HfApi object
api = HfApi()

# Hardcoded local folder path to upload
local_folder_path = "output/loraRonan"  # Replace with your actual folder path

# Hardcoded repository details
repo_id = "Trelis/loraRonan"  # Replace with your actual repo ID
repo_type = "model"  # Change to "space" or "dataset" if needed

# Check if the repository exists, if not create it
try:
    # Try to create the repository (it will raise an error if it already exists)
    create_repo(repo_id, repo_type=repo_type, exist_ok=True)
    print(f"Repository '{repo_id}' created or already exists.")
except Exception as e:
    print(f"Error creating repository: {e}")

# Optional: ignore patterns and/or allow patterns for filtering files
# ignore_patterns = "**/logs/*.txt"  # Ignore all text log files, adjust as needed
allow_patterns = None  # You can define patterns for allowed files if necessary

# Upload the folder to the repository
try:
    api.upload_folder(
        folder_path=local_folder_path,
        repo_id=repo_id,
        repo_type=repo_type,
        # ignore_patterns=ignore_patterns,  # Adjust or remove based on needs
        allow_patterns=allow_patterns     # Adjust or remove based on needs
    )
    print(f"Successfully uploaded the folder: {local_folder_path} to repo: {repo_id}")
except Exception as e:
    print(f"Error uploading folder: {e}")

Repository 'Trelis/loraRonan' created or already exists.


loraRonan.safetensors:   0%|          | 0.00/344M [00:00<?, ?B/s]

loraRonan_000000500.safetensors:   0%|          | 0.00/344M [00:00<?, ?B/s]

Upload 4 LFS files:   0%|          | 0/4 [00:00<?, ?it/s]

optimizer.pt:   0%|          | 0.00/346M [00:00<?, ?B/s]

loraRonan_000001000.safetensors:   0%|          | 0.00/344M [00:00<?, ?B/s]

Successfully uploaded the folder: output/loraRonan to repo: Trelis/loraRonan
