# AI Toolkit by Ostris
## FLUX.1-dev Training


In [1]:
!nvidia-smi

Sun Jun  8 12:52:32 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   60C    P8             10W /   70W |       0MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

In [2]:
!git clone https://github.com/ostris/ai-toolkit
!mkdir -p /content/dataset

Cloning into 'ai-toolkit'...
remote: Enumerating objects: 6317, done.[K
remote: Counting objects: 100% (562/562), done.[K
remote: Compressing objects: 100% (151/151), done.[K
remote: Total 6317 (delta 520), reused 411 (delta 411), pack-reused 5755 (from 4)[K
Receiving objects: 100% (6317/6317), 31.18 MiB | 24.88 MiB/s, done.
Resolving deltas: 100% (4413/4413), done.


Put your image dataset in the `/content/dataset` folder

In [4]:
!cd ai-toolkit && git submodule update --init --recursive && pip install -r requirements.txt


Collecting git+https://github.com/jaretburkett/easy_dwpose.git (from -r requirements.txt (line 5))
  Cloning https://github.com/jaretburkett/easy_dwpose.git to /tmp/pip-req-build-1dwkhg7x
  Running command git clone --filter=blob:none --quiet https://github.com/jaretburkett/easy_dwpose.git /tmp/pip-req-build-1dwkhg7x
  Resolved https://github.com/jaretburkett/easy_dwpose.git to commit 417b34de004d80b9181ff96059f5ae0793abd235
  Installing build dependencies ... [?25l[?25hcanceled
[31mERROR: Operation cancelled by user[0m[31m
[0mTraceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/pip/_internal/cli/base_command.py", line 179, in exc_logging_wrapper
    status = run_func(*args)
             ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/pip/_internal/cli/req_command.py", line 67, in wrapper
    return func(self, options, args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/pip/_internal/commands/ins

## Model License
Training currently only works with FLUX.1-dev. Which means anything you train will inherit the non-commercial license. It is also a gated model, so you need to accept the license on HF before using it. Otherwise, this will fail. Here are the required steps to setup a license.

Sign into HF and accept the model access here [black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev)

[Get a READ key from huggingface](https://huggingface.co/settings/tokens/new?) and place it in the next cell after running it.

In [5]:
import getpass
import os

# Prompt for the token
hf_token = getpass.getpass('Enter your HF access token and press enter: ')

# Set the environment variable
os.environ['HF_TOKEN'] = hf_token

print("HF_TOKEN environment variable has been set.")

Enter your HF access token and press enter: ··········
HF_TOKEN environment variable has been set.


In [6]:
import os
import sys
sys.path.append('/content/ai-toolkit')
from toolkit.job import run_job
from collections import OrderedDict
from PIL import Image
import os
os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"

## Setup

This is your config. It is documented pretty well. Normally you would do this as a yaml file, but for colab, this will work. This will run as is without modification, but feel free to edit as you want.

In [7]:
from collections import OrderedDict

job_to_run = OrderedDict([
    ('job', 'extension'),
    ('config', OrderedDict([
        # this name will be the folder and filename name
        ('name', 'me_flux_lora_v1'),
        ('process', [
            OrderedDict([
                ('type', 'sd_trainer'),
                # root folder to save training sessions/samples/weights
                ('training_folder', '/content/output'),
                # uncomment to see performance stats in the terminal every N steps
                #('performance_log_every', 1000),
                ('device', 'cuda:0'),
                # if a trigger word is specified, it will be added to captions of training data if it does not already exist
                # alternatively, in your captions you can add [trigger] and it will be replaced with the trigger word
                ('trigger_word', 'fotie'),
                ('network', OrderedDict([
                    ('type', 'lora'),
                    ('linear', 32), # how complex, the heigher the number the more complex it will be
                    ('linear_alpha', 32) # 32 for capturing something as complex as a face
                ])),
                ('save', OrderedDict([
                    ('dtype', 'float16'),  # precision to save
                    ('save_every', 250),  # save every this many steps
                    ('max_step_saves_to_keep', 4)  # how many intermittent saves to keep
                ])),
                ('datasets', [
                    # datasets are a folder of images. captions need to be txt files with the same name as the image
                    # for instance image2.jpg and image2.txt. Only jpg, jpeg, and png are supported currently
                    # images will automatically be resized and bucketed into the resolution specified
                    OrderedDict([
                        ('folder_path', '/content/dataset'),
                        ('caption_ext', 'txt'),
                        ('caption_dropout_rate', 0.05),  # will drop out the caption 5% of time
                        ('shuffle_tokens', False),  # shuffle caption order, split by commas
                        ('cache_latents_to_disk', True),  # leave this true unless you know what you're doing
                        ('resolution', [512, 768, 1024])  # flux enjoys multiple resolutions
                    ])
                ]),
                ('train', OrderedDict([
                    ('batch_size', 1),
                    ('steps', 2000),  # total number of steps to train 500 - 4000 is a good range
                    ('gradient_accumulation_steps', 1),
                    ('train_unet', True),
                    ('train_text_encoder', False),  # probably won't work with flux
                    ('content_or_style', 'balanced'),  # content, style, balanced
                    ('gradient_checkpointing', True),  # need the on unless you have a ton of vram
                    ('noise_scheduler', 'flowmatch'),  # for training only
                    ('optimizer', 'adamw8bit'),
                    ('lr', 1e-4),

                    # uncomment this to skip the pre training sample
                    # ('skip_first_sample', True),

                    # uncomment to completely disable sampling
                    # ('disable_sampling', True),

                    # uncomment to use new vell curved weighting. Experimental but may produce better results
                    # ('linear_timesteps', True),

                    # ema will smooth out learning, but could slow it down. Recommended to leave on.
                    ('ema_config', OrderedDict([
                        ('use_ema', True),
                        ('ema_decay', 0.99)
                    ])),

                    # will probably need this if gpu supports it for flux, other dtypes may not work correctly
                    ('dtype', 'bf16')
                ])),
                ('model', OrderedDict([
                    # huggingface model name or path
                    ('name_or_path', 'black-forest-labs/FLUX.1-dev'),
                    ('is_flux', True),
                    ('quantize', True),  # run 8bit mixed precision
                    #('low_vram', True),  # uncomment this if the GPU is connected to your monitors. It will use less vram to quantize, but is slower.
                ])),
                ('sample', OrderedDict([
                    ('sampler', 'flowmatch'),  # must match train.noise_scheduler
                    ('sample_every', 200),  # sample every this many steps
                    ('width', 1024),
                    ('height', 1024),
                    ('prompts', [
                        # you can add [trigger] to the prompts here and it will be replaced with the trigger word
                        '[trigger] holding a sign that says \'I LOVE PROMPTS!\'',
                        '[trigger] with red hair, playing chess at the park, bomb going off in the background',
                        '[trigger] holding a coffee cup, in a beanie, sitting at a cafe',
                        '[trigger] is a DJ at a night club, fish eye lens, smoke machine, lazer lights, holding a martini',
                        '[trigger] showing off his cool new t shirt at the beach, a shark is jumping out of the water in the background',
                        '[trigger] building a log cabin in the snow covered mountains',
                        '[trigger] playing the guitar, on stage, singing a song, laser lights, punk rocker',
                        '[trigger] with a beard, building a chair, in a wood shop',
                        'photo of [trigger], white background, medium shot, modeling clothing, studio lighting, white backdrop',
                        '[trigger] holding a sign that says, \'this is a sign\'',
                        '[trigger], in a post apocalyptic world, with a shotgun, in a leather jacket, in a desert, with a motorcycle'
                    ]),
                    ('neg', ''),  # not used on flux
                    ('seed', 42),
                    ('walk_seed', True),
                    ('guidance_scale', 4),
                    ('sample_steps', 20)
                ]))
            ])
        ])
    ])),
    # you can add any additional meta info here. [name] is replaced with config name at top
    ('meta', OrderedDict([
        ('name', '[name]'),
        ('version', '1.0')
    ]))
])


## Run it

Below does all the magic. Check your folders to the left. Items will be in output/LoRA/your_name_v1 In the samples folder, there are preiodic sampled. This doesnt work great with colab. They will be in /content/output

In [8]:
run_job(job_to_run)


RuntimeError: Failed to import diffusers.schedulers.scheduling_ddpm because of the following error (look up to see its traceback):
numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

## Done

Check your ourput dir and get your slider


In [None]:
from huggingface_hub import upload_folder

# we upload the full model to hf
upload_folder(
    folder_path="output/me_flux_lora_v1",
    repo_id="fotiecodes/me-flux-lora-v1",
    repo_type="model",
    # ignore_patterns=["*.gguf"], # ignore all .gguf
)

  0%|          | 0/6 [00:00<?, ?it/s]

me_flux_lora_v1.safetensors:   0%|          | 0.00/344M [00:00<?, ?B/s]

me_flux_lora_v1_000001000.safetensors:   0%|          | 0.00/344M [00:00<?, ?B/s]

me_flux_lora_v1_000001250.safetensors:   0%|          | 0.00/344M [00:00<?, ?B/s]

me_flux_lora_v1_000001500.safetensors:   0%|          | 0.00/344M [00:00<?, ?B/s]

me_flux_lora_v1_000001750.safetensors:   0%|          | 0.00/344M [00:00<?, ?B/s]

optimizer.pt:   0%|          | 0.00/350M [00:00<?, ?B/s]

CommitInfo(commit_url='https://huggingface.co/fotiecodes/me-flux-lora-v1/commit/d6a29010662c7ac34a0ae2cca27b8f80c1d5828f', commit_message='Upload folder using huggingface_hub', commit_description='', oid='d6a29010662c7ac34a0ae2cca27b8f80c1d5828f', pr_url=None, repo_url=RepoUrl('https://huggingface.co/fotiecodes/me-flux-lora-v1', endpoint='https://huggingface.co', repo_type='model', repo_id='fotiecodes/me-flux-lora-v1'), pr_revision=None, pr_num=None)