This repository contains the code needed both
- to train (or fine-tune) diffusion models
- to perform image-to-image class translation with pretrained models
To install the required dependencies, run:
mamba install -f environment.yaml
or:
conda env create -f environment.yml
The only supported experiment tracker for now is wandb
. You will need a configured Weights & Biases environement to log information for both kind of experiments.
Training or fine-tuning diffusion models is in principle performed by running the following command:
accelerate launch {accelerate args} train.py {script args}
where:
accelerate
must be configured to your training setup, either withaccelerate config
beforehand or by passing the appropriate flags toaccelerate launch
in place of{accelerate args}
(see the accelerate documentation for more details)- some args are required by the training script in lieu of
{script args}
(see thesrc/args_parser.py
file for the full list of possible and required training script arguments –you can also callpython train.py --help
in the terminal but it takes quite some time to print)
Some examples of commands launching a training experience can be found in the examples/examples_training_scripts
folder.
They consist in bash/zsh
scripts handling the configuration of both accelerate
and the training script train.py
. They can be called directly from the command line.
Two examples:
- The following script:
./examples/examples_training_scripts/launch_script_DDIM.sh
will train a DDIM model from scratch on the data located at path/to/train/data
.
- This one:
./examples/examples_training_scripts/launch_script_SD.sh
will fine-tune the UNet of stabilityai/stable-diffusion-2-1
(plus a custom class embedding) on the data located at path/to/train/data
.
Configure these examples launchers to your neeeds!
Note
Future version will probably use Hydra to handle the training configuration.
The SLURM_launch_script_<xxx>.sh
files demonstrate how to adapt these bash scripts to a SLURM cluster.
They are meant to launch a series of runs at different sizes of training data on the A100
partition of the Jean Zay CNRS cluster. They load a custom python
environement located at ${SCRATCH}/micromamba/envs/diffusion-experiments
with micromamba
; adapt to your needs!
Image-to-image class transfer experiments are performed with the img2img_comparison_launcher.py
script, which additionally handles the configuration of accelerate
and possibly submits the jobs to the SLURM
manager. It can be called as:
python img2img_comparison_launcher.py {hydra overrides} &
The image-to-image class transfer experiments are configured with Hydra. Example configuration files can be found in the examples/example_img2img_comparison_conf
folder.
The img2img_comparison_launcher.py
script expects a configuration folder named my_img2img_comparison_conf
to be located in the directory where it is called, and a file named general_config.yaml
inside this configuration folder.
These defaults can be overriden with the --config-path
and --config-name
Hydra arguments.
Note
To prevent the experiment config being modified between the job submission and the job launch (which can typically take quite some time when submitting to SLURM), the entire config is copied to the experiment folder and the submitted job will pull its config from there.
TODO
All experiments (either training or class transfer) output artifacts following the project/run organization of wandb:
- exp_parent_folder
| - project
| | - run_name
where exp_parent_folder
is any base path on the system, and project
and run_name
are both the names of the folder created by the scripts on the system and the project and run names on wandb (1-on-1 correspondence).
When Hydra is used, an additional timestamped sub-folder hierarchy is also created under the run_name
folder:
- run_name
| - day
| | - hour
This is especially important when doing sweeps, so that runs with the same name but with different hyperparameters do not overwrite each other.