Official implementation of Flow Matching Adversarial Imitation Learning (FAIL) for Image Generation.
FAIL minimizes policy-expert divergence through adversarial training without explicit rewards or pairwise comparisons. We provide two algorithms:
- FAIL-PD (Pathwise Derivative): Backpropagates discriminator gradients through the ODE solver
- FAIL-PG (Policy Gradient): Policy gradient alternative using Flow Policy Optimization (FPO)
Please see [Paper] for more information.
| Yeyao Ma1, Chen Li2, Xiaosong Zhang3, Han Hu3, and Weidi Xie1. FAIL: Flow Matching Adversarial Imitation Learning for Image Generation. arXiv, 2026. |
1Shanghai Jiao Tong University, 2Xi'an Jiaotong University, 3Tencent
bash env_setup.shmkdir -p ./data/flux ./data/Qwen3-VL-2B-Instruct- FLUX.1-dev →
./data/flux - Qwen3-VL-2B-Instruct →
./data/Qwen3-VL-2B-Instruct
The expert data consists of:
gemini_13k.parquet: 13K prompts with metadata (uuid, content, etc.)- Expert images: one image per prompt, organized by uuid
Download from HuggingFace:
hf download HansPolo/FAIL-expert-data --repo-type dataset --local-dir ./data
unzip ./data/FAIL_train.zip -d ./dataDirectory structure after unzip:
./data/gemini_13k.parquet
./data/Gemini2K/{uuid}/sample_0.png
Each {uuid} folder corresponds to a row in gemini_13k.parquet, and sample_0.png is the expert image for that prompt.
Extract FLUX text embeddings for all prompts in the parquet file:
bash scripts/preprocess/preprocess_flux_rl_embeddings.shFirst, initialize the policy via Supervised Fine-Tuning on expert demonstrations for one epoch:
bash scripts/finetune/finetune_flux_sft.shThen run FAIL training with the SFT checkpoint (set --pretrained_transformer_path in the script):
# FAIL-PD
bash scripts/finetune/finetune_flux_fail_pd.sh
# FAIL-PG
bash scripts/finetune/finetune_flux_fail_pg.shMulti-node (e.g., 4 nodes):
# On each node, set WORLD_SIZE, RANK, MASTER_ADDR
WORLD_SIZE=4 RANK=0 MASTER_ADDR=<master_ip> bash scripts/finetune/finetune_flux_fail_pd.sh # node 0
WORLD_SIZE=4 RANK=1 MASTER_ADDR=<master_ip> bash scripts/finetune/finetune_flux_fail_pd.sh # node 1
WORLD_SIZE=4 RANK=2 MASTER_ADDR=<master_ip> bash scripts/finetune/finetune_flux_fail_pd.sh # node 2
WORLD_SIZE=4 RANK=3 MASTER_ADDR=<master_ip> bash scripts/finetune/finetune_flux_fail_pd.sh # node 3Generate images using Ray-based distributed inference:
# Set CHECKPOINT_PATH in the script to load trained model
bash scripts/visualization/sample_flux_ray.shThis repo is built upon these amazing works:
@article{ma2026fail,
title={FAIL: Flow Matching Adversarial Imitation Learning for Image Generation},
author={Ma, Yeyao and Li, Chen and Zhang, Xiaosong and Hu, Han and Xie, Weidi},
journal={arXiv preprint arXiv:2602.12155},
year={2026}
}