Skip to content

FAIL: Flow Matching Adversarial Imitation Learning for Image Generation

License

Notifications You must be signed in to change notification settings

HansPolo113/FAIL

Repository files navigation

FAIL: Flow Matching Adversarial Imitation Learning for Image Generation

Official implementation of Flow Matching Adversarial Imitation Learning (FAIL) for Image Generation.

FAIL minimizes policy-expert divergence through adversarial training without explicit rewards or pairwise comparisons. We provide two algorithms:

  • FAIL-PD (Pathwise Derivative): Backpropagates discriminator gradients through the ODE solver
  • FAIL-PG (Policy Gradient): Policy gradient alternative using Flow Policy Optimization (FPO)

Please see [Paper] for more information.

Yeyao Ma1, Chen Li2, Xiaosong Zhang3, Han Hu3, and Weidi Xie1. FAIL: Flow Matching Adversarial Imitation Learning for Image Generation. arXiv, 2026.

1Shanghai Jiao Tong University, 2Xi'an Jiaotong University, 3Tencent

Installation

bash env_setup.sh

Data Preparation

1. Download Checkpoints

mkdir -p ./data/flux ./data/Qwen3-VL-2B-Instruct

2. Prepare Expert Data

The expert data consists of:

  • gemini_13k.parquet: 13K prompts with metadata (uuid, content, etc.)
  • Expert images: one image per prompt, organized by uuid

Download from HuggingFace:

hf download HansPolo/FAIL-expert-data --repo-type dataset --local-dir ./data
unzip ./data/FAIL_train.zip -d ./data

Directory structure after unzip:

./data/gemini_13k.parquet
./data/Gemini2K/{uuid}/sample_0.png

Each {uuid} folder corresponds to a row in gemini_13k.parquet, and sample_0.png is the expert image for that prompt.

3. Preprocess Text Embeddings

Extract FLUX text embeddings for all prompts in the parquet file:

bash scripts/preprocess/preprocess_flux_rl_embeddings.sh

Training

Step 1: Cold Start with SFT

First, initialize the policy via Supervised Fine-Tuning on expert demonstrations for one epoch:

bash scripts/finetune/finetune_flux_sft.sh

Step 2: FAIL Training

Then run FAIL training with the SFT checkpoint (set --pretrained_transformer_path in the script):

# FAIL-PD
bash scripts/finetune/finetune_flux_fail_pd.sh

# FAIL-PG
bash scripts/finetune/finetune_flux_fail_pg.sh

Multi-node (e.g., 4 nodes):

# On each node, set WORLD_SIZE, RANK, MASTER_ADDR
WORLD_SIZE=4 RANK=0 MASTER_ADDR=<master_ip> bash scripts/finetune/finetune_flux_fail_pd.sh  # node 0
WORLD_SIZE=4 RANK=1 MASTER_ADDR=<master_ip> bash scripts/finetune/finetune_flux_fail_pd.sh  # node 1
WORLD_SIZE=4 RANK=2 MASTER_ADDR=<master_ip> bash scripts/finetune/finetune_flux_fail_pd.sh  # node 2
WORLD_SIZE=4 RANK=3 MASTER_ADDR=<master_ip> bash scripts/finetune/finetune_flux_fail_pd.sh  # node 3

Sampling

Generate images using Ray-based distributed inference:

# Set CHECKPOINT_PATH in the script to load trained model
bash scripts/visualization/sample_flux_ray.sh

Acknowledgement

This repo is built upon these amazing works:

Citation

@article{ma2026fail,
  title={FAIL: Flow Matching Adversarial Imitation Learning for Image Generation},
  author={Ma, Yeyao and Li, Chen and Zhang, Xiaosong and Hu, Han and Xie, Weidi},
  journal={arXiv preprint arXiv:2602.12155},
  year={2026}
}

About

FAIL: Flow Matching Adversarial Imitation Learning for Image Generation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published