1Hong Kong University of Science and Technology
*Corresponding Author
Summary: We presented the Bi-Anchor Interpolation Solver (BA-solver), a novel framework that effectively bridges the gap between computationally expensive training-free solvers and resource-intensive training-based acceleration methods. By introducing a lightweight SideNet, we endow frozen flow matching backbones with bidirectional temporal perception, enabling high-order efficient numerical integration. BA-solver achieves state-of-the-art generation quality with as few as 5 to 10 NFEs, matching the performance of standard Euler solvers requiring 100+ NFEs.
conda create -n BA_solver python=3.11
conda activate BA_solver
pip install -r requirements.txtThe experiments for ImageNet are provided. You can place the data that you want and can specifiy it via --data-dir arguments (e.g., "../train") in training scripts. We didn't apply preprocessing process for extracting latent in our experiments.
export WANDB_API_KEY="YOUR_API_KEY"
export WANDB_ENTITY="WANDB_ENTITY"
export WANDB_PROJECT="WANDB_PROJECT"
accelerate launch --num_processes 1 train_BA_side.py \
--data-dir <DATA_PATH> \
--pretrained-ckpt <PRETRAINED_MODEL_PATH> \
--resolution 256 \
--model SiT-XL/2 \
--exp-name <EXP_NAME> \
--batch-size 168 \
--learning-rate 0.0001 \
--SideNet-depth 4 \
--cfg-scale 4.0 \
--sampling-steps 1000 \
--checkpointing-steps 1000All experiments are conducted using the SiT-XL/2 backbone. You can adjust the configuration using the following arguments:
-
--SideNet-depth: Depth of the SideNet module. -
--SideNet-in-channels: Number of input channels for SideNet. -
--SideNet-base-channels: Base channel width for SideNet. -
--SideNet-h-emb-dim: Embedding dimension for the offset parameter$h$ . -
--data-dir: Path to the ImageNet dataset directory. -
--pretrained-ckpt: Path to the pre-trained SiT-REPA checkpoint. You can download it here. -
--resolution: Input image resolution (e.g., 256 or 512).
For ImageNet 512x512, please use the following script:
export WANDB_API_KEY="YOUR_API_KEY"
export WANDB_ENTITY="WANDB_ENTITY"
export WANDB_PROJECT="WANDB_PROJECT"
accelerate launch --num_processes 1 train_BA_side.py \
--data-dir <DATA_PATH> \
--pretrained-ckpt <PRETRAINED_MODEL_PATH> \
--resolution 512 \
--model SiT-XL/2 \
--exp-name <EXP_NAME> \
--batch-size 96 \
--learning-rate 0.0001 \
--SideNet-depth 8 \
--cfg-scale 4.0 \
--sampling-steps 1000 \
--checkpointing-steps 1000Utilizing trained SideNet, you can generate ImageNet-256 images (and the .npz file can be used for ADM evaluation suite) through the following script:
torchrun --nnodes=1 --nproc_per_node=8 generate.py \
--model SiT-XL/2 \
--resolution 256 \
--num-fid-samples 50000 \
--per-proc-batch-size 32 \
--base-ckpt <PRETRAINED_MODEL_PATH> \
--side-ckpt <SIDENET_MODEL_PATH> \
--SideNet-depth 4 \
--num-steps <STEPS> \
--sample-dir <OUTPUT_DIR> \
--cfg-scale <CFG> \
--cfg-interval-start <CFG_START> \You can also generate ImageNet-512 images through the following script:
torchrun --nnodes=1 --nproc_per_node=8 generate.py \
--model SiT-XL/2 \
--resolution 512 \
--num-fid-samples 50000 \
--per-proc-batch-size 10 \
--base-ckpt <PRETRAINED_MODEL_PATH> \
--side-ckpt <SIDENET_MODEL_PATH> \
--SideNet-depth 8 \
--num-steps <STEPS> \
--sample-dir <OUTPUT_DIR> \
--cfg-scale <CFG> \
--cfg-interval-start <CFG_START> \We provided SideNet checkpoints here for ImageNet-256 and -512 generation.
Here are some visual samples on ImageNet-512 with only 7 NFEs.
This code is mainly built upon DiT, SiT, edm2, RCG, and REPA repositories.
@article{chen2026bi,
title={Bi-Anchor Interpolation Solver for Accelerating Generative Modeling},
author={Chen, Hongxu and Li, Hongxiang and Wang, Zhen and Chen, Long},
journal={arXiv preprint arXiv:2601.21542},
year={2026}
}
