RATLIP: Generative Adversarial CLIP Text-to-Image Synthesis Based on Recurrent Affine Transformations
At least 1x24GB 3090 GPU (for training), only CPU (for sampling)
- Environment
conda create -n RATLIP python=3.9
conda activate RATLIP
- Clone this repo
git clone https://github.com/OxygenLu/RATLIP.git
- Install the requirements
cd RATLIP
pip install -r requirements.txt
- Install CLIP
cd ../
git clone https://github.com/openai/CLIP.git
python ./CLIP/setup.py install
cd RALIP/code
bash scripts/train.sh ./cfg/bird.yml
bash scripts/test.sh ./cfg/bird.yml
You can change state_epoch and the corresponding weight to continue training at breakpoints
The results are stored in TensorBoard files under ./logs
tensorboard --logdir your_path --port 8166
The sample.ipynb can be used to sample
Compare RATLIP and state-of-the-art models on FID values (the smaller, the better).
Model | CUB | CelebA-tiny |
---|---|---|
AttnGAN | 23.98 | 125.98 |
LAFITE | 14.58 | - |
DF-GAN | 14.81 | 137.6 |
GALIP | 10.00 | 94.45 |
Ours | 13.28 | 81.48 |
Compare RATLIP and state-of-the-art models on CLIP score values (the bigger, the better).
Model | CUB | CelebA-tiny | Oxford |
---|---|---|---|
AttnGAN | - | 21.15 | - |
LAFITE | 31.25 | - | - |
DF-GAN | 29.20 | 26.67 | 24.41 |
GALIP | 31.60 | 31.77 | 27.95 |
Ours | 32.03 | 31.94 | 28.91 |
@misc{lin2024ratlip,
title={RATLIP: Generative Adversarial CLIP Text-to-Image Synthesis Based on Recurrent Affine Transformations},
author={Chengde Lin and Xijun Lu and Guangxi Chen},
year={2024},
eprint={2405.08114},
archivePrefix={arXiv},
primaryClass={cs.CV}
}