Skip to content
The method of text-to-image
Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
code
data/coco
evaluation
memory
models
outputs
LICENSE
README.md
VIS_Each_model.jpg
VIS_Previous.jpg
environment.yaml
model_structure.jpg

README.md

CPGAN

Pytorch implementation for reproducing CPGAN results in the paper CPGAN: Full-Spectrum Content-Parsing Generative Adversarial Networks for Text-to-image Synthesis by Jiadong Liang, Wenjie Pei, Feng Lu.

Getting Started

  • Create anaconda virtual environment
conda create -n CPGAN python=2.7
conda activate CPGAN
conda install pytorch=1.0.1 torchvision==0.2.1 cudatoolkit=10.0.130
  • PIP Install
pip install python-dateutil easydict pandas torchfile nltk scikit-image h5py pyyaml
  • Clone this repo:
https://github.com/dongdongdong666/CPGAN.git
  • Download train2014-text.zip from here and unzip it to data/coco/text/.

  • Unzip val2014-text.zip in data/coco/text/.

  • Download memory features for each word from here and put the features in memory/.

  • Download Inception Enocder , Generator , Text Encoder and put these models in models/.

Test

Sampling

  • Set B_VALIDATION:False in "/code/cfg/eval_coco.yml".
  • Run python eval.py --cfg cfg/eval_coco.yml --gpu 0 to generate examples from captions listed in "data/coco/example_captions.txt". Results are saved to "outputs/Inference_Images/example_captions".

Validation

  • Set B_VALIDATION:True in "/code/cfg/eval_coco.yml".
  • Run python eval.py --cfg cfg/eval_coco.yml --gpu 0 to generate examples for all captions in the validation dataset. Results are saved to "outputs/Inference_Images/single".
  • Compute inception score for the model trained on coco.
python caculate_IS.py --dir ../outputs/Inference_Images/single/
  • Compute R precison for the model trained on coco.
python caculate_R_precison.py --cfg cfg/coco.yml --gpu 0 --valid_dir ../outputs/Inference_Images/single/

Train

TBA

Examples generated by CPGAN

  • Qualitative comparison between different modules of our model for ablation study, the results of AttnGAN are also provided for reference.

  • Qualitative comparison between our CPGAN with other classical models for text-to-image synthesis.

Performance

Note that after cleaning and refactoring the code of the paper, the results are slightly different. We use the Pytorch implementation to measure inception score and R precision.

Model R-precision↑ IS↑
Reed (-) 7.88 ± 0.07
StackGAN (-) 8.45 ± 0.03
Infer (-) 11.46 ± 0.09
SD-GAN (-) 35.69 ± 0.50
MirrorGAN (-) 26.47 ± 0.41
SEGAN (-) 27.86 ± 0.31
DMGAN 88.56% 30.49 ± 0.57
AttnGAN 82.98% 25.89 ± 0.47
objGAN 91.05% 30.29 ± 0.33
CPGAN 93.59% 52.73 ± 0.61

Citing CPGAN

If you find CPGAN useful in your research, please consider citing:

@misc{liang2019cpgan,
    title={CPGAN: Full-Spectrum Content-Parsing Generative Adversarial Networks for Text-to-Image Synthesis},
    author={Jiadong Liang and Wenjie Pei and Feng Lu},
    year={2019},
    eprint={1912.08562},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
You can’t perform that action at this time.