- Goal : In terms of specific text description to generate corresponding images.
- TensorFlow implement for Kaggles Contest - Reverse Image Caption
- Technique: Using
GAN-CLS
algorithm from the paper Generative Adversarial Text-to-Image Synthesis andstackGAN++
model from StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks
- Download image files, captions, and pre-trained model from Google Drive. Image and captions file put into
./text-to-image
directory. Moreover, the pre-trained model put into./checkpoint
directory.
- the flower shown has yellow anther red pistil and bright red petals.
- this flower has petals that are yellow, white and purple and has dark lines
- the petals on this flower are white with a yellow center
- this flower has a lot of small round pink petals.
- this flower is orange in color, and has petals that are ruffled and rounded.
- the flower has yellow petals and the center of it is brown
- this flower has petals that are blue and white.
- these white flowers have petals that start off white in color and end in a white towards the tips.
- Use stackGAN can get better result, despite the improved-wgan with skip-thought also can produce satisfying one.
- Get more captions per images, that is, randomly choose 3~5 captions for each picture (if we choose all captions, the training set is so large that it takes long time to train).
- Add wrong image with false label to train with Discriminator: That is, give a caption and a random image from dataset, and give it false label. (To let D learn whether the image has relation with the caption.)
- The result of random distortion(e.g. brightness and different angles) to images is not good as I think.
- StackGAN++ PyTorch paper code
- Generative Adversarial Text to Image Synthesis