Skip to content

Rakshith-Manandi/text-to-image-using-GAN

Repository files navigation

Text to Image Synthesis Synthesis using Generative Adversarial Networks

Intoduction

This project is mainly inspired from Generative Adversarial Text-to-Image Synthesis paper. We implemented this model using PyTorch. In this model we train a conditional generative adversarial network, conditioned on text captions, to generate images that correspond to the captions. The network architecture is shown below. This architecture is based on DCGAN.

Credits: [1]

Datasets

We used the hdf5 format of these datasets which can be found here for birds_hdf5 and here for flowers_hdf5. These hdf5 datasets were converted from Caltech-UCSD Birds 200 and Oxford Flowers datasets.

We used the text embeddings provided by the paper([1]) authors.

Requirements

  • PyTorch
  • h5py
  • EasyDict
  • PIL
  • Numpy

This implementation only supports running with GPUs.

To install all the dependencies please do:
$ pip install -r requirements.txt

Training

To use this code for training you can:
$ git clone https://github.com/Rakshith-Manandi/text-to-image-using-GAN.git
$ cd ./text-to-image-using-GAN
$ python -u runtime.py

Inputs to the model for training/prediction:

  • dataset: Dataset to use (birds | flowers)
  • split : An integer indicating which split to use (0 : train | 1: valid | 2: test).
  • save_path : Path for saving the models and results
  • pre_trained_disc : Discriminator pre-tranined model path used for intializing training or continuing from a checkpoint.
  • pre_trained_gen Generator pre-tranined model path used for intializing training or continuing from a checkpoint.
  • cls: Boolean flag to indicate whether to train with cls algorithms or not.

Demo

To get a glimpse of the results generated, you can:
First make sure you have installed all the dependencies, as mentioned in Requirements section. Also, make sure you have GPU access.
$ git clone https://github.com/Rakshith-Manandi/text-to-image-using-GAN.git
$ cd ./text-to-image-using-GAN
$ jupyter notebook GAN_demo.ipynb (i.e. open the 'GAN_demo.ipynb' file)

Results

Here are a few examples of the images generated by our model:



References

[1] Generative Adversarial Text-to-Image Synthesis https://arxiv.org/abs/1605.05396
[2] https://github.com/reedscot/icml2016 (the authors version)

About

This repository consists of code that is used to convert text-embeddings into Images using Generative Adversarial Networks(GAN)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published