Generative Adversarial Text to Image Synthesis in Tensorflow

Tensorflow implementation of Generative Adversarial Text to Image Synthesis for MNIST handwritten digit dataset.

Prerequisites

This code requires Tensorflow and OpenCV. The MNIST dataset is stored in the 'MNIST_data' directory. The files will be automatically downloaded if the dataset does not exist.

If you want to use --show_progress True option, you need to install python package progress.

$ pip install progress

Usage

To train a vanilla GAN (GAN_CLS) with z dimension 20 and generator input code dimension 2, run the following command:

$ python GAN_CLS_main.py --z_dim 20 --c_dim 2

To see all training options, run:

$ python GAN_CLS_main.py --help

which will print:

usage: GAN_CLS_main.py [-h] [--X_dim X_DIM] [--nwords NWORDS]
                   [--vocab_size VOCAB_SIZE] [--z_dim Z_DIM]
                   [--c_dim C_DIM] [--e_dim E_DIM] [--d_update D_UPDATE]
                   [--niter NITER] [--batch_size BATCH_SIZE] [--lr LR]
                   [--checkpoint_dir CHECKPOINT_DIR]
                   [--image_dir IMAGE_DIR] [--use_adam [USE_ADAM]]
                   [--nouse_adam]

optional arguments:
  -h, --help            show this help message and exit
  --X_dim X_DIM         dimension of the original image [784]
  --nwords NWORDS       number of words in the input sentence (e.g. "thin
                        number one with left skew") [6]
  --vocab_size VOCAB_SIZE
                        size of the vocabulary [19]
  --z_dim Z_DIM         dimension of the generator input noise variable z [20]
  --c_dim C_DIM         dimension of input code variable c [2]
  --e_dim E_DIM         dimension of the word embedding phi [20]
  --d_update D_UPDATE   update the discriminator weights [d_update] times per
                        generator update [5]
  --niter NITER         number of iterations to use during training [5500]
  --batch_size BATCH_SIZE
                        batch size to use during training [128]
  --lr LR               learning rate of the optimizer to use during training
                        [0.001]
  --checkpoint_dir CHECKPOINT_DIR
                        checkpoint directory [./checkpoints]
  --image_dir IMAGE_DIR
                        directory to save generated images to [./images]
  --use_adam [USE_ADAM]
                        if True, use Adam optimizer; otherwise, use SGD [True]
  --nouse_adam

To train a GAN with manifold interpolation (GAN-INT) with z dimension 20 and generator code input dimension 2, run the following command:

$ python GAN_INT_main.py --z_dim 20 --c_dim 2

To see all training options, run:

$ python GAN_INT_main.py --help

which will print:

usage: GAN_INT_main.py [-h] [--X_dim X_DIM] [--nwords NWORDS]
                   [--vocab_size VOCAB_SIZE] [--z_dim Z_DIM]
                   [--c_dim C_DIM] [--e_dim E_DIM] [--d_update D_UPDATE]
                   [--niter NITER] [--batch_size BATCH_SIZE] [--beta BETA]
                   [--lr LR] [--checkpoint_dir CHECKPOINT_DIR]
                   [--image_dir IMAGE_DIR] [--use_adam [USE_ADAM]]
                   [--nouse_adam]

optional arguments:
  -h, --help            show this help message and exit
  --X_dim X_DIM         dimension of the original image [784]
  --nwords NWORDS       number of words in the input sentence (e.g. "thin
                        number one with left skew") [6]
  --vocab_size VOCAB_SIZE
                        size of the vocabulary [19]
  --z_dim Z_DIM         dimension of the generator input noise variable z [20]
  --c_dim C_DIM         dimension of input code variable c [2]
  --e_dim E_DIM         dimension of the word embedding phi [20]
  --d_update D_UPDATE   update the discriminator weights [d_update] times per
                        generator update [1]
  --niter NITER         number of epochs to use during training [5500]
  --batch_size BATCH_SIZE
                        batch size to use during training [128]
  --beta BETA           variable that parameterizes the amount of
                        interpolation between two text embeddings [0.5]
  --lr LR               learning rate of the optimizer to use during training
                        [0.001]
  --checkpoint_dir CHECKPOINT_DIR
                        checkpoint directory [./checkpoints]
  --image_dir IMAGE_DIR
                        directory to save generated images to [./images]
  --use_adam [USE_ADAM]
                        if True, use Adam optimizer; otherwise, use SGD [True]
  --nouse_adam

Notes

The Annotated_MNIST.py is a thickness and skew labeler for MNIST handwritten digit dataset, intended to be used for toy text to image generation tasks or specific classification tasks. More details can be found in this repository.

This GAN was not able to learn proper associations between input sentences and image features due to the lack of depth. Any modifications to the model structure/configuration is welcome.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
MNIST_data		MNIST_data
.gitignore		.gitignore
Annotated_MNIST.py		Annotated_MNIST.py
GAN_CLS.py		GAN_CLS.py
GAN_CLS_main.py		GAN_CLS_main.py
GAN_INT.py		GAN_INT.py
GAN_INT_main.py		GAN_INT_main.py
README.md		README.md
Text to Image GAN.ipynb		Text to Image GAN.ipynb
ops.py		ops.py
ops2.py		ops2.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MNIST_data

MNIST_data

.gitignore

.gitignore

Annotated_MNIST.py

Annotated_MNIST.py

GAN_CLS.py

GAN_CLS.py

GAN_CLS_main.py

GAN_CLS_main.py

GAN_INT.py

GAN_INT.py

GAN_INT_main.py

GAN_INT_main.py

README.md

README.md

Text to Image GAN.ipynb

Text to Image GAN.ipynb

ops.py

ops.py

ops2.py

ops2.py

utils.py

utils.py

Repository files navigation

Generative Adversarial Text to Image Synthesis in Tensorflow

Prerequisites

Usage

Notes

About

Releases

Packages

Languages

1202kbs/Generative-Adversarial-Text-to-Image-Synthesis-Tensorflow

Folders and files

Latest commit

History

Repository files navigation

Generative Adversarial Text to Image Synthesis in Tensorflow

Prerequisites

Usage

Notes

About

Resources

Stars

Watchers

Forks

Languages