videoGAN

My smal project to implement GAN to generate video with an LSTM encodes the transformation of video frames in noise space.

Datasets:

Tasks:

Network structure:

The generator consist of two parts:

An LSTM that transform the noise from the current frame to the next frame
A DCGAN generator to generate the frame from the noise.
Layer Normalization is used in place of Batch Normalization

The discriminator is the from the paper "Generating Videos With Scence Dynamics" (http://web.mit.edu/vondrick/tinyvideo/paper.pdf). No Bach Normalization in the discriminator.

GAN training:

Train is done by using Wasserstein GAN with Gradient Policy, lambda = 200.0. Basic Loss and Alternative Loss for GAN training is also available.

Adam Optimizer is used for the discrimnator and the DCGAN part of the generator.

Stochastic Gradient Descent with momentum is used for the LSTM part of the generator.

Things I learned from training using Wasserstein-GAN with Gradient Policy (WGAN-GP):

About the three losses I tried:

Basic version of GAN loss doesn't work for this model. The LSTM part of the generator didn't receive much gradients to learn from
Alternative -log(D(G(noise))) version of GAN loss is able to provide big and unstable gradient to LSTM part and learn quickly at first but unable to go further than learning some of the moving dynamics and some vague digit shapes. After that, it diverge and video quality looks worse overtime. Also, extreme mode collapse.
WGAN-GP learn quicker than alternative loss, gradients are stable. At first the discriminator loss go down but after a few thousand iterations, it goes up steadily. Video quality goes up as discriminator loss goes up. Generator loss doesn't mean that much. No witnessed mode collapse

Some note about training WGAN-GP:

If not use gradient clipping, the gradients of some parameters can go upto 25-30. Those big values doesn't seem to have any effects though.
Using layer normalization in the discriminator seem to make it harder for the discriminator to learn (shown as gradients of the generator go down to 0 for some iteration) and make it worse for everything. (this is confirmed after 2nd try).
The discriminator should be trained more than the generator to reach optimum. From the github of Wasserstein-GAN original, the first few iteration we should train the discriminator 100 iters instead of 5 iters for 1 generator iter.
For LSTM one should not use Adam, instead use SGD+momentum (0.001, 0.95 seems to work well)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
imgs		imgs
.gitignore		.gitignore
README.md		README.md
discriminator.py		discriminator.py
generator.py		generator.py
generator_v1.py		generator_v1.py
ops.py		ops.py
setup.sh		setup.sh
tfnet.py		tfnet.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

imgs

imgs

.gitignore

.gitignore

README.md

README.md

discriminator.py

discriminator.py

generator.py

generator.py

generator_v1.py

generator_v1.py

ops.py

ops.py

setup.sh

setup.sh

tfnet.py

tfnet.py

train.py

train.py

utils.py

utils.py

Repository files navigation

videoGAN

Datasets:

Tasks:

Network structure:

GAN training:

Things I learned from training using Wasserstein-GAN with Gradient Policy (WGAN-GP):

About the three losses I tried:

Some note about training WGAN-GP:

About

Releases

Packages

Languages

minhnhat93/videoDCGAN

Folders and files

Latest commit

History

Repository files navigation

videoGAN

Datasets:

Tasks:

Network structure:

GAN training:

Things I learned from training using Wasserstein-GAN with Gradient Policy (WGAN-GP):

About the three losses I tried:

Some note about training WGAN-GP:

About

Topics

Resources

Stars

Watchers

Forks

Languages