Image-Generation-Using-VQVAE

Overview of VQVAE Structure

VQ-VAE is a type of variational autoencoder that uses vector quantisation to obtain a discrete latent representation. It differs from VAEs in two key ways:

Encoder network outputs discrete, rather than continuous, codes.
A prior is learnt rather than static.

In order to learn a discrete latent representation, ideas from vector quantisation (VQ) are incorporated. Using the VQ method allows the model to circumvent issues of posterior collapse - where the latents are ignored when they are paired with a powerful autoregressive decoder - typically observed in the VAE framework. Pairing these representations with an autoregressive prior, the model can generate high quality images, videos, and speech as well as doing high quality speaker conversion and unsupervised learning of phonemes.

VQVAE-MODEL

QUANTIZATION MODULE

Training process

Initially VQVAE is trained to learn discreate features from the images through a Image recontruction task.
Later we collect all the Discreate latent codes and train a prior ontop of these latent codes.
Here we choose gpt as our prior model which will predict the next tokens based on the previously predicted tokens.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assests		assests
generations		generations
reconstructions		reconstructions
.gitignore		.gitignore
README.md		README.md
vqvae-gpt.ipynb		vqvae-gpt.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image-Generation-Using-VQVAE

Overview of VQVAE Structure

VQVAE-MODEL

QUANTIZATION MODULE

Training process

Discreate Latent Code from Trained VQVAE:

Training GPT prior with future token prediction task:

Reconstructions of VQVAE Model

Generated Images using Trained vqvae decoder and GPT prior

About

Releases

Packages

Languages

BhanuPrakashPebbeti/Image-Generation-Using-VQVAE

Folders and files

Latest commit

History

Repository files navigation

Image-Generation-Using-VQVAE

Overview of VQVAE Structure

VQVAE-MODEL

QUANTIZATION MODULE

Training process

Discreate Latent Code from Trained VQVAE:

Training GPT prior with future token prediction task:

Reconstructions of VQVAE Model

Generated Images using Trained vqvae decoder and GPT prior

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages