Accepted to IEEE Robotics and Automation Letters (RA-L) April 2024
-
Updated
Apr 5, 2024 - Python
Accepted to IEEE Robotics and Automation Letters (RA-L) April 2024
An educational project dedicated to text-to-image generation with neural networks. VQVAE and BPE autoencoders are used to learn the embedding of text and image respectively. A transformer-based model then is trained to predict the next token in the concatenated sequence of image and text tokens and used for generation.
Compression via Vector Quantization in PyTorch
Official code for the NeurIPS 2022 paper "Posterior Matching for Arbitrary Conditioning".
Improving Semantic Control in Discrete Latent Spaces with Transformer Quantized Variational Autoencoders
VQGAN from LDM without hell of dependencies
Large-Scale Bidirectional Training for Zero-Shot Image Captioning
Tensorflow Implementation of "Theory and Experiments on Vector Quantized Autoencoders"
VQ-VAE/GAN implementation in pytorch-lightning
OmniTokenizer: one model and one weight for image-video joint tokenization.
Language Quantized AutoEncoders
Inverse DALL-E for Optical Character Recognition
Voice conversion (VC) investigation using three variants of VAE
Experimental implementation for a sparse-dictionary based version of the VQ-VAE2 paper
Demo of robust semantic communication against semantic noise
Fast and scalable search of whole-slide images via self-supervised deep learning - Nature Biomedical Engineering
Add a description, image, and links to the vqvae topic page so that developers can more easily learn about it.
To associate your repository with the vqvae topic, visit your repo's landing page and select "manage topics."