GitHub - lytsl/art-gen

Art Gen

This project utilizes the VQGAN+CLIP image generation model, complemented by ESRGAN's upscaling capabilities, to create high-quality images. The VQGAN+CLIP image generation model is a powerful combination of two separate models that work together to create stunning images. VQGAN (Vector Quantized Generative Adversarial Network) is responsible for generating high-quality images by mapping a low-dimensional noise vector to an image space through the use of vector quantization. On the other hand, CLIP (Contrastive Language Image Pre-Training) provides a way to match the generated images with the original prompts or text descriptions used as input.

Together, VQGAN and CLIP form a symbiotic relationship where the strengths of one complement the other, resulting in highly detailed and accurate images. The addition of ESRGAN for upscaling further enhances the quality of the final output, ensuring that the small details are preserved even after increasing the resolution.

Installation

To install the necessary dependencies, follow the below steps:

Install the environment.yml file with Anaconda.

Clone the Real-ESRGAN,CLIP and taming-transformers git repositories by running the following commands:

git clone https://github.com/sberbank-ai/Real-ESRGAN
git clone https://github.com/openai/CLIP.git
git clone https://github.com/CompVis/taming-transformers.git

Download the VQGAN model by going into the taming-transformers directory and running the following commands:

mkdir -p checkpoints
wget 'https://heibox.uni-heidelberg.de/d/a7530b09fed84f80a887/files/?p=%2Fckpts%2Flast.ckpt&dl=1' -O 'checkpoints/vqgan_imagenet_f16_16384.ckpt'
mkdir checkpoints
wget 'https://heibox.uni-heidelberg.de/d/a7530b09fed84f80a887/files/?p=%2Fconfigs%2Fmodel.yaml&dl=1' -O 'checkpoints/vqgan_imagenet_f16_16384.yaml'

Download the RealESRGAN model by going into the RealESRGAN directory and running the following commands:

gdown https://drive.google.com/uc?id=1SGHdZAln4en65_NQeQY9UjchtkEF9f5F -O weights/RealESRGAN_x4.pth &> /dev/null

Run the generate_image.py file or the gen.ipynb notebook to generate images based on your input.

Result

Text Prompt	Style	Output
a castle above clouds	heavenly
scream	horror
Beautiful desolated majestic futuristic technologically advanced cyberpunk scientist civilization	dark fantasy
keanu reeves	default
a painting of sunset at sea	epic

Credits

This project was inspired by Katherine Crowson's original implementation of the VQGAN+CLIP model.

License

This project is licensed under the MIT License - see the LICENSE.md file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
gen.ipynb		gen.ipynb
generate.py		generate.py
generate_image.py		generate_image.py
gradient_noise.py		gradient_noise.py
upsample_image.py		upsample_image.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Art Gen

Installation

Result

Credits

License

About

Releases

Packages

Languages

License

lytsl/art-gen

Folders and files

Latest commit

History

Repository files navigation

Art Gen

Installation

Result

Credits

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages