CoverGAN is a set of tools and machine learning models designed to generate good-looking album covers based on users' audio tracks and emotions. Resulting covers are generated in vector graphics format (SVG).
Available emotions:
- Anger
- Comfortable
- Fear
- Funny
- Happy
- Inspirational
- Joy
- Lonely
- Nostalgic
- Passionate
- Quiet
- Relaxed
- Romantic
- Sadness
- Serious
- Soulful
- Surprise
- Sweet
- Wary
The service is available on http://81.3.154.178:5001/covergan.
- Generation of music covers by analyzing music and emotions
- Several GAN models
- SVG format
- Possibility of rasterization
- Insertion of readable captions
- A large number of different fonts
- Insertion of different color filters
- SVG editor
- Convenient change of colors
- Style transfer from provided image
- Saving images in any resolution
- The pretrained weights can be downloaded from here
- These weights should be placed into
./weights
folder
- See this README for training details.
In this service two types of generator are available:
- The first one creates the covers with abstract lines
- The second one draws closed forms.
It is also possible to use one of two algorithms for applying inscriptions to the cover:
- The first algorithm uses the captioner model
- The second is a deterministic algorithm which searches for a suitable location
The service uses pretrained weights. See this section.
-
Specify PyTorch version to install in
Dockerfile
. -
Build the image running
docker_build_covergan_service.sh
file
- Start the container running
docker_run_covergan_service.sh
file
Go to http://localhost:5001
in the browser and enjoy!
- Install suitable PyTorch version:
pip install torch torchvision torchaudio
- Install DiffVG
- Install dependencies from this file
- Run
python3 ./eval.py \
--audio_file="test.mp3" \
--emotions=joy,relaxed \
--track_artist="Cool Band" \
--track_name="New Song"
- The resulting
.svg
covers by default will be saved to./gen_samples
folder.
See this examples folder.
captions/
: a network that predicts aesthetically matching colors and positions for the captions (artist and track names).colorer/
: a network that predicts palettes for music covers.docs/
: folder with instructions on how to start training or testing models.examples/
: folder with simple music tracks, their generated covers, and with examples of original and clean datasets.fonts/
: folder with downloaded from Google Fonts fonts.outer/
: the primary GAN that generates vector graphics descriptions from audio files and user-specified emotions.utils/
: parts of code implementing various independent functionality and separated for convenient reuse.weights/
: folder where the best models were saved.captioner_train.py
: an entry point to trigger the Captioner network training.covergan_train.py
: an entry point to trigger the CoverGAN training.eval.py
: an entry point to trigger the primary flow as a command line tool.service.py
: the primary code flow for album cover generation.
audio/
: default folder with music tracks (.flac
or.mp3
format) for CoverGAN training.checkpoint/
: default folder where checkpoints and other intermediate files while training CoverGAN and Captioner Networks will be stored.clean_covers/
: default folder with covers on which captures were removed.original_covers/
: default folder with original covers.plots/
: the folder where the intermediate plots while training will be savedemotions.json
: file with emotion markup for train dataset.
- The machine learning models rely on the popular PyTorch framework.
- Differentiable vector graphics rendering is provided by diffvg, which needs to be built from source.
- Audio feature extraction is based on Essentia, prebuilt pip packages are available.
- Other Python library dependencies include Pillow, Matplotlib, SciPy, and Kornia.
The full dataset contains of:
- Audio tracks
- Original covers
- Cleaned covers
- Fonts
- Marked up emotions
- Marked up rectangles for captioner model training
The dataset can be downloaded from here
- Build image running
docker_build.sh
- See these docs for more details about specified options while training networks.
- Specify training command in
covergan_training_command.sh
- Start container running
docker_run.sh
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.