Duck and Cover

Source: http://cbarks.dk/Digital/seraa196208.JPG

Duck and Cover allows you to create your own album covers based on additional information like the genre of the LP, the release year and the name and titles of your artificial album..

Duck and Cover uses data from more than 600.000 covers of over 120.000 spotify top artists from 3.254 genres to learn about the structure and appearance of let's say a thrash metal album cover from 1988.

Data Gathering

Data gathering consists of two steps:

Create a .env file with your SPOTIPY_CLIENT_ID and SPOTIPY_CLIENT_SECRET (Spotify API Client ID & Secret). Read more on how to get your Spotify client ID and secret here.
Run the data collection script which iteratively collects the top 50 artists for each of the genres listed in this file and their related artists whereby duplicated artists are removed. After this step the script builds a table containing genre and release date of each album released by these artists, the artists and the album name as well as an URL to download a 300x300 as well as a 64x64 image of the cover. Based on this URL the covers are finally downloaded and save to a unified identifiable file structure.

All of the final and intermediate results of the tasks in steps in 2. are saved in a temporary dictionary to allow splitting the data collection in case of reaching the quota limit of the Spotify API (which is usually not the case) or running into other trouble.

Networks and results

The first network built is a simple Deep Convolutional GAN. The results aren't really satisfying since a DCGAN is not able to capture the manifold variations in an album Cover and collapses pretty early on:

Switching from a normal binary crossentropy loss for both discriminator and the combined model to a GAN trained with wasserstein loss fused with gradient penalty yields much better results than the DCGAN:

Obviously optimizing the wasserstein loss results in more stable gradients which leads to a steady learning phase, whereas the gradient penalty prevents varnished gradients. This results in a detectable structure in the generated images, so that they even adumbrate interpret or album names on top or bottom of the generated covers.

Now let's have a look at the results of the ProGAN. Clearly once can see how the model is built up from a very small resolution to a final resolution of 512x512 pixel. This allows the network to learn the structure of the image little by little and produces an image that has clearer edges on those structures. The results are far from perfect, but much better than on the Deep Convolutional GAN and on the Wassertrein GAN.

Next Steps:

Train ProGan + Genre
Integrate Artist Name
Integrate Album Name
Migrate W-GAN to PTLightning
Migrate DCGAN to PTLightning

Name		Name	Last commit message	Last commit date
Latest commit History 305 Commits
.dvc		.dvc
.githooks		.githooks
config		config
data		data
data_collection		data_collection
img		img
loader		loader
networks		networks
tasks		tasks
train		train
utils		utils
.dvcignore		.dvcignore
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
learning_progress.dvc		learning_progress.dvc
requirements.txt		requirements.txt

mcschmitz/duck_and_cover

Folders and files

Latest commit

History

Repository files navigation

Duck and Cover

Data Gathering

Networks and results

Next Steps:

About

Resources

Stars

Watchers

Forks

Languages