Skip to content

3 part AI creates pop song w/ lyrics, vocals, & music.

Notifications You must be signed in to change notification settings

nrm33n/SongGenAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SongGenAI

This is a pop music generator using transformer networks consisting of 3 parts:

  • a lyric generator using the GPT-2 transformer model
  • singing voice synthesis using the DiffSinger model with a HifiGan deep learning based vocoder translate control data and convert synthesis model into audio. The specific vocoder used is by Kong, J., Kim, J., & Bae, J. from the 2020 paper Hifi-gan: Generative adversarial networks for efficient and high fidelity speech. It also uses an LJS speech model.
  • a music generator using the Music-VAE auto-encoder model.

The latent space takes in a 2 or 16 bar musical sequence with multiple parts (e.g. 1-melody, 3-bass, melody, drums trio). It encodes to a 256 or 512 D latent vector. You can analyse the type of sound Music-VAE generates using the midi_analyser.py and midi_note_extractor.py files in the analyse-and-mix folder.

The kind of output it produces looks like this: midi_analyser midi_note_extractor

The actual song output at the end is mix.wav in the analyse-and-mix folder.

About

3 part AI creates pop song w/ lyrics, vocals, & music.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published