SongGenAI

This is a pop music generator using transformer networks consisting of 3 parts:

a lyric generator using the GPT-2 transformer model
singing voice synthesis using the DiffSinger model with a HifiGan deep learning based vocoder translate control data and convert synthesis model into audio. The specific vocoder used is by Kong, J., Kim, J., & Bae, J. from the 2020 paper Hifi-gan: Generative adversarial networks for efficient and high fidelity speech. It also uses an LJS speech model.
a music generator using the Music-VAE auto-encoder model.

The latent space takes in a 2 or 16 bar musical sequence with multiple parts (e.g. 1-melody, 3-bass, melody, drums trio). It encodes to a 256 or 512 D latent vector. You can analyse the type of sound Music-VAE generates using the midi_analyser.py and midi_note_extractor.py files in the analyse-and-mix folder.

The kind of output it produces looks like this:

The actual song output at the end is mix.wav in the analyse-and-mix folder.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
DiffSinger		DiffSinger
analyse-and-mix		analyse-and-mix
lyrics-gpt2		lyrics-gpt2
music_vae		music_vae
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SongGenAI

About

Releases

Packages

Languages

nrm33n/SongGenAI

Folders and files

Latest commit

History

Repository files navigation

SongGenAI

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages