Unconditional Audio Generation with GAN and Cycle Regularization
This repository contains the code and samples for our paper "Unconditional Audio Generation with GAN and Cycle Regularization". The goal is to unconditionally generate singing voices, speech, and instrument sounds with GAN.
The model is implemented with PyTorch.
pip install -r requirements.txt
Download pretrained parameters
The pretrained parameters can be downloaded here: Pretrained parameters
Unzip it so that the
models folder is in the current folder.
Or use the following script
Display the options
python generate.py -h
Generate singing voices
The following commands are equivalent.
python generate.py python generate.py -data_type singing -arch_type hc --duration 10 --num_samples 5 python generate.py -d singing -a hc --duration 10 -ns 5
python generate.py -d speech
Generate piano sounds
python generate.py -d piano
Generate violin sounds
python generate.py -d violin
We use MelGAN as the vocoder. The trained vocoders are included in the
For singing, piano, and violin, we have modify the MelGAN to include GRU in the vocoder architecture. We have found that this modification yields improved audio quality. For speech, we directly use the trained LJ vocoder from MelGAN.
Train your own model
One may use the following steps to train their own models.
Some generated audio samples can be found in: