Skip to content

Implementation of WaveGAN using PyTorch to generate artificially-made human voices uttering digits "0" to "9".

Notifications You must be signed in to change notification settings

lukysummer/WaveGAN-Speech-Synthesis

Repository files navigation

WaveGAN (PyTorch)

This is my own implementation of WaveGAN using PyTorch, introduced in this paper. The main task was to synthesize a raw audio of drum sounds and human voice articulating numbers 0 to 9.

Results

✿ Take a look at an example of a synthesized audio HERE!✿

Additional Notes

  • While building the model, I chose hyperparameters suggested by the paper, EXCEPT the model size (d) :

  • I reduced the model size (d) from 64 (suggested in the paper) to 32, and obtained recognizable synthesis as early as Epoch 30.

  • During training, I did not use any quantitative stopping criteria (as in the paper). I just used qualitative method of checking the synthesized audio.

Datasets

WaveGAN model was trained on the following datasets (containing .wav files):

Sources

About

Implementation of WaveGAN using PyTorch to generate artificially-made human voices uttering digits "0" to "9".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages