WaveGAN (PyTorch)

This is my own implementation of WaveGAN using PyTorch, introduced in this paper. The main task was to synthesize a raw audio of drum sounds and human voice articulating numbers 0 to 9.

Results

✿ Take a look at an example of a synthesized audio HERE!✿

Additional Notes

While building the model, I chose hyperparameters suggested by the paper, EXCEPT the model size (d) :
I reduced the model size (d) from 64 (suggested in the paper) to 32, and obtained recognizable synthesis as early as Epoch 30.
During training, I did not use any quantitative stopping criteria (as in the paper). I just used qualitative method of checking the synthesized audio.

Datasets

WaveGAN model was trained on the following datasets (containing .wav files):

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
images		images
.gitattributes		.gitattributes
GenerateAudio.py		GenerateAudio.py
LoadData.py		LoadData.py
README.md		README.md
WaveGAN_models.py		WaveGAN_models.py
train_WaveGAN.py		train_WaveGAN.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WaveGAN (PyTorch)

Results

Additional Notes

Datasets

Sources

About

Releases

Packages

Languages

lukysummer/WaveGAN-Speech-Synthesis

Folders and files

Latest commit

History

Repository files navigation

WaveGAN (PyTorch)

Results

Additional Notes

Datasets

Sources

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages