Skip to content

Text-to-Speech Synthesis by Generating Spectrograms using Generative Adversarial Network

License

Notifications You must be signed in to change notification settings

ligaoliang/Conditional-SpecGAN-Tensorflow

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Conditional SpecGAN

A (conditional) audio synthesis generative adversarial network that generates spectrogram, which furthur synthesize raw waveform, implementation in Tensorflow.

Requirements:

  • Tensorflow r1.10.1
  • Python 3.6
  • numpy 1.14.5
  • librosa 0.6.2
  • tqdm 4.26.0
  • matplotlib 2.2.3

Introduction

Text-to-Speech Synthesis by Generating Spectrograms using Generative Adversarial Network. This work is based on the original implementation of SpecGAN, where I furthur explore on conditioning SpecGAN training. Additionally, an energy based data preprocessing scheme is applied, which results in an improvement in audio quality.

The preprocess result can be demonstrated by the following visualization:

Build Dataset

  • Download training data: here

  • Run './src/utils/preprocess_data.py' to process data or download the processed data: here

  • Run './src/utils/visualize_wav.py' to visualize the processed clean data or download the results: here

  • Run './src/utils/make_tfrecord.py' to process .wav files into .tfrecord training ready files, or download the processed data: here

  • Extract the .tgz file in step.4, and place them to the relevent path according to args.data_dir in ./src/config.py:

data_dir='../data/sc09_preprocess_energy'

This default path can be modified by changing the '--data_dir option in './src/config.py'.

Usage

  • Resume or train a new SpecGAN model by the following command:
python3 ./src/runner.py train
  • To inference and generate from a trained SpecGAN model, use the following command:
python3 ./src/runner.py generate
  • To train or generate from a conditional SpecGAN, use the following command (Note: This feature is still under implementation and is not complete!):
python3 ./src/runner.py train --conditional
python3 ./src/runner.py generate --conditional

About

Text-to-Speech Synthesis by Generating Spectrograms using Generative Adversarial Network

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%