Skip to content

A repository comprising of code for generation of noisy speech data from clean data using deep learning methods

Notifications You must be signed in to change notification settings

shashankshirol/GeneratingNoisySpeechData

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

70 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Generating Noisy Speech Data using Deep Learning methods

A repository comprising of code for generation of noisy speech data from clean data in the frequency domain using deep learning methods.

We explore two architectures -- one uses a style transfer method and the other uses an image-to-image translation model.

Architectures:

Style-Transfer Method:

The code makes use of the official SinGAN implementaion to generate noisy spectrograms of audio data. We make use of the Paint2Image task of SinGAN.

Image-to-Image Translation:

This repo houses a modified version of CUT: Contrastive unpaired Translation GAN which we use to learn a mapping from clean to noisy spectrograms. We have tuned the model enough for it to work on spectrograms and produce recontructable audio. The code is heavily derived from the official implementation available at official CUT implementaion

Note

Refer to the directories pertaining to the two architectures to learn more and test them out for yourselves!

Consolidated list of important links

Documentation

About

A repository comprising of code for generation of noisy speech data from clean data using deep learning methods

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages