Skip to content

bill9800/Speech-denoise-Autoencoder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Speech-denoising-Autoencoder

Speech denoising systems usually enhance only the magnitude spectrum while leaving the phase spectrum. This system try to improve the performance of denoising system based on denoising autoencoder neural network. The estimation of clean audio is computed by complex ideal ratio mask to enhance the phase information.

Structure

Input : audio data on mel-frequency domain

Output: complex ratio mask (cRM)[1]

This model built in linear shape (2049-500-180) without weight lock[2].

Source

youtube-dl : a command-line program to download videos from YouTube.com and a few more sites

SoX : a cross-platform command line utility to convert various formats of audio files in to other formats

FFmpeg : a complete, cross-platform solution to record, convert and stream audio and video

librosa : python package for music and audio analysis

Reference

[1] Complex Ratio Masking for Monaural Speech Separation, D.Williamson, IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 24, NO. 3, MARCH 2016

[2] Speech Synthesis with Deep Denoising Autoencoder, Zhenzhou Wu

About

Speech denoiser model using Keras

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages