Speech-denoising-Autoencoder

Speech denoising systems usually enhance only the magnitude spectrum while leaving the phase spectrum. This system try to improve the performance of denoising system based on denoising autoencoder neural network. The estimation of clean audio is computed by complex ideal ratio mask to enhance the phase information.

Structure

Input : audio data on mel-frequency domain

Output: complex ratio mask (cRM)[1]

This model built in linear shape (2049-500-180) without weight lock[2].

Source

youtube-dl : a command-line program to download videos from YouTube.com and a few more sites

SoX : a cross-platform command line utility to convert various formats of audio files in to other formats

FFmpeg : a complete, cross-platform solution to record, convert and stream audio and video

librosa : python package for music and audio analysis

Reference

[1] Complex Ratio Masking for Monaural Speech Separation, D.Williamson, IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 24, NO. 3, MARCH 2016

[2] Speech Synthesis with Deep Denoising Autoencoder, Zhenzhou Wu

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
lib		lib
pic		pic
README.md		README.md
denoise-autoencoder.py		denoise-autoencoder.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lib

lib

pic

pic

README.md

README.md

denoise-autoencoder.py

denoise-autoencoder.py

Repository files navigation

Speech-denoising-Autoencoder

Structure

Source

Reference

About

Releases

Packages

Languages

bill9800/Speech-denoise-Autoencoder

Folders and files

Latest commit

History

Repository files navigation

Speech-denoising-Autoencoder

Structure

Source

Reference

About

Resources

Stars

Watchers

Forks

Languages