convolutional-autoencoder-for-raw-waveform-reconstruction

convolutional autoencoder for raw waveform reconstruction to replace the classic STFT, i called it as short-time AE transform (STAET)

For now, it can reconstruct the raw waveforms of audio. The convolution + pooling then deconvolution + upsampling

Pixels as features work very well for image processing, why not raw waveform for audio or speech processing? I know google Tara has some work of using CLDNN (or convolutional LSTM) to model on raw waveform for speech recognition. But it is still very difficult especially on small datasets. Because all of the FIR filters should be learned from data.

Contact: yx0001@surrey.ac.uk

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
config.py		config.py
time_convAD_test.py		time_convAD_test.py
time_convAD_v4.py		time_convAD_v4.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

config.py

config.py

time_convAD_test.py

time_convAD_test.py

time_convAD_v4.py

time_convAD_v4.py

Repository files navigation

convolutional-autoencoder-for-raw-waveform-reconstruction

About

Releases

Packages

Languages

yongxuUSTC/convolutional-autoencoder-for-raw-waveform-reconstruction

Folders and files

Latest commit

History

Repository files navigation

convolutional-autoencoder-for-raw-waveform-reconstruction

About

Resources

Stars

Watchers

Forks

Languages