Norse - Non Recurrent Speech Enhancement

My master thesis project exploring various ways to enchance speech signals in a fully-convolutional fashion.

Current features:

The work is inspired by SEGAN and DEMUCS architectures. Both of them follow a similar U-net-like design:

Input -> Encoder -> Bottleneck ops (RNN, Noise sampling, w/e) -> Decoder -> Output <--> Loss func (D in case of SEGAN, MSE in demucs)

The idea is to explore what are the best bottleneck operations, loss funcs, optimal depth, upsampling methods, skip operations and so on.

I really liked the design of SEGAN, but GANs are rather unstable and deconv layers lead to checkerboard patterns (buzzing in 1D space).

DEMUCS performs really well, but they use a LSTM module as their bottleneck operation, which we all know is slow to train.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
nbs		nbs
norse		norse
.gitignore		.gitignore
README.md		README.md
data.py		data.py
parameter_search.py		parameter_search.py
requirements.txt		requirements.txt
sampler.py		sampler.py
setup.py		setup.py
test.py		test.py
train.py		train.py
utils.py		utils.py

Provide feedback