Source separation (speech enhancement) in Python 3.7

Separates noisy into clean speech and noise using non-negative matrix factorization (NMF) algorithms.

This code uses non-negative matrix factorization (NMF) algorithms to enhance speech.

This code is a non-official implementation of the paper:
K. Kwon, J. W. Shid and N. S. Kim, "NMF-based source separation utilizing prior knowledge on encoding vector," 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, 2016, pp. 479-483, doi: 10.1109/ICASSP.2016.7471721.
(Document: https://sapl.gist.ac.kr/wp-content/uploads/2017/01/NMF-based-source-separation-utilizing-prior-knowledge-on-encoding-vector.pdf)

NMF: standard NMF with Kullback-Leibler divergence(KLD) and multiplicate update rules(MuR)
NMF_g: NMF using gamma distribution with KLD and MuR
NMF_e: NMF using exponential distribution with KLD and MuR

Speech enhancement based on NMF

The figure above is a general workflow of the NMF-based speech and noise separation approach. The encoding matrix for the training data, H_train is usually removed although it has some useful information.
In this project, the penalty terms are proposed based on the prior knowledge on H in the separation phase for NMF-based source separation.

NMF_g:

NMF_e:

How to run?

The subdirectory of project is in the structure of

├── datasets
│   ├── train
│   │   ├── speech
│   │   └── noise
│   └── test
│       └── speech
├── output
│   ├── merged_audio
│   ├── test_noisy_audio
│   ├── enhanced_audio
│   └── plot
├── work_module
│   └── nmf
└── util

The output directory is created automatically.

Run on mini data as default option

You just run run_main.py in editor or enter python main.py in command prompt.
(In datasets directory, there are audios for the test. These audios are part of TIMIT datasets and NOISEX-92)

[Default Option]

Algorithm: NMF_e (using exponential distribution)
Sampling rate: 16 kHz
FFT size: 512
Window type: Hamming
Window size: 256
Overlap size: 192 (75%)
Max number of training iterations: 100
Max number of test iterations: 30
Number of the basis of speech: 128
Number of the basis of noise: 128
Threshold to check convergence: 0.5
Penalty rate for the penalty term: 0.005
Power of wiener gain: 2
Plotting results
Save results as image files (\output\plot)

User mode

If you want to replace the mini data with your own data, put your datasets to datasets directory or edit path for your direrectory in run_main.py.
If you change parameters of your experiment, you can change parameters by changing the default value of the argument in run_main.py or you can enter parameters in the command prompt.
You can see all the adjustable parameters and usage. python run_main.py --help

Example usage in the command prompt:

Running the program with standard NMF algorithm:
```
python run_main.py --nmf_mode NMF 
```
Running the program with the penalty rate of 0.5:
```
python run_main.py --penalty 0.5
```

Do not plot the results:

python run_main.py --visualize 0

or

python run_main.py -v 0

Results

Reference

TODO:

The performance of the NMF_g algorithm should be improved. (It takes too long to get the distribution parameter)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
dataset		dataset
util		util
work_module		work_module
README.md		README.md
requirements.txt		requirements.txt
run_main.py		run_main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assets

assets

dataset

dataset

util

util

work_module

work_module

README.md

README.md

requirements.txt

requirements.txt

run_main.py

run_main.py

Repository files navigation

Source separation (speech enhancement) in Python 3.7

Separates noisy into clean speech and noise using non-negative matrix factorization (NMF) algorithms.

Speech enhancement based on NMF

NMF_g:

NMF_e:

How to run?

Run on mini data as default option

User mode

Results

Reference

TODO:

About

Releases

Packages

Languages

jeongHwarr/Speech_Enhancement_NMF

Folders and files

Latest commit

History

Repository files navigation

Source separation (speech enhancement) in Python 3.7

Separates noisy into clean speech and noise using non-negative matrix factorization (NMF) algorithms.

Speech enhancement based on NMF

NMF_g:

NMF_e:

How to run?

Run on mini data as default option

User mode

Results

Reference

TODO:

About

Resources

Stars

Watchers

Forks

Languages