Skip to content
A PyTorch implementation of "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation"
Python Shell
Branch: master
Clone or download
Latest commit f3333b0 Nov 14, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
nnet release model Nov 14, 2019
pretrain release model Nov 14, 2019
.gitattributes init repo Dec 27, 2018
.gitignore fix checkpoint error in trainer Mar 23, 2019
LICENSE Add MIT license Sep 17, 2019
README.md simply dataloader and update trainer Mar 6, 2019
requirements.txt
train.sh update git index Mar 7, 2019

README.md

ConvTasNet

A PyTorch implementation of the TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation

Requirements

see requirements.txt

Usage

./nnet/separate.py /path/to/checkpoint --input /path/to/mix.scp --gpu 0 > separate.log 2>&1 &
  • evaluate
./nnet/compute_si_snr.py /path/to/ref_spk1.scp,/path/to/ref_spk2.scp /path/to/inf_spk1.scp,/path/to/inf_spk2.scp

Result (on best configuratures in the paper)

ID Settings Causal Norm Param Loss Si-SDR
0 adam/lr:1e-3/wd:1e-5/32-batch/2gpu N BN/relu 8.75M -17.59/-15.45 14.63
1 adam/lr:1e-2/wd:1e-5/20-batch/2gpu N gLN/relu - -16.09/-15.21 14.58
2 adam/lr:1e-3/wd:1e-5/20-batch/2gpu N gLN/relu - -17.91/-16.54 15.87
3 adam/lr:1e-2/wd:1e-5/32-batch/2gpu N BN/sigmoid - -14.51/-13.40 12.62
4 adam/lr:1e-2/wd:1e-5/32-batch/2gpu N BN/relu - -17.20/-15.38 14.58
5 adam/lr:1e-3/wd:1e-5/20-batch/2gpu N gLN/sigmoid - -17.20/-16.11 15.55
6 adam/lr:1e-3/wd:1e-5/32-batch/2gpu Y BN/relu - -15.25/-12.47 11.42
7 adam/lr:1e-3/wd:1e-5/24-batch/2gpu N cLN/relu - -18.72/-16.17 15.25

Reference

Luo Y, Mesgarani N. TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation[J]. arXiv preprint arXiv:1809.07454, 2018.

You can’t perform that action at this time.