-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dynamic mixing in the speech separation task. #4387
Changes from all commits
8f03e3c
77d0a9d
0d6e1e7
cd8f4ce
a7df550
002dc6e
17f93dd
42e7efc
bf90690
7387fd1
a8d742a
e6f333d
d0117f7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
|
||
encoder: stft | ||
encoder_conf: | ||
n_fft: 512 | ||
hop_length: 128 | ||
decoder: stft | ||
decoder_conf: | ||
n_fft: 512 | ||
hop_length: 128 | ||
separator: rnn | ||
separator_conf: | ||
rnn_type: blstm | ||
num_spk: 2 | ||
nonlinear: relu | ||
layer: 1 | ||
unit: 128 | ||
dropout: 0.2 | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you add a comment about this configuration? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done. |
||
# dynamic_mixing related | ||
# dynamic_mixing_gain_db: | ||
# The maximum random gain (in dB) for each source before the mixing. | ||
# The gain (in dB) of each source is unifromly sampled in | ||
# [-dynamic_mixing_gain_db, dynamic_mixing_gain_db] | ||
dynamic_mixing: True | ||
dynamic_mixing_gain_db: 2.0 | ||
|
||
criterions: | ||
# The first criterion | ||
- name: mse | ||
conf: | ||
compute_on_mask: True | ||
mask_type: PSM^2 | ||
# the wrapper for the current criterion | ||
# for single-talker case, we simplely use fixed_order wrapper | ||
wrapper: fixed_order | ||
wrapper_conf: | ||
weight: 1.0 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
init: xavier_uniform | ||
max_epoch: 150 | ||
batch_type: folded | ||
# When dynamic mixing is enabled, the actual batch_size will | ||
# be (batch_size / num_spk) | ||
batch_size: 8 | ||
iterator_type: chunk | ||
chunk_length: 16000 | ||
num_workers: 2 | ||
optim: adamw | ||
optim_conf: | ||
lr: 1.0e-03 | ||
eps: 1.0e-06 | ||
weight_decay: 0 | ||
patience: 20 | ||
grad_clip: 5.0 | ||
val_scheduler_criterion: | ||
- valid | ||
- loss | ||
best_model_criterion: | ||
- - valid | ||
- si_snr | ||
- max | ||
- - valid | ||
- loss | ||
- min | ||
keep_nbest_models: 10 | ||
|
||
scheduler: steplr | ||
scheduler_conf: | ||
step_size: 2 | ||
gamma: 0.97 | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ditto There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
# dynamic_mixing related | ||
# dynamic_mixing_gain_db: | ||
# The maximum random gain (in dB) for each source before the mixing. | ||
# The gain (in dB) of each source is unifromly sampled in | ||
# [-dynamic_mixing_gain_db, dynamic_mixing_gain_db] | ||
dynamic_mixing: True | ||
dynamic_mixing_gain_db: 2.0 | ||
|
||
encoder: conv | ||
encoder_conf: | ||
channel: 64 | ||
kernel_size: 2 | ||
stride: 1 | ||
decoder: conv | ||
decoder_conf: | ||
channel: 64 | ||
kernel_size: 2 | ||
stride: 1 | ||
separator: skim | ||
separator_conf: | ||
causal: False | ||
num_spk: 2 | ||
layer: 6 | ||
nonlinear: relu | ||
unit: 128 | ||
segment_size: 250 | ||
dropout: 0.05 | ||
mem_type: hc | ||
seg_overlap: True | ||
|
||
# A list for criterions | ||
# The overlall loss in the multi-task learning will be: | ||
# loss = weight_1 * loss_1 + ... + weight_N * loss_N | ||
# The default `weight` for each sub-loss is 1.0 | ||
criterions: | ||
# The first criterion | ||
- name: si_snr | ||
conf: | ||
eps: 1.0e-6 | ||
wrapper: pit | ||
wrapper_conf: | ||
weight: 1.0 | ||
independent_perm: True | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is better to echo some message here to warn the user that dynamic mixing is being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Can you add a comment, @LiChenda?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.