Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dynamic mixing in the speech separation task. #4387

Merged
merged 13 commits into from Aug 10, 2022

Conversation

LiChenda
Copy link
Contributor

@LiChenda LiChenda commented May 23, 2022

This PR add a simple dynamic mixing (DM) option for the speech separation task.
The current version support the mixing of a variable number of speakers.
To enable DM, set related options in the training config file:

model_conf:
    dynamic_mixing: True
    dynamic_mixing_gain_db: 2.0

and pass dynamic_mixing=true to enh.sh.

dynamic_mixing_gain_db is the maximum random gain (in dB) for each source before the mixing.
The gain (in dB) of each source is unifromly sampled in [-dynamic_mixing_gain_db, dynamic_mixing_gain_db] . Default value is 0.0.

When applying DM, we do not need wav.scp, spk{2~N}.scp anymore like that in the regular training set (only source file is needed ). So far I haven't figured out how to unify the data preparing stage (stage 1~5) for the dataset with and without DM.
And even when use DM for training, the valid and testing set shoud be prepared without DM.

So, in the current version of DM, users need to collect all the source files in spk1.scp manually in the training set.

@mergify mergify bot added the ESPnet2 label May 23, 2022
@codecov
Copy link

codecov bot commented May 23, 2022

Codecov Report

Merging #4387 (d0117f7) into master (f2778f7) will increase coverage by 0.06%.
The diff coverage is 56.98%.

@@            Coverage Diff             @@
##           master    #4387      +/-   ##
==========================================
+ Coverage   82.40%   82.46%   +0.06%     
==========================================
  Files         481      487       +6     
  Lines       41238    42112     +874     
==========================================
+ Hits        33982    34729     +747     
- Misses       7256     7383     +127     
Flag Coverage Δ
test_integration_espnet1 66.36% <ø> (-0.02%) ⬇️
test_integration_espnet2 48.27% <56.98%> (-0.90%) ⬇️
test_python 69.65% <23.65%> (+0.25%) ⬆️
test_utils 23.29% <ø> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
espnet2/enh/espnet_model.py 85.48% <ø> (-9.85%) ⬇️
espnet2/train/preprocessor.py 37.87% <43.47%> (+1.15%) ⬆️
espnet2/tasks/enh.py 99.33% <95.83%> (+0.10%) ⬆️
espnet2/utils/kwargs2args.py 21.42% <0.00%> (-71.43%) ⬇️
espnet2/enh/separator/fasnet_separator.py 90.00% <0.00%> (-6.16%) ⬇️
espnet2/enh/separator/skim_separator.py 92.10% <0.00%> (-4.77%) ⬇️
espnet/nets/pytorch_backend/e2e_asr_maskctc.py 87.40% <0.00%> (-4.45%) ⬇️
espnet2/train/trainer.py 76.59% <0.00%> (-2.51%) ⬇️
espnet2/enh/loss/criterions/time_domain.py 95.07% <0.00%> (-1.49%) ⬇️
... and 28 more

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@mergify
Copy link
Contributor

mergify bot commented Jun 6, 2022

This pull request is now in conflict :(

@mergify mergify bot added the conflicts label Jun 6, 2022
@mergify mergify bot removed the conflicts label Jun 8, 2022
@Emrys365 Emrys365 added New Features SE Speech enhancement labels Jun 8, 2022
espnet2/tasks/enh.py Outdated Show resolved Hide resolved
espnet2/tasks/enh.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@Emrys365 Emrys365 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general.

@Emrys365
Copy link
Collaborator

BTW, could you add an integration test for this new preprocessor? You can refer to https://github.com/espnet/espnet/blob/master/ci/test_integration_espnet2.sh#L87

@mergify mergify bot added the CI Travis, Circle CI, etc label Jul 20, 2022
egs2/TEMPLATE/enh1/enh.sh Outdated Show resolved Hide resolved
done

else
# prepare train and valid data parameters
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is better to echo some message here to warn the user that dynamic mixing is being used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Can you add a comment, @LiChenda?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Collaborator

@Emrys365 Emrys365 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Co-authored-by: Wangyou Zhang <C0me_On@163.com>
Copy link
Contributor

@sw005320 sw005320 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
I have some minor comments.

done

else
# prepare train and valid data parameters
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Can you add a comment, @LiChenda?

layer: 1
unit: 128
dropout: 0.2

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment about this configuration?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

scheduler_conf:
step_size: 2
gamma: 0.97

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@sw005320
Copy link
Contributor

sw005320 commented Aug 4, 2022

@LiChenda, can you add some comments?
After that, I’ll merge this PR.

@LiChenda
Copy link
Contributor Author

LiChenda commented Aug 4, 2022

@LiChenda, can you add some comments? After that, I’ll merge this PR.

Oh, I forgot to hande it. I'll do it now.

@sw005320 sw005320 added the auto-merge Enable auto-merge label Aug 4, 2022
@sw005320 sw005320 added this to the v.202209 milestone Aug 4, 2022
@sw005320
Copy link
Contributor

sw005320 commented Aug 4, 2022

Thanks!

@mergify mergify bot merged commit 24b12f8 into espnet:master Aug 10, 2022
@LiChenda LiChenda deleted the dynamic_mixing branch May 19, 2023 06:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge Enable auto-merge CI Travis, Circle CI, etc ESPnet2 New Features SE Speech enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants