Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed CHiME-7 DASR adding evaluation inference + adding support to use diarization baseline "pre-computed" JSONs (new PR) #5228

Merged
merged 3 commits into from Jun 14, 2023

Conversation

popcornell
Copy link
Contributor

@popcornell popcornell commented Jun 12, 2023

new PR from #5183 which had problems due to error of applying black.
@simpleoier I made another PR and closed the other one.

@codecov
Copy link

codecov bot commented Jun 12, 2023

Codecov Report

Merging #5228 (a959fce) into master (096e2bb) will decrease coverage by 0.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #5228      +/-   ##
==========================================
- Coverage   74.43%   74.43%   -0.01%     
==========================================
  Files         642      642              
  Lines       57611    57605       -6     
==========================================
- Hits        42885    42880       -5     
+ Misses      14726    14725       -1     
Flag Coverage Δ
test_integration_espnet1 66.28% <ø> (ø)
test_integration_espnet2 47.52% <ø> (ø)
test_python 65.14% <ø> (-0.01%) ⬇️
test_utils 23.28% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

see 5 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@popcornell
Copy link
Contributor Author

popcornell commented Jun 14, 2023

@simpleoier can you take a look ? This is important for the challenge to merge ASAP.
Also now I am modifying only challenge related files so it should be good to go

Copy link
Collaborator

@simpleoier simpleoier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late action. Please ping me on slack.

asr_tt_set="kaldi/chime6/dev/gss kaldi/dipco/dev/gss/ kaldi/mixer6/dev/gss/"
elif
[ $decode_only == "eval" ]; then
# apply gss only on dev
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment should be updated as eval, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@@ -80,9 +80,15 @@ asr_batch_size=$(calc_int 128*$ngpu) # reduce 128 bsz if you get OOMs errors
asr_max_lr=$(calc_float $ngpu/10000.0)
asr_warmup=$(calc_int 40000.0/$ngpu)

if [ $decode_only -eq 1 ]; then
if [ $decode_only == "dev" ]; then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will you add a check if decode_only is neither dev nor eval?
I feel decode_only is a bit confusing. How about using skip_train (default: false) to control whether to process the training set? And using test_sets to let users define the gss and asr_tt sets? For example

test_sets="chime6_dev dipco_dev mixer6_dev"
gss_dsets+="${test_sets}"

test_sets_list=(${test_sets// / })
asr_tt_sets=
for dset in ${test_sets_list[@]}; do
    asr_tt_sets+=$(echo "kaldi/${dset}/gss " | tr "_" "/") 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice but will require too much change in the codebase at this stage.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, indeed. We may not need to update this at the moment.

@@ -26,7 +26,7 @@ background_snrs="20:10:15:5:0"
gss_dsets=$(echo $gss_dsets | tr "," " ") # split by commas


if [ $decode_only == 1 ]; then
if [ -n "$decode_only" ]; then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can use ${skip_train} and ${test_sets} instead of decode_only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as before, we cannot change extensively the recipe at this point. I added the check for the argument.
Note that this will also require to change all two README.md and notify participants.

you can run:
```bash
./run.sh --chime7-root YOUR_PATH_TO_CHiME7_ROOT --stage 2 --ngpu YOUR_NUMBER_OF_GPUs \
--use-pretrained popcornell/chime7_task1_asr1_baseline \
--decode-only 1 --gss-max-batch-dur 30-360-DEPENDING_ON_GPU_MEM \
--decode-only dev --gss-max-batch-dur 30-360-DEPENDING_ON_GPU_MEM \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skip_train and test_sets instead of decode-only.

@popcornell
Copy link
Contributor Author

I added the check for checking the decode_only argument but we cannot make extensive changes at this point.
This is fine the way it is and making changes is too much hassle, we would need to retest everything and will confuse participants.
We should merge this recipe

@simpleoier simpleoier added the auto-merge Enable auto-merge label Jun 14, 2023
@mergify mergify bot merged commit 2839fb7 into espnet:master Jun 14, 2023
24 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants