Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ngram scorer update #1992

Merged
merged 4 commits into from
Jun 3, 2020
Merged

Ngram scorer update #1992

merged 4 commits into from
Jun 3, 2020

Conversation

qmpzzpmq
Copy link
Contributor

@qmpzzpmq qmpzzpmq commented Jun 1, 2020

  1. glitch fix
  2. added an option for full or part ngram scorer for decoding acceleration
  3. and transform it to local score
  4. RESULT update, lower test CER from 6.7% to 6.6%
ngram scorer dev set decoding time(s)
full 9860
part 4090

@mergify mergify bot added the README label Jun 1, 2020
@codecov
Copy link

codecov bot commented Jun 1, 2020

Codecov Report

Merging #1992 into develop will decrease coverage by 0.25%.
The diff coverage is 0.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1992      +/-   ##
===========================================
- Coverage    62.24%   61.98%   -0.26%     
===========================================
  Files          258      258              
  Lines        21907    21997      +90     
===========================================
  Hits         13635    13635              
- Misses        8272     8362      +90     
Impacted Files Coverage Δ
espnet/asr/pytorch_backend/recog.py 0.00% <0.00%> (ø)
espnet/bin/asr_recog.py 0.00% <0.00%> (ø)
espnet/nets/scorers/ngram.py 0.00% <0.00%> (ø)
espnet/tts/pytorch_backend/tts.py 0.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e6f818c...e5810be. Read the comment docs.

@qmpzzpmq
Copy link
Contributor Author

qmpzzpmq commented Jun 1, 2020

@sw005320
hi, shinji, could you give me feedback about this Codecov issue? I am not good at it. But I cleaned my commit log this time

@sw005320
Copy link
Contributor

sw005320 commented Jun 1, 2020

Cool!
Thanks for the update!

@sw005320
hi, shinji, could you give me feedback about this Codecov issue? I am not good at it. But I cleaned my commit log this time

You can ignore this.
This will be passed when you add some new classes or functions with appropriate documentation. These types of modifications are not important.

Copy link
Contributor

@sw005320 sw005320 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

type=str,
default="full",
choices=("full", "part"),
help="ngram scorer choices",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain a bit more about the full and part scorer options?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the ngram is set as a part scorer, similar to CTC scorer, the ngram scorer only score to topID.
but ngram is set as a full scorer, the ngram score all ID.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, I meant to say that it’s better to add such descriptions. Probably in the help message?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea, I will add it.

| Sum/Avg | 14326 205341 | 94.1 5.7 0.2 0.1 6.0 41.7 |
exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_test_decode_pytorch_transformer_lm0.7_4gram_0.3/result.txt
| SPKR | # Snt # Wrd | Corr Sub Del Ins Err S.Err |
| Sum/Avg | 7176 104765 | 93.5 6.3 0.2 0.1 6.6 44.6 |
```
- only e2e model
exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_dev_decode_pytorch_transformer/result.txt
| SPKR | # Snt # Wrd | Corr Sub Del Ins Err S.Err |
| Sum/Avg | 14326 205341 | 93.6 6.2 0.2 0.1 6.5 45.6 |
exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_test_decode_pytorch_transformer/result.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines should be in the coding mode. Add "```", i.e.,

exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_dev_decode_pytorch_transformer/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |   14326       205341 |   93.6        6.2        0.2        0.1        6.5      45.6    |
exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_test_decode_pytorch_transformer/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |   7176       104765  |   92.7        7.1        0.2        0.1        7.4      49.8    |

@ShigekiKarita ShigekiKarita added this to the v.0.8.0 milestone Jun 2, 2020
@ShigekiKarita ShigekiKarita added the Enhancement Enhancement label Jun 2, 2020
@sw005320
Copy link
Contributor

sw005320 commented Jun 2, 2020

Sorry to ask many things.
Is it possible to add a partial scorer test as well?

@qmpzzpmq
Copy link
Contributor Author

qmpzzpmq commented Jun 2, 2020

@sw005320 I will try

@qmpzzpmq
Copy link
Contributor Author

qmpzzpmq commented Jun 2, 2020

@sw005320 done
ngram full and part scorer have the same result but the full scorer takes 9860s in dev set decoding part one takes only 4090s in my test.

espnet/bin/asr_recog.py Outdated Show resolved Hide resolved
@sw005320 sw005320 mentioned this pull request Jun 2, 2020
10 tasks
@sw005320 sw005320 merged commit cca79e6 into espnet:develop Jun 3, 2020
@sw005320
Copy link
Contributor

sw005320 commented Jun 3, 2020

Many thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants