Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add e-branchformer result for tedlium3 and add checker for text output length #5130

Merged
merged 8 commits into from May 2, 2023

Conversation

Some-random
Copy link
Contributor

No description provided.

@codecov
Copy link

codecov bot commented Apr 20, 2023

Codecov Report

Merging #5130 (cac7f59) into master (2219358) will decrease coverage by 0.01%.
The diff coverage is 50.00%.

@@            Coverage Diff             @@
##           master    #5130      +/-   ##
==========================================
- Coverage   74.99%   74.99%   -0.01%     
==========================================
  Files         618      618              
  Lines       55586    55588       +2     
==========================================
+ Hits        41688    41689       +1     
- Misses      13898    13899       +1     
Flag Coverage Δ
test_integration_espnet1 66.28% <ø> (-0.01%) ⬇️
test_integration_espnet2 47.61% <0.00%> (-0.01%) ⬇️
test_python 65.45% <50.00%> (-0.01%) ⬇️
test_utils 23.28% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
espnet2/train/preprocessor.py 29.16% <50.00%> (+0.06%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@sw005320
Copy link
Contributor

Thanks!

  • Can you upload a model to huggingface hub and add a link to READEME.md?
  • Can you move the e-branchformer results to the top of READEME.md?

@sw005320 sw005320 added ASR Automatic speech recogntion Recipe labels Apr 20, 2023
@sw005320 sw005320 added this to the v.202303 milestone Apr 20, 2023
@sw005320
Copy link
Contributor

@Some-random, any update?

@kan-bayashi kan-bayashi modified the milestones: v.202303, v.202307 May 1, 2023
@Some-random
Copy link
Contributor Author

@Some-random, any update?

So sorry for the delay, so how I missed the feedback on this PR. I've put the results from E-Branchformer in the front and added links for model on HuggingFace.

@sw005320 sw005320 added the auto-merge Enable auto-merge label May 2, 2023
@mergify mergify bot merged commit 33aa097 into espnet:master May 2, 2023
24 of 25 checks passed
@Some-random Some-random deleted the tedlium3 branch May 2, 2023 15:57
Comment on lines +335 to +340
if len(text_ints) > 100:
logging.warning(
"The length of the text output exceeds 100, "
"which may cause OOM on the GPU."
"Please ensure that the data processing is correct and verify it."
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sw005320 @Some-random Sorry, really late on that (I didn't sync w/ master in a while) but why was it added and how was the limit defined?

I was a bit confused when I saw that thousand of warnings were displayed in my training logs!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was added to a warning of the large memory consumption, but 100 is probably too small.
So, we can change it to 500 or more.
Sorry about it.

Copy link
Member

@b-flo b-flo Jun 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks and no worries, I'll open a PR!

Edit: You're too fast for me Professor! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ASR Automatic speech recogntion auto-merge Enable auto-merge ESPnet2 README Recipe
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants