Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recipe for African Accented French #2813

Open
wants to merge 36 commits into
base: master
Choose a base branch
from

Conversation

johnjosephmorgan
Copy link
Contributor

This is a recipe to build an ASR system with the African Accented French corpus. It follows the mini_librispeech pattern.

# Num-params 5270002


#| model | dev tgsmall | test tgsmall | devtest tgsmall | dev tgmed | test tgmed | devtest tgmed | dev tglarge | test tglarge | devtest tglarge |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Would you mind putting these GMM numbers in a RESULTS file instead? It would make it easier to find for those used to the usual structure. And please don't forget to include some kind of command in the RESULTS that would enable you to obtain those numbers (even if not in the exact same format). That will make it easier for others, after running it, to verify that their results are similar to yours.

… version on openslr.org. My scripts were using those new files. I changed my scripts to use the files that are currently on openslr.org. Later when we get transcripts for the answers, I will ask Yenda to update the corpus on openslr.org and I will update my scripts. I also changed to use iconv instead of uconv.
…slr.org is bad. The transcripts are actually the questions instead of the answers. I am including a temporary transcription that was obtained from a decoding run until we get the good transcripts.
| tri2b | 33.02 | 19.01 | 3.85 | 33.26 | 26.36 | 9.91 | 31.48 | 21.27 | 5.53 |
| tri3b | 26.91 | 18.85 | 3.49 | 25.90 | 24.51 | 8.51 | 23.83 | 20.01 | 4.37 |
| chain tdnn-f | 24.02 | 17.20 | 1.96 | 22.30 | 33.66 | 16.17 | 20.14 | 18.69 | 3.33 |
| chain tdnn-f online | 24.21 | 17.23 | 1.96 | 22.26 | 33.72 | 16.14 | 19.10 | 32.07 | 14.74 |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did something go wrong with the last 2 results on this line? The chain tdnnf-online with tglarge rescoring?

You don't seem to be getting as much improvement from the GMM to TDNN phase as I would normally expect.
Since there is more data, you could try increase the bottleneck dimension from 96 to 128 and reducing the l2-regularize values from 0.03 and 0.015 to 0.02 and 0.01, and reducing num-epochs from 20 to 15 (or maybe even 10, but test it).

@stale
Copy link

stale bot commented Jun 19, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale Stale bot on the loose label Jun 19, 2020
@stale
Copy link

stale bot commented Jul 19, 2020

This issue has been automatically closed by a bot strictly because of inactivity. This does not mean that we think that this issue is not important! If you believe it has been closed hastily, add a comment to the issue and mention @kkm000, and I'll gladly reopen it.

@stale stale bot closed this Jul 19, 2020
@kkm000 kkm000 reopened this Jul 19, 2020
@stale stale bot removed the stale Stale bot on the loose label Jul 19, 2020
@stale
Copy link

stale bot commented Sep 17, 2020

This issue has been automatically marked as stale by a bot solely because it has not had recent activity. Please add any comment (simply 'ping' is enough) to prevent the issue from being closed for 60 more days if you believe it should be kept open.

@stale stale bot added the stale Stale bot on the loose label Sep 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Stale bot on the loose
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants