Skip to content

Conversation

@SylvesterDuah
Copy link

No description provided.

jtrmal and others added 30 commits August 18, 2022 03:42
* [infra] docker images automatically using gh
* minor change
The example for the post-to-tacc fails , but with the correct of `ark:- |`  there is no piping error
Coauthored-By: Jonghwan Hyeon <hyeon0145@gmail.com>
* Update run_blstm.sh

fix bug aspire run_blstm.sh

* Update egs/aspire/s5/local/nnet3/run_blstm.sh

Co-authored-by: Cy 'kkm' Katsnelson <kkm@pobox.com>

Co-authored-by: Cy 'kkm' Katsnelson <kkm@pobox.com>
* Remove unused variable.

* cudadecoder: Make word alignment optional.

For CTC models using word pieces or graphemes, there is not enough
positional information to use the word alignment.

I tried marking every unit as "singleton" word_boundary.txt, but this
explodes the state space very, very often. See:

nvidia-riva/riva-asrlib-decoder#3

With the "_" character in CTC models predicting word pieces, we at the
very least know which word pieces begin a word and which ones are
either in the middle of the word or the end of a word, but the
algorithm would still need to be rewritten, especially since "blank"
is not a silence phoneme (it can appear between).

I did look into using the lexicon-based word alignment. I don't have a
specific complaint about it, but I did get a weird error where it
couldn't create a final state at all in the output lattice, which
caused Connect() to output an empty lattice. This may be because I
wasn't quite sure how to handle the blank token. I treat it as its own
phoneme, bcause of limitations in TransitionInformation, but this
doesn't really make any sense.

Needless to say, while the CTM outputs of the cuda decoder will be
correct from a WER point of view, their time stamps won't be correct,
but they probably never were in the first place, for CTC models.
Fix "glossaries_opt" variable name at line number 39. It's misspelled due to which words in the glossaries weren't reserved while creating BPE.
This is to fix a CI error.

It appears that this is from using "ubuntu-latest" in the CI
workflow. It got upgraded to ubuntu 22.04 automatically, and this
doesn't have python2.7 by default.
danijel3 and others added 28 commits June 2, 2024 23:11
Fix missing FLT_MAX in some CUDA installation scenarios.
Fix reported issues w.r.t python2.7 and some apple silicone quirks
Support for both OpenFST 1.7.3 and 1.8.2
* upload the error logs to artifact repository

* fix yaml error
* upgrade the checkout action on build pipeline

* upgrade the checkout action on build pipeline
Make it buildable under Fedora41 using their board tools
@jtrmal
Copy link
Contributor

jtrmal commented Jan 27, 2025

Probably mistake

@jtrmal jtrmal closed this Jan 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.