-
Notifications
You must be signed in to change notification settings - Fork 447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prediction on new audio file #23
Comments
Hi ! I worked a bit with Kaldi in production environment (with automatic transcriptions of uploaded audio files). Nonetheless, and as you mention, the decoding time can be a problem. One of the solution we found is to use speaker diarization, so we can split the decoding in multiple threads with one thread equal to a speaker. |
Yes,
the current version of the toolkit is mainly designed for off-line speech
recognition. If you would like to switch to on-line speech recognition,
what you could do is to redirect the posterior probabilities (now saved
into and *.ark file) into the standar output and read them with the kaldi
script for on-line decoding. This can be done, but it is not implemented
yet...
Mirco
…On Thu, Nov 29, 2018 at 1:30 PM Parcollet Titouan ***@***.***> wrote:
Hi !
For now, the only solution is to first train your pytorch model, and then
call run_exp.py with a modified conf file with the number of epoch set to 0
(and also a specific [dataset] section that you can call as a testing
dataset. We are aware that this is not optimal for real production case,
and we are currently working on a side script that one can call to just
decode .wav files from a previously trained pytorch model. Nonetheless, you
can dive a bit on the run_exp.sh script to better understand how you can
easily build your own script (if you are in a hurry).
I worked a bit with Kaldi in production environment (with automatic
transcriptions of uploaded audio files). Nonetheless, and as you mention,
the decoding time can be a problem. One of the solution we found is to use
speaker diarization, so we can split the decoding in multiple threads with
one thread equal to a speaker.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#23 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AQGs1pVTDCieJxRLCXoXxkiPdGoMtI5jks5u0CfQgaJpZM4Y5aZ5>
.
|
Hi,
How to use pytorch-kaldi in production environment after training the Model.
I have models ready, which I have generated by using core Kaldi. The problem I am facing is that it takes lot of time during decoding/prediction phase.
So please let me know how to use this tool during live environment.
Also if you have useful suggestions for Kaldi deployments please share.
Also I am planing to integrate Kaldi model in one of ours applications which is live.
So yours suggestions will be very useful for me.
--
thanks
Nisar
The text was updated successfully, but these errors were encountered: