Offline inference as service #537

mailong25 · 2020-02-14T03:40:52Z

I'm looking for making the prediction on a single wav file without the need of loading pre-trained AM and LM models every single time. These models should only be loaded once at the beginning.

I'm not referring to online decoding (real-time decoding). I've read Python bindings examples and simple_streaming_asr_example and seen that it is not possible. Should I create my own code to do this?

optimusfzco · 2020-02-15T14:40:44Z

Thank you for asking this, i am very interested as well.

mailong25 · 2020-02-18T01:04:21Z

@vineelpratap @avidov any suggestions ?

avidov · 2020-02-18T16:04:04Z

Trying to understand the ask here.
If I understand you correctly, you are asking for a:

process that loads and stays up continuously
and you want at any time later feed the input files into this continuously running process.

Is this correct ?

If so:
How do you want to feed the input when the process is running?

The examples can be modified to do something like that. I can give you some suggestions if you explain in more details what you need.

avidov · 2020-02-18T23:50:36Z

At some point we'll probably release a small library for plugging wav2letter into a service (e.g. web-services or web-site).
Will this cover your need?

mailong25 · 2020-02-19T00:56:27Z

Yes, you're right. I'm building a python-based ASR applications and I try to intergrate ASR to our system. So I need either:

A web service as you mentioned above. That way I can feed an audio (as binary format) to the process and get the transcription result. The python code will look like this:
import Wav2LetterClient
model = Wav2LetterClient(port = 'xxx', ip = 'xxx')
audio = open('sample.wav','rb')
model.transcribe(audio)
A python bindings that allow me to load an Acoustic modeling (eg, convglu, tds), make prediction on single wav file, and return emissions and transitions scores.

optimusfzco · 2020-02-19T05:10:07Z

Hello,
I am working on an ASR app as well, i am more concerned with feeding audio live from microphone to achieve live transcription.

avidov · 2020-02-19T19:38:52Z

After thinking about it with the team I see the following:

Creating ASR services is supported by https://github.com/facebookresearch/wav2letter/blob/master/inference/inference/examples/AudioToWords.h
With use example in:
https://github.com/facebookresearch/wav2letter/blob/master/inference/inference/examples/MultithreadedStreamingASRExample.cpp#L273-L280
For feeding audio live from microphone we have:
https://github.com/facebookresearch/wav2letter/blob/master/inference/inference/examples/AudioToWords.h#L23-L29
For quick on the fly testing we suggest adding an interactive executable with a tiny shell.
You can enter a file name at the shell and it will dump the transcription. Will look something like:

$ ./interactive_streaming_asr_example

/some/file/name.wav
.... transcription
/some/other/file/name.wav
.... transcription

I think this covers what you need. Please correct me otherwise.

mailong25 · 2020-02-20T04:59:54Z

Thank you for your instruction, but where the feature_extractor.bin comes from ?

https://github.com/facebookresearch/wav2letter/blob/master/inference/inference/examples/MultithreadedStreamingASRExample.cpp#L76

avidov · 2020-02-20T23:30:11Z

Added interactive_streaming_asr_example
45110ba

Interactive mode loads the models once and then waits for command line requests. It has a tiny command line shell that support:

Transcribing audio files:
input=[full path to audio file]
Redirecting output to a file:
output=[full path to output text file]
Redirecting to stdout
output=stdout
Convenient use from Python script/shell using popen(). tlikhomanenko@ will release a tutorial for that soon.

Will add a tutorial soon at:
https://github.com/facebookresearch/wav2letter/wiki/Inference-Run-Examples

Hope that this will cover your needs. Please let me know.

Regarding:

Thank you for your instruction, but where the feature_extractor.bin comes from ?

https://github.com/facebookresearch/wav2letter/blob/master/inference/inference/examples/MultithreadedStreamingASRExample.cpp#L76

Is this comment belong to this thread?

mailong25 · 2020-02-21T01:06:02Z

Thanks, the changes cover all my needs.
Regarding:

Thank you for your instruction, but where the feature_extractor.bin comes from ?

https://github.com/facebookresearch/wav2letter/blob/master/inference/inference/examples/MultithreadedStreamingASRExample.cpp#L76
Is this comment belong to this thread?

No, it doesn't . I'm gonna double check that. Any concerns will be open in another thread.

mailong25 changed the title ~~Offline inference pipeline~~ Offline inference as service Feb 14, 2020

tlikhomanenko assigned avidov and vineelpratap Feb 15, 2020

avidov closed this as completed Feb 21, 2020

JesseBerdowski mentioned this issue May 19, 2020

Reaching out regarding wav2letter issue 537 mailong25/bert-vietnamese-question-answering#1

Open

mailong25 mentioned this issue Sep 30, 2020

Decoder using build/Decode will reload model? mailong25/self-supervised-speech-recognition#3

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Offline inference as service #537

Offline inference as service #537

mailong25 commented Feb 14, 2020

optimusfzco commented Feb 15, 2020

mailong25 commented Feb 18, 2020

avidov commented Feb 18, 2020

avidov commented Feb 18, 2020

mailong25 commented Feb 19, 2020

optimusfzco commented Feb 19, 2020

avidov commented Feb 19, 2020

mailong25 commented Feb 20, 2020

avidov commented Feb 20, 2020

mailong25 commented Feb 21, 2020

Offline inference as service #537

Offline inference as service #537

Comments

mailong25 commented Feb 14, 2020

optimusfzco commented Feb 15, 2020

mailong25 commented Feb 18, 2020

avidov commented Feb 18, 2020

avidov commented Feb 18, 2020

mailong25 commented Feb 19, 2020

optimusfzco commented Feb 19, 2020

avidov commented Feb 19, 2020

mailong25 commented Feb 20, 2020

avidov commented Feb 20, 2020

mailong25 commented Feb 21, 2020