Not clear how to do a simple speech recognition #102
Comments
Can you add more data about what steps dodn't worked for you? |
What I'm trying to do is.... speech to text of a publicly available content how to replicate:
The output of the "deepspeech" command is....
|
Seems that you are using it right but you are not using our model for italian, right? |
Actually... I should be... I forgot these 2 lines which I had done Scarica e scompatta i file per il modello italianocurl -LO https://github.com/MozillaItalia/DeepSpeech-Italian-Model/releases/download/2020.08.07/model_tensorflow_it.tar.xz Just in case, this is the ls
|
So you are using the model without transfer learning from english, that is pure italian but as we don't have a lot of hours to training is not very good. So the procedure you are using is right, it is just the model that is not working good. I suggest to you to try the one with transfer learning from english that is more accurate as is adding over 7000+ hours compared to 250~ hours of italian (it is in the release page like the other one). |
Hello there, I'm checking it right now. That mp3 file is a stereo file. During conversion, please convert it in mono. Also you can try to use the same model but trained with transfer learning from english. You can find in the release or at this url: |
Ok it works. The disclaimer voice transcription:
The first 10 seconds of speaker transcription (I cannot understand what he's saying btw)
|
Ok! Way better with mono. A good candidate for README :) Original message is "Produzioni radio maria. Tutti i diritti sono riservati... blabla...". Transfer learning model(venv) luca@DESKTOP-QTQGTR0:/mnt/c/Users/luca/dati/voice/it$ deepspeech --model transfer_model_tensorflow_it/output_graph.pbmm --scorer scorer --audio commento-al-messaggio-del-25-.short.wav Normal model(venv) luca@DESKTOP-QTQGTR0:/mnt/c/Users/luca/dati/voice/it$ deepspeech --model output_graph.pbmm --scorer scorer --audio commento-al-messaggio-del-25-.short.wav |
Readme updated, about the quality as today the model is not very good anyway. If you want to help and contribute to the project I invite you to join us on Telegram with @mozitabot and later in the Developers group where we discuss. |
It would be great if the instructions in the README were dumb-proof.
I just tried to follow them and the results were nonsensical.
It may clearly be due to error on our side or the environment (WSL) but looking at the release, I suspect that some data is missing (I just followed strictly what's on the README).
The text was updated successfully, but these errors were encountered: