Skip to content

Add Silero Speech-To-Text models #153

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Sep 22, 2020
Merged

Add Silero Speech-To-Text models #153

merged 7 commits into from
Sep 22, 2020

Conversation

snakers4
Copy link
Contributor

Please kindly review our Speech-To-Text models

@netlify
Copy link

netlify bot commented Sep 12, 2020

Deploy preview for pytorch-hub-preview ready!

Built with commit 78f39ca

https://deploy-preview-153--pytorch-hub-preview.netlify.app

@snakers4
Copy link
Contributor Author

Maybe I did something wrong, but I cannot see our model's page here

@snakers4
Copy link
Contributor Author

Hi, @ailzhang @bertmaher @wconstab could you please help with assigning a correct person to review the submission?
Thanks!

Copy link
Contributor

@ailzhang ailzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@snakers4
Thanks for contributing!
If you move your md file out of docs you should be able to see it on the webpage.
Also once you do that, code snippet in the the markdown will be executed in the CI so you probably also need a working example there (e.g. test_files)

@snakers4
Copy link
Contributor Author

Hi,

If you move your md file out of docs you should be able to see it on the webpage.

Moved it to the root folder

Also once you do that, code snippet in the the markdown will be executed in the CI so you probably also need a working example there (e.g. test_files)

Added a small validation dataset download so that it works end-to-end, tested it locally

@snakers4
Copy link
Contributor Author

fixed the imports and path issues
hopefully it will be fine this time

@snakers4
Copy link
Contributor Author

@ailzhang

looks like all is fixed
please take a look

Copy link
Member

@soumith soumith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great great addition.

I have one important comment around dependencies (so that the hub example works in Google Colab) and another comment which is a suggestion.
Please take a look

# see https://github.com/snakers4/silero-models for utils and more examples

device = torch.device('cpu') # gpu also works, but our models are fast enough for CPU
model, decoder, utils = torch.hub.load(github='snakers4/silero-models',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this require omegaconf and torchaudio to be installed?

if so, you should add a cell like in https://github.com/pytorch/hub/blob/master/nvidia_deeplearningexamples_waveglow.md#example
which pip installs these extra packages.

that will make the thing instantly run in Google Colab, and is really valuable!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

would this require omegaconf and torchaudio to be installed?

Yeah, this would
That is why I needed to include them in your CI environment above
Actually in colab I also needed to install soundfile, as we are using it as a backend for TorchAudio

if so, you should add a cell like in
that will make the thing instantly run in Google Colab, and is really valuable!

yeah, this totally makes sense
by the way, you can see a more extended colab version here

I will add this shortly

read_audio,
prepare_model_input) = utils # see function signature for details

torch.hub.download_url_to_file('http://www.openslr.org/resources/83/midlands_english_female.zip',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of a zip file, extracting it and then running a batch of wav files through, it would be much nicer and more illustrative of downloading a single wav file and processing it through.

like:

torch.hub.download_url_to_file('some download url for speech.wav', dst='speech.wav)

input = prepare_model_input(glob('midlands_english_female/*.wav'), device=device)

output = model(input)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one of reasons why I am downloading whole validation dataset and running a batch is to demonstrate the models are actually quite fast on CPU as well as on GPU

but I see your point
I will add a default example with one file, but I would nevertheless keep the utils
because for a first time user batching may be an issue, so I would like to solve that

I will add the changes shortly

@soumith soumith merged commit 9146b39 into pytorch:master Sep 22, 2020
@soumith
Copy link
Member

soumith commented Sep 22, 2020

thanks for the great contribution. it should go live sometime today, maybe in a couple of hours.

@snakers4
Copy link
Contributor Author

checked here I guess it is not there yet, looking forward to the release!

on a side note, as a small self-funded team we feel extremely proud to have made something worthy of including in this hub
speech has been asking for some care for quite some time

@soumith
Copy link
Member

soumith commented Sep 22, 2020

apparently we've moved to a once-per-day update of the site now. it should be live by tomorrow morning.

@snakers4
Copy link
Contributor Author

it is live now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants