-
Notifications
You must be signed in to change notification settings - Fork 766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Speech to text notebook 211 #271
Add Speech to text notebook 211 #271
Conversation
This notebook does not follow the contribution guide. Please review the guide. We need things like a header/title and layout the is similar to other notebooks. A few other points:
Is there any way for us to take ANY .wav file and make sure it meets these requirements? (Without adding additional dependencies to the requirements.txt?)
librosa seems to have the ability to load OGG and MP3. Can we use it for this? https://librosa.org/doc/main/generated/librosa.load.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make the following changes before approval to merge
@Debskij |
"id": "b7e9d9b9", | ||
"metadata": {}, | ||
"source": [ | ||
"### Run Decoding and Print Output." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"### Run Decoding and Print Output." | |
"### Run Decoding and Print Output" |
"id": "a566de49", | ||
"metadata": {}, | ||
"source": [ | ||
"### Do Inference!\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"### Do Inference!\n", | |
"### Do Inference\n", |
Co-authored-by: Ryan Loney <ryanloney@gmail.com>
Co-authored-by: Adrian Boguszewski <adekboguszewski@gmail.com>
@Debskij please clear the outputs from Jupyter cells and commit clean version. |
@helena-intel I'm fine with merging this tomorrow, as long as you are |
Just for fun, I tried transcribing a CSPAN video (public domain) to see how long it would take. This is a 17 minute video and the model was able to process on my laptop's Tiger Lake CPU in 4s and only 2s on iGPU. https://youtu.be/Wp-WiNXH6hI This is the output. I'm impressed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Debskij ! This is a great notebook - so many possibilities. I love that this allows you to run speech-to-text for private data that you do not want to upload to some cloud server.
I will approve and merge this now. I have one non-blocking change request: there is a DeprecationWarning about waveplot. The replacement (according to docstring) is waveshow
. I tried to simply replace the method but that did not work. Can you make a separate PR that changes waveplot
to waveshow
?
No description provided.