New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
create a transcription output #106
Comments
I did similar experiments with IBM Bluemix where I took some videos from recorded presentations (not videos with high quality narration). The results were abysmal. It only got the basic English words right like stop words and some other simpler words. All the words that mattered it got wrong. My intention was to make it possible to full-text search for videos based on what was said in them but the stop-words are ignored by the search engine anyway so I gave up. Can you elaborate a bit on that "90%" number and the nature & quality of the audio? |
hey @peterbe so I did some tests with some nytimes videos, including some with accents, and the results were really good. Example: http://flv.io/41857_1_02sa-elections_wg_360p.mp4 {
"results": [{
"alternatives": [{
"confidence": 0.84931809,
"transcript": "NC is only an obstacle we have to move them out of the way so we can fight the number one present yet South Africa which is drug test"
}]
}, {
"alternatives": [{
"confidence": 0.84143984,
"transcript": " why does ANC might be experiencing its huge internal weaknesses the institutionalization of this party and its infrastructure and resources still has death"
}]
}]
} |
Hmm... I'm impressed but not impressed :) Your results with Google's Speech API is certainly better than mine from IBM Bluemix but I'm still unsure this transcript is good enough to put in front of users. What my plan was was to use the automated transcript for my search engine "Find videos by words uttered" (to extend beyond searching metadata text) but people are more likely to type in "ANC" rather than "the number one". Having said that I'm going to go back and re-investigate Google as an option for my videos with really clear and crisp sound. Perhaps an output of this is not to really automate it but to guide and document how you'd go ahead and do it if interested. You know, to avoid snickers being too tightly bundled to vendors like Google. |
yes @peterbe you are right. We don't have plans to automatically generate subtitles with this but to add in the metadata of the videos to help on the search engine and personalization. |
hey @peterbe maybe Polly is more clever than Google API? |
Recently I played with @google's speech API and it seems they have a pretty accurate speech-to-text feature. I tested by extracting the audio of some @nytimes videos using
and sent to the speech api. I got ~90% of accuracy.
It would be a blast if we had this transcription generation as a feature of snickers.
The text was updated successfully, but these errors were encountered: