Issue #130: user changes to examples/audio_transcribe #133

combinatorist · 2016-06-29T01:10:21Z

The point of my changes were to:

only use pocketsphinx
pass in file names from the command line (I'm aiming for a for loop or a file list)

It works for .aif (from inside a garage .band media file) files that are only 5 seconds, but my next smallest is at least a minute and it fails.

combinatorist · 2016-07-13T18:10:20Z

Ok, there are instructions in audio_transcribe about how to run this from the command line, but first you would need to download the file: long_interview_example.aif from my public dropbox link and put it in examples/

combinatorist · 2016-07-13T18:20:39Z

FYI: There're actually two copies of the short interview: one in examples/, the other deep in examples/short_interview.band/ just to show you where I grabbed it from the garage band project.

combinatorist · 2016-07-13T18:22:30Z

Ok, I tried listen instead of record and got a transcription of 810 characters out of the 41 minute "long" interview. So, it seems I only got the first chunk, but that's definitely the most promising thing so far! Do I need to create a loop around listen?

combinatorist · 2016-07-13T19:37:13Z

Sorry, my participant changed their mind and I had to remove the file, so the dropbox link won't work anymore. I'll try to generate a replacement soon.

Meanwhile, do you have any idea why listen would produce such a short transcription?

Uberi · 2016-07-14T18:50:01Z

listen only transcribes the first phrase, so you'll need to use a loop around that. It's a bit of a hacky workaround; I'll post an example here in a bit.

No worries about the Dropbox link, by the way. I got the issue to manifest with a long-ish podcast, so there's a good starting point.

combinatorist · 2016-07-17T17:34:10Z

I just got this to work with a while loop over listen that stops when it catches the sr.WaitTimeoutError from the timeout argument in listen.

I'm not sure this is reliable, particularly, because I don't really know that it will always generate this timeout error at the end of the file (before, never?), but, at least it worked and I'm really grateful for your help.

I'd love to dig deeper and do some better testing to improve this example or create a new one to your design standards, @Uberi (if you think that would be valuable).

Uberi · 2016-07-23T00:39:23Z

Glad you got it working! I'd definitely like to include this, so when there's time I'll do a proper review and merge it. Ideally, we'd want to exclude the line ending changes and split out the long examples into their own files.

There's a small issue with listen() that will be fixed in the next release (it's actually done, but still needs to be packaged up and published) - that should get things working pretty robustly. I'll be sure to update in this thread.

combinatorist · 2016-07-30T18:48:51Z

Sorry, @Uberi, I'm really eager to make this useful, but I can't figure out what you mean by "the line endings changes"- are you saying you would like me to break the long transcription up into multiple lines, or that I messed up some kind of line endings that used to be there?

Similarly, when you say "split out the long examples into their own files", do you mean write each loop of a transcription into its own file, or move the source code for long transcriptions into a separate file (from audio_transcribe.py) ... or something else?

I think I'll have some time to next week (probably this ⚠️Tuesday) to do some tidying I'd like to do anyway. I would love if you gave me a little direction:

Should I make a separate "example" to demonstrate long transcriptions?
Should I make the long transcription loop work on all the API's? (would need keys)
Should I break up the resulting transcription somehow (line breaks, separate files, etc)?
Anything else?

Thanks!

Uberi · 2016-08-01T00:22:27Z

Hey @combinatorist,

If you check out the diff for this PR, you'll notice that there are about 3500 lines changed, but the actual number of changes is somewhat less. That seems to be due to the line endings being changed from CRLF to LF. As for those questions:

Ideally, we'd want to have a separate example for that (maybe called long_transcriptions.py).
Just one API (Google's speech recognition maybe, since it doesn't require installing Sphinx or an API key) is totally fine.
Sure, either way works!
Nope, not at the moment. Note that I unfortunately won't have much time to look at it until exams are over.

combinatorist · 2016-09-06T02:21:28Z

Hi @Uberi, sorry, I was going to work on this during a flight, but had terrible wifi, so I put it off.

I think I might have some spare time later this month, so for what it's worth:

Agreed
I haven't really used the Google Speech Recognition side of the package, but I noticed it gives better results than Sphinx in the microphone example. I'm not sure what complications might arise from longer segments, but I bet it's worth it.
Ok, I'll go with line breaks.

I'll also fix the line endings. Let me know if you happen to think of anything else!

combinatorist · 2016-10-01T07:25:38Z

Actually, when I look at the diff online, I don't see a difference in line breaks. Maybe it's displaying differently, but I think what you're seeing is just where I commented out a lot of code that wasn't relevant to my specific use case.

Regardless, I'm going to start over with my code in a separate example (as you suggested), so I won't be commenting anything out anymore.

…iptions.py

combinatorist · 2016-10-01T10:08:04Z

Hmm, for some reason I'm getting various networking errors with Google, but notice the shortest phrase worked. I've tried to get really short phrases in case it's a timeout issue, but that didn't appear to work

(speech)[517] examples% python long_transcriptions.py long_interview_example.aif 
time: 04:02.54, loop_count: 1
google error; recognition connection failed: [Errno 32] Broken pipe
time: 04:06.56, loop_count: 2
    google error; recognition request failed: Bad Gateway
time: 04:11.00, loop_count: 3
google error; recognition request failed: Bad Gateway
time: 04:15.03, loop_count: 4
google error; recognition connection failed: [Errno 32] Broken pipe
time: 04:19.06, loop_count: 5
google error; recognition connection failed: [Errno 32] Broken pipe
time: 04:23.08, loop_count: 6
number 10 I don't anticipate any reason I would need to withdraw from the study if you choose to withdraw yourself at some point you can do that no problem whenever you want to do
time: 04:23.14, loop_count: 7
google error; recognition request failed: Bad Gateway
time: 04:27.17, loop_count: 8
google error; recognition connection failed: [Errno 32] Broken pipe
time: 04:31.19, loop_count: 9
^C
Traceback (most recent call last):
  File "long_transcriptions.py", line 63, in <module>
    f.write(' ' + text)
TypeError: cannot concatenate 'str' and 'exceptions.KeyboardInterrupt' objects

combinatorist · 2016-10-03T04:33:17Z

OK, FWIW, I downloaded Google's example docs and managed to get a 40 minute clip to transcribe in 4 minutes., but I had to use a URI in Google Cloud Storage (required an account etc).

I just need to add this to the python module and we're set!

Tim Eccleston added 2 commits June 28, 2016 17:25

applied examples/audio_transcribe to Interviews (redacted)

241d6a9

return progress to CLI

87afea5

combinatorist mentioned this pull request Jun 29, 2016

Audio_transcribe hangs on long recordings #130

Closed

Tim Eccleston added 2 commits July 13, 2016 13:03

add redacted 'short interview' example

ef7ccef

update audio_transcribe.py and transcribe short 'interview'

53245f0

switched to 'listen' and transcribed part of long_interview

a519e69

Tim Eccleston added 8 commits July 16, 2016 15:21

BROKEN: add for loop in to audio_transcribe

1ffc7ba

indent under for loop in audio_transcribe

f87ea8c

add 10 second timeout to listen in audio_transcribe

e3649a9

get listen to loop in audio_transcribe (overwriting)

0785241

move transcription write out of loop in audio_transcribe

ba13667

fix loop indentation in audio_transcribe

8a1da71

add space between loops in audio_transcribe.py

0cea128

transcribe short_interview without overwriting (also works on long)

abd0523

add revision of long_interview example without overwriting

b5bb3ea

Tim Eccleston added 4 commits October 1, 2016 03:22

remove corrupted verion of audio_transcribe.py

3448b37

re-introduce corrupted version of audio_transcribe.py as long_transcr…

7c53078

…iptions.py

Merge branch 'master' of https://github.com/Uberi/speech_recognition

32554da

switch long_transcriptions from Sphinx to Google (and rm comments)

bd677e2

Tim Eccleston added 6 commits October 14, 2016 20:31

remove garage band project / directory

ead9c99

latest version of long_transcriptions with google

f367683

move and rename long_transcriptions files into a directory

ad974aa

add additional sphinx analysis etc

d65100e

Merge branch 'long_transcriptions_with_listen'

fca0d48

remove extra copy of full sphinx long transcription

598272a

milahu mentioned this pull request Mar 5, 2022

Merge uberi pr133 598272a issue 130 user changes to examplesaudio transcribe by combinatorist milahu/speech_recognition#21

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue #130: user changes to examples/audio_transcribe #133

Issue #130: user changes to examples/audio_transcribe #133

combinatorist commented Jun 29, 2016 •

edited

combinatorist commented Jul 13, 2016

combinatorist commented Jul 13, 2016

combinatorist commented Jul 13, 2016

combinatorist commented Jul 13, 2016 •

edited

Uberi commented Jul 14, 2016

combinatorist commented Jul 17, 2016

Uberi commented Jul 23, 2016

combinatorist commented Jul 30, 2016

Uberi commented Aug 1, 2016

combinatorist commented Sep 6, 2016

combinatorist commented Oct 1, 2016

combinatorist commented Oct 1, 2016

combinatorist commented Oct 3, 2016

Issue #130: user changes to examples/audio_transcribe #133

Are you sure you want to change the base?

Issue #130: user changes to examples/audio_transcribe #133

Conversation

combinatorist commented Jun 29, 2016 • edited

combinatorist commented Jul 13, 2016

combinatorist commented Jul 13, 2016

combinatorist commented Jul 13, 2016

combinatorist commented Jul 13, 2016 • edited

Uberi commented Jul 14, 2016

combinatorist commented Jul 17, 2016

Uberi commented Jul 23, 2016

combinatorist commented Jul 30, 2016

Uberi commented Aug 1, 2016

combinatorist commented Sep 6, 2016

combinatorist commented Oct 1, 2016

combinatorist commented Oct 1, 2016

combinatorist commented Oct 3, 2016

combinatorist commented Jun 29, 2016 •

edited

combinatorist commented Jul 13, 2016 •

edited