Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get text from audio #38

Closed
walchko opened this issue May 21, 2016 · 13 comments
Closed

Get text from audio #38

walchko opened this issue May 21, 2016 · 13 comments

Comments

@walchko
Copy link

walchko commented May 21, 2016

Can you please write a complete library? Please include a function for speech (link to your API) passed as an audio file. Basically it does this (per your docs):

  $ curl -XPOST 'https://api.wit.ai/speech?v=20141022' \
   -i -L \
   -H "Authorization: Bearer $TOKEN" \
   -H "Content-Type: audio/wav" \
   --data-binary "@sample.wav"
@oplatek
Copy link

oplatek commented Jun 14, 2016

Speech API would be nice!
Any update on this?

I know I can hack it and submit a speech request to your or any other speech API and than submit the 1-best hypothesis to your converse API.
However, as your (speech) API is quite slow, the latency is not trivial and the user experience horrible
just because I need to submit two requests instead of one.
If you would provide a converse API through speech directly it would speed up things considerably.

@jhoelzl
Copy link
Contributor

jhoelzl commented Jun 14, 2016

+1

@goose121
Copy link

I also think that this would be great; after all, there's not much of a point to natural speech if you can't actually speak

@lowdev
Copy link

lowdev commented Sep 21, 2016

+1

@milindaj
Copy link

milindaj commented Oct 1, 2016

+1 converse API through speech directly is a great feature to have

@andehr
Copy link

andehr commented Oct 26, 2016

+1

@Accentrix
Copy link

This feature would make the Pywit library perfect! still waiting.... :/

@blandinw
Copy link
Contributor

blandinw commented Nov 2, 2016

Hi everybody, apologies for the lack of responsiveness here and thanks for keeping this issue alive.
We used to have audio recording + streaming in the first versions of the library, but it was a constant source of pain, as it involved a lot of platform specific code.

Regarding audio recording (from a microphone device), I don't think it makes sense to add that to pywit, as it's highly platform specific and does not make sense for server-side use cases.

Regarding the network streaming part, we'd be open to add back a method .speech() to the client that takes a "stream of bytes" (what's the idiomatic way to reprensent that?), uploads it to Wit and returns the response object. We'd need to come up with a solution that works on both Python 2 and 3. We may come around to doing that, but we're working on some other awesome things at the moment. Contributions welcome!

@walchko
Copy link
Author

walchko commented Nov 2, 2016

You might want to actually read what I was asking for ... I never asked you to capture audio. Just make python as complete as your http api so I can send an audio file for you to interpret ... it is simple!

You also might want to check your pull requests ... Method added to upload voice commands #67 already already does this. I independently implemented a very similar solution long ago, but was far too lazy to submit a pull request. @willywongi however did, so please take a look at his work and consider committing it.

@blandinw
Copy link
Contributor

blandinw commented Nov 3, 2016

I commented on the PR, hopefully @willywongi can get around to implementing the last bit soon. We'll merge then.

@willywongi
Copy link
Contributor

"Good news everyone!" I pushed the correction @blandinw was asking - I forgot to allow users to set the correct content-type header.

@blandinw
Copy link
Contributor

blandinw commented Nov 3, 2016

Thank you @willywongi!
I merged your PR + bumped Wit to 4.2.0 on PyPI.

@blandinw blandinw closed this as completed Nov 3, 2016
@sergios-ferreira
Copy link

sergios-ferreira commented Nov 14, 2021

Can you please write a complete library? Please include a function for speech (link to your API) passed as an audio file. Basically it does this (per your docs):

  $ curl -XPOST 'https://api.wit.ai/speech?v=20141022' \
   -i -L \
   -H "Authorization: Bearer $TOKEN" \
   -H "Content-Type: audio/wav" \
   --data-binary "@sample.wav"

curl -XPOST "https://api.wit.ai/speech?v=20211113" \
-i -L \
-H "Authorization: Bearer [YOUR_TOKEN]" \
-H "Content-Type: audio/raw;encoding=signed-integer;bits=16;rate=44100;endian=little" \
--data-binary "@[YOUR_AUDIO].wav"

Remember: @ front of YOUR_AUDIO is important.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests