Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsupported content-type #145

Closed
Prateek13727 opened this issue Jun 2, 2020 · 5 comments
Closed

Unsupported content-type #145

Prateek13727 opened this issue Jun 2, 2020 · 5 comments

Comments

@Prateek13727
Copy link

Prateek13727 commented Jun 2, 2020

hey fellas,

Thank-you for creating Wit :) It's helping me get started with my first NLP project.

the use-case
I am reading a wave file and sending it across using wit speech API, as shown in the code at the bottom.

I was initially getting 400 Bad Request, hence I cloned the repo and performed the steps mentioned #126 to get a more specific error message (as shown below)

The error trace

Traceback (most recent call last):
  File "wit_speech.py", line 23, in <module>
    text =  RecognizeSpeech('recordings/mav_abs_1.wav', 4)
  File "wit_speech.py", line 12, in RecognizeSpeech
    resp = client.speech(f, headers={'Content-Type':'audio/wav'})
  File "/home/maverick/maverick/myGit/orador/pywit/wit/wit.py", line 90, in speech
    data=audio_file, headers=headers)
  File "/home/maverick/maverick/myGit/orador/pywit/wit/wit.py", line 46, in req
    raise WitError('Wit responded with an error: ' + json['error'])
wit.wit.WitError: Wit responded with an error: Unsupported content-type

Am I missing something here?

The code

import requests
  import json
  from wit import Wit
   
  wit_access_token = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
  
  def RecognizeSpeech(AUDIO_FILENAME, num_seconds = 5):
  
      client = Wit(wit_access_token)
      with open(AUDIO_FILENAME, 'rb') as f:
          resp = client.speech(f, {'Content-Type': 'audio/wav'})
  
      return resp
  
  if __name__ == "__main__":
      text =  RecognizeSpeech('recordings/mav_abs_1.wav', 4)
      print("\nYou said: {}".format(text))

App-id: 1531619810342334

Any helps would be appreciated. Please let me know in case of any clarifications needed.

cheers
Prateek

@patapizza
Copy link
Member

Hi @Prateek13727,

How is the WAV file encoded? What is the output of sox --i recordings/mav_abs_1.wav?

@Prateek13727
Copy link
Author

hey @patapizza

the output of sox --i recordings/mav_abs_1.wav is

Input File : 'mav_abs_1.wav'
Channels : 2
Sample Rate : 44100
Precision : 16-bit
Duration : 00:01:07.82 = 2990784 samples = 5086.37 CDDA sectors
File Size : 12.0M
Bit Rate : 1.41M
Sample Encoding: 16-bit Signed Integer PCM

@patapizza
Copy link
Member

From the API docs:

At this time, Wit.ai is only able to process mono so you must make sure to send mono and not stereo to the API.

Also make sure you provide the right headers for sample rates, precision, etc.

Hope this helps.

@Prateek13727
Copy link
Author

Prateek13727 commented Jun 7, 2020

Thanks, @patapizza for your reply. I will try with mono audio and get back

The wave file created with the below command (from the APU docs) works fine with the API.
sox -d -b 16 -c 1 -r 16k sample.wav

@Prateek13727
Copy link
Author

Hey @patapizza

Had a small doubt wrt Wit Speech use case. I intend to use the wit.ai speech API for pure transcription. I am looking for a response something like as shown below (I require the timestamps of individual words). This is like level-1 basic transcription without training. Is this possible with current wit speech API? If I understand correctly right now I get timestamps only with the entities that are extracted from the speech after training.

Cheers
Prateek

[
"items": [
{
"start_time": "2.23",
"end_time": "2.78",
"alternatives": [
{
"confidence": "0.9582",
"content": "morning"
}
],
"type": "pronunciation"
},
{
"alternatives": [
{
"confidence": "0.0",
"content": "."
}
],
"type": "punctuation"
},
{
"start_time": "2.79",
"end_time": "2.91",
"alternatives": [
{
"confidence": "0.861",
"content": "Who"
}
],
"type": "pronunciation"
},
{
"start_time": "2.91",
"end_time": "3.04",
"alternatives": [
{
"confidence": "0.8081",
"content": "would"
}
],
"type": "pronunciation"
},
{
"alternatives": [
{
"confidence": "0.0",
"content": "?"
}
],
"type": "punctuation"
},
]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants