Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting Bad request error for enrolment - Speaker Recognition #66

Closed
rajagopal28 opened this issue May 17, 2016 · 18 comments
Closed

Getting Bad request error for enrolment - Speaker Recognition #66

rajagopal28 opened this issue May 17, 2016 · 18 comments
Assignees

Comments

@rajagopal28
Copy link

Hi,
I've been trying to enroll a voice file for a created profile using the python API.
I was able to create a profile and list all profiles successfully. But when I try to enroll a voice (.wav) file with a simple hello world phrase with the created profile, I get the error 'ERROR:root:Error enrolling profile.' which in the trace tells 'Exception: Error enrolling profile: Bad Request'. If needed I can attach the stack trace. Can you help me getting started with this?

@rajagopal28 rajagopal28 changed the title Getting Bad request error for enrolment - Speech Recogntioon Getting Bad request error for enrolment - Speech Recogntion May 17, 2016
@rajagopal28 rajagopal28 changed the title Getting Bad request error for enrolment - Speech Recogntion Getting Bad request error for enrolment - Speech Recognition May 17, 2016
@rajagopal28 rajagopal28 changed the title Getting Bad request error for enrolment - Speech Recognition Getting Bad request error for enrolment - Speaker Recognition May 17, 2016
@rajagopal28
Copy link
Author

It seems like an API problem, I've tried hitting the actual endpoint with a POST request along with the documentation specified parameters and headers. I get the response { 'status' : 'Bad request', message: 'Not a valid WAVE file - No RIFF header'. I've tried with multipart/form-data and using file input from postman REST client. I've also tried to hit the API endpoint in the actual console provided by Microsoft (which doesn't have any way to pass the wave file as file input) by encoding the audio file into string(which starts with data:audio/wav;base64..) Can anyone from Microsoft answer this. I know its in preview stage, but it should have some understandable instructions and parameter details.

@momohs
Copy link
Member

momohs commented May 18, 2016

Hi @rajagopal28,
Thanks for your comments. Can you please attach the *.wav file used for enrollment?

@rajagopal28
Copy link
Author

rajagopal28 commented May 18, 2016

I've used 3 files, I'm attaching all the three
Archive.zip

@momohs I see that you are from Microsoft. In the API console link for enrolling and verifying there are text fields to send the audio file, In what format it should be sent? I used base64 encoded text (as mentioned above), I get the same error. Can you please clarify this? Thanks for your comment.

@cthrash
Copy link

cthrash commented May 19, 2016

It looks like the enrollment audio is too short. The audio file should be at least 20 seconds long and no longer than 5 minutes. The minimum number of total speech needed for enrollment, after removing silence, is 60 seconds.

@momohs - one improvement to consider is to include the response body in the exception. In this case it would have made the error much more obvious: { "error": { "code": "BadRequest", "message": "Audio too short" }

@rajagopal28
Copy link
Author

@cthrash Thank you so much. It worked, I enrolled a voice phrase to the created profile.
It would be better if there is a way to know this message('Not a valid WAVE file - No RIFF header' or 'Audio too short') in the python wrapper log. It only shows the code ('BadRequest'), which is not so helping in identifying the issue.

@momohs
Copy link
Member

momohs commented May 19, 2016

@rajagopal28 I have tried out the files you sent and I did some successful enrollments with them. However, the file "password.wav" has an incorrect sampling rate. and thus gave me an "incorrect sampling rate error". I have used a REST client for this.

Regarding the python wrapper, the enrollments were successful but I have received a "Bad request" for the file "password.wav". Indeed the exception needs to be better handled in the python wrapper.

Using the console, I am not sure how to attach the file to the request. I am in contact with the team responsible for that. I'll get back to you once it is sorted out.

@cthrash The "Audio Too Short" exception message is currently thrown out by the server if the audio is too short. At this moment, the audio should be from 1 to 15 seconds (as mentioned in the API Documentation)

@cthrash
Copy link

cthrash commented May 19, 2016

1-15 seconds, IIUC, is for Speaker Verification. In the Stack Overflow Post, @rajagopal28 is asking (despite the title) about Speaker Identification, as you can see from the call stack.

@jjsuarez
Copy link

Hello, I am also having problems enrolling an audio file in the API testing console. Please can you answer the question that @rajagopal28 asked, what format should be used in the Request body field? I am getting the same error: {
"error": {
"code": "BadRequest",
"message": "Invalid Audio Format: Not a WAVE file - no RIFF header"
}
}

My file is recorded according to the required parameter values of format and length. Any help would be greatly appreciated. Thanks a lot.

@momohs
Copy link
Member

momohs commented Jun 26, 2016

Thanks for your feedback @jjsuarez!
We are aware of the issue with uploading audio files using the API Testing Console and we are still sorting it out! Meanwhile, I urge you to use the Python sample code or the C# sample code or the Online demos to test the Speaker Recognition service.

@margaretmz
Copy link

This issue was moved to microsoft/Cognitive-SpeakerRecognition-Python#2

@taunkankur
Copy link

I am getting follwing response -

{
"error": {
"code": "BadRequest",
"message": "InvalidPhrase"
}
}

@soso-maitha
Copy link

I am getting "InvalidPhrase" as well. What could be the cause?

@khilscher
Copy link

Also getting "InvalidPhrase". Regardless of the audio length.

@EasonWang01
Copy link

EasonWang01 commented Oct 30, 2018

Hey guys, I also encounter this InvalidPhrase issue before.
Eventually, I found out we can only say what Azure ask us to say.

Using the following API to List All Supported Verification Phrases.
https://westus.dev.cognitive.microsoft.com/docs/services/563309b6778daf02acc0a508/operations/5652c0801984551c3859634d

2018-10-30 8 44 11

@kiranmahto
Copy link

i used python sample code code but it is giving error "message": "Invalid Audio Format: Require Mono"
or "message": "Invalid Audio Format: Require PCM"

@soso-maitha
Copy link

@kiranmahto use “Audacity” software with which you can convert the audio file to the required format. For the Speaker Verification service the audio file should be in specific format eg. Mono channel not dual, sampeling rate..etc. you will find these in the documentation of the API, i can share the link tomorrow

@kiranmahto
Copy link

kiranmahto commented Jan 15, 2019 via email

@soso-maitha
Copy link

soso-maitha commented Jan 16, 2019

All audio files in the dataset should be stored in the WAV (RIFF) audio format.
The audio must have a sampling rate of 8 kilohertz (KHz) or 16 KHz, and the sample values should be stored as uncompressed, pulse-code modulation (PCM) 16-bit signed integers (shorts).
Only single-channel (mono) audio files are supported.
You will find these requirement for most of Microsoft cognitive services dealing with sound files.
Reference: https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-customize-acoustic-models

Also search in Pluralsight (the website or the app) search for Microsoft speech and Speaker Recognition course it is explained step by step.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests