Sample no longer works with Custom Speech service after //BUILD 2018 product updates #81

mikebranstein · 2018-05-18T12:17:50Z

The JavaScript SDK only works with Bing Speech API endpoints. Custom Speech endpoints need to be supported. PR incoming.

mikebranstein · 2018-05-18T15:41:55Z

PR #82 submitted.

mageshpurpleslate · 2018-05-23T14:22:36Z

Hi,

I have used the sample and changed the URI to wss://westus.stt.speech.microsoft.com in the speechConnectionFactory.js. I kept getting error 403 Forbidden.

May I know what should be the URI?

Thanks in Advance.

mikebranstein · 2018-05-23T14:37:46Z

@mageshpurpleslate it depends. If you're using the Bing Speech service, nothing needs to change. If you're going to use the Custom Speech Service, you need to append an endpoint Id to the URI. Check out PR #82 for the details on everything that needs to change.

mageshpurpleslate · 2018-05-24T03:34:47Z

Mike,

Thanks a lot for your help. Works like a charm. Pretty nicely done.

Is this acceptable to send the API subscription key in the query parameter? Are you planning to do any changes to that?

Regards,

Magesh

mikebranstein · 2018-05-24T23:09:54Z

@mageshpurpleslate I'm glad this worked for you - I recall starting out with Bing and Custom Speech ~ a year ago and the samples were pretty rough.

There are 2 ways to authenticate to the speech services with WebSockets. The first is using the the query string format. It's acceptable to send it that way because it's over HTTPS (WSS). The second way to authenticate is to pre-authenticate with an HTTP POST to the Cognitive Services secure token service. This returns a bearer token that is added to the WebSocket connection header. Docs on how to do this is here.

mraguraman3 · 2018-07-09T07:58:57Z

@mikebranstein - I tried the custom speech implementation with your proposed code changes. But i am getting "403 Forbidden error" in the WSS call. The path i have copied from F12 dev tools looks like:

wss://westus.stt.speech.microsoft.com/speech/recognition/interactive/cognitiveservices/v1?cid=https://westus.api.cognitive.microsoft.com/sts/v1.0&format=simple&language=en-US&Ocp-Apim-Subscription-Key=<...... key......>&X-ConnectionId=<..... connection id .... >

Is this a valid path formation, have you ever faced 403 error during your testing ?

mikebranstein · 2018-07-09T12:31:55Z

@mraguraman3 I believe you are placing the entire endpoint URL in the "Custom Speech Endpoint ID" textbox. Instead, use the Endpoint ID, which is a GUID. You can find the endpoint ID on the custom speech portal.

mraguraman3 · 2018-07-10T07:44:45Z

Thanks a lot @mikebranstein , its working now after placing the endpoint ID.

Also you mentioned that we can use token based authentication, so in this case we don't need to pass endpoint ID in HTTP Post header , just the subscription key is enough to generate the token ?

mikebranstein · 2018-07-10T12:11:06Z

@mraguraman3 - yes, token-based auth is also available. I did not use token-based auth because the original solution used the query string parameter auth. I wanted to augment the solution in a specific way for this PR. A different PR would be necessary to change the auth.

mraguraman3 · 2018-07-12T13:05:24Z

Thanks @mikebranstein .

Anyways i can confirm token based auth is not working with the custom speech implementation.

Though the token is generated using the subscription ID, I am getting 401 Unauthorized when hitting the web sockets. Below is the wss call format:

wss://westus.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?cid=#endpointID#&format=simple&language=en-US&Authorization=#token#

mikebranstein · 2018-07-12T13:16:04Z

@mraguraman3 token based auth does work, but it's tricky. I have a C# SDK I had to roll for the Custom Speech Service websocket speech protocol before Microsoft released their own.

mraguraman3 · 2018-07-13T05:21:09Z

Great @mikebranstein, but i am using Javascript Node App to generate the token. Anyways, all i want to know is whether this is a valid wss call format or am i missing some thing ?

wss://westus.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?cid=#endpointID#&format=simple&language=en-US&Authorization=#token#

mikebranstein · 2018-07-13T10:45:25Z

@mraguraman3 the C# SDK was released after //BUILD this year. You can find it on NuGet: https://www.nuget.org/packages/Microsoft.CognitiveServices.Speech/.

mraguraman3 · 2018-07-16T08:01:48Z

Thanks @mikebranstein , but i don't think there is an option in this SDK to provide the endpoint ID for custom speech.

Only EndpointURL is supported which i believe is the actual http host for speech service. Here is the documentation of supported properties in C# sdk:

https://docs.microsoft.com/en-gb/dotnet/api/microsoft.cognitiveservices.speech.speechfactory?view=azure-dotnet

Do you have any plans to support token auth in "SpeechToText-WebSockets-Javascript" for custom speech ?

mikebranstein · 2018-07-16T14:02:34Z

@mraguraman3 from what I understand, the EndpointURL property is part of the URL. So, the EndpointURL for custom speech could be wss://westus.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?cid=#endpointID#.

Underneath the SDK the streaming protocol supported is the Speech Service WebSocket protocol, outlined here: https://docs.microsoft.com/en-us/azure/cognitive-services/speech/api-reference-rest/websocketprotocol.

If you were going to implement the speech protocol yourself, you'd have to request an auth token using your subscription id (like this code snippet below).

 private async Task<string> FetchToken()
{
  using (var client = new HttpClient())
  {
    client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", "<Subscription Id>");
    UriBuilder uriBuilder = new UriBuilder("https://westus.api.cognitive.microsoft.com/sts/v1.0/issueToken");

    var result = await client.PostAsync(uriBuilder.Uri.AbsoluteUri, null);
    return await result.Content.ReadAsStringAsync();
  }
}

When you have that token, you can use a ClientWebSocket and set the Authorization bearer token on the web socket connection. Assuming _cws is the client web socket:

var authToken = await FetchToken();
_cws.Options.SetRequestHeader("Authorization", $"Bearer {authToken}");

In review of the JavaScript SDK, it supports auth token connections. The sample HTML does not use it, but you can modify the sample code slightly to take advantage of the auth token approach. See

SpeechToText-WebSockets-Javascript/samples/browser/Sample.html

Line 166 in 477067f

var useTokenAuth = false;

. I believe you can change this value to true and your solution would use the auth token approach.

mageshpurpleslate · 2018-08-02T07:20:53Z

Hi @mikebranstein, I am here to check one more item with you. Is there a way, we can save the clip, while it is being sent for recognition as well? We are trying to save it for auditing purposes.

mikebranstein · 2018-08-02T12:37:56Z

@mageshpurpleslate there's no native SDK way of doing this (to my knowledge), so you'd have to write the code to do this. For example, you could write a middle layer that collects the audio from a microphone, then funnels it to your desired location, then writes the same stream to the Speech SDK. If you don't want to do that client-side with JavaScript, then you could host your own WebSocket app that uses the C# Speech SDK. Your websocket app would act as the middle layer, intercepting the audio stream. I have a solution that does this that is hosted as a Service Fabric Web Socket app in Azure.

mikebranstein · 2018-08-02T12:39:38Z

@mageshpurpleslate After thinking for a few more minutes, the C# SDK has a custom audio source/stream you can create. You could create one that audits the audio bytes as they are being fed to the service via the SDK.

mageshpurpleslate · 2018-08-02T16:13:54Z

@mikebranstein thank you. This would help. I will try it out.

hellowonders · 2018-12-11T10:39:02Z

wss://westus.stt.speech.microsoft.com is working for latest speech api.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sample no longer works with Custom Speech service after //BUILD 2018 product updates #81

Sample no longer works with Custom Speech service after //BUILD 2018 product updates #81

mikebranstein commented May 18, 2018 •

edited

Loading

mikebranstein commented May 18, 2018

mageshpurpleslate commented May 23, 2018

mikebranstein commented May 23, 2018

mageshpurpleslate commented May 24, 2018

mikebranstein commented May 24, 2018

mraguraman3 commented Jul 9, 2018

mikebranstein commented Jul 9, 2018 •

edited

Loading

mraguraman3 commented Jul 10, 2018

mikebranstein commented Jul 10, 2018

mraguraman3 commented Jul 12, 2018 •

edited

Loading

mikebranstein commented Jul 12, 2018

mraguraman3 commented Jul 13, 2018 •

edited

Loading

mikebranstein commented Jul 13, 2018

mraguraman3 commented Jul 16, 2018

mikebranstein commented Jul 16, 2018

mageshpurpleslate commented Aug 2, 2018

mikebranstein commented Aug 2, 2018

mikebranstein commented Aug 2, 2018

mageshpurpleslate commented Aug 2, 2018

hellowonders commented Dec 11, 2018

Sample no longer works with Custom Speech service after //BUILD 2018 product updates #81

Sample no longer works with Custom Speech service after //BUILD 2018 product updates #81

Comments

mikebranstein commented May 18, 2018 • edited Loading

mikebranstein commented May 18, 2018

mageshpurpleslate commented May 23, 2018

mikebranstein commented May 23, 2018

mageshpurpleslate commented May 24, 2018

mikebranstein commented May 24, 2018

mraguraman3 commented Jul 9, 2018

mikebranstein commented Jul 9, 2018 • edited Loading

mraguraman3 commented Jul 10, 2018

mikebranstein commented Jul 10, 2018

mraguraman3 commented Jul 12, 2018 • edited Loading

mikebranstein commented Jul 12, 2018

mraguraman3 commented Jul 13, 2018 • edited Loading

mikebranstein commented Jul 13, 2018

mraguraman3 commented Jul 16, 2018

mikebranstein commented Jul 16, 2018

mageshpurpleslate commented Aug 2, 2018

mikebranstein commented Aug 2, 2018

mikebranstein commented Aug 2, 2018

mageshpurpleslate commented Aug 2, 2018

hellowonders commented Dec 11, 2018

mikebranstein commented May 18, 2018 •

edited

Loading

mikebranstein commented Jul 9, 2018 •

edited

Loading

mraguraman3 commented Jul 12, 2018 •

edited

Loading

mraguraman3 commented Jul 13, 2018 •

edited

Loading