"Get result as an in-memory stream" sample plays sound out loud #298

lee-borlace · 2020-12-23T01:05:47Z

Hi there

I followed the "Get result as an in-memory stream" example from this page : https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/get-started-text-to-speech?tabs=script%2Cwindowsinstall&pivots=programming-language-javascript

I'm using Chrome to test this out. I find that even when passing undefined audio context (so that I can process the resulting ArrayBuffer manually), it's speaking it out of the speakers. I'm processing the ArrayBuffer with howler.js (only way I can find to use this with iOS, see related issue I raised), so the end result is that it gets spoken twice.

lee-borlace · 2021-01-04T08:23:54Z

I believe this line is the issue, it always tries to speak the result regardless of audioconfig. I'd do a PR but I don't understand enough about the related inner workings.

https://github.com/microsoft/cognitive-services-speech-sdk-js/blob/master/src/sdk/SpeechSynthesizer.ts#L428

glharper · 2021-01-04T18:04:52Z

@leeroy79 Can you try passing in a PushAudioOutputStream that does nothing with the output, something like:

const stream = sdk.PushAudioOutputStream.create(null);
const audioConfig = sdk.AudioConfig.fromStreamOutput(stream);

glharper · 2021-01-15T19:05:00Z

@leeroy79 any update for this issue, or were you able to obtain the results you were looking for?

lee-borlace · 2021-01-16T07:59:04Z

@glharper sorry for the late reply. Below is an extract of my current code for setting up the synthesizer and trying to speak based on what you'd mentioned above. I'm finding that it's not hitting the console.log() that I put in the success and error callbacks inside speakTextAsync() i.e. where I'd get the array buffer and pass it onto the code that plays it, but it is executing that function call without throwing an exception. So I'm not really sure what's happening at this point or whether I'm using it as expected.

` init(languageCode: string, voice: string) {

    this.speechConfig = SpeechConfig.fromSubscription(this.SUB_KEY, this.REGION);
    this.speechConfig.speechSynthesisLanguage = languageCode;
    this.speechConfig.speechSynthesisVoiceName = voice;


    const stream = PushAudioOutputStream.create(null);
    this.audioConfig = AudioConfig.fromStreamOutput(stream);

    this.synthesizer = new SpeechSynthesizer(this.speechConfig, this.audioConfig);
    this.logService.log(`SpeechSynthesizerServiceCustom.init(${languageCode},${voice})`);
}



private getSpeechFromServer(text: string, callback: Function): void {

    try {

        this.synthesizer.speakTextAsync(
            text,
            result => {
                if (result) {
                    console.log("TTS OK : " + JSON.stringify(result));
                    callback(result.audioData);
                }
                else {
                    console.log("TTS no result");
                    callback(null);
                }
            },
            error => {
                console.log("TTS error : " + error);
                callback(null);
            });
    }
    catch (ex) {
        console.log(ex);
    }

}

`

glharper · 2021-01-20T00:00:30Z

@leeroy79 Using the above code, I get the return of

Canceled:  websocket error code: 1006
TTS OK : {"privResultId":"[...]","privReason":1,"privErrorDetails":" websocket error code: 1006","privProperties":{"privKeys":["CancellationErrorCode"],"privValues":["ConnectionFailure"]}}

Perhaps instead, you could just create an unused file for the audio that comes back,

const filename = "foo.wav";
const audioConfig = sdk.AudioConfig.fromAudioFileOutput(filename);
[...]
   this.synthesizer.speakTextAsync(
        text,
        result => {
            if (result) {
                console.log("TTS OK : " + JSON.stringify(result));
                callback(result.audioData);
            }
            else {
                console.log("TTS no result");
                callback(null);
            }
        },
        error => {
            console.log("TTS error : " + error);
            callback(null);
        },
        filename);

lee-borlace · 2021-01-21T08:45:11Z

Thanks @glharper

It's not your area but I've started getting a new error when I change to use that code. I probably didn't provide enough info initially but this is an Angular app running in the browser, not sure if that affects compatibility.

ERROR Error: Uncaught (in promise): TypeError: fs__WEBPACK_IMPORTED_MODULE_0__.openSync is not a function

I'm not sure I'm going to be able to get to the bottom of this new error if it's not a known incompatibility.

Perhaps to close off this ticket if I'm not able to resolve, do you think the docco for this SDK needs to be updated? I.e. is it still true that passing null audio context should work to not speak the text out loud, or does that need to be amended in the docco?

glharper · 2021-01-21T16:44:29Z

@leeroy79 I've repro-ed this, seems like a bug, will investigate further. Good find!

glharper · 2021-01-21T19:21:59Z

@leeroy79 Thanks for your patience. I've tested this code in browser and it should produce neither audio output nor errors:

        const stream = AudioOutputStream.createPullStream();
        this.audioConfig = AudioConfig.fromStreamOutput(stream);
        this.synthesizer = new SpeechSynthesizer(this.speechConfig, this.audioConfig);

lee-borlace · 2021-01-26T06:29:10Z

Awesome, that did the job thanks @glharper, cheers for that. Should I close this issue in place of a new issue around correcting the docco for not speaking out loud when synthesizing?

trrwilson · 2021-01-27T18:34:54Z

@leeroy79 I'm glad that worked! I agree with you that a documentation update or behavioral clarification in the code is in order and I've filed a work item on the team's backlog to track it. No need to open a separate issue for the doc update as we'd have converted that to the same work item.

I'll close this issue to keep things tidy, but please stay in touch and let us know what you find!

glharper self-assigned this Jan 4, 2021

glharper added the in review Acknowledged and being looked at now label Jan 19, 2021

glharper added the bug Something isn't working label Jan 21, 2021

trrwilson closed this as completed Jan 27, 2021

glharper mentioned this issue Apr 20, 2021

Throw more descriptive error when fs.openSync is not defined #363

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"Get result as an in-memory stream" sample plays sound out loud #298

"Get result as an in-memory stream" sample plays sound out loud #298

lee-borlace commented Dec 23, 2020

lee-borlace commented Jan 4, 2021

glharper commented Jan 4, 2021

glharper commented Jan 15, 2021

lee-borlace commented Jan 16, 2021 •

edited

Loading

glharper commented Jan 20, 2021

lee-borlace commented Jan 21, 2021

glharper commented Jan 21, 2021

glharper commented Jan 21, 2021 •

edited

Loading

lee-borlace commented Jan 26, 2021 •

edited

Loading

trrwilson commented Jan 27, 2021

"Get result as an in-memory stream" sample plays sound out loud #298

"Get result as an in-memory stream" sample plays sound out loud #298

Comments

lee-borlace commented Dec 23, 2020

lee-borlace commented Jan 4, 2021

glharper commented Jan 4, 2021

glharper commented Jan 15, 2021

lee-borlace commented Jan 16, 2021 • edited Loading

glharper commented Jan 20, 2021

lee-borlace commented Jan 21, 2021

glharper commented Jan 21, 2021

glharper commented Jan 21, 2021 • edited Loading

lee-borlace commented Jan 26, 2021 • edited Loading

trrwilson commented Jan 27, 2021

lee-borlace commented Jan 16, 2021 •

edited

Loading

glharper commented Jan 21, 2021 •

edited

Loading

lee-borlace commented Jan 26, 2021 •

edited

Loading