Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Get result as an in-memory stream" sample plays sound out loud #298

Closed
lee-borlace opened this issue Dec 23, 2020 · 10 comments
Closed

"Get result as an in-memory stream" sample plays sound out loud #298

lee-borlace opened this issue Dec 23, 2020 · 10 comments
Assignees
Labels
bug Something isn't working in review Acknowledged and being looked at now

Comments

@lee-borlace
Copy link

Hi there

I followed the "Get result as an in-memory stream" example from this page : https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/get-started-text-to-speech?tabs=script%2Cwindowsinstall&pivots=programming-language-javascript

I'm using Chrome to test this out. I find that even when passing undefined audio context (so that I can process the resulting ArrayBuffer manually), it's speaking it out of the speakers. I'm processing the ArrayBuffer with howler.js (only way I can find to use this with iOS, see related issue I raised), so the end result is that it gets spoken twice.

@lee-borlace
Copy link
Author

I believe this line is the issue, it always tries to speak the result regardless of audioconfig. I'd do a PR but I don't understand enough about the related inner workings.

https://github.com/microsoft/cognitive-services-speech-sdk-js/blob/master/src/sdk/SpeechSynthesizer.ts#L428

@glharper
Copy link
Member

glharper commented Jan 4, 2021

@leeroy79 Can you try passing in a PushAudioOutputStream that does nothing with the output, something like:

const stream = sdk.PushAudioOutputStream.create(null);
const audioConfig = sdk.AudioConfig.fromStreamOutput(stream);

@glharper glharper self-assigned this Jan 4, 2021
@glharper
Copy link
Member

@leeroy79 any update for this issue, or were you able to obtain the results you were looking for?

@lee-borlace
Copy link
Author

lee-borlace commented Jan 16, 2021

@glharper sorry for the late reply. Below is an extract of my current code for setting up the synthesizer and trying to speak based on what you'd mentioned above. I'm finding that it's not hitting the console.log() that I put in the success and error callbacks inside speakTextAsync() i.e. where I'd get the array buffer and pass it onto the code that plays it, but it is executing that function call without throwing an exception. So I'm not really sure what's happening at this point or whether I'm using it as expected.

` init(languageCode: string, voice: string) {

    this.speechConfig = SpeechConfig.fromSubscription(this.SUB_KEY, this.REGION);
    this.speechConfig.speechSynthesisLanguage = languageCode;
    this.speechConfig.speechSynthesisVoiceName = voice;


    const stream = PushAudioOutputStream.create(null);
    this.audioConfig = AudioConfig.fromStreamOutput(stream);

    this.synthesizer = new SpeechSynthesizer(this.speechConfig, this.audioConfig);
    this.logService.log(`SpeechSynthesizerServiceCustom.init(${languageCode},${voice})`);
}



private getSpeechFromServer(text: string, callback: Function): void {

    try {

        this.synthesizer.speakTextAsync(
            text,
            result => {
                if (result) {
                    console.log("TTS OK : " + JSON.stringify(result));
                    callback(result.audioData);
                }
                else {
                    console.log("TTS no result");
                    callback(null);
                }
            },
            error => {
                console.log("TTS error : " + error);
                callback(null);
            });
    }
    catch (ex) {
        console.log(ex);
    }

}

`

@glharper glharper added the in review Acknowledged and being looked at now label Jan 19, 2021
@glharper
Copy link
Member

@leeroy79 Using the above code, I get the return of

Canceled:  websocket error code: 1006
TTS OK : {"privResultId":"[...]","privReason":1,"privErrorDetails":" websocket error code: 1006","privProperties":{"privKeys":["CancellationErrorCode"],"privValues":["ConnectionFailure"]}}

Perhaps instead, you could just create an unused file for the audio that comes back,

const filename = "foo.wav";
const audioConfig = sdk.AudioConfig.fromAudioFileOutput(filename);
[...]
   this.synthesizer.speakTextAsync(
        text,
        result => {
            if (result) {
                console.log("TTS OK : " + JSON.stringify(result));
                callback(result.audioData);
            }
            else {
                console.log("TTS no result");
                callback(null);
            }
        },
        error => {
            console.log("TTS error : " + error);
            callback(null);
        },
        filename);

@lee-borlace
Copy link
Author

Thanks @glharper

It's not your area but I've started getting a new error when I change to use that code. I probably didn't provide enough info initially but this is an Angular app running in the browser, not sure if that affects compatibility.

ERROR Error: Uncaught (in promise): TypeError: fs__WEBPACK_IMPORTED_MODULE_0__.openSync is not a function

I'm not sure I'm going to be able to get to the bottom of this new error if it's not a known incompatibility.

Perhaps to close off this ticket if I'm not able to resolve, do you think the docco for this SDK needs to be updated? I.e. is it still true that passing null audio context should work to not speak the text out loud, or does that need to be amended in the docco?

@glharper glharper added the bug Something isn't working label Jan 21, 2021
@glharper
Copy link
Member

@leeroy79 I've repro-ed this, seems like a bug, will investigate further. Good find!

@glharper
Copy link
Member

glharper commented Jan 21, 2021

@leeroy79 Thanks for your patience. I've tested this code in browser and it should produce neither audio output nor errors:

        const stream = AudioOutputStream.createPullStream();
        this.audioConfig = AudioConfig.fromStreamOutput(stream);
        this.synthesizer = new SpeechSynthesizer(this.speechConfig, this.audioConfig);

@lee-borlace
Copy link
Author

lee-borlace commented Jan 26, 2021

Awesome, that did the job thanks @glharper, cheers for that. Should I close this issue in place of a new issue around correcting the docco for not speaking out loud when synthesizing?

@trrwilson
Copy link
Member

@leeroy79 I'm glad that worked! I agree with you that a documentation update or behavioral clarification in the code is in order and I've filed a work item on the team's backlog to track it. No need to open a separate issue for the doc update as we'd have converted that to the same work item.

I'll close this issue to keep things tidy, but please stay in touch and let us know what you find!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working in review Acknowledged and being looked at now
Projects
None yet
Development

No branches or pull requests

3 participants