Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chrome has a time limit for individual SpeechSynthesisUtterances #60

Closed
zepumph opened this issue Feb 11, 2022 · 17 comments
Closed

Chrome has a time limit for individual SpeechSynthesisUtterances #60

zepumph opened this issue Feb 11, 2022 · 17 comments

Comments

@zepumph
Copy link
Member

zepumph commented Feb 11, 2022

@terracoda and I found this while reviewing the overview button in Ratio and Proportion (phetsims/ratio-and-proportion#363).

Basically, at 1x speed, the overview button stops part way through speaking. This is the text:

The Discover Screen changes as you play with it.

A tall space contains two interactive hands, a Left Hand and a Right Hand. You can move the Left Hand and the Right Hand up and down individually, or activate Both Hands to move both hands at the same time. As you move the hands up and down, you will get closer to or farther from the challenge ratio, and closer or farther from a dark green screen.

Tick mark options allow you to explore the vertical space with evenly-spaced horizontal lines. 

There are different challenges to explore, and a button to reset the sim.

At 1x, it stops around As you move the hands up and down.

@terracoda tested this on safari, and I on Firefox, and those browsers spoke the whole thing. I would like to talk to @jessegreenberg about how to proceed here.

@jessegreenberg
Copy link
Contributor

jessegreenberg commented Feb 11, 2022

I haven't observed this myself, but I think I noticed a reference to this while investigating phetsims/number-play#138! I think it is another Chrome bug. https://bugs.chromium.org/p/chromium/issues/detail?id=679437

@jessegreenberg
Copy link
Contributor

That thread suggests a workaround where you call pause and resume intermittently to get the engine running again. Reports say it works in Chrome but breaks in other browsers.

@jessegreenberg
Copy link
Contributor

I ran into this while working on phetsims/quadrilateral#75.

@jessegreenberg
Copy link
Contributor

Concerned that if we do the pause/resume workaround that may send end events through the Announcer/UtteranceQueue to dispose of the Utterance. Haven't done any testing.

@zepumph
Copy link
Member Author

zepumph commented Mar 17, 2022

Perhaps we can solve this in the purely on the announcer side. Because we can manage the end event there and conditionally "restart" the speech or send things back to the queue. I'm just floating ideas though. I'm not sure.

@jessegreenberg
Copy link
Contributor

jessegreenberg commented Apr 1, 2022

Some notes:

  • In https://bugs.chromium.org/p/chromium/issues/detail?id=679437 it was mentioned that this is only an issue for the "Google" voices, and I confirmed that to be the case. I see the problem for "Google US English" but not for "Microsoft David".
  • The last comment in the report is that the pause/resume workaround works for desktop chrome but breaks for Android chrome. Should check various platforms if we use this.
  • According to https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesisUtterance, I don't expect pause/resume to trigger start or end events since they have their own events called pause and resume. So the workaround may not interfere with listeners currently on our SpeechSynthesisUtterance.
  • The pause/resume workaround is working great in Windows 10 Chrome.
  • Confirmed that the pause/resume workaround breaks in Android Chrome. Speech stops and it does not resume. We never get a cancel or end event.
  • I tried to wrap the synth.resume in a setTimeout for 1 and 15 ms to fix on Android Chrome but it did nothing.
  • A thread about this issue: https://stackoverflow.com/questions/21947730/chrome-speech-synthesis-with-longer-texts/23808155#23808155
    • Recommends to "un freeze" the synth after the bug happens with a synth.cancel(). I don't want to remove all SpeechSynthesisUtterances though, we would never hear anything in the queue.
    • One answer recommends resume() without a cancel(). A response says it used to work but doesn't anymore. I confirmed it does not work in Win 10 Chrome.
    • Several answers recommend to chunk the utterance into multiple smaller utterances, ideally around punctuation.

Have to stop for tonight, but maybe breaking it up into multiple SpeechSynthesisUtterances wouldn't be too bad. Alternatively, I am NOT seeing this issue on my Android Chrome device, so maybe a platform specific if( platform.chromium && !platform.android )?

@jessegreenberg
Copy link
Contributor

jessegreenberg commented Apr 1, 2022

@zepumph and I discussed and agreed to go with the quicker pause/resume workaround for now with a platform check. We agreed that breaking up the SpeechSynthesisUtterance into several could work but would require a lot of code to get it working and there are still some open questions about how to get it working?

  • How would we break up long utterances? Punctuation isn't good enough, we could still have very long sentences that break it. Punctuation might not work well for i18n.
  • How would we manage events on multiple SpeechSynthesisUtterances?
  • What if Utterance.priorityProperty changes?
  • Breaking it up into multiple SpeechSynthesisUtterances would create a pause in the speech output, is that acceptable?

After this fix, we should make sure that everything still works on all chromium browsers (including Edge!!) as well as all browsers that can be installed on a Chromebook.

@jessegreenberg
Copy link
Contributor

jessegreenberg commented Apr 8, 2022

I was just about ready to commit this when I removed to the workaround to verify that behavior was improved only to find that I can no longer reproduce the problem in Chrome with any voice. I am on version Version 100.0.4896.75 (Official Build) (64-bit). Did Chrome fix this in the last 7 days?

Here is the patch I was about to commit:

Index: js/SpeechSynthesisAnnouncer.ts
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/js/SpeechSynthesisAnnouncer.ts b/js/SpeechSynthesisAnnouncer.ts
--- a/js/SpeechSynthesisAnnouncer.ts	(revision fa66202e844102e1e85129029849a6b17e57f5dc)
+++ b/js/SpeechSynthesisAnnouncer.ts	(date 1649449568272)
@@ -26,6 +26,7 @@
 import utteranceQueueNamespace from './utteranceQueueNamespace.js';
 import { ResolvedResponse } from './ResponsePacket.js';
 import stepTimer from '../../axon/js/stepTimer.js';
+import platform from '../../phet-core/js/platform.js';
 
 // If a polyfill for SpeechSynthesis is requested, try to initialize it here before SpeechSynthesis usages. For
 // now this is a PhET specific feature, available by query parameter in initialize-globals. QueryStringMachine
@@ -45,6 +46,11 @@
 // to handle accordingly.
 const PENDING_UTTERANCE_DELAY = 5000;
 
+// In Windows Chromium, long utterances with the Google voices simply stop after 15 seconds and we never get end or
+// cancel events. The workaround proposed in https://bugs.chromium.org/p/chromium/issues/detail?id=679437 is
+// to pause/resume the utterance at an interval.
+const PAUSE_RESUME_WORKAROUND_INTERVAL = 10000;
+
 // In ms. In Safari, the `start` and `end` listener do not fire consistently, especially after interruption
 // with cancel. But speaking behind a timeout/delay improves the behavior significantly. Timeout of 125 ms was
 // determined with testing to be a good value to use. Values less than 125 broke the workaround, while larger
@@ -88,6 +94,11 @@
   // fast on Chromebooks, see documentation around ENGINE_WAKE_INTERVAL.
   private timeSinceWakingEngine: number;
 
+  // In ms, how long since we have applied the "pause/resume" workaround for long utterances in Chromium. Very
+  // long SpeechSynthesisUtterances (longer than 15 seconds) get cut on Chromium and we never get "end" or "cancel"
+  // events due to a platform bug, see https://bugs.chromium.org/p/chromium/issues/detail?id=679437.
+  private timeSincePauseResume: number;
+
   // In ms, how long it has been since we requested speech of a new utterance and when
   // the synth has successfully started speaking it. It is possible that the synth will fail to speak so if
   // this timer gets too high we handle the failure case.
@@ -183,6 +194,7 @@
     this.hasSpoken = false;
 
     this.timeSinceWakingEngine = 0;
+    this.timeSincePauseResume = 0;
 
     this.timeSincePendingUtterance = 0;
 
@@ -329,6 +341,22 @@
         this.readyToAnnounce = true;
       }
 
+      // SpeechSynthesisUtterances longer than 15 seconds will get interrupted on Chrome and fail to stop with
+      // end or error events. https://bugs.chromium.org/p/chromium/issues/detail?id=679437 suggests a workaround
+      // that uses pause/resume like this. The workaround is needed for desktop Chrome when using `localService: false`
+      // voices. It does not apear on any Microsoft Edge voices. It breaks SpeechSynthesis on android. In this check we
+      // only use this workaround where needed.
+      if ( platform.chromium && !platform.android && ( this.voiceProperty.value && !this.voiceProperty.value.localService ) ) {
+
+        // If we are not speaking, we don't apply the pause/resume workaround, it is for long SpeechSynthesisUtterances
+        this.timeSincePauseResume = synth.speaking ? this.timeSincePauseResume + dt : 0;
+        if ( this.timeSincePauseResume > PAUSE_RESUME_WORKAROUND_INTERVAL ) {
+          this.timeSincePauseResume = 0;
+          synth.pause();
+          synth.resume();
+        }
+      }
+
       // A workaround to keep SpeechSynthesis responsive on Chromebooks. If there is a long enough interval between
       // speech requests, the next time SpeechSynthesis is used it is very slow on Chromebook. We think the browser
       // turns "off" the synthesis engine for performance. If it has been long enough since using speech synthesis and

@zepumph @BLFiedler or @terracoda can you please try the toolbar buttons in ratio-and-proportion in Chrome and see if you are still seeing this? If you do, can you please report your Chrome version? I heard continuous speech for ~30 seconds in ratio-and-proportion with all available voices.

@zepumph
Copy link
Member Author

zepumph commented Apr 11, 2022

I can consistently get this to reproduce with the "Google US English" but I cannot reproduce with any microsoft voice (mark david zira). When it occurs, it is pretty much right at 15 seconds. I reproduced with a voice rate of .75, 1, and 1.25.

@jessegreenberg
Copy link
Contributor

OK thanks @zepumph! Can you please post your chrome version?

@brettfiedler
Copy link
Member

Good catch, @zepumph ! I have discovered it's the same for me. Google US English cuts out at 15 seconds, but all other voices play out the full overview just fine.

My Chrome version (on Win10):
Version 100.0.4896.75 (Official Build) (64-bit)

@jessegreenberg
Copy link
Contributor

OK thanks for checking! For some reason I am still not seeing this with 100.0.4896.75 (Official Build) (64-bit) with "Google US English" voice. I timed it speaking for 29 seconds and it never cut out. Ill try applying the workaround and check-in again with @zepumph or @BLFiedler to see if it is fixed.

@zepumph
Copy link
Member Author

zepumph commented Apr 22, 2022

@jessegreenberg can you still not reproduce this?

@jessegreenberg
Copy link
Contributor

I just tested again and cannot reproduce this. Using 'Google US English' I heard speech for 21 seconds.

@jessegreenberg
Copy link
Contributor

I committed the change proposed in #60 (comment). @samreid helped me test since I haven't been able to reproduce this bug. On his Mac with Chrome we observed the problem before the fix and verified that the problem was gone after 6d5b3c1.

@zepumph can you please also verify the fix and review the change?

@zepumph
Copy link
Member Author

zepumph commented May 9, 2022

That is so interesting that you couldn't I added a console log inside the code block that paused and resumed speech. I then put on Google US English voice at .75x speed. I saw that the first screen overview button in Ratio and Proportion used the workaround 2+ times to complete the sentence, but it did complete it. I then used a Microsoft voice in the same parameters and saw it complete the whole overview without entering the code block (no console logs).

I also tested from an unbuilt sim on my android phone, and the single available voice completed the overview on the first screen of RAP also.

This is superb. Thank you @jessegreenberg and @samreid.

It breaks SpeechSynthesis on android

It took me a couple of back-and-forths to come to the conclusion that "It" is the workaround, not the bug. I'm still not positibe though. Can you will in that pronoun?

@jessegreenberg
Copy link
Contributor

Yes, I can see how that was confusing. Is that better?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants