speech-api/SpeechSynthesis-speak-ownership.html is incomplete; renders inaccurate results; should be removed from WPT pending specification clarification [type:untestable] #23097

guest271314 · 2020-04-19T18:32:51Z

https://github.com/web-platform-tests/wpt/blob/master/speech-api/SpeechSynthesis-speak-ownership.html links to WICG/speech-api#8 which does not settle at all precisely what the concept of "ownership" is in the Web Speech API specification, where the term used exactly once

4.2.2. SpeechSynthesis Methods
speak(utterance) method

The SpeechSynthesis object takes exclusive ownership of the SpeechSynthesisUtterance object. Passing it as a speak() argument to another SpeechSynthesis object should throw an exception. (For example, two frames may have the same origin and each will contain a SpeechSynthesis object.)

with no accompanying definition of the term or algorithm clearly describing how "ownership" is set or thereafter verified programmatically.

The language used is should throw an exception, not MUST throw an exception.

Nor is there any explanation as to why an exception "should" be thrown in that case.

In practice, it is possible to differentiate speechSynthesis instances, depending on the implementation, not the specification, because speechSynthesis.speak() calls could be added to the same global queue, if, for example, the implementer routes all speak() calls to a single local speech synthesis socket connection application (for the browsers where a speech synthesis engine is not shipped by default with the source code, e.g., Chromium), where it is possible to call cancel() at a different window from a closed window that initiated a speak() call and thereby clear the queue in the socket connection for all pending speak() call, for every window: there is limited consistency at implementations because the specification is ambiguous in certain locations. The specification does not state that calling cancel() should only affect a single queue

This method removes all utterances from the queue.

note the singular the queue.

For example, if speechSynthesis.pause() is executed at Chromium before calling speak(), neither iframe.contentWindow.speak() nor speechSynthesis.speak() will output audio, iframe.contentWindow.speechSynthesis.pending and speechSynthesis will always be false. Nightly does output audio for iframe.contentWindow.speak() (because that "instance" of speechSynthesis was not paused) and sets speechSynthesis.pending to true.

The test will always fail because there is no documentation or algorithm or implementation of the concept of "ownership" of a SpeechSynthesisUtterance object resulting in threw always being false, which means the test results are inaccurate: What specific language in the specification is the current test relying on for how to determine "ownership" of a SpeechSynthesisUtterance by a particular call or reference to speechSynthesis, and exactly how is that verification done in the test?

Presently, it is impossible to programmatically determine the "ownership" of a SpeechSynthesisUtterance instance because no such concept exists in practice. Therefore, the test, is incomplete and inaccurate, and without clarity in the specification, should be removed from the WPT tests until the specification is clear, as it is currently impossible to verify "ownership" of a SpeechSynthesisUtterance instance relying solely on the language in the specification.

The text was updated successfully, but these errors were encountered:

…nership" of a SpeechSynthesisUtterance Fixes web-platform-tests#23097

guest271314 · 2020-04-19T20:27:31Z

Re the current ambiguity of language in the current Web Speech API specification, and why it is presently impossible to refer to that draft or specification for clarity at the primary source document, consider

cancel() method
This method removes all utterances from the queue. If an utterance is being spoken, speaking ceases immediately. This method does not change the paused state of the global SpeechSynthesis instance.

and

pause() method
This method puts the global SpeechSynthesis instance into the paused state. If an utterance was being spoken, it pauses mid-utterance. (If called when the SpeechSynthesis instance was already in the paused state, it does nothing.)

where the term the is used, dissimilar from language at speak()

to another SpeechSynthesis object

which does not in implementation occur at Chromium, as there is really only one speechSynthesis object or instance and one queue, where the socket connection session could survive even after browser is closed.

To demonstrate, we can reverse the order of execution for some parts of the code, to execute iframe.contentWindow.speechSynthesis.pause() first, proceed to attempt executing iframe.contentWindow.speechSynthesis.speak() then window.speechSynthesis.speak()

<!DOCTYPE html>

<html>
  <head>
    <link rel="stylesheet" href="lib/style.css" />
    <script src="lib/script.js"></script>
  </head>

  <body>
    <iframe></iframe>
    <script>
      // using an utterance for different SpeechSynthesis instances should throw
      // the utterance is short to make the test faster
      speechSynthesis.cancel();
      const utter = new SpeechSynthesisUtterance('1');
      const iframe = document.querySelector('iframe');
      iframe.contentWindow.speechSynthesis.cancel();
      alert(window.speechSynthesis === iframe.contentWindow.speechSynthesis);
      alert(
        JSON.stringify(
          {
            'speechSynthesis.pending': speechSynthesis.pending,
            'iframe.contentWindow.speechSynthesis.pending':
              iframe.contentWindow.speechSynthesis.pending,
          },
          null,
          2
        )
      );

      // the spec doesn't say what exception to throw:
      // https://github.com/w3c/speech-api/issues/8
      let threw = false;
      alert(
        JSON.stringify(
          {
            'speechSynthesis.pending': speechSynthesis.pending,
            'iframe.contentWindow.speechSynthesis.pending':
              iframe.contentWindow.speechSynthesis.pending,
          },
          null,
          2
        )
      );
      try {
        iframe.contentWindow.speechSynthesis.pause();
        iframe.contentWindow.speechSynthesis.speak(utter);
        alert(
          JSON.stringify(
            {
              'speechSynthesis.pending': speechSynthesis.pending,
              'iframe.contentWindow.speechSynthesis.pending':
                iframe.contentWindow.speechSynthesis.pending,
            },
            null,
            2
          ))
      } catch (e) {
        threw = true;
      } finally {
              
      speechSynthesis.speak(utter);
        alert(
          JSON.stringify(
            {
              'speechSynthesis.pending': speechSynthesis.pending,
              'iframe.contentWindow.speechSynthesis.pending':
                iframe.contentWindow.speechSynthesis.pending,
            },
            null,
            2
          )
        );
        alert(JSON.stringify({ threw }, null, 2));
        console.log(utter, speechSynthesis, iframe.contentWindow.speechSynthesis);
      }
    </script>
  </body>
</html>

where the result at Chromium 84 is neither outputs audio: there is evidently only one speechSynthesis object or instance, and only one "global" queue, irrespective of any "same object" check performed in JavaScript at the browser, because the utterance could have already been sent as a socket message to the local (native) speech synthesis processing application or engine itself, which could have only one queue and one automatically spawned socket connection - which again, particularly at Chromium, could outlive the "lifetime" of the tab or browser instance.

The question is Chromium implementation in conformance or not in conformance with the Web Speech API specification re "ownership" of a SpeechSynthesisUtterance object and the "global" speechSynthesis instance?

What section of the Web Speech API specification unambiguously verifies Chromium allowing an iframe to call pause() (or cancel()) which affects window.speechSynthesis at the queue of the "global" object is not in conformance with the specificaton?

In the above case, how did the iframe.contentWindow.speechSythesis "instance" gain control of window.speechSynthesis "instance"?

re "ownership" is there the queue or multiple queues; one speechSynthesis global instance, or potential for multiple speechSynthesis instances - each spawning a new native socket connection that can be paused or canceled independently by the caller using JavaScript at the browser - or is there only one socket connection, out of the control of the Web Speech API, and thus, again, untestable?

Are the answers immediately clear at the current language of the specification?

guest271314 · 2020-04-19T20:44:06Z

For completeness, to check the possibility of Chromium actually following the specification, not outputting audio because the implementation somehow, without reference thereto at the specification, is recognizing utter as an instance of SpeechSynthesisUtterance that was previously passed to iframe.contentWindow.speechSynthesis.speak(), and flagged as having "owership" by some other speechSynthesis "instance", we can still pass a new SpeechSynthesisUtterance

speechSynthesis.speak(new SpeechSynthesisUtterance('2'));

which still does not output audio.

Is Firefox implementation not in conformance for outputting audio at window.speechSynthesis.speak() after the SpeechSynthesisUtterance was passed to iframe.contentWindow.speechSynthesis.speak(), if, in fact it is possible for there to be more than one speechSynthesis instances, and utter "ownership" was conferred to iframe.contentWindow.speechSynthesis even though that "instance" is in the paused state?

Due to the above unanswered questions testing for "ownership" of a SpeechSynthesisUtterance object is futile.

…nership" of a SpeechSynthesisUtterance (#23098) Fixes #23097

…e is not defined by the specification, a=testonly Automatic update from web-platform-tests The specification does not define the concept of or algorithm for "ownership" of a SpeechSynthesisUtterance (#23098) Fixes web-platform-tests/wpt#23097 -- wpt-commits: a12b280bc38ae366807ad9324e1a6fc3d8b5e82e wpt-pr: 23098

…e is not defined by the specification, a=testonly Automatic update from web-platform-tests The specification does not define the concept of or algorithm for "ownership" of a SpeechSynthesisUtterance (#23098) Fixes web-platform-tests/wpt#23097 -- wpt-commits: a12b280bc38ae366807ad9324e1a6fc3d8b5e82e wpt-pr: 23098 UltraBlame original commit: e797ade7daf3bdd783f829c36190e6d13f6d6438

…e is not defined by the specification, a=testonly Automatic update from web-platform-tests The specification does not define the concept of or algorithm for "ownership" of a SpeechSynthesisUtterance (#23098) Fixes web-platform-tests/wpt#23097 -- wpt-commits: a12b280bc38ae366807ad9324e1a6fc3d8b5e82e wpt-pr: 23098 UltraBlame original commit: f478b06a1f1be1b03dd39da5fa0a41077d66b9ae

…e is not defined by the specification, a=testonly Automatic update from web-platform-tests The specification does not define the concept of or algorithm for "ownership" of a SpeechSynthesisUtterance (#23098) Fixes web-platform-tests/wpt#23097 -- wpt-commits: a12b280bc38ae366807ad9324e1a6fc3d8b5e82e wpt-pr: 23098 UltraBlame original commit: e797ade7daf3bdd783f829c36190e6d13f6d6438

…e is not defined by the specification, a=testonly Automatic update from web-platform-tests The specification does not define the concept of or algorithm for "ownership" of a SpeechSynthesisUtterance (#23098) Fixes web-platform-tests/wpt#23097 -- wpt-commits: a12b280bc38ae366807ad9324e1a6fc3d8b5e82e wpt-pr: 23098 UltraBlame original commit: f478b06a1f1be1b03dd39da5fa0a41077d66b9ae

…e is not defined by the specification, a=testonly Automatic update from web-platform-tests The specification does not define the concept of or algorithm for "ownership" of a SpeechSynthesisUtterance (#23098) Fixes web-platform-tests/wpt#23097 -- wpt-commits: a12b280bc38ae366807ad9324e1a6fc3d8b5e82e wpt-pr: 23098 UltraBlame original commit: e797ade7daf3bdd783f829c36190e6d13f6d6438

…e is not defined by the specification, a=testonly Automatic update from web-platform-tests The specification does not define the concept of or algorithm for "ownership" of a SpeechSynthesisUtterance (#23098) Fixes web-platform-tests/wpt#23097 -- wpt-commits: a12b280bc38ae366807ad9324e1a6fc3d8b5e82e wpt-pr: 23098 UltraBlame original commit: f478b06a1f1be1b03dd39da5fa0a41077d66b9ae

guest271314 added a commit to guest271314/wpt that referenced this issue Apr 19, 2020

The specification does not define the concept of or algorithm for "ow…

1dcc609

…nership" of a SpeechSynthesisUtterance Fixes web-platform-tests#23097

stephenmcgruer mentioned this issue Apr 20, 2020

What exception should speechSynthesis.speak() throw for reused SpeechSynthesis? WICG/speech-api#8

Open

stephenmcgruer closed this as completed in #23098 Apr 20, 2020

stephenmcgruer pushed a commit that referenced this issue Apr 20, 2020

The specification does not define the concept of or algorithm for "ow…

a12b280

…nership" of a SpeechSynthesisUtterance (#23098) Fixes #23097

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speech-api/SpeechSynthesis-speak-ownership.html is incomplete; renders inaccurate results; should be removed from WPT pending specification clarification [type:untestable] #23097

speech-api/SpeechSynthesis-speak-ownership.html is incomplete; renders inaccurate results; should be removed from WPT pending specification clarification [type:untestable] #23097

guest271314 commented Apr 19, 2020

guest271314 commented Apr 19, 2020

guest271314 commented Apr 19, 2020

speech-api/SpeechSynthesis-speak-ownership.html is incomplete; renders inaccurate results; should be removed from WPT pending specification clarification [type:untestable] #23097

speech-api/SpeechSynthesis-speak-ownership.html is incomplete; renders inaccurate results; should be removed from WPT pending specification clarification [type:untestable] #23097

Comments

guest271314 commented Apr 19, 2020

guest271314 commented Apr 19, 2020

guest271314 commented Apr 19, 2020