Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

speech-api/SpeechSynthesis-speak-ownership.html is incomplete; renders inaccurate results; should be removed from WPT pending specification clarification [type:untestable] #23097

Closed
guest271314 opened this issue Apr 19, 2020 · 2 comments · Fixed by guest271314/wpt#2 or #23098

Comments

@guest271314
Copy link
Contributor

https://github.com/web-platform-tests/wpt/blob/master/speech-api/SpeechSynthesis-speak-ownership.html links to WICG/speech-api#8 which does not settle at all precisely what the concept of "ownership" is in the Web Speech API specification, where the term used exactly once

4.2.2. SpeechSynthesis Methods
speak(utterance) method

The SpeechSynthesis object takes exclusive ownership of the SpeechSynthesisUtterance object. Passing it as a speak() argument to another SpeechSynthesis object should throw an exception. (For example, two frames may have the same origin and each will contain a SpeechSynthesis object.)

with no accompanying definition of the term or algorithm clearly describing how "ownership" is set or thereafter verified programmatically.

The language used is should throw an exception, not MUST throw an exception.

Nor is there any explanation as to why an exception "should" be thrown in that case.

In practice, it is possible to differentiate speechSynthesis instances, depending on the implementation, not the specification, because speechSynthesis.speak() calls could be added to the same global queue, if, for example, the implementer routes all speak() calls to a single local speech synthesis socket connection application (for the browsers where a speech synthesis engine is not shipped by default with the source code, e.g., Chromium), where it is possible to call cancel() at a different window from a closed window that initiated a speak() call and thereby clear the queue in the socket connection for all pending speak() call, for every window: there is limited consistency at implementations because the specification is ambiguous in certain locations. The specification does not state that calling cancel() should only affect a single queue

This method removes all utterances from the queue.

note the singular the queue.

For example, if speechSynthesis.pause() is executed at Chromium before calling speak(), neither iframe.contentWindow.speak() nor speechSynthesis.speak() will output audio, iframe.contentWindow.speechSynthesis.pending and speechSynthesis will always be false. Nightly does output audio for iframe.contentWindow.speak() (because that "instance" of speechSynthesis was not paused) and sets speechSynthesis.pending to true.

The test will always fail because there is no documentation or algorithm or implementation of the concept of "ownership" of a SpeechSynthesisUtterance object resulting in threw always being false, which means the test results are inaccurate: What specific language in the specification is the current test relying on for how to determine "ownership" of a SpeechSynthesisUtterance by a particular call or reference to speechSynthesis, and exactly how is that verification done in the test?

Presently, it is impossible to programmatically determine the "ownership" of a SpeechSynthesisUtterance instance because no such concept exists in practice. Therefore, the test, is incomplete and inaccurate, and without clarity in the specification, should be removed from the WPT tests until the specification is clear, as it is currently impossible to verify "ownership" of a SpeechSynthesisUtterance instance relying solely on the language in the specification.

guest271314 added a commit to guest271314/wpt that referenced this issue Apr 19, 2020
@guest271314
Copy link
Contributor Author

Re the current ambiguity of language in the current Web Speech API specification, and why it is presently impossible to refer to that draft or specification for clarity at the primary source document, consider

cancel() method
This method removes all utterances from the queue. If an utterance is being spoken, speaking ceases immediately. This method does not change the paused state of the global SpeechSynthesis instance.

and

pause() method
This method puts the global SpeechSynthesis instance into the paused state. If an utterance was being spoken, it pauses mid-utterance. (If called when the SpeechSynthesis instance was already in the paused state, it does nothing.)

where the term the is used, dissimilar from language at speak()

to another SpeechSynthesis object

which does not in implementation occur at Chromium, as there is really only one speechSynthesis object or instance and one queue, where the socket connection session could survive even after browser is closed.

To demonstrate, we can reverse the order of execution for some parts of the code, to execute iframe.contentWindow.speechSynthesis.pause() first, proceed to attempt executing iframe.contentWindow.speechSynthesis.speak() then window.speechSynthesis.speak()

<!DOCTYPE html>

<html>
  <head>
    <link rel="stylesheet" href="lib/style.css" />
    <script src="lib/script.js"></script>
  </head>

  <body>
    <iframe></iframe>
    <script>
      // using an utterance for different SpeechSynthesis instances should throw
      // the utterance is short to make the test faster
      speechSynthesis.cancel();
      const utter = new SpeechSynthesisUtterance('1');
      const iframe = document.querySelector('iframe');
      iframe.contentWindow.speechSynthesis.cancel();
      alert(window.speechSynthesis === iframe.contentWindow.speechSynthesis);
      alert(
        JSON.stringify(
          {
            'speechSynthesis.pending': speechSynthesis.pending,
            'iframe.contentWindow.speechSynthesis.pending':
              iframe.contentWindow.speechSynthesis.pending,
          },
          null,
          2
        )
      );

      // the spec doesn't say what exception to throw:
      // https://github.com/w3c/speech-api/issues/8
      let threw = false;
      alert(
        JSON.stringify(
          {
            'speechSynthesis.pending': speechSynthesis.pending,
            'iframe.contentWindow.speechSynthesis.pending':
              iframe.contentWindow.speechSynthesis.pending,
          },
          null,
          2
        )
      );
      try {
        iframe.contentWindow.speechSynthesis.pause();
        iframe.contentWindow.speechSynthesis.speak(utter);
        alert(
          JSON.stringify(
            {
              'speechSynthesis.pending': speechSynthesis.pending,
              'iframe.contentWindow.speechSynthesis.pending':
                iframe.contentWindow.speechSynthesis.pending,
            },
            null,
            2
          ))
      } catch (e) {
        threw = true;
      } finally {
              
      speechSynthesis.speak(utter);
        alert(
          JSON.stringify(
            {
              'speechSynthesis.pending': speechSynthesis.pending,
              'iframe.contentWindow.speechSynthesis.pending':
                iframe.contentWindow.speechSynthesis.pending,
            },
            null,
            2
          )
        );
        alert(JSON.stringify({ threw }, null, 2));
        console.log(utter, speechSynthesis, iframe.contentWindow.speechSynthesis);
      }
    </script>
  </body>
</html>

where the result at Chromium 84 is neither outputs audio: there is evidently only one speechSynthesis object or instance, and only one "global" queue, irrespective of any "same object" check performed in JavaScript at the browser, because the utterance could have already been sent as a socket message to the local (native) speech synthesis processing application or engine itself, which could have only one queue and one automatically spawned socket connection - which again, particularly at Chromium, could outlive the "lifetime" of the tab or browser instance.

The question is Chromium implementation in conformance or not in conformance with the Web Speech API specification re "ownership" of a SpeechSynthesisUtterance object and the "global" speechSynthesis instance?

What section of the Web Speech API specification unambiguously verifies Chromium allowing an iframe to call pause() (or cancel()) which affects window.speechSynthesis at the queue of the "global" object is not in conformance with the specificaton?

In the above case, how did the iframe.contentWindow.speechSythesis "instance" gain control of window.speechSynthesis "instance"?

re "ownership" is there the queue or multiple queues; one speechSynthesis global instance, or potential for multiple speechSynthesis instances - each spawning a new native socket connection that can be paused or canceled independently by the caller using JavaScript at the browser - or is there only one socket connection, out of the control of the Web Speech API, and thus, again, untestable?

Are the answers immediately clear at the current language of the specification?

@guest271314
Copy link
Contributor Author

For completeness, to check the possibility of Chromium actually following the specification, not outputting audio because the implementation somehow, without reference thereto at the specification, is recognizing utter as an instance of SpeechSynthesisUtterance that was previously passed to iframe.contentWindow.speechSynthesis.speak(), and flagged as having "owership" by some other speechSynthesis "instance", we can still pass a new SpeechSynthesisUtterance

speechSynthesis.speak(new SpeechSynthesisUtterance('2'));

which still does not output audio.

Is Firefox implementation not in conformance for outputting audio at window.speechSynthesis.speak() after the SpeechSynthesisUtterance was passed to iframe.contentWindow.speechSynthesis.speak(), if, in fact it is possible for there to be more than one speechSynthesis instances, and utter "ownership" was conferred to iframe.contentWindow.speechSynthesis even though that "instance" is in the paused state?

Due to the above unanswered questions testing for "ownership" of a SpeechSynthesisUtterance object is futile.

stephenmcgruer pushed a commit that referenced this issue Apr 20, 2020
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue May 1, 2020
…e is not defined by the specification, a=testonly

Automatic update from web-platform-tests
The specification does not define the concept of or algorithm for "ownership" of a SpeechSynthesisUtterance (#23098)

Fixes web-platform-tests/wpt#23097
--

wpt-commits: a12b280bc38ae366807ad9324e1a6fc3d8b5e82e
wpt-pr: 23098
xeonchen pushed a commit to xeonchen/gecko that referenced this issue May 1, 2020
…e is not defined by the specification, a=testonly

Automatic update from web-platform-tests
The specification does not define the concept of or algorithm for "ownership" of a SpeechSynthesisUtterance (#23098)

Fixes web-platform-tests/wpt#23097
--

wpt-commits: a12b280bc38ae366807ad9324e1a6fc3d8b5e82e
wpt-pr: 23098
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue May 1, 2020
…e is not defined by the specification, a=testonly

Automatic update from web-platform-tests
The specification does not define the concept of or algorithm for "ownership" of a SpeechSynthesisUtterance (#23098)

Fixes web-platform-tests/wpt#23097
--

wpt-commits: a12b280bc38ae366807ad9324e1a6fc3d8b5e82e
wpt-pr: 23098
xeonchen pushed a commit to xeonchen/gecko that referenced this issue May 1, 2020
…e is not defined by the specification, a=testonly

Automatic update from web-platform-tests
The specification does not define the concept of or algorithm for "ownership" of a SpeechSynthesisUtterance (#23098)

Fixes web-platform-tests/wpt#23097
--

wpt-commits: a12b280bc38ae366807ad9324e1a6fc3d8b5e82e
wpt-pr: 23098
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this issue May 3, 2020
…e is not defined by the specification, a=testonly

Automatic update from web-platform-tests
The specification does not define the concept of or algorithm for "ownership" of a SpeechSynthesisUtterance (#23098)

Fixes web-platform-tests/wpt#23097
--

wpt-commits: a12b280bc38ae366807ad9324e1a6fc3d8b5e82e
wpt-pr: 23098

UltraBlame original commit: e797ade7daf3bdd783f829c36190e6d13f6d6438
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this issue May 3, 2020
…e is not defined by the specification, a=testonly

Automatic update from web-platform-tests
The specification does not define the concept of or algorithm for "ownership" of a SpeechSynthesisUtterance (#23098)

Fixes web-platform-tests/wpt#23097
--

wpt-commits: a12b280bc38ae366807ad9324e1a6fc3d8b5e82e
wpt-pr: 23098

UltraBlame original commit: f478b06a1f1be1b03dd39da5fa0a41077d66b9ae
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this issue May 3, 2020
…e is not defined by the specification, a=testonly

Automatic update from web-platform-tests
The specification does not define the concept of or algorithm for "ownership" of a SpeechSynthesisUtterance (#23098)

Fixes web-platform-tests/wpt#23097
--

wpt-commits: a12b280bc38ae366807ad9324e1a6fc3d8b5e82e
wpt-pr: 23098

UltraBlame original commit: e797ade7daf3bdd783f829c36190e6d13f6d6438
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this issue May 3, 2020
…e is not defined by the specification, a=testonly

Automatic update from web-platform-tests
The specification does not define the concept of or algorithm for "ownership" of a SpeechSynthesisUtterance (#23098)

Fixes web-platform-tests/wpt#23097
--

wpt-commits: a12b280bc38ae366807ad9324e1a6fc3d8b5e82e
wpt-pr: 23098

UltraBlame original commit: f478b06a1f1be1b03dd39da5fa0a41077d66b9ae
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this issue May 3, 2020
…e is not defined by the specification, a=testonly

Automatic update from web-platform-tests
The specification does not define the concept of or algorithm for "ownership" of a SpeechSynthesisUtterance (#23098)

Fixes web-platform-tests/wpt#23097
--

wpt-commits: a12b280bc38ae366807ad9324e1a6fc3d8b5e82e
wpt-pr: 23098

UltraBlame original commit: e797ade7daf3bdd783f829c36190e6d13f6d6438
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this issue May 3, 2020
…e is not defined by the specification, a=testonly

Automatic update from web-platform-tests
The specification does not define the concept of or algorithm for "ownership" of a SpeechSynthesisUtterance (#23098)

Fixes web-platform-tests/wpt#23097
--

wpt-commits: a12b280bc38ae366807ad9324e1a6fc3d8b5e82e
wpt-pr: 23098

UltraBlame original commit: f478b06a1f1be1b03dd39da5fa0a41077d66b9ae
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant