/
index.bs
1251 lines (953 loc) · 50.1 KB
/
index.bs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<pre class='metadata'>
Title: Open Screen Protocol
Shortname: openscreenprotocol
Level: 1
Status: w3c/ED
ED: https://webscreens.github.io/openscreenprotocol/
Canonical URL: ED
Editor: Mark Foltz, Google, https://github.com/mfoltzgoogle, w3cid 68454
Repository: webscreens/openscreenprotocol
Abstract: The Open Screen Protocol is a suite of network protocols that allow user agents to implement the [[PRESENTATION-API|Presentation API]] and the [[REMOTE-PLAYBACK|Remote Playback API]] in an interoperable fashion.
Group: Second Screen Community Group
Mailing List: public-webscreens@w3c.org
Mailing List Archives: https://lists.w3.org/Archives/Public/public-webscreens/
Markup Shorthands: markdown yes, dfn yes, idl yes
</pre>
<p boilerplate="copyright">
<a href="http://www.w3.org/Consortium/Legal/ipr-notice#Copyright">Copyright</a> © [YEAR] the Contributors to the [TITLE] Specification, published by the <a href="https://www.w3.org/community/webscreens/">Second Screen Community Group</a> under the <a href="https://www.w3.org/community/about/agreements/cla/">W3C Community Contributor License Agreement (CLA)</a>.
A human-readable <a href="http://www.w3.org/community/about/agreements/cla-deed/">summary</a> is available.
</p>
<!-- TODO: Add short names to Presentation API spec, so that BS autolinking works as designed. -->
<!-- TODO: Can autolinks to HTML51 be automatically generated? -->
<pre class="anchors">
urlPrefix: https://w3c.github.io/presentation-api/#dfn-; type: dfn; spec: PRESENTATION-API
text: available presentation display
text: controller
text: controlling user agent
text: controlling browsing context
text: presentation
text: presentation display
text: presentation display availability
text: presentation id
text: presentation request url
text: receiver
text: receiving browsing context
text: receiving user agent
urlPrefix: https://w3c.github.io/presentation-api/; type: interface; spec: PRESENTATION-API
text: PresentationConnection
urlPrefix: https://w3c.github.io/remote-playback/#dfn-; type: dfn; spec: REMOTE-PLAYBACK
text: remote playback device
urlPrefix: https://www.w3.org/TR/html51/single-page.html; type: dfn; spec: HTML51
text: media element
</pre>
<h2 class='no-num no-toc no-ref' id='status'>Status of this document</h2>
This specification was published by the [Second Screen Community
Group](https://www.w3.org/community/webscreens/). It is not a W3C Standard nor
is it on the W3C Standards Track. It should not be viewed as a stable
specification, and may change in substantial ways at any time. A future version
of this document will be published as a Community Group Report.
Please note that under the [W3C Community Contributor License Agreement
(CLA)](https://www.w3.org/community/about/agreements/cla/) there is a limited
opt-out and other conditions apply.
Learn more about [W3C Community and Business
Groups](http://www.w3.org/community/).
Introduction {#introduction}
============================
The Open Screen Protocol connects browsers to devices capable of rendering Web
content for a shared audience. Typically, these are devices like
Internet-connected TVs, HDMI dongles, or "smart" speakers.
The protocol is a suite of subsidiary network protocols that enable two user
agents to implement the [[PRESENTATION-API|Presentation API]] and
[[REMOTE-PLAYBACK|Remote Playback API]] in an interoperable fashion. This means
that a user can expect these APIs work as intended when connecting two devices
from independent implementations of the Open Screen Protocol.
The Open Screen Protocol is a specific implementation of these two APIs, meaning
that it does not handle all possible ways that browsers and presentation
displays could support these APIs. The Open Screen Protocol specifically
supports browsers and displays that are connected via the same local area
network, and that initiate presentation or remote playback by sending a URL
from the browser to the target display.
The Open Screen Protocol is intended to be extensible, so that additional
capabilities can be added over time. This may include new implementations of
existing APIs, or new APIs.
Terminology {#terminology}
--------------------------
We borrow terminology from the [[PRESENTATION-API|Presentation API]] and
[[REMOTE-PLAYBACK|Remote Playback API]] for terms used in this document. These
terms are summarized here.
We call the browser that is used to discover and initiate presentation of Web
content on another device the [=controlling user agent=]. We call the
user agent on the device rendering the Web content the
[=receiving user agent=], or *receiver* for short. We use
the term [=presentation display=] to refer to the entire platform and
responsible for implementing the *receiver*, including browser, OS, networking,
audio and graphics.
For the [[PRESENTATION-API|Presentation API]], presentation of Web content is
initiated at the request of a [=controlling browsing context=] (or
*controller*), which creates a [=receiving browsing context=] (or
*presentation*) to load a [=presentation request URL=] and exchange messages
with the resulting document.
Before this can happen, the [=controlling user agent=] must determine which
[=receivers=], if any, are compatible with the [=presentation request URL=]. This
happens by determining the [=presentation display availability=] for the
presentation request URL.
For the [[REMOTE-PLAYBACK|Remote Playback API]], the device responsible for
rendering the content of a [=media element=] when remote playback is connected
is called the [=remote playback device=].
For additional terms and idioms specific to the [[PRESENTATION-API|Presentation API]] or
Remote Playback API, please consult the respective specifications.
We also use the term "agent" to mean any implementation of this protocol,
browser, device, or otherwise, acting as a controller or a receiver.
Requirements {#requirements}
============================
Presentation API Requirements {#requirements-presentation-api}
--------------------------------------------------------------
1. A controlling user agent must be able to discover the presence of a
presentation display connected to the same IPv4 or IPv6 subnet and reachable
by IP multicast.
2. A controlling user agent must be able to obtain the IPv4 or IPv6 address of
the display, a friendly name for the display, and an IP port number for
establishing a network transport to the display.
3. A controlling user agent must be able to determine if the receiver is
reasonably capable of rendering a specific [=presentation request URL=].
4. A controlling user agent must be able to start a new presentation on a receiver given a
[=presentation request URL=] and [=presentation ID=].
5. A controlling user agent must be able to create a new
{{PresentationConnection}} to an existing presentation on the
receiver, given its [=presentation request URL=] and [=presentation ID=].
6. It must be possible to to close a {{PresentationConnection}} between a
controller and a presentation, and signal both parties with the
reason why the connection was closed.
7. Multiple controllers must be able to connect to a single presentation
simultaneously, possibly from from one or more [=controlling user agents=].
8. Messages sent by the controller must be delivered to the presentation (or
vice versa) in a reliable and in-order fashion.
9. If a message cannot be delivered, then the controlling user agent must be
able to signal the receiver (or vice versa) that the connection should be
closed with reason `error`.
10. The controller and presentation must be able to send and receive `DOMString`
messages (represented as `string` type in ECMAScript).
11. The controller and presentation must be able to send and receive binary
messages (represented as `Blob` objects in HTML5, or `ArrayBuffer` or
`ArrayBufferView` types in ECMAScript).
12. The controlling user agent must be able to signal to the receiver to
terminate a presentation, given its [=presentation request URL=] and [=presentation
ID=].
13. The receiver must be able to signal all connected controlling user agents
when a presentation is terminated.
Remote Playback API Requirements {#requirements-remote-playback}
----------------------------------------------------------------
Issue(3): Requirements for Remote Playback API
Non-Functional Requirements {#requirements-non-functional}
----------------------------------------------------------
1. It should be possible to implement an Open Screen presentation display using
modest hardware requirements, similar to what is found in a low end
smartphone, smart TV or streaming device. See the [Device
Specifications](device_specs.md) document for expected presentation display
hardware specifications.
2. It should be possible to implement an Open Screen controlling user agent on a
low-end smartphone. See the [Device Specifications](device_specs.md) document
for expected controlling user agent hardware specifications.
3. The discovery and connection protocols should minimize power consumption,
especially on the controlling user agent which is likely to be battery
powered.
4. The protocol should minimize the amount of information provided to a passive
network observer about the identity of the user, activity on the controlling
user agent and activity on the receiver.
5. The protocol should prevent passive network eavesdroppers from learning
presentation URLs, presentation IDs, or the content of presentation messages
passed between controllers and presentations.
6. The protocol should prevent active network attackers from impersonating a
display and observing or altering data intended for the controller or
presentation.
7. The controlling user agent should be able to discover quickly when a
presentation display becomes available or unavailable (i.e., when it connects
or disconnects from the network).
8. The controlling user agent should present sensible information to the user
when a protocol operation fails. For example, if a controlling user agent is
unable to start a presentation, it should be possible to report in the
controlling user agent interface if it was a network error, authentication
error, or the presentation content failed to load.
9. The controlling user agent should be able to remember authenticated
presentation displays. This means it is not required for the user to
intervene and re-authenticate each time the controlling user agent connects
to a pre-authenticated display.
10. Message latency between the controller and a presentation should be minimized
to permit interactive use. For example, it should be comfortable to type in
a form in the controller and have the text appear in the presentation in real
time. Real-time latency for gaming or mouse use is ideal, but not a
requirement.
11. The controlling user agent initiating a presentation should communicate its
preferred locale to the receiver, so it can render the presentation content
in that locale.
12. It should be possible to extend the control protocol (above the discovery and
transport levels) with optional features not defined explicitly by the
specification, to facilitate experimentation and enhancement of the base
APIs.
Discovery with mDNS {#discovery}
===============================
Agents may discover one another using [[RFC6763|DNS-SD]] over [[RFC6762|mDNS]].
To do so, agents must use the service name "_openscreen._udp.local".
Advertising Agents must use an instance name that is a prefix of the agent's
display name. If the instance name is not the complete display name (if it has
been truncated), it must be terminated by a null character. It is prefix so
that the name displayed to the user pre-verification can be verified later. It
is terminated by a null character in the case of truncation so that the
listening agent knows it has been truncated. This complexity is necessary to
all for display names that exceed the size allowed in an instance name and for
such (possibly truncated) display names to be visible to the user sooner
(before a QUIC connection is made). Listening agents must treat instance names
as unverified and must verify that the instance name is a prefix of the verified
display name before showing the user a verified display name.
Advertising agents must include DNS TXT records with the following
keys and values:
- key "fp" with value of the certificate fingerprint of the advertising agent.
The format of the fingerprint is defined by [RFC 8122 section
5](https://tools.ietf.org/html/rfc8122#section-5), excluding the
"fingerprint:" prefix and including the hash function, space, and hex-encoded
fingerprint. The fingerprint value also functions as an ID for the agent.
All agents must support the following hash functions: "sha-256", "sha-512".
Agents must not support the following hash functions: "md2", "md5".
<!-- TODO: include cross references to the specs for these hash functions. -->
- key "mv" with an unsigned integer value that indicates that
metadata has changed. The advertising agent must update it to a greater
value. This signals to the listening agent that it should connect to the
advertising agent to discover updated metadata.
<!-- TODO: Add examples of sample mDNS records. -->
Future extensions to this QUIC-based protocol can use the same metadata
discovery process to indicate support for those extensions, through a
capabilities mechanism to be determined. If a future version of the Open Screen
Protocol uses mDNS but breaks compatibility with the metadata discovery process,
it should change the DNS-SD service name to a new value, indicating a new
mechanism for metadata discovery.
Transport and metadata discovery with QUIC {#transport}
=======================================================
If a listening agent wants to connect to an advertising agent, or to
learn further metadata about it, it initiates a [[!QUIC]] connection to
the IP and port from the SRV record. Prior to authentication, a message may be
exchanged (such as further metadata), but such info should be treated as
unverified (such as indicating to a user that a display name of an
unauthenticated agent is unverified).
To learn further metadata, an agent may send an agent-info-request
message (see [[#appendix-a]]) and receive back an agent-info-response message. The
messages may contain the following information with the following meaning:
- display-name: (required) The display name of the responding agent intended
to be displayed to a user by the requesting agent. If the responding agent
is not yet authenticated, the requesting agent should make UI affordance for
indicating to the user that the display name is not yet verified. If the
responding agent changes its display name, the requesting agent should
make UI affordance for indicating to the user that the display name has
changed.
- model-name: (optional) If the agent is a hardware device, the model name of
the device. This is used mainly for debugging purposes, but may be
displayed to the user of the requesting agent.
<!-- TODO: Add device type and/or capabilities -->
Listening agents act as QUIC clients. Advertising agents act as QUIC servers.
If a listening agent wishes to receive messages from an advertising agent or an
advertising agent wishes to send messages to a listening agent, it may wish to
keep the QUIC connection alive. Once neither side needs to keep the connection
alive for the purposes of sending or receiving messages, the connection should
be closed with an error code of 5139. In order to keep a QUIC connection alive, an
agent may send an agent-status-request message, and any agent that receives an
agent-status-request message should send an agent-status-response message. Such
messages should be sent more frequently than the QUIC idle_timeout transport
parameter (see section 18 of [[!QUIC]]) and QUIC PING
frames should not be used. An idle_timeout transport parameter of 25 seconds is
recommended. The agent should behave as though a timer less than the
idle_timeout were reset every time a message is sent on a QUIC stream. If the
timer expires, a agent-status-request message should be sent.
If a client agent wishes to send messages to a server agent, the client
agent can connect to the server agent "on demand"; it does not need to
keep the connection alive.
The agent-info-response message and agent-status-response
messages may be extended to include additional information not defined
in this spec. If done ad-hoc by applications and not in future specs,
keys should be chosen to avoid collision, such as by choosing large
integers or long strings. Agents must ignore keys in the
agent-info-message that it does not understand to allow agents
to easily extend this message.
Messages delivery using CBOR and QUIC streams {#control}
========================================================
Messages are serialized using [[!RFC7049|CBOR]]. To
send a group of messages in order, that group of messages must be sent in one
QUIC stream. Independent groups of messages (with no ordering dependency
across groups) should be sent in different QUIC streams. In order to put
multiple CBOR-serialized messages into the the same QUIC stream, the following
is used.
For each message, the sender must write to the QUIC stream the following:
1. A type key representing the type of the message, encoded as a variable-length
integer (see [[#appendix-a]] for type keys)
2. The message length encoded as a variable-length integer
3. The message encoded as CBOR (whose length must match the value in step 2)
If an agent receives a message for which it does not recognize a
type key, it must close the QUIC connection with an application error
code of 404 and should include the unknown type key in the reason phrase
(see [[!QUIC]] section 19.4).
Variable-length integers are encoded in the same format as defined by [QUIC
transport section
16](https://tools.ietf.org/html/draft-ietf-quic-transport-16#section-16).
Many messages are requests and responses, so a common format is defined for
those. A request and a response includes a request ID which is an unsigned
integer chosen by the requester. Responses must include the request ID of the
request they are associated with.
Authentication {#authentication}
================================
In order for one agent (the challenger) to authenticate another (the responder),
the challenger may send an authentication-request message and expect an
authentication-response message to be sent back from the responder. To
mutually authenticate, this mechanism is used twice, once by each side acting as
the challenger. This mechanism assumes the agents share a low-entropy secret,
such as a number or a short password that could be entered by a user on a
keyboard or TV remote control.
For all messages and objects defined in this section, see Appendix A for the full
CDDL definitions.
The challenger sends an authentication-request message with the following values:
- mechanism: The authentication mechanism being used. This standard only
defines the mechanism hkdf-of-scrypt-of-psk but this field gives a place for
other mechanisms to be specified.
- salt: 32 random bytes. This salt is used in HKDF, so see
https://tools.ietf.org/html/rfc5869#section-3.1 for more details on how this
value should be generated.
- cost: log base 2 of the cost parameter (N) for scrypt defined in [RFC
7914 section 2](https://tools.ietf.org/html/rfc7914#section-2). It must be
greater than or equal to 14 (to avoid being too weak) and less than or equal
to 128 (the limit defined by scrypt). A value of 15 is recommended (an
scrypt N of 2^15 or 32768).
The responder replies with an authentication-response message with the following values:
- result: If the responder was able to calculate proof of possession of the
shared secret, and if it failed, why it failed.
- proof: The result of running the authentication mechanism. The steps for
hkdf-of-scrypt-of-psk are described below.
The challenger verifies the proof and sends the responder an
authentication-result message with the following values:
- result: If the challenger was able to authenticate the responder or not,
and if not, why not.
The challenger must limit the time the responder has to send a response to 60
seconds (to avoid the possibility of brute-force attacks.)
For hkdf-of-scrypt-of-psk, the proof is calculated using the following steps:
1. Let secret be the pre-shared secret.
2. Let N be 2 to the power of of the cost from the authentication-request
message.
3. Let r be 8.
4. Let p be 1.
5. Let keyLength be 32.
6. Let scryptResult be the result of running
[scrypt](https://tools.ietf.org/html/rfc7914) on secret with cost parameter N,
block size r, parallelization parameter p, and derived key length of
keyLength.
7. Let hashFunction be sha-256.
8. Let salt be the salt from the authentication-request message.
9. Let info be a CBOR-serialized certificate-fingerprint-pair object (CDDL
defined in Appendix A) with the following values:
- challenger-fingerprint: The result of running sha-256 on the
Distinguished Encoding Rules (DER) form (see
https://tools.ietf.org/html/rfc8122#section-5) of the certificate used by
the challenger in the QUIC crypto handshake during connection establishment.
- responder-fingerprint: The result of running sha-256 on the
Distinguished Encoding Rules (DER) form (see
https://tools.ietf.org/html/rfc8122#section-5) of the certificate used by
the responder in the QUIC crypto handshake during connection establishment.
9. Let proof be the result of running
[\HKDF](https://tools.ietf.org/html/rfc5869) on scryptResult with
both the extract and expand steps, hash function hashFunction,
application-specific info, and output key length keyLength.
To verify that the responder's proof is correct, the challenger makes the same
calculation of the proof and compares the result. If the results are the same,
the challenger considers the responder authenticated, and considers it
unauthenticated otherwise.
Note: the values of 32 above (for salt length, keyLength) are based on the
output size of sha-256. If a different hash mechanism is used in the future,
these values should be updated as well.
Control Protocols {#control-protocols}
============================
Presentation Protocol {#presentation-protocol}
---------------------------------------------
This section defines the use of the Open Screen Protocol for starting,
stopping, and controlling presentations as defined by
[[PRESENTATION-API|Presentation API]]. A subsequent section will
define how APIs in [[PRESENTATION-API|Presentation API]] map to the
protocol messages defined in this section.
For all messages defined in this section, see [[#appendix-a]] for the full
CDDL definitions.
<!-- TODO: Add a capability that indicates support for the
presentation protocol.
See https://github.com/webscreens/openscreenprotocol/issues/123 -->
To learn which receivers are [=available presentation displays=] for a
particular URL or set of URLs, the controller may send a
presentation-url-availability-request message with the following values:
- urls: A list of presentation URLs. Must not be empty.
- watch-duration: The period of time that the controller is interested in
receiving updates about the URLs, should the availability change.
- watch-id: An identifier the receiver may use when sending updates about URL
availability so that controller knows which URLs the receiver is referring
to.
In response, the receiver should send one presentation-url-availability-response
message with the following values:
- url-availabilities: A list of URL availability states (available,
unavailable, or invalid). Each state must correspond to the matching URL
from the request by list index.
The receivers should later (up to the current time plus request
watch-duration) send presentation-url-availability-event messages if
URL availabilities change. Such events contain the following values:
- watch-id: The watch-id given in the presentation-url-availability-response,
used to refer to the presentation URLs whose availability has changed.
- url-availabilities: A list of URL availability states (available,
unavailable, or invalid). Each state must correspond to the URLs from the
request referred to by the watch-id.
Note that these messages are not broadcasted to all controllers. They are sent
individually to controllers that have requested availability for the URLs that
have changed in availability state within the watch duration of the original
availability request.
To save power, the controller may disconnect the QUIC connection and
later reconnect to send availablity requests and receive availability
responses and updates.
To start a presentation, the controller may send a
presentation-start-request message to the receiver with the following
values:
- presentation-id: the presentation identifier
- url: the selected presentation URL
- headers: headers that the receiver should use to fetch the
presentationUrl. For example, section 6.6.1 of
[[PRESENTATION-API|Presentation API]] says that the Accept-Language
header should be provided.
The presentation ID must follow the restrictions defined by
[[PRESENTATION-API|Presentation API]] section 6.1, in that it must
consist of at least 16 ASCII characters.
When the receiver receives the presentation-start-request, it should send back a
presentation-start-response message after either the presentation URL has been
fetched and loaded, or the receiver has failed to do so. If it has failed, it
must respond with the appropriate result (such as invalid-url or timeout). If
it has succeeded, it must reply with a success result. Additionally, the
response must include the following:
- connection-id: An ID that both agents can use to send connection messages
to each other. It is chosen by the receiver for ease of implementation: if
the message receiver chooses the connection-id, it may keep the ID unique
across connections, thus making message demuxing/routing easier.
<!-- TODO: Add optional HTTP response code to the response? -->
To send a presentation message, the controller or receiver may send a
presentation-connection-message with the following values:
- connection-id: The ID from the presentation-start-response or
presentation-connection-open-response messages.
- message: the presentation message data.
To terminate a presentation, the controller may send a
presentation-termination-request message with the following values:
- presentation-id: The ID of the presentation to terminate.
- reason: The reason the presentation is being terminated.
When a receiving agent receives a presentation-termination-request, it should
send back a presentation-termination-response message to the requesting
agent. It should also notify other controllers about the termination by sending
a presentation-termination-event message. And it can send the same message if
it terminates a presentation without a request from a controller to do so. This
message contains the following values:
- presentation-id: The ID of the presentation that was terminated.
- reason: The reason the presentation was terminated.
<!-- TODO: Split up reason into reason and whether it was triggered by the user
or not? -->
To accept incoming connections requests from controller, a receiver
must receive and process the presentation-connection-open-request
message which contains the following values:
- presentation-id: The ID of the presentation to connect to.
- url: The URL of the presentation to connect to.
The receiver should, upon receipt of a
presentation-connection-open-request message, send back a
presentation-connection-open-response message which contains the
following values:
- result: a code indicating success or failure, and the reason for the failure
- connection-id: An ID that both agents can use to send connection messages
to each other. It is chosen by the receiver for ease of implementation (if
the message receiver chooses the connection-id, it may keep the ID unique
across connections, thus making message demuxing/routing easier).
A controller may terminate a connection without terminating the presentation by
sending a presentation-connection-close-request message with the following
values:
- connection-id: The ID of the connection to close.
The receiver should, upon receipt of a presentation-connection-close-request,
send back a presentation-connection-close-response message with the following
values:
- result: If the close succeed or failed, and if it failed why it failed.
The receiver may also close a connection without a request from the controller
to do so and without terminating a presentation. If it does so, it should send
a presentation-connection-close-event to the controller with the following
values:
- connection-id: The ID of the connection that was closed
- reason: The reason the connection was closed
- error-message: A debug message suitable for a log or perhaps presented to
the user with more explanation as to why it was closed.
<!-- TODO: Why does the Presentation API spec not mention the use of the close
message? -->
<!-- TODO: Specify message ordering groups. -->
Presentation API {#presentation-api}
---------------------------------------------
This section defines how [[PRESENTATION-API|Presentation API]] uses
the messages defined in the previous sections.
Non-browser agents can also send and receive the same messages defined here.
If so, a non-browser agent must follow the same restrictions for the
presentation-id as the a does, as defined by [[PRESENTATION-API|Presentation
API]] section 6.1 (at least 16 ASCII characters).
When [[PRESENTATION-API|Presentation API]] [section
6.4.2](https://www.w3.org/TR/presentation-api/#sending-a-message-through-presentationconnection)
says "This list of presentation displays ... is populated based on an
implementation specific discovery mechanism", the [=controlling user
agent=] may use the mDNS, QUIC, agent-info-request, and
presentation-url-availability-request messages defined previously in
this spec to discover receivers.
When [[PRESENTATION-API|Presentation API]] [section
6.4.2](https://www.w3.org/TR/presentation-api/#the-list-of-available-presentation-displays)
says "To further save power, ... implementation specific discovery of
presentation displays can be resumed or suspended.", the [=controlling
user agent=] may use the power saving mechanism defined in the
previous section.
When [[PRESENTATION-API|Presentation API]] [section
6.3.4](https://www.w3.org/TR/presentation-api/#starting-a-presentation-connection)
says "Using an implementation specific mechanism, tell U to create a
receiving browsing context with D, presentationUrl, and I as
parameters.", U (the [=controlling user agent=]) may send a
presentation-start-request message to D (the receiver), with I for the
presentation identifier and presentationUrl for the selected
presentation URL.
<!-- TODO: Once the Presentation API has text about reconnecting via an
implementation specific mechanism, quote that here and map it to a message -->
When [[PRESENTATION-API|Presentation API]] [section
6.5.2](https://www.w3.org/TR/presentation-api/#sending-a-message-through-presentationconnection)
says "Using an implementation specific mechanism, transmit the
contents of messageOrData as the presentation message data and
messageType as the presentation message type to the destination
browsing context", the [=controlling user agent=] may send a
presentation-connection-message with messageOrData for the
presentation message data. Note that the messageType is embedded in
the encoded CBOR type and does not need an additional value in the
message.
When [[PRESENTATION-API|Presentation API]] [section
6.5.6](https://www.w3.org/TR/presentation-api/#terminating-a-presentation-in-a-controlling-browsing-context)
says "Send a termination request for the presentation to its receiving user
agent using an implementation specific mechanism", the [=controlling user
agent=] may send a presentation-termination-request message.
When [[PRESENTATION-API|Presentation API]] [section
6.7.1](https://www.w3.org/TR/presentation-api/#monitoring-incoming-presentation-connections)
says "it MUST listen to and accept incoming connection requests from a
controlling browsing context using an implementation specific
mechanism", the [=receiving user agent=] must receive and process the
presentation-connection-open-request.
When [[PRESENTATION-API|Presentation API]] [section
6.7.1](https://www.w3.org/TR/presentation-api/#monitoring-incoming-presentation-connections)
says "Establish the connection between the controlling and receiving browsing
contexts using an implementation specific mechanism.", the [=receiving user
agent=], must send a presentation-connection-open-response message.
Remote Playback API Protocol {#remote-playback}
-----------------------------------------------
Issue(12): Propose control protocol for Remote
Playback API.
Security and Privacy {#security-privacy}
====================
The Open Screen Protocol allows two networked agents to discover each other
and exchange user and application data. As such, its security and privacy
considerations should be closely examined. We first evaluate the protocol
itself using the W3C [[SECURITY-PRIVACY-QUESTIONNAIRE|Security and Privacy
Questionnaire]]. We then examine whether the security and privacy guidelines
recommended by the [[PRESENTATION-API|Presentation API]] and the
[[REMOTE-PLAYBACK|Remote Playback API]] are met. Finally we discuss recommended
mitigations that agents can use to meet these security and privacy
requirements.
Threat Models {#threat-models}
--------------------------------
### Passive Network Attackers ### {#passive-network-attackers}
The Open Screen Protocol should assume that all parties that are connected to
the same LAN, either through a wired connection or through WiFi, are able to
observe all data flowing between Open Screen Protocol agents.
These parties will be able collect any data exposed through unencrypted
messages, such as mDNS records and the QUIC handshakes.
These parties may attempt to learn cryptographic parameters by observing data
flows on the QUIC connection, or by observing cryptographic timing.
### Active Network Attackers ### {#active-network-attackers}
Active attackers, such as compromised routers, will be able to manipulate data
exchanged between agents. They can inject traffic into existing QUIC
connections and attempt to inititate new QUIC connections. These abilities can
be used to attempt the following:
* Impersonate an agent or one already trusted by the user, in an attempt
to convince the user to authenticate to it.
* Connect to an agent and query its capabilities.
* Connect to and control a presentation or remote playback, or extract data
from the application state of the presentation or remote playback.
One particular attack of concern is misconfigured or compromised routers that
expose local network devices (such as Open Screen Protocol agents) to the
Internet. This vector of attack has been used by malicious parties to take
control of printers and smart TVs by connecting to local network services that
would normally be inaccessible from the Internet.
### Denial of Service ### {#denial-of-service}
Parties with connected to the LAN may attempt to deny access to Open Screen
Protocol agents. For example, an attacker my attempt to open
a large number of QUIC connections to an agent in an attempt to block
legitimate connections or exhaust the agent's system resources. They may
also multicast spurious DNS-SD records in an attempt to exhaust the cache
capacity for mDNS listeners, or to get listeners to open a large number of bogus
QUIC connections.
### Same-Origin Policy Violations ### {#same-origin-policy-violations}
The Presentation API allows cross-origin communication between controlling pages
and presentations with the consent of each origin (through their use of the
API). This is similar to cross-origin communication via
{{Window/postMessage()}} with a target origin of `*`. However, the Presentation
API does not convey source origin information with each message. Therefore, the
Open Screen Protocol does not convey origin information between its agents.
The [=presentation ID=] carries some protection against unrestricted
cross-origin access; but, rigorous authentication of the parties connected by a
{{PresentationConnection}} must be done at the application level.
Open Screen Protocol Security and Privacy Considerations {#security-privacy-questions}
-----------------------------------
### Personally Identifiable Information & High-Value Data ### {#personally-identifiable-information}
The following data exchanged by the protocol can be personally identifiable
and/or high value data:
1. Presentation URLs and availability results
1. Presentation IDs
1. Presentation connection IDs
1. Presentation connection messages
1. Remote playback URLs
1. Remote playback commands and status messages
Presentation IDs are considered high value data because they can be used in
conjunction with a Presentation URL to connect to a running presentation.
Presentation display friendly names, model names, and capabilities, while not
considered personally identifiable, are important to protect to prevent an
attacker from changing them or substituting other values during the discovery
and authentication process.
The following data cannot be reasonably made confidential and should be
considered public and untrusted data:
1. IP addresses and ports used by the Open Screen Protocol.
1. Data advertised through mDNS, including the display name prefix, the
certificate fingerprint, and the metadata version.
### Cross Origin State Considerations ### {#cross-origin-state}
Access to origin state across browsing sessions is possible through the
Presentation API by reconnecting to a presentation that was started by a
previous session. This scenario is addressed in
[[PRESENTATION-API#cross-origin-access]].
Presentation display availability and remote playback device availability are
states that are available cross-origin depending on the user's network
context. Exposure of this data to the Web is also discussed in
[[PRESENTATION-API#personally-identifiable-information]] and
[[REMOTE-PLAYBACK#personally-identifiable-information]].
### Origin Access to Other Devices ### {#origin-access-devices}
By design, the Open Screen Protocol allows access to presentation displays and
remote playback devices from the Web. By implementing the protocol, these
devices are knowingly making themselves available to the Web and should be
designed accordingly.
Below, we discuss mitigation steps to prevent malicious use of these devices.
### Incognito Mode ### {#incognito-mode}
The Open Screen Protocol does not distinguish between the user agent's normal
browsing and incognito modes, and agents that follow the specification
behave identically regardless of which mode is in use.
It's recommended that user agents use separate authentication contexts and QUIC
connections for normal and incognito profiles from the same user agent instance.
This prevents Open Screen agents from correlating activity among profiles
belonging to the same user (both normal and incognito).
### Persistent State ### {#persistent-state}
An agent is likely to persist the identity of agents that have successfully
completed [[#authentication]]. This may include the public key fingerprints,
metadata versions, and metadata for those parties.
However, this data is not normally exposed to the Web, only through the native
UI of the user agent during the display selection or display authentication
process. It can be an implementation choice whether the user agent clears or
retains this data when the user clears browsing data.
Issue(132): [Privacy] Fate of metadata / authentication history when clearing
browsing data
### Other Considerations ### {#other-considerations}
The Open Screen Protocol does not grant to the Web additional access to the
following:
* New script loading mechanisms
* Access to the user's location
* Access to device sensors
* Access to the user's local computing environment
* Control over the user agent's native UI
* Security characteristics of the user agent
Presentation API Considerations {#presentation-api-considerations}
-------------------------------
[[PRESENTATION-API#security-and-privacy-considerations]] place these
requirements on the Open Screen Protocol:
1. Presentation URLs and presentation IDs should remain private among the
parties that are allowed to connect to a presentation, per the
cross-origin access guidelines.
1. Controllers and receivers should be notified when connections representing
multiple user agent profiles have been made to a presentation, per the user
interface guidelines.
1. Messaging between controllers and receivers should be authenticated and
confidential, per the guidelines for messaging between presentation
connections.
The Open Screen Protocol addresses these considerations by:
1. Requiring mutual authentication and a TLS-secured QUIC connection before
presentation URLs, IDs, or messages are exchanged.
1. Adding explicit messages and connection IDs for individual
{{PresentationConnection|PresentationConnections}} so that agents can track
the number of active connections.
Remote Playback API Considerations {#remote-playback-considerations}
----------------------------------
The [[REMOTE-PLAYBACK#security-and-privacy-considerations]] also state that
messaging between local and remote playback devices should also be authenticated
and confidential.
This consideration is handled by requiring mutual authentication and a
TLS-secured QUIC connection before any remote playback related messages are
exchanged.
Mitigation Strategies {#security-mitigations}
--------------------------------------------
### Local passive network attackers ### {#local-passive-mitigations}
Local passive attackers may attempt to harvest data about user activities and
device capabilties using the Open Screen Protocol. The main strategy to address
this is data minimization, by only exposing opaque public key fingerprints
before user-mediated authentication takes place.
Passive attackers may also attempt timing attacks to learn the
cryptographic parameters of the TLS 1.3 QUIC connection.
Issue(130): [Security] Review attack and mitigation considerations for TLS 1.3
### Local active network attackers ### {#local-active-mitigations}
Local active attackers may attempt to impersonate a presentation display the
user would normally trust. The [[#authentication]] step of the Open Screen
Protocol prevents a man-in-the-middle from impersonating an agent, without
knowledge of a shared secret. However, it is possible for an attacker to
impersonate an existing, trusted display or a newly discovered display that is
not yet authenticated and try to convince the user to authenticate it.
This can be addressed through a combination of techniques. The first is flagging:
* Flag an advertised display whose public key fingerprint collides with that
from an already-trusted display that is concurrently being advertised.
* Flag an advertised display whose friendly name differs from the one previously
advertised under a public key fingerprint.
* Flag already-trusted displays whose metadata has changed.
* Flag agents that fail the authentication challenge a certain number of times.
Flagging means that the user is notified, or in some cases they are required to
re-authenticate to the presentation display to verify its identity.
The second is through management of the shared secret during mutual
authentication:
* Rotate the shared secret to prevent brute force attacks.
* Use an increasing backoff to respond to authentication challenges, also to
prevent brute force attacks.
* Use a cryptographically sound source of entropy to generate the shared secret.
* Require the end user to manually type the shared secret - shown only on the
display - to prevent the user from blindly clicking through this step.
The active attacker may also attempt to disrupt data exchanged over the QUIC
connection by injecting or modifying traffic. These attacks should be mitigated
by a correct implementation of TLS 1.3.
Issue(130): [Security] Review attack and mitigation considerations for TLS 1.3
### Remote active network attackers ### {#remote-active-mitigations}
Unfortunately, we cannot rely on network devices to fully protect Open Screen
Protocol agents from traffic from the broader Internet. Open Screen Protocol
agents that are only intended to work on the LAN should filter packets
from non-local IP addresses. Agents can also use the ARP cache to
detect attempts to spoof local network IP addresses.
Issue(131): [Security] Mitigations for remote network attackers
### Denial of service ### {#denial-of-service-mitigations}
It will be difficult to completely prevent denial service of attacks that
originate on the user's local area network. Open Screen Protocol agents can
refuse new connections, rate limit the rate of messages from existing
connections, or limit the number of mDNS records cached from a specific
responder in an attempt to allow existing activities to continue in spite of
such an attack.
### Malicious input ### {#malicious-input-mitigations}
Open Screen Protocol agents should be robust against malicious input that
attempts to compromise the target device by exploiting parsing vulnerabilities.
CBOR is intended to be less vulnerable to such attacks relative to alternatives
like JSON and XML. Still, agents should be thoroughly tested using approaches
like [fuzz testing](https://en.wikipedia.org/wiki/Fuzzing).