-
Notifications
You must be signed in to change notification settings - Fork 4
/
566.txt
105 lines (78 loc) · 5.8 KB
/
566.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
[10] [DFN[[RUBYB[サロゲートペア]@en[surrogate pair]]]]は、
[[UTF-16]] で2つの[[16ビット符号単位]]を組み合わせて1つの[[Unicode符号位置]]を表すものです。
[11] [[サロゲートペア]]で使う[[16ビット符号単位]]に相当する[[符号位置]]は、
([[符号化方式]]に関わらず) 使わないことになっています。
* 符号化方式
[12] 次の[[符号化方式]]で[[サロゲートペア]]が使われました。
[FIG(short list)[
- [[UTF-16]]
- [[CESU-8]]
- [[UTF-32S]]
]FIG]
* 歴史
[1]
[CITE@en[I'm not a Klingon : UTF-16, UTF-8 & UTF-32 update to conform with Unicode 5.0's security concerns.]] ([TIME[2007-07-27 23:26:23 +09:00]] 版) <http://blogs.msdn.com/shawnste/archive/2007/07/23/utf-16-utf-8-utf-32-update-to-conform-with-unicode-5-0-s-security-concerns.aspx>
[2] [CITE@en[Web Applications 1.0 r7084 Make WebSocket silently convert isolated surrogated to U+FFFD rather than throwing an exception. This will result in data corruption when a user types in astral-plane characters that get truncated by naiive script half-way through, rather than crashing the application.]]
( ([TIME[2012-05-03 05:06:00 +09:00]] 版))
<http://html5.org/tools/web-apps-tracker?from=7083&to=7084>
[3] [CITE@en[Notifications API: minor change]]
( ([[Anne van Kesteren]] 著, [TIME[2012-11-30 00:03:40 +09:00]] 版))
<http://lists.w3.org/Archives/Public/public-web-notification/2012Nov/0010.html>
[4] [CITE[IRC logs: freenode / #whatwg / 20130915]]
( ([TIME[2013-09-16 22:06:01 +09:00]] 版))
<http://krijnhoetmer.nl/irc-logs/whatwg/20130915#l-241>
[5] [CITE[IRC logs: freenode / #whatwg / 20101111]]
( ([TIME[2010-11-19 23:02:05 +09:00]] 版))
<http://krijnhoetmer.nl/irc-logs/whatwg/20101111#l-579>
[6] [CITE@en[Web Applications 1.0 r6184 Try to clean up the stuff about Unicode characters.]]
( ([TIME[2011-06-04 04:40:00 +09:00]] 版))
<http://html5.org/tools/web-apps-tracker?from=6183&to=6184>
[7] [CITE[IRC logs: freenode / #whatwg / 20140329]]
( ([TIME[2014-03-31 13:06:30 +09:00]] 版))
<http://krijnhoetmer.nl/irc-logs/whatwg/20140329>
[8] [CITE[IRC logs: freenode / #whatwg / 20140516]]
( ([TIME[2014-05-21 16:29:26 +09:00]] 版))
<http://krijnhoetmer.nl/irc-logs/whatwg/20140516>
[9] [CITE@en[''''''[''''''CSSWG'''''']'''''' Minutes Seoul F2F 2014-05-19 Part V: Counter Styles, CSS Formatting for Books, Font Load Events, Future F2F Meetings, CSS Syntax - Unpaired Surrogates, MQ Listener]]
( ([[Dael Jackson]] 著, [TIME[2014-06-09 08:42:57 +09:00]] 版))
<http://lists.w3.org/Archives/Public/www-style/2014Jun/0060.html>
[13] [CITE@en[Define JavaScript string and scalar value string]]
([[annevk]]著, [TIME[2017-03-27 16:01:49 +09:00]])
<https://github.com/whatwg/infra/commit/f1be763cfba23d2fc780b35403074c599e69616e>
[14] [CITE@en['''['''c''']''' (2) Disallow surrogates in the input stream; make the syntax sect…]]
([[Hixie]]著, [TIME[2013-09-14 06:27:11 +09:00]])
<https://github.com/whatwg/html/commit/6dfaa1a826fae1dd50695710498434d201e543f6>
[15] [CITE@en['''['''''']''' (0) Catch unpaired surrogates before trying to convert them to UTF-8.]]
([[Hixie]]著, [TIME[2009-06-13 07:33:53 +09:00]])
<https://github.com/whatwg/html/commit/53f640d41e2aadfde9cada86d3046d5912ecc818>
[16] [CITE@en['''['''ct''']''' (2) Make surrogates in UTF-8 and character references turn into …]]
([[Hixie]]著, [TIME[2009-09-16 18:22:01 +09:00]])
<https://github.com/whatwg/html/commit/6db21943d024e774d2aa52573981c130767034e9>
[17] [CITE@en['''['''t''']''' (0) Remove the requirement that the parser deal with raw surrogat…]]
([[Hixie]]著, [TIME[2011-02-09 09:29:12 +09:00]])
<https://github.com/whatwg/html/commit/3accfd8a1893d91cb3cdbae62b6d8980e456dda6>
[18] [CITE@en['''['''giow''']''' (0) Fix the UTF-8 decoder error handling to handle a few error…]]
([[Hixie]]著, [TIME[2011-03-04 11:56:49 +09:00]])
<https://github.com/whatwg/html/commit/74e3b6cb761ee8a79b3a1a44d029c128fd0a201f>
[19] [CITE@en['''['''giow''']''' (0) Unpaired surrogates should throw an exception in close, li…]]
([[Hixie]]著, [TIME[2011-06-22 08:01:17 +09:00]])
<https://github.com/whatwg/html/commit/226e15ebd3d557a67bedcfc043e165d24e4182c1>
[20] [CITE@en['''['''giow''']''' (1) Make WebSocket silently convert isolated surrogated to U+F…]]
([[Hixie]]著, [TIME[2012-05-03 05:06:23 +09:00]])
<https://github.com/whatwg/html/commit/a817b04f4c262645ef996a5176b4a3f0a3a11928>
[21] [CITE@en['''['''c''']''' (2) Disallow surrogates in the input stream; make the syntax sect…]]
([[Hixie]]著, [TIME[2013-09-14 06:27:11 +09:00]])
<https://github.com/whatwg/html/commit/6dfaa1a826fae1dd50695710498434d201e543f6>
[22] [CITE@en['''['''cssom''']''' Add IDL `CSSOMString`, typedef of either USVString or DOMString]]
([[SimonSapin]]著, [TIME[2017-04-21 12:28:09 +09:00]])
<https://github.com/w3c/csswg-drafts/commit/830ae19ffd9a6fa6eb60aa21549d334cb18fb706>
[23] [[サロゲートペア]]の ([[UTF-16]] の) 支持者は、
不支持者に対して、「[[サロゲートペア]]の処理は[[結合文字]]の処理より簡単だ、
どのみち[[結合文字]]の処理は必要だから大した問題ではない」
と返すのが (十数年にわたってw) 定番となっています。
[24] [[文字符号化]]のレイヤーと[[文字]]の処理のレイヤーが混ざっているのは20世紀のソフトウェア開発技法だと思うんですがねぇ。
([[シフトJIS]]は[[半角]]と[[全角]]が[[バイト数]]と一致するから表示処理が楽だ、
みたいなのと同じでしょう。)
[25] [CITE@en['''['''css-syntax''']''' Remove 'code point' and 'surrogate code point' in favor …]]
([[tabatkins]]著, [TIME[2017-06-10 03:59:51 +09:00]])
<https://github.com/w3c/csswg-drafts/commit/320a990184a331057a56a17cdf627fee81bdc5d3>