Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-normatively visualize the indexes #89

Merged
merged 12 commits into from
Jan 20, 2017
6 changes: 6 additions & 0 deletions deploy.sh
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,8 @@ curl https://api.csswg.org/bikeshed/ -f -F file=@$INPUT_FILE -F md-status=LS-COM
> $COMMIT_DIR/index.html;
cp *.txt $COMMIT_DIR/;
cp *.json $COMMIT_DIR/;
cp *.css $COMMIT_DIR/;
python visualize.py $COMMIT_DIR/;
echo "Commit snapshot output to $WEB_ROOT/$COMMITS_DIR/$SHA"
echo ""

Expand All @@ -68,6 +70,8 @@ if [ $BRANCH != "master" ] ; then
> $BRANCH_DIR/index.html;
cp *.txt $BRANCH_DIR/;
cp *.json $BRANCH_DIR/;
cp *.css $BRANCH_DIR/;
python visualize.py $BRANCH_DIR/;
echo "Branch snapshot output to $WEB_ROOT/$BRANCHES_DIR/$BRANCH"
else
# Living standard, if master
Expand All @@ -76,6 +80,8 @@ else
> $WEB_ROOT/index.html
cp *.txt $WEB_ROOT/;
cp *.json $WEB_ROOT/;
cp *.css $WEB_ROOT/;
python visualize.py $WEB_ROOT/;
echo "Living standard output to $WEB_ROOT"
fi

Expand Down
111 changes: 77 additions & 34 deletions encoding.bs
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ Markup Shorthands: css off
Translate IDs: dictdef-textdecoderoptions textdecoderoptions,dictdef-textdecodeoptions textdecodeoptions
</pre>

<link rel=stylesheet href=visualization-colors.css>
<script src=https://resources.whatwg.org/file-issue.js async></script>
<script src=https://resources.whatwg.org/commit-snapshot-shortcut-key.js async></script>
<script src=https://resources.whatwg.org/dfn.js defer></script>
Expand Down Expand Up @@ -653,33 +654,72 @@ changed, so has the <a>index</a>.
<var>code point</var> in <var>index</var>, or null if
<var>code point</var> is not in <var>index</var>.

<div class=note id=visualization>
<p>There is a non-normative visualization for each index other than <a>index gb18030 ranges</a>.
<a>index jis0208</a> also has an alternative <a>Shift_JIS</a> visualization. Additionally, there is
visualization of the Basic Multilingual Plane coverage of each index other than
<a>index gb18030 ranges</a>.

<p>The legend for the visualizations is:

<ul class=visualizationlegend>
<li class=unmapped>Unmapped
<li class=mid>Two bytes in UTF-8
<li class="mid contiguous">Two bytes in UTF-8, code point follows immediately the code point of
previous pointer
<li class=upper>Three bytes in UTF-8 (non-PUA)
<li class="upper contiguous">Three bytes in UTF-8 (non-PUA), code point follows immediately the
code point of previous pointer
<li class=pua>Private Use
<li class="pua contiguous">Private Use, code point follows immediately the code point of previous
pointer
<li class=astral>Four bytes in UTF-8
<li class="astral contiguous">Four bytes in UTF-8, code point follows immediately the code point
of previous pointer
<li class=duplicate>Duplicate code point already mapped at an earlier index
<li class=compatibility>CJK Compatibility Ideograph
<li class=ext>CJK Unified Ideographs Extension A
</ul>
</div>

<p>These are the <a lt=index>indexes</a> defined by this
specification, excluding <a>index single-byte</a>, which have their own table:

<table>
<tbody><tr><th colspan=2><a>Index</a><th>Notes
<tbody><tr><th colspan=4><a>Index</a><th>Notes
<tr>
<td><dfn export>index Big5</dfn>
<td><a href=index-big5.txt>index-big5.txt</a>
<td><a href=big5.html>index Big5 visualization</a>
<td><a href=big5-bmp.html>index Big5 BMP coverage</a>
<td>This matches the Big5 standard in combination with the
Hong Kong Supplementary Character Set and other common extensions.
<tr>
<td><dfn export>index EUC-KR</dfn>
<td><a href=index-euc-kr.txt>index-euc-kr.txt</a>
<td>This matches the KS X 1001 standard and the Unified Hangul Code, more
commonly known together as Windows Codepage 949.
<td><a href=euc-kr.html>index EUC-KR visualization</a>
<td><a href=euc-kr-bmp.html>index EUC-KR BMP coverage</a>
<td>This matches the KS X 1001 standard and the Unified Hangul Code, more commonly known together
as Windows Codepage 949. It covers the Hangul Syllables block of Unicode in its entirety. The
Hangul block whose top left corner in the visualization is at pointer 9026 is in the Unicode
order. Taken separately, the rest of the Hangul syllables in this index are in the Unicode order,
too.
<tr>
<td><dfn export>index gb18030</dfn>
<td><a href=index-gb18030.txt>index-gb18030.txt</a>
<td><a href=gb18030.html>index gb18030 visualization</a>
<td><a href=gb18030-bmp.html>index gb18030 BMP coverage</a>
<td>This matches the GB18030-2005 standard for code points encoded as two bytes, except for
0xA3 0xA0 which maps to U+3000 to be compatible with deployed content.
0xA3 0xA0 which maps to U+3000 to be compatible with deployed content. This index covers the
CJK Unified Ideographs block of Unicode in its entirety. Entries from that block that are above or
to the left of (the first) U+3000 in the visualization are in the Unicode order.
<!-- https://bugzilla.mozilla.org/show_bug.cgi?id=131837
https://bugs.webkit.org/show_bug.cgi?id=17014
https://www.w3.org/Bugs/Public/show_bug.cgi?id=25396
https://github.com/whatwg/encoding/issues/17 -->
<tr>
<td><dfn export>index gb18030 ranges</dfn>
<td><a href=index-gb18030-ranges.txt>index-gb18030-ranges.txt</a>
<td colspan=3><a href=index-gb18030-ranges.txt>index-gb18030-ranges.txt</a>
<td>This <a>index</a> works different from all others. Listing all code points would result
in over a million items whereas they can be represented neatly in 207 ranges combined with trivial
limit checks. It therefore only superficially matches the GB18030-2005 standard for code points
Expand All @@ -688,12 +728,16 @@ specification, excluding <a>index single-byte</a>, which have their own table:
<tr>
<td><dfn export>index jis0208</dfn>
<td><a href=index-jis0208.txt>index-jis0208.txt</a>
<td><a href=jis0208.html>index jis0208 visualization</a>, <a href=shift_jis.html>Shift_JIS visualization</a>
<td><a href=jis0208-bmp.html>index jis0208 BMP coverage</a>
<td>This is the JIS X 0208 standard including formerly proprietary
extensions from IBM and NEC.
<!-- NEC = Nippon Electronics Corporation -->
<tr>
<td><dfn export>index jis0212</dfn>
<td><a href=index-jis0212.txt>index-jis0212.txt</a>
<td><a href=jis0212.html>index jis0212 visualization</a>
<td><a href=jis0212-bmp.html>index jis0212 BMP coverage</a>
<td>This is the JIS X 0212 standard. It is only used by the <a>EUC-JP decoder</a>
due to lack of widespread support elsewhere.
<!--
Expand Down Expand Up @@ -1442,36 +1486,35 @@ depends on the <a>single-byte encoding</a> in use. All but two
unique <a>index</a>.

<table>
<tbody><tr><th><a>Name</a><th><a>Index</a>
<tr><td><dfn export>IBM866</dfn><td><a href=index-ibm866.txt>index-ibm866.txt</a>
<tr><td><dfn export>ISO-8859-2</dfn><td><a href=index-iso-8859-2.txt>index-iso-8859-2.txt</a>
<tr><td><dfn export>ISO-8859-3</dfn><td><a href=index-iso-8859-3.txt>index-iso-8859-3.txt</a>
<tr><td><dfn export>ISO-8859-4</dfn><td><a href=index-iso-8859-4.txt>index-iso-8859-4.txt</a>
<tr><td><dfn export>ISO-8859-5</dfn><td><a href=index-iso-8859-5.txt>index-iso-8859-5.txt</a>
<tr><td><dfn export>ISO-8859-6</dfn><td><a href=index-iso-8859-6.txt>index-iso-8859-6.txt</a>
<tr><td><dfn export>ISO-8859-7</dfn><td><a href=index-iso-8859-7.txt>index-iso-8859-7.txt</a>
<tr><td><dfn export>ISO-8859-8</dfn><td rowspan=2><a href=index-iso-8859-8.txt>index-iso-8859-8.txt</a>
<tr><td><dfn export>IBM866</dfn><td><a href=index-ibm866.txt>index-ibm866.txt</a><td><a href=ibm866.html>index IBM866 visualization</a><td><a href=ibm866-bmp.html>index IBM866 BMP coverage</a>
<tr><td><dfn export>ISO-8859-2</dfn><td><a href=index-iso-8859-2.txt>index-iso-8859-2.txt</a><td><a href=iso-8859-2.html>index ISO-8859-2 visualization</a><td><a href=iso-8859-2-bmp.html>index ISO-8859-2 BMP coverage</a>
<tr><td><dfn export>ISO-8859-3</dfn><td><a href=index-iso-8859-3.txt>index-iso-8859-3.txt</a><td><a href=iso-8859-3.html>index ISO-8859-3 visualization</a><td><a href=iso-8859-3-bmp.html>index ISO-8859-3 BMP coverage</a>
<tr><td><dfn export>ISO-8859-4</dfn><td><a href=index-iso-8859-4.txt>index-iso-8859-4.txt</a><td><a href=iso-8859-4.html>index ISO-8859-4 visualization</a><td><a href=iso-8859-4-bmp.html>index ISO-8859-4 BMP coverage</a>
<tr><td><dfn export>ISO-8859-5</dfn><td><a href=index-iso-8859-5.txt>index-iso-8859-5.txt</a><td><a href=iso-8859-5.html>index ISO-8859-5 visualization</a><td><a href=iso-8859-5-bmp.html>index ISO-8859-5 BMP coverage</a>
<tr><td><dfn export>ISO-8859-6</dfn><td><a href=index-iso-8859-6.txt>index-iso-8859-6.txt</a><td><a href=iso-8859-6.html>index ISO-8859-6 visualization</a><td><a href=iso-8859-6-bmp.html>index ISO-8859-6 BMP coverage</a>
<tr><td><dfn export>ISO-8859-7</dfn><td><a href=index-iso-8859-7.txt>index-iso-8859-7.txt</a><td><a href=iso-8859-7.html>index ISO-8859-7 visualization</a><td><a href=iso-8859-7-bmp.html>index ISO-8859-7 BMP coverage</a>
<tr><td><dfn export>ISO-8859-8</dfn><td rowspan=2><a href=index-iso-8859-8.txt>index-iso-8859-8.txt</a><td><a href=iso-8859-8.html>index ISO-8859-8 visualization</a><td><a href=iso-8859-8-bmp.html>index ISO-8859-8 BMP coverage</a>
<tr><td><dfn export>ISO-8859-8-I</dfn>
<tr><td><dfn export>ISO-8859-10</dfn><td><a href=index-iso-8859-10.txt>index-iso-8859-10.txt</a>
<tr><td><dfn export>ISO-8859-13</dfn><td><a href=index-iso-8859-13.txt>index-iso-8859-13.txt</a>
<tr><td><dfn export>ISO-8859-14</dfn><td><a href=index-iso-8859-14.txt>index-iso-8859-14.txt</a>
<tr><td><dfn export>ISO-8859-15</dfn><td><a href=index-iso-8859-15.txt>index-iso-8859-15.txt</a>
<tr><td><dfn export>ISO-8859-16</dfn><td><a href=index-iso-8859-16.txt>index-iso-8859-16.txt</a>
<tr><td><dfn export>KOI8-R</dfn><td><a href=index-koi8-r.txt>index-koi8-r.txt</a>
<tr><td><dfn export>KOI8-U</dfn><td><a href=index-koi8-u.txt>index-koi8-u.txt</a>
<tr><td><dfn export>macintosh</dfn><td><a href=index-macintosh.txt>index-macintosh.txt</a>
<tr><td><dfn export>windows-874</dfn><td><a href=index-windows-874.txt>index-windows-874.txt</a>
<tr><td><dfn export>windows-1250</dfn><td><a href=index-windows-1250.txt>index-windows-1250.txt</a>
<tr><td><dfn export>windows-1251</dfn><td><a href=index-windows-1251.txt>index-windows-1251.txt</a>
<tr><td><dfn export>windows-1252</dfn><td><a href=index-windows-1252.txt>index-windows-1252.txt</a>
<tr><td><dfn export>windows-1253</dfn><td><a href=index-windows-1253.txt>index-windows-1253.txt</a>
<tr><td><dfn export>windows-1254</dfn><td><a href=index-windows-1254.txt>index-windows-1254.txt</a>
<tr><td><dfn export>windows-1255</dfn><td><a href=index-windows-1255.txt>index-windows-1255.txt</a>
<tr><td><dfn export>windows-1256</dfn><td><a href=index-windows-1256.txt>index-windows-1256.txt</a>
<tr><td><dfn export>windows-1257</dfn><td><a href=index-windows-1257.txt>index-windows-1257.txt</a>
<tr><td><dfn export>windows-1258</dfn><td><a href=index-windows-1258.txt>index-windows-1258.txt</a>
<tr><td><dfn export>x-mac-cyrillic</dfn><td><a href=index-x-mac-cyrillic.txt>index-x-mac-cyrillic.txt</a>
</table>
<tr><td><dfn export>ISO-8859-10</dfn><td><a href=index-iso-8859-10.txt>index-iso-8859-10.txt</a><td><a href=iso-8859-10.html>index ISO-8859-10 visualization</a><td><a href=iso-8859-10-bmp.html>index ISO-8859-10 BMP coverage</a>
<tr><td><dfn export>ISO-8859-13</dfn><td><a href=index-iso-8859-13.txt>index-iso-8859-13.txt</a><td><a href=iso-8859-13.html>index ISO-8859-13 visualization</a><td><a href=iso-8859-13-bmp.html>index ISO-8859-13 BMP coverage</a>
<tr><td><dfn export>ISO-8859-14</dfn><td><a href=index-iso-8859-14.txt>index-iso-8859-14.txt</a><td><a href=iso-8859-14.html>index ISO-8859-14 visualization</a><td><a href=iso-8859-14-bmp.html>index ISO-8859-14 BMP coverage</a>
<tr><td><dfn export>ISO-8859-15</dfn><td><a href=index-iso-8859-15.txt>index-iso-8859-15.txt</a><td><a href=iso-8859-15.html>index ISO-8859-15 visualization</a><td><a href=iso-8859-15-bmp.html>index ISO-8859-15 BMP coverage</a>
<tr><td><dfn export>ISO-8859-16</dfn><td><a href=index-iso-8859-16.txt>index-iso-8859-16.txt</a><td><a href=iso-8859-16.html>index ISO-8859-16 visualization</a><td><a href=iso-8859-16-bmp.html>index ISO-8859-16 BMP coverage</a>
<tr><td><dfn export>KOI8-R</dfn><td><a href=index-koi8-r.txt>index-koi8-r.txt</a><td><a href=koi8-r.html>index KOI8-R visualization</a><td><a href=koi8-r-bmp.html>index KOI8-R BMP coverage</a>
<tr><td><dfn export>KOI8-U</dfn><td><a href=index-koi8-u.txt>index-koi8-u.txt</a><td><a href=koi8-u.html>index KOI8-U visualization</a><td><a href=koi8-u-bmp.html>index KOI8-U BMP coverage</a>
<tr><td><dfn export>macintosh</dfn><td><a href=index-macintosh.txt>index-macintosh.txt</a><td><a href=macintosh.html>index macintosh visualization</a><td><a href=macintosh-bmp.html>index macintosh BMP coverage</a>
<tr><td><dfn export>windows-874</dfn><td><a href=index-windows-874.txt>index-windows-874.txt</a><td><a href=windows-874.html>index windows-874 visualization</a><td><a href=windows-874-bmp.html>index windows-874 BMP coverage</a>
<tr><td><dfn export>windows-1250</dfn><td><a href=index-windows-1250.txt>index-windows-1250.txt</a><td><a href=windows-1250.html>index windows-1250 visualization</a><td><a href=windows-1250-bmp.html>index windows-1250 BMP coverage</a>
<tr><td><dfn export>windows-1251</dfn><td><a href=index-windows-1251.txt>index-windows-1251.txt</a><td><a href=windows-1251.html>index windows-1251 visualization</a><td><a href=windows-1251-bmp.html>index windows-1251 BMP coverage</a>
<tr><td><dfn export>windows-1252</dfn><td><a href=index-windows-1252.txt>index-windows-1252.txt</a><td><a href=windows-1252.html>index windows-1252 visualization</a><td><a href=windows-1252-bmp.html>index windows-1252 BMP coverage</a>
<tr><td><dfn export>windows-1253</dfn><td><a href=index-windows-1253.txt>index-windows-1253.txt</a><td><a href=windows-1253.html>index windows-1253 visualization</a><td><a href=windows-1253-bmp.html>index windows-1253 BMP coverage</a>
<tr><td><dfn export>windows-1254</dfn><td><a href=index-windows-1254.txt>index-windows-1254.txt</a><td><a href=windows-1254.html>index windows-1254 visualization</a><td><a href=windows-1254-bmp.html>index windows-1254 BMP coverage</a>
<tr><td><dfn export>windows-1255</dfn><td><a href=index-windows-1255.txt>index-windows-1255.txt</a><td><a href=windows-1255.html>index windows-1255 visualization</a><td><a href=windows-1255-bmp.html>index windows-1255 BMP coverage</a>
<tr><td><dfn export>windows-1256</dfn><td><a href=index-windows-1256.txt>index-windows-1256.txt</a><td><a href=windows-1256.html>index windows-1256 visualization</a><td><a href=windows-1256-bmp.html>index windows-1256 BMP coverage</a>
<tr><td><dfn export>windows-1257</dfn><td><a href=index-windows-1257.txt>index-windows-1257.txt</a><td><a href=windows-1257.html>index windows-1257 visualization</a><td><a href=windows-1257-bmp.html>index windows-1257 BMP coverage</a>
<tr><td><dfn export>windows-1258</dfn><td><a href=index-windows-1258.txt>index-windows-1258.txt</a><td><a href=windows-1258.html>index windows-1258 visualization</a><td><a href=windows-1258-bmp.html>index windows-1258 BMP coverage</a>
<tr><td><dfn export>x-mac-cyrillic</dfn><td><a href=index-x-mac-cyrillic.txt>index-x-mac-cyrillic.txt</a><td><a href=x-mac-cyrillic.html>index x-mac-cyrillic visualization</a><td><a href=x-mac-cyrillic-bmp.html>index x-mac-cyrillic BMP coverage</a>
</table>

<p class=note><a>ISO-8859-8</a> and <a>ISO-8859-8-I</a> are
distinct <a for=/>encoding</a> <a for=encoding>names</a>, because
Expand Down
91 changes: 91 additions & 0 deletions visualization-colors.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
/* Any copyright is dedicated to the Public Domain.
* http://creativecommons.org/publicdomain/zero/1.0/ */

/*
* Color scheme from http://mkweb.bcgsc.ca/biovis2012/
* for color blindness a11y.
*/

.visualizationlegend li {
padding: 0.3em 0.5em;
}

.visualizationlegend {
padding-bottom: 1em;
padding-right: 4.5em;
}

.unmapped {
background-color: #920000;
color: white;
}

.astral {
background-color: #6db6fe;
color: black;
}

.mid {
background-color: #ffff6d;
color: black;
}

.upper, .byte {
background-color: #d4d5db;
color: black;
}

.pua {
background-color: #db6d00;
color: black;
}

.mid.contiguous {
background-color: #009292;
color: black;
}

.upper.contiguous {
background-color: #24fe23;
color: black;
}

.pua.contiguous {
background-color: #924900;
color: white;
}

.astral.contiguous {
background-color: #480091;
color: white;
}

.duplicate {
border-style: solid ;
border-color: #920000;
color: black;
}

.compatibility {
border-style: dashed;
border-color: black;
}

.duplicate.compatibility {
border-style: dashed;
border-color: #920000;
}

.ext {
border-style: dotted;
border-color: black;
}

.ext.compatibility {
border-style: dotted;
border-color: #920000;
}

.surrogate {
background-color: #b4b4b6;
}
92 changes: 92 additions & 0 deletions visualization.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
@import 'https://resources.whatwg.org/fonts/faces.css';
@import 'visualization-colors.css';

/* Any copyright is dedicated to the Public Domain.
* http://creativecommons.org/publicdomain/zero/1.0/ */


/* Common styling from standard.css */
h1 {
color: #3c790a; /* WHATWG Green */
}

@media screen {
:link, :visited { text-decoration: none; }
:link:hover, :visited:hover, :link:focus, :visited:focus { text-decoration: underline; }
:link { color: #00C; }
:visited { color: #609; }
:link:active, :visited:active { color: #C00; }
}

.note { position: relative; color: green; background: #DDFFDD; font-style: italic; margin-left: 2em; padding-left: 2em; }
.note::before { content: 'Note'; background: green; color: white; padding: 0.15em 0.25em; font-style: normal; position: absolute; top: -0.2em; left: -1.5em; transform: rotate(-5deg); }
/* End copypasta from standard.css */

html {
font-family: 'Source Sans Pro', 'Noto Naskh Arabic', 'Noto Sans Hebrew', 'Noto Sans Thai', sans-serif;
}

th, h1, {
font-weight: 700;
}

th {
text-align: right;
border: 2px solid transparent;
}

td {
text-align: center;
border: 2px solid transparent;
padding: 0;
}

thead th {
text-align: center;
}

li {
border: 2px solid transparent;
}

dl, dd, dt {
margin: 0;
padding: 0;
border: 0;
}
dt, dd + dd {
font-weight: 300;
font-size: 9px;
letter-spacing: .15em;
}

.astral dd {
letter-spacing: normal;
}

dd {
line-height: 1.1;
font-size: 18px;
}

table {
table-layout: fixed;
border-spacing: 1px;
border-collapse: separate;
}

:lang(ja) {
font-family: 'Source Han Sans JP';
}

:lang(ko) {
font-family: 'Source Han Sans KR';
}

:lang(zh-cn) {
font-family: 'Source Han Sans CN';
}

:lang(zh-tw) {
font-family: 'Source Han Sans TW';
}