Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-normatively visualize the indexes #89

Merged
merged 12 commits into from
Jan 20, 2017
4 changes: 4 additions & 0 deletions deploy.sh
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,8 @@ if [ $BRANCH != "master" ] ; then
> $BRANCH_DIR/index.html;
cp *.txt $BRANCH_DIR/;
cp *.json $BRANCH_DIR/;
cp *.css $BRANCH_DIR/;
python visualize.py $BRANCH_DIR/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was it intentional to skip commit snapshots?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. I just didn't understand the purpose of that directory correctly.

echo "Branch snapshot output to $WEB_ROOT/$BRANCHES_DIR/$BRANCH"
else
# Living standard, if master
Expand All @@ -76,6 +78,8 @@ else
> $WEB_ROOT/index.html
cp *.txt $WEB_ROOT/;
cp *.json $WEB_ROOT/;
cp *.css $WEB_ROOT/;
python visualize.py $WEB_ROOT/
echo "Living standard output to $WEB_ROOT"
fi

Expand Down
101 changes: 69 additions & 32 deletions encoding.bs
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ Markup Shorthands: css off
Translate IDs: dictdef-textdecoderoptions textdecoderoptions,dictdef-textdecodeoptions textdecodeoptions
</pre>

<style>@import 'visualization-colors.css';</style>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to Tab we can use <link> and Bikeshed will move it correctly in the output.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

<script src=https://resources.whatwg.org/file-issue.js async></script>
<script src=https://resources.whatwg.org/commit-snapshot-shortcut-key.js async></script>
<script src=https://resources.whatwg.org/dfn.js defer></script>
Expand Down Expand Up @@ -653,33 +654,66 @@ changed, so has the <a>index</a>.
<var>code point</var> in <var>index</var>, or null if
<var>code point</var> is not in <var>index</var>.

<div class=note id=visualization>
<p>There is a non-normative visualization for each index other than index gb18030 ranges.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is inside a <div> it needs to be indented by one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also xref the indexes here I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

index jis0208 also has an alternative Shift_JIS visualization.
Additionally, there is visualization of the Basic Multilingual Plane coverage of each index other than index gb18030 ranges.

<p>The legend for the visualizations is:
<ul class=visualizationlegend>
<li class=unmapped>Unmapped
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These have too much indentation. <div> + <ul> means two spaces.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

<li class=mid>Two bytes in UTF-8
<li class="mid contiguous">Two bytes in UTF-8, code point follows immediately the code point of previous pointer
<li class=upper>Three bytes in UTF-8 (non-PUA)
<li class="upper contiguous">Three bytes in UTF-8 (non-PUA), code point follows immediately the code point of previous pointer
<li class=pua>Private Use
<li class="pua contiguous">Private Use, code point follows immediately the code point of previous pointer
<li class=astral>Four bytes in UTF-8
<li class="astral contiguous">Four bytes in UTF-8, code point follows immediately the code point of previous pointer
<li class=duplicate>Duplicate code point already mapped at an earlier index
<li class=compatibility>CJK Compatibility Ideograph
<li class=ext>CJK Unified Ideographs Extension A
</ul>
</div>

<p>These are the <a lt=index>indexes</a> defined by this
specification, excluding <a>index single-byte</a>, which have their own table:

<table>
<tbody><tr><th colspan=2><a>Index</a><th>Notes
<tbody><tr><th colspan=4><a>Index</a><th>Notes
<tr>
<td><dfn export>index Big5</dfn>
<td><a href=index-big5.txt>index-big5.txt</a>
<td><a href=big5.html>Visualization</a>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a title here saying "index Big5 visualization". Even better would be to make each link text unique somehow since that affects screen readers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did the latter.

<td><a href=big5-bmp.html>BMP coverage</a>
<td>This matches the Big5 standard in combination with the
Hong Kong Supplementary Character Set and other common extensions.
<tr>
<td><dfn export>index EUC-KR</dfn>
<td><a href=index-euc-kr.txt>index-euc-kr.txt</a>
<td><a href=euc-kr.html>Visualization</a>
<td><a href=euc-kr-bmp.html>BMP coverage</a>
<td>This matches the KS X 1001 standard and the Unified Hangul Code, more
commonly known together as Windows Codepage 949.
commonly known together as Windows Codepage 949. This index covers the Hangul Syllables
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/This index/It/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

block of Unicode in its entirety. The Hangul block whose top left corner in the visualization
is at pointer 9026 is in the Unicode order. Taken separately, the rest of the Hangul syllables
in this index are in the Unicode order, too.
<tr>
<td><dfn export>index gb18030</dfn>
<td><a href=index-gb18030.txt>index-gb18030.txt</a>
<td><a href=gb18030.html>Visualization</a>
<td><a href=gb18030-bmp.html>BMP coverage</a>
<td>This matches the GB18030-2005 standard for code points encoded as two bytes, except for
0xA3 0xA0 which maps to U+3000 to be compatible with deployed content.
0xA3 0xA0 which maps to U+3000 to be compatible with deployed content. This index covers the
CJK Unified Ideographs block of Unicode in its entirety. Entries from that block that are
above or to the left of (the first) U+3000 in the visualization are in the Unicode order.
<!-- https://bugzilla.mozilla.org/show_bug.cgi?id=131837
https://bugs.webkit.org/show_bug.cgi?id=17014
https://www.w3.org/Bugs/Public/show_bug.cgi?id=25396
https://github.com/whatwg/encoding/issues/17 -->
<tr>
<td><dfn export>index gb18030 ranges</dfn>
<td><a href=index-gb18030-ranges.txt>index-gb18030-ranges.txt</a>
<td colspan=3><a href=index-gb18030-ranges.txt>index-gb18030-ranges.txt</a>
<td>This <a>index</a> works different from all others. Listing all code points would result
in over a million items whereas they can be represented neatly in 207 ranges combined with trivial
limit checks. It therefore only superficially matches the GB18030-2005 standard for code points
Expand All @@ -688,12 +722,16 @@ specification, excluding <a>index single-byte</a>, which have their own table:
<tr>
<td><dfn export>index jis0208</dfn>
<td><a href=index-jis0208.txt>index-jis0208.txt</a>
<td><a href=jis0208.html>Visualization</a>, <a href=shift_jis.html>Shift_JIS visualization</a>
<td><a href=jis0208-bmp.html>BMP coverage</a>
<td>This is the JIS X 0208 standard including formerly proprietary
extensions from IBM and NEC.
<!-- NEC = Nippon Electronics Corporation -->
<tr>
<td><dfn export>index jis0212</dfn>
<td><a href=index-jis0212.txt>index-jis0212.txt</a>
<td><a href=jis0212.html>Visualization</a>
<td><a href=jis0212-bmp.html>BMP coverage</a>
<td>This is the JIS X 0212 standard. It is only used by the <a>EUC-JP decoder</a>
due to lack of widespread support elsewhere.
<!--
Expand Down Expand Up @@ -1442,35 +1480,34 @@ depends on the <a>single-byte encoding</a> in use. All but two
unique <a>index</a>.

<table>
<tbody><tr><th><a>Name</a><th><a>Index</a>
<tr><td><dfn export>IBM866</dfn><td><a href=index-ibm866.txt>index-ibm866.txt</a>
<tr><td><dfn export>ISO-8859-2</dfn><td><a href=index-iso-8859-2.txt>index-iso-8859-2.txt</a>
<tr><td><dfn export>ISO-8859-3</dfn><td><a href=index-iso-8859-3.txt>index-iso-8859-3.txt</a>
<tr><td><dfn export>ISO-8859-4</dfn><td><a href=index-iso-8859-4.txt>index-iso-8859-4.txt</a>
<tr><td><dfn export>ISO-8859-5</dfn><td><a href=index-iso-8859-5.txt>index-iso-8859-5.txt</a>
<tr><td><dfn export>ISO-8859-6</dfn><td><a href=index-iso-8859-6.txt>index-iso-8859-6.txt</a>
<tr><td><dfn export>ISO-8859-7</dfn><td><a href=index-iso-8859-7.txt>index-iso-8859-7.txt</a>
<tr><td><dfn export>ISO-8859-8</dfn><td rowspan=2><a href=index-iso-8859-8.txt>index-iso-8859-8.txt</a>
<tr><td><dfn export>IBM866</dfn><td><a href=index-ibm866.txt>index-ibm866.txt</a><td><a href=ibm866.html>Visualization</a><td><a href=ibm866-bmp.html>BMP coverage</a>
<tr><td><dfn export>ISO-8859-2</dfn><td><a href=index-iso-8859-2.txt>index-iso-8859-2.txt</a><td><a href=iso-8859-2.html>Visualization</a><td><a href=iso-8859-2-bmp.html>BMP coverage</a>
<tr><td><dfn export>ISO-8859-3</dfn><td><a href=index-iso-8859-3.txt>index-iso-8859-3.txt</a><td><a href=iso-8859-3.html>Visualization</a><td><a href=iso-8859-3-bmp.html>BMP coverage</a>
<tr><td><dfn export>ISO-8859-4</dfn><td><a href=index-iso-8859-4.txt>index-iso-8859-4.txt</a><td><a href=iso-8859-4.html>Visualization</a><td><a href=iso-8859-4-bmp.html>BMP coverage</a>
<tr><td><dfn export>ISO-8859-5</dfn><td><a href=index-iso-8859-5.txt>index-iso-8859-5.txt</a><td><a href=iso-8859-5.html>Visualization</a><td><a href=iso-8859-5-bmp.html>BMP coverage</a>
<tr><td><dfn export>ISO-8859-6</dfn><td><a href=index-iso-8859-6.txt>index-iso-8859-6.txt</a><td><a href=iso-8859-6.html>Visualization</a><td><a href=iso-8859-6-bmp.html>BMP coverage</a>
<tr><td><dfn export>ISO-8859-7</dfn><td><a href=index-iso-8859-7.txt>index-iso-8859-7.txt</a><td><a href=iso-8859-7.html>Visualization</a><td><a href=iso-8859-7-bmp.html>BMP coverage</a>
<tr><td><dfn export>ISO-8859-8</dfn><td rowspan=2><a href=index-iso-8859-8.txt>index-iso-8859-8.txt</a><td><a href=iso-8859-8.html>Visualization</a><td><a href=iso-8859-8-bmp.html>BMP coverage</a>
<tr><td><dfn export>ISO-8859-8-I</dfn>
<tr><td><dfn export>ISO-8859-10</dfn><td><a href=index-iso-8859-10.txt>index-iso-8859-10.txt</a>
<tr><td><dfn export>ISO-8859-13</dfn><td><a href=index-iso-8859-13.txt>index-iso-8859-13.txt</a>
<tr><td><dfn export>ISO-8859-14</dfn><td><a href=index-iso-8859-14.txt>index-iso-8859-14.txt</a>
<tr><td><dfn export>ISO-8859-15</dfn><td><a href=index-iso-8859-15.txt>index-iso-8859-15.txt</a>
<tr><td><dfn export>ISO-8859-16</dfn><td><a href=index-iso-8859-16.txt>index-iso-8859-16.txt</a>
<tr><td><dfn export>KOI8-R</dfn><td><a href=index-koi8-r.txt>index-koi8-r.txt</a>
<tr><td><dfn export>KOI8-U</dfn><td><a href=index-koi8-u.txt>index-koi8-u.txt</a>
<tr><td><dfn export>macintosh</dfn><td><a href=index-macintosh.txt>index-macintosh.txt</a>
<tr><td><dfn export>windows-874</dfn><td><a href=index-windows-874.txt>index-windows-874.txt</a>
<tr><td><dfn export>windows-1250</dfn><td><a href=index-windows-1250.txt>index-windows-1250.txt</a>
<tr><td><dfn export>windows-1251</dfn><td><a href=index-windows-1251.txt>index-windows-1251.txt</a>
<tr><td><dfn export>windows-1252</dfn><td><a href=index-windows-1252.txt>index-windows-1252.txt</a>
<tr><td><dfn export>windows-1253</dfn><td><a href=index-windows-1253.txt>index-windows-1253.txt</a>
<tr><td><dfn export>windows-1254</dfn><td><a href=index-windows-1254.txt>index-windows-1254.txt</a>
<tr><td><dfn export>windows-1255</dfn><td><a href=index-windows-1255.txt>index-windows-1255.txt</a>
<tr><td><dfn export>windows-1256</dfn><td><a href=index-windows-1256.txt>index-windows-1256.txt</a>
<tr><td><dfn export>windows-1257</dfn><td><a href=index-windows-1257.txt>index-windows-1257.txt</a>
<tr><td><dfn export>windows-1258</dfn><td><a href=index-windows-1258.txt>index-windows-1258.txt</a>
<tr><td><dfn export>x-mac-cyrillic</dfn><td><a href=index-x-mac-cyrillic.txt>index-x-mac-cyrillic.txt</a>
<tr><td><dfn export>ISO-8859-10</dfn><td><a href=index-iso-8859-10.txt>index-iso-8859-10.txt</a><td><a href=iso-8859-10.html>Visualization</a><td><a href=iso-8859-10-bmp.html>BMP coverage</a>
<tr><td><dfn export>ISO-8859-13</dfn><td><a href=index-iso-8859-13.txt>index-iso-8859-13.txt</a><td><a href=iso-8859-13.html>Visualization</a><td><a href=iso-8859-13-bmp.html>BMP coverage</a>
<tr><td><dfn export>ISO-8859-14</dfn><td><a href=index-iso-8859-14.txt>index-iso-8859-14.txt</a><td><a href=iso-8859-14.html>Visualization</a><td><a href=iso-8859-14-bmp.html>BMP coverage</a>
<tr><td><dfn export>ISO-8859-15</dfn><td><a href=index-iso-8859-15.txt>index-iso-8859-15.txt</a><td><a href=iso-8859-15.html>Visualization</a><td><a href=iso-8859-15-bmp.html>BMP coverage</a>
<tr><td><dfn export>ISO-8859-16</dfn><td><a href=index-iso-8859-16.txt>index-iso-8859-16.txt</a><td><a href=iso-8859-16.html>Visualization</a><td><a href=iso-8859-16-bmp.html>BMP coverage</a>
<tr><td><dfn export>KOI8-R</dfn><td><a href=index-koi8-r.txt>index-koi8-r.txt</a><td><a href=koi8-r.html>Visualization</a><td><a href=koi8-r-bmp.html>BMP coverage</a>
<tr><td><dfn export>KOI8-U</dfn><td><a href=index-koi8-u.txt>index-koi8-u.txt</a><td><a href=koi8-u.html>Visualization</a><td><a href=koi8-u-bmp.html>BMP coverage</a>
<tr><td><dfn export>macintosh</dfn><td><a href=index-macintosh.txt>index-macintosh.txt</a><td><a href=macintosh.html>Visualization</a><td><a href=macintosh-bmp.html>BMP coverage</a>
<tr><td><dfn export>windows-874</dfn><td><a href=index-windows-874.txt>index-windows-874.txt</a><td><a href=windows-874.html>Visualization</a><td><a href=windows-874-bmp.html>BMP coverage</a>
<tr><td><dfn export>windows-1250</dfn><td><a href=index-windows-1250.txt>index-windows-1250.txt</a><td><a href=windows-1250.html>Visualization</a><td><a href=windows-1250-bmp.html>BMP coverage</a>
<tr><td><dfn export>windows-1251</dfn><td><a href=index-windows-1251.txt>index-windows-1251.txt</a><td><a href=windows-1251.html>Visualization</a><td><a href=windows-1251-bmp.html>BMP coverage</a>
<tr><td><dfn export>windows-1252</dfn><td><a href=index-windows-1252.txt>index-windows-1252.txt</a><td><a href=windows-1252.html>Visualization</a><td><a href=windows-1252-bmp.html>BMP coverage</a>
<tr><td><dfn export>windows-1253</dfn><td><a href=index-windows-1253.txt>index-windows-1253.txt</a><td><a href=windows-1253.html>Visualization</a><td><a href=windows-1253-bmp.html>BMP coverage</a>
<tr><td><dfn export>windows-1254</dfn><td><a href=index-windows-1254.txt>index-windows-1254.txt</a><td><a href=windows-1254.html>Visualization</a><td><a href=windows-1254-bmp.html>BMP coverage</a>
<tr><td><dfn export>windows-1255</dfn><td><a href=index-windows-1255.txt>index-windows-1255.txt</a><td><a href=windows-1255.html>Visualization</a><td><a href=windows-1255-bmp.html>BMP coverage</a>
<tr><td><dfn export>windows-1256</dfn><td><a href=index-windows-1256.txt>index-windows-1256.txt</a><td><a href=windows-1256.html>Visualization</a><td><a href=windows-1256-bmp.html>BMP coverage</a>
<tr><td><dfn export>windows-1257</dfn><td><a href=index-windows-1257.txt>index-windows-1257.txt</a><td><a href=windows-1257.html>Visualization</a><td><a href=windows-1257-bmp.html>BMP coverage</a>
<tr><td><dfn export>windows-1258</dfn><td><a href=index-windows-1258.txt>index-windows-1258.txt</a><td><a href=windows-1258.html>Visualization</a><td><a href=windows-1258-bmp.html>BMP coverage</a>
<tr><td><dfn export>x-mac-cyrillic</dfn><td><a href=index-x-mac-cyrillic.txt>index-x-mac-cyrillic.txt</a><td><a href=x-mac-cyrillic.html>Visualization</a><td><a href=x-mac-cyrillic-bmp.html>BMP coverage</a>
</table>

<p class=note><a>ISO-8859-8</a> and <a>ISO-8859-8-I</a> are
Expand Down
91 changes: 91 additions & 0 deletions visualization-colors.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
/* Any copyright is dedicated to the Public Domain.
* http://creativecommons.org/publicdomain/zero/1.0/ */

/*
* Color scheme from http://mkweb.bcgsc.ca/biovis2012/
* for color blindness a11y.
*/

.visualizationlegend li {
padding: 0.3em 0.5em;
}

.visualizationlegend {
padding-bottom: 1em;
padding-right: 4.5em;
}

.unmapped {
background-color: #920000;
color: white;
}

.astral {
background-color: #6db6fe;
color: black;
}

.mid {
background-color: #ffff6d;
color: black;
}

.upper, .byte {
background-color: #d4d5db;
color: black;
}

.pua {
background-color: #db6d00;
color: black;
}

.mid.contiguous {
background-color: #009292;
color: black;
}

.upper.contiguous {
background-color: #24fe23;
color: black;
}

.pua.contiguous {
background-color: #924900;
color: black;
}

.astral.contiguous {
background-color: #480091;
color: white;
}

.duplicate {
border-style: solid ;
border-color: #920000;
color: black;
}

.compatibility {
border-style: dashed;
border-color: black;
}

.duplicate.compatibility {
border-style: dashed;
border-color: #920000;
}

.ext {
border-style: dotted;
border-color: black;
}

.ext.compatibility {
border-style: dotted;
border-color: #920000;
}

.surrogate {
background-color: #b4b4b6;
}
92 changes: 92 additions & 0 deletions visualization.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
@import 'https://resources.whatwg.org/fonts/faces.css';
@import 'visualization-colors.css';

/* Any copyright is dedicated to the Public Domain.
* http://creativecommons.org/publicdomain/zero/1.0/ */


/* Common styling from standard.css */
h1 {
color: ##3c790a; /* WHATWG Green */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double #

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed in the latest push.

}

@media screen {
:link, :visited { text-decoration: none; }
:link:hover, :visited:hover, :link:focus, :visited:focus { text-decoration: underline; }
:link { color: #00C; }
:visited { color: #609; }
:link:active, :visited:active { color: #C00; }
}

.note { position: relative; color: green; background: #DDFFDD; font-style: italic; margin-left: 2em; padding-left: 2em; }
.note::before { content: 'Note'; background: green; color: white; padding: 0.15em 0.25em; font-style: normal; position: absolute; top: -0.2em; left: -1.5em; transform: rotate(-5deg); }
/* End copypasta from standard.css */

html {
font-family: 'Source Sans Pro', 'Noto Naskh Arabic', 'Noto Sans Hebrew', 'Noto Sans Thai', sans-serif;
}

th, h1, {
font-weight: 700;
}

th {
text-align: right;
border: 2px solid transparent;
}

td {
text-align: center;
border: 2px solid transparent;
padding: 0;
}

thead th {
text-align: center;
}

li {
border: 2px solid transparent;
}

dl, dd, dt {
margin: 0;
padding: 0;
border: 0;
}
dt, dd + dd {
font-weight: 300;
font-size: 9px;
letter-spacing: .15em;
}

.astral dd {
letter-spacing: normal;
}

dd {
line-height: 1.1;
font-size: 18px;
}

table {
table-layout: fixed;
border-spacing: 1px;
border-collapse: separate;
}

:lang(ja) {
font-family: 'Source Han Sans JP';
}

:lang(ko) {
font-family: 'Source Han Sans KR';
}

:lang(zh-cn) {
font-family: 'Source Han Sans CN';
}

:lang(zh-tw) {
font-family: 'Source Han Sans TW';
}