Non-normatively visualize the indexes #89

hsivonen · 2017-01-16T13:39:58Z

Closes #78.

Closes #78

hsivonen · 2017-01-16T17:23:07Z

Oops. I forgot to actually call my aria() function.

annevk · 2017-01-17T08:41:18Z

The reason Travis fails is because the visualization resources are not passing the HTML checker. Some are not NFC and some use code points U+0080 to U+009F. We'll either need to exclude them from HTML checking or address that somehow.

annevk

Couple nits. Looks good to me overall. I wonder if @inexorabletash would also be willing to review.

annevk · 2017-01-17T08:43:00Z

deploy.sh

@@ -68,6 +68,8 @@ if [ $BRANCH != "master" ] ; then
         > $BRANCH_DIR/index.html;
    cp *.txt $BRANCH_DIR/;
    cp *.json $BRANCH_DIR/;
+    cp *.css $BRANCH_DIR/;
+    python visualize.py $BRANCH_DIR/


Was it intentional to skip commit snapshots?

No. I just didn't understand the purpose of that directory correctly.

annevk · 2017-01-17T08:43:31Z

encoding.bs

@@ -18,6 +18,7 @@ Markup Shorthands: css off
 Translate IDs: dictdef-textdecoderoptions textdecoderoptions,dictdef-textdecodeoptions textdecodeoptions
 </pre>

+<style>@import 'visualization-colors.css';</style>


According to Tab we can use <link> and Bikeshed will move it correctly in the output.

annevk · 2017-01-17T08:43:50Z

encoding.bs

@@ -653,33 +654,66 @@ changed, so has the <a>index</a>.
 <var>code point</var> in <var>index</var>, or null if
 <var>code point</var> is not in <var>index</var>.

+<div class=note id=visualization>
+<p>There is a non-normative visualization for each index other than index gb18030 ranges.


Since this is inside a <div> it needs to be indented by one.

We should also xref the indexes here I think.

annevk · 2017-01-17T08:44:44Z

encoding.bs

+
+<p>The legend for the visualizations is:
+<ul class=visualizationlegend>
+    <li class=unmapped>Unmapped


These have too much indentation. <div> + <ul> means two spaces.

annevk · 2017-01-17T08:46:37Z

encoding.bs

 <tr>
  <td><dfn export>index Big5</dfn>
  <td><a href=index-big5.txt>index-big5.txt</a>
+  <td><a href=big5.html>Visualization</a>


We should add a title here saying "index Big5 visualization". Even better would be to make each link text unique somehow since that affects screen readers.

I did the latter.

annevk · 2017-01-17T08:47:53Z

encoding.bs

  <td>This matches the KS X 1001 standard and the Unified Hangul Code, more
-  commonly known together as Windows Codepage 949.
+  commonly known together as Windows Codepage 949. This index covers the Hangul Syllables


s/This index/It/

hsivonen · 2017-01-17T11:54:15Z

Some are not NFC

Prepended with space.

and some use code points U+0080 to U+009F.

Replaced with SVG.

Addressed all review comments.

zcorpan · 2017-01-17T12:52:25Z

visualization.css

+
+/* Common styling from standard.css */
+h1 {
+  color: ##3c790a; /* WHATWG Green */


Thanks. Fixed in the latest push.

zcorpan · 2017-01-17T13:04:17Z

visualize.py

+        # HTML prohibits C1 controls
+        # TODO draw some fancy SVG hex inside the square
+        return "<svg><rect x=1 y=1 width=14 height=14 stroke=black stroke-width=2 fill=none /></svg>"
+    as_str = unichr(code_point)


I get:

Traceback (most recent call last): File "visualize.py", line 258, in <module> format_index(name, row_length, lang, bmp, duplicates, byte_rule) File "visualize.py", line 206, in format_index out_file.write((u"<td class='%s %s%s%s' aria-label='%s'><dl><dt>%d<dd lang=%s>%s<dd>U+%04X</dl>" % ("contiguous" if contiguous else "discontiguous", classify(code_point), " duplicate" if duplicate else "", check_compatibility(code_point), aria(code_point, contiguous, duplicate), pointer, lang, format_code_point(code_point), code_point)).encode("utf-8")) File "visualize.py", line 124, in format_code_point as_str = unichr(code_point) ValueError: unichr() arg not in range(0x10000) (narrow Python build)

...with Python 2.7.10

Python's sad Unicode support strikes again. How did you get your copy of Python? Apple?

The script works on Python 2.7.12 shipped on Ubuntu 16.04. Debian and, therefore, Ubuntu ships wide Python. It seems that Travis also runs Ubuntu, so it should be OK there.

@annevk, Is making the script work on e.g. Apple-shipped Python a blocker for merging?

I think I got it from Apple, but I don't really know.

Apple and some OS X package managers ship narrow builds. If the issue is literally just unichr, then it is at least easily worked around.

If the issue is literally just unichr, then it is at least easily worked around.

Yeah, I guess tomorrow I'll be writing the wrapper (Unicode string for code point regardless of 16 vs. 32-bit units) that Python should have had the good sense to put in the standard library.

It now works with Apple-shipped Python, too. (Actually tested on a Mac and diffed the output with Ubuntu's wide Python output.)

zcorpan · 2017-01-18T11:38:30Z

Builds for me now for ./deploy.sh --local. 👍

annevk · 2017-01-20T10:17:39Z

https://travis-ci.org/whatwg/encoding/builds/193654621 still reports PUA and NFC warnings. I think the NFC warnings we have to accept per conversation on IRC. Does the same go for PUA? It would be good to mention in the commit message what is expected there.

annevk · 2017-01-20T10:27:07Z

One other problem I noticed is that the wrapping is not following 100 columns which is what we're starting to adopt everywhere (though no linebreaks inside "inline" elements). I'm happy to fix that once you consider this ready.

hsivonen · 2017-01-20T10:28:11Z

Does the same go for PUA?

As noted on IRC, yes. For example, on Ubuntu, showing PUA as PUA gives the insight that the PUA code points in the left part of the last row of gb18030 show up as radicals that fit reasonably between the non-PUA radical code points around them. Then one can go read Lunde and learn why this is so: the PUA radicals weren't mapped in Unicode at the time gb18030 was first specced.

One other problem I noticed is that the wrapping is not following 100 columns which is what we're starting to adopt everywhere (though no linebreaks inside "inline" elements). I'm happy to fix that once you consider this ready.

I think this is ready. Thanks.

annevk · 2017-01-20T10:30:33Z

Okay, I'll go fix that. Could you write a commit message that explains the change and the warnings? Just post it as a comment and I'll use it when squash and merging.

hsivonen · 2017-01-20T10:34:55Z

Add visualizations for the indexes

PUA and NFC validation warnings are expected. Exposing the PUA code points as PUA in HTML is useful for seeing what those code points map to (if to anything) in system fonts, which may give insights about their usage. Exposing singletons (compatibility ideographs and scientific units) without normalizing them is useful for having browser "Find" functionality match whatever you get as output of a converter you might be developing.

annevk · 2017-01-20T10:46:31Z

Thanks @hsivonen, they look great and will surely help folks understand how these encodings work!

domenic · 2017-01-20T16:33:16Z

May I suggest a WHATWG blog post explaining this space and the new visualizations to people? :)

hsivonen added 2 commits January 16, 2017 15:38

Non-normatively visualize the indexes

24e43a3

Closes #78

Close </dl> for U+FFFD cells.

6536875

annevk previously requested changes Jan 17, 2017

View reviewed changes

hsivonen added 3 commits January 17, 2017 12:59

Actually use the aria() function

6f873eb

Make the validator happier about the visualizations

dba60fd

Address review comments

5acd5aa

zcorpan reviewed Jan 17, 2017

View reviewed changes

hsivonen added 3 commits January 17, 2017 15:55

Fix CSS syntax for WHATWG Green

148d01f

Support narrow Python

1f96d68

Add aria-label to unmapped cells

10c81ef

hsivonen added 3 commits January 18, 2017 14:34

Increase contrast between text and background for contiguous PUA

751ad1d

Fix Shift_JIS lead byte value row headings.

a7ba8e2

Restore accidentally-deleted svg intrinsic dimensions.

c6232f7

slight rewrapping

98acf1f

annevk merged commit 7696b7a into master Jan 20, 2017

annevk deleted the visualization branch January 20, 2017 10:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-normatively visualize the indexes #89

Non-normatively visualize the indexes #89

hsivonen commented Jan 16, 2017

hsivonen commented Jan 16, 2017

annevk commented Jan 17, 2017

annevk left a comment

annevk Jan 17, 2017

hsivonen Jan 17, 2017

annevk Jan 17, 2017

hsivonen Jan 17, 2017

annevk Jan 17, 2017

annevk Jan 17, 2017

hsivonen Jan 17, 2017

annevk Jan 17, 2017

hsivonen Jan 17, 2017

annevk Jan 17, 2017

hsivonen Jan 17, 2017

annevk Jan 17, 2017

hsivonen Jan 17, 2017

hsivonen commented Jan 17, 2017

zcorpan Jan 17, 2017

hsivonen Jan 17, 2017

zcorpan Jan 17, 2017

hsivonen Jan 17, 2017

zcorpan Jan 17, 2017

gsnedders Jan 17, 2017

hsivonen Jan 17, 2017

hsivonen Jan 18, 2017

zcorpan commented Jan 18, 2017

annevk commented Jan 20, 2017

annevk commented Jan 20, 2017

hsivonen commented Jan 20, 2017

annevk commented Jan 20, 2017

hsivonen commented Jan 20, 2017 •

edited

annevk commented Jan 20, 2017

domenic commented Jan 20, 2017

Non-normatively visualize the indexes #89

Non-normatively visualize the indexes #89

Conversation

hsivonen commented Jan 16, 2017

hsivonen commented Jan 16, 2017

annevk commented Jan 17, 2017

annevk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hsivonen commented Jan 17, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zcorpan commented Jan 18, 2017

annevk commented Jan 20, 2017

annevk commented Jan 20, 2017

hsivonen commented Jan 20, 2017

annevk commented Jan 20, 2017

hsivonen commented Jan 20, 2017 • edited

annevk commented Jan 20, 2017

domenic commented Jan 20, 2017

hsivonen commented Jan 20, 2017 •

edited