Normative: Require the latest available Unicode version instead of a fixed version number #620

Merged
merged 1 commit into from Jul 28, 2016

Projects

None yet

7 participants

@mathiasbynens
Contributor
mathiasbynens commented Jun 23, 2016 edited

As of June 21st, Unicode 9.0.0 is the latest version.

Update July 27: This PR has been updated to refer to the latest available Unicode version rather than v9.0.0 specifically, as per the July 27 meeting.

@bterlson
Member

Should probably do a review of the changes before committing to this. Due diligence and all of that. Can you comment on whether/which changes are relevant?

@domenic
Member
domenic commented Jun 23, 2016

No due dilligence, I want my Unicode power symbol now!!

@mathiasbynens
Contributor

@bterlson Sure.

  • Space_Separator hasn’t changed between Unicode 8 and 9.
  • Unicode 8 has 2,518 ID_Start symbols; Unicode 9 has 2,669, i.e. 151 more.
  • Unicode 8 has 109,830 ID_Continue symbols; Unicode 9 has 117,007, i.e. 7,177 more.
  • Unicode 8 defines 1,245 simple case mappings (1,217 C + 28 S); Unicode 9 defines 1,325 (1,297 C + 28 S), i.e. 80 more.

Did I miss anything?

@bterlson
Member

@mathiasbynens Additions don't seem worrying. Anything by way of "breaking changes" there? Removals from ID_Start/ID_Continue and the like?

@mathiasbynens
Contributor
mathiasbynens commented Jun 23, 2016 edited

No removals from ID_Start, ID_Continue, or C or S case-fold mappings.

I’ve checked all of the above using the Unicode data files directly, but it can all be verified quite easily by running npm install unicode-8.0.0 unicode-9.0.0 and writing some quick Node.js scripts à la:

// Look for removals in `ID_Continue`:
const a = require('unicode-8.0.0/Binary_Property/ID_Continue/code-points.js');
const b = new Set(require('unicode-9.0.0/Binary_Property/ID_Continue/code-points.js'));
const diff = new Set(a.filter(x => !b.has(x)));
console.log(diff);
// → Set { }

Once we’ve established there are no removals, we can easily count the number of new symbols:

const a = require('unicode-8.0.0/Binary_Property/ID_Continue/code-points.js').length;
const b = require('unicode-9.0.0/Binary_Property/ID_Continue/code-points.js').length;
console.log(b - a);
// 7339

The same goes for other properties, e.g.:

// Look for removals in `S` case folding:
const a = Object.keys(require('unicode-8.0.0/Case_Folding/S/code-points.js'));
const b = new Set(Object.keys(require('unicode-9.0.0/Case_Folding/S/code-points.js')));
const diff = new Set(a.filter(x => !b.has(x)));
console.log(diff);
@bterlson
Member

Good point! I will explore some and report back.

@allenwb
Member
allenwb commented Jun 23, 2016

Unicode now seems to be on a yearly update schedule that slightly lags ECAM-262.

If out intent is to update these references every year , wouldn't it be better to use an open ended reference to the current Unicode standard. In standards documents, a normative reference to another standard that does not include a specific version or date qualifier means the "current version".

@mathiasbynens
Contributor

@allenwb That’s what I proposed three years ago: https://bugs.ecmascript.org/show_bug.cgi?id=2071#c0

@rwaldron
Contributor

Whichever update process is used, Ecma-402 will need to be updated as well

@bterlson
Member

@allenwb I think that is what we had consensus for as well. An open-ended reference seems fine but I was thinking a specific reference at least cued us to do the due diligence of looking for potential issues. I worry without that (and @mathiasbynens's expertise) we'd grow complacent :-P Happy to update to an open-ended reference though.

@littledan
Contributor

When I proposed upgrading to Unicode 8.0, I had no idea that an open-ended reference was in the cards as a legal possibility for a spec to do, and didn't realize that we got consensus on that. I thought the consensus was annual bumps like this. I like Allen's idea. Let's keep doing the due diligence, but I don't think we need explicit bump commits to enforce that.

@bterlson
Member
bterlson commented Jun 24, 2016 edited

Going through the notes I see that we had consensus for "8 or greater" which doesn't actually imply that we can use an unversioned reference. For now I won't take this PR, and will add to the agenda for Redmond that we discuss the unversioned reference.

@ljharb ljharb added a commit to tc39/agendas that referenced this pull request Jul 1, 2016
@ljharb ljharb Adding Unicode PR per tc39/ecma262#620 (comment) 1060132
@mathiasbynens mathiasbynens referenced this pull request in shapesecurity/shift-parser-js Jul 5, 2016
Open

upgrade to Unicode 8.0.0 #263

@mathiasbynens mathiasbynens Normative: Require the latest available Unicode version f3541c9
@mathiasbynens
Contributor

PR updated to refer to the latest available Unicode version, as per the July 27 meeting.

@mathiasbynens mathiasbynens changed the title from Normative: Require Unicode 9.0.0 to Normative: Require the latest available Unicode version instead of a fixed version number Jul 28, 2016
@bterlson
Member

@mathiasbynens Looks great, thanks so much!

@bterlson bterlson merged commit 0eb8b2f into tc39:master Jul 28, 2016
@caridy
Contributor
caridy commented Aug 26, 2016

@mathiasbynens do you plan to update 402 as well?

@mathiasbynens mathiasbynens added a commit to mathiasbynens/ecma402 that referenced this pull request Aug 30, 2016
@mathiasbynens mathiasbynens Normative: Require the latest available Unicode version instead of a …
…fixed version number

Ref. tc39/ecma262#620.
a0bb34e
@mathiasbynens mathiasbynens added a commit to mathiasbynens/ecma262 that referenced this pull request Aug 30, 2016
@mathiasbynens mathiasbynens Normative: Point to the latest version of UTR15
Instead of referring to a version snapshot, link to the latest version of UTR15.

Ref. #620.
0c97a2d
@bterlson bterlson added a commit that referenced this pull request Sep 2, 2016
@mathiasbynens @bterlson mathiasbynens + bterlson Normative: Point to the latest version of UTR15 (#681)
Instead of referring to a version snapshot, link to the latest version of UTR15.

Ref. #620.
8ac4a31
@caridy caridy added a commit to tc39/ecma402 that referenced this pull request Sep 27, 2016
@mathiasbynens @caridy mathiasbynens + caridy Normative: Require the latest available Unicode version instead of a …
…fixed version number

Ref. tc39/ecma262#620.
fbbce48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment