Date and Number formatting #348

Closed
wants to merge 3 commits into
from

8 participants

@ashiksmd

Added date and number formatting APIs. New modules datatype-date-advanced-format, datatype-number-advanced-format, datatype-date-timezone and intl-format.

Created new topic branch and cherry-picked changes as recommended by ericf (#347)

@rgrove

-1.

The size of this pull request makes it virtually impossible to review thoroughly, but based on what I was able to review of it, I see several problems.

First, the sheer amount of code and number of files added to the library: over 700,000 lines of code! That's massive, and all for a relatively small subset of functionality that many people will never use. The metadata alone for these new modules adds a lot of weight even when the modules aren't being loaded. The additional files and testing surface will significantly increase the time required to build and test the library. For people who self-host the library, all of this is a big deal.

Second, there doesn't appear to be any non-API documentation, and the pull request description doesn't give any details about what all this new code does. Sure, it formats dates and numbers, but for a 700,000+ line commit, I'd like to know a little more than that. What does it do that justifies its size and the significant complexity it adds to the library?

Third, I notice that at least one file includes a distinct Yahoo! copyright message (separate from the copyright already in YUI's license) that also mentions the code is based on code owned by VMWare, but it doesn't indicate what VMWare's license is. This is potentially problematic, since VMWare's ownership of the original code may mean that derivative works can't be included in YUI. I assume this means this code was derived from Zimbra, but someone from Yahoo! probably needs to clarify the licensing situation here.

This seems like a very good candidate for the gallery, especially given Dav's recent gallery improvements, but I'm against adding this to the core in its current state.

@allenrabinovich

@rgrove -- I would just like to shed some light here and give you a context, if I may.

First, most of the lines that you describe as "lines of code" are language packs for number and date formatting. Let's be fair: that's not code, and there is nothing there that is within our area of qualification to review (well, I can review the Russian files for correctness, and you can review the English ones, and we can test JSON for correct syntax, and that's about it). So the only code to review is in src/date/js, src/number/js and src/intl/js -- much less code than you are implying there is.

Second, all this code has been approved by our open source working group for release, and while it's true that portions of it are under a different license, that license will be added to the overall license file, just like the relevant license for Handlebars was approved and added at your behest. I think that license file update got lost somewhere along the way, but that's just an accidental omission -- it of course needs to be added.

Third, the purpose of this code is to add proper and standard methods for formatting numbers, dates (including timezone information and relative dates and times) as well as messages containing numbers and dates, in a large range of languages. This code has been put forward by the internationalization team at Yahoo!, and I believe there is quite a bit of demand for it, at least based on the requests I've seen from the community. I'll reach out to the committer to put up the examples he has shown to me, to make the purpose of the code clearer.

I think your concerns about the metadata size are perfectly valid, and we should discuss that at length. There's a lot more to discuss there, too, including the right places for specific portions of the code, as well as the need for more documentation and examples. I believe that we're opening up our pull request discussion meetings, so this will be up for a group discussion soon. I do think that this code and data are actually quite valuable to the library (believe it or not, there are many companies working in non-English language markets!), so I hope you can join these discussions as they are scheduled.

Incidentally, if you have thoughts on how to best make the pull request more convenient to parse, given the large number of language packs in it (multiple pull requests?), that would be helpful too.

@rgrove

@allenrabinovich To be clear, my vote is a -1 for this pull request, not for this functionality. I'm undecided on the functionality, though I feel that, logistically, there are a lot of problems with putting it in the core library.

Whether you call them "language packs" or code, there are over 700,000 lines of something in files with .js extensions, which will be parsed, interpreted, and executed by a browser when loaded. I call this code. In any case, these lines need to be reviewable, and this pull request is far too large to review or comment on. I mean that literally: GitHub's diff and line-commenting UI literally will not work on this pull request.

If some of this code is generated, then I suggest breaking this up into separate requests so we can at least review and comment on the hand-written code. We should also review whatever code was used to generate the generated code, since it may need to be regenerated in the future.

I see that license info has been added now, so that's good. There's still the problem of metadata size and the additional weight this adds to the library. Metadata size is an issue for everyone; file size in general is an issue for those of us who self-host the library.

The additional files we can probably live with if this functionality is really that much in demand (although in my tests it does increase build time by 25 seconds, or 16%), but making everyone who uses YUI pay the metadata penalty for this functionality seems unreasonable to me. That's why I suggested putting it in the gallery, where the metadata penalty is optional.

Given the ongoing work to make gallery modules both more autonomous and more integrated with the core modules throughout the website, the gallery really does feel like a great place for something like this to live. It's a large piece of non-essential functionality that, while useful, probably will not actually be used by 90% of the people who use YUI. Those who do need it could easily load it from the gallery, and maintaining it in the gallery keeps the core metadata from ballooning and also avoids increasing build time, test time, and the self-hosting cost of the core library.

@davglass
YUI Library member

👎

I agree with @rgrove on this, this is a huge addition with little payoff. I don't see a reason to pull this into core. This should live in the gallery. (or some happy medium of that which we can discuss during a hangout).

@allenrabinovich

@ashiksmd -- I hate to do this to you again, but can you create a separate branch off of dev-3.x, then cherry-pick only the non-generated code (that is, only the code you wrote and not the language packs) into that branch and send another pull request? The language packs do make it basically impossible to comment directly on your code.

Also, is there additional code that was used to generate these language packs? We'd probably like to have it included as well, since these may need to be regenerated in the future (if there are any changes in locale data, etc.).

Because ballooning metadata with hundreds of language packs is a serious issue, we'll bring this up for discussion with the team this week and think about how to best distribute this code between the core and the gallery.

@sdesai - can you also join this conversation? I'd like to hear your thoughts as the architect of the lang infrastructure.

@ericf
YUI Library member
@sdesai

I'll look at the pull request details and comment. Hopefully by Thu EOD.

@sdesai

[edit, get the formatting right]

This is probably going to repeat a lot of what was said above, which I mostly agree with, and a couple of things I don't:

1) New feature sets of this size need to have a design document covering the high level use cases/demand/API/pros/cons/numbers on meta-data size increase etc.

I think we should start there, before doing anything else with this pull request. So we can see what the API implications are, and what the numbers look like.

By the time we get back to the pull request, it should mostly be about implementation details and JS cleanup.

2) I believe there's a demand for this functionality, and it should ideally exist in core.

As a framework which people use to build large applications, I think localization in general, and date/number localization in particular, is a key component. For example, Calendar/Charts/DataTable etc. should all be leveraging this layer for Date/Number formatting/parsing and get localization support for free.

That is, it's an important lower-level feature which warrants the testing and long term stability which being in core (vs. gallery) implies.

3) The reason we're really shying away from putting it in core, is largely unrelated to the actual feature. It's a more generic loading/meta-data problem which we need to address regardless of this feature. Stuff Dav is working on independently or we've discussed in the past for generic meta-data growth/core growth reasons.

Off the top of my head:

Metadata K-Weight Optimization

  • Trying to share the array of languages a module supports, across multiple components, so it's not repeated. I'm not sure we have a way to get rid of the explicit "I support these languages" definition completely, but if so, even better.

  • Maybe split the extended lang meta-data out, and include it conditionally - 2 step loading for people leveraging lang/localization support.

  • Allowing users to prune the meta-data to the exact set of modules/features they're using.

Load/Use Run-Time Cost Optimization

  • Optimizing loader run-time costs by not exploding language packs as first class modules, unless someone is using a localizable module/feature.

File System/Clone Cost Optimization

  • Along the lines of the meta-data pruning, have the same tool also prune the file system of unused modules (and variants -debug, -coverage etc). In addition to being a build tool for self-hosting, this comes up a lot when deploying to devices for offline applications.

  • If cloning cost/size of the repo is a real concern, we can consider hosting non-default language packs in a separate repo. So, out of the box, localizable components have a default language pack in the yui3 repo - and "additional" languages are available from another repo (gallery or otherwise).

4) Regarding aligning to the ECMAScript Internationalization Spec, sure seems like it's worth looking at, before we commit to an API. The only thing I'd recommend there is to make sure there's actual traction and stability before stating that it's an absolute requirement for getting this pull request in (Norbert is the original author of the Intl and current localizable Date formatting/parsing infrastructure)

@ericf
YUI Library member

Regarding aligning to the ECMAScript Internationalization Spec, sure seems like it's worth looking at, before we commit to an API. The only thing I'd recommend there is to make sure there's actual traction and stability before stating that it's an absolute requirement for getting this pull request in (Norbert is the original author of the Intl and current localizable Date formatting/parsing infrastructure)

I was talking to Norbert about this, this week. I was trying to push for the resource packs for Date, Number, and currency be a part of the Internationalization spec in some way which the browsers/envs would have these resources built in. The other concept is having a infrastructure in the library which enables localized strings to be loaded for the correct locale. This is something we already have, but we could to better at it so that it's more deterministic (two-phase loading process) and not penalize.

For old browsers, YUI will still need these standardized resource packs (for Date, Number, etc.). But this seems like something we should provide in a separate project from YUI. Ideally all libraries would integrate a single poly-filling project until everyone has browsers with that implement the Internationalization spec and have the standard resource packs built in.

@ericf
YUI Library member
@NorbertLindenberg

The ECMAScript Internationalization API, edition 1.0, is stable; we expect it to get approved as an Ecma standard in about 10 days. It has significant traction: The Chrome implementation is in beta, the Firefox implementation is in progress, an Explorer implementation exists, although releasing the latter may take a while. We don't know what's up with Safari, Opera, or regional browsers. On the other hand, even if they all shipped today, it would take a few years before developers can take the API for granted in the then A-grade browsers.

The API covers the basics of string comparison, number, and date and time formatting, but especially for date and time formatting there's room for advanced functionality.

Together, this means that there's still room for libraries: poly-fills for the ECMAScript Internationalization API, libraries providing functionality in areas not covered by the API, and libraries providing advanced functionality in the areas covered by the API. The sponsors of this library will have to explain where it fits in. In the past, that would have been me, but I had to erase that part of my brain when I left Yahoo last year.

It's essential that the tools used to generate the resource bundles data are included with the commit and reviewed. I assume the bundles are derived from CLDR, and CLDR is continuously updated and has new releases about twice a year. Time zone data changes even more often – the IANA time zone database sees 10-20 releases per year. There's no reasonable way to review the generated data; you have to assume that the CLDR data and IANA time zone database are OK, and review how they are converted.

The list of languages supported should also be reviewed: There's no need to support POSIX on the web; Yugoslavia no longer exists, I can't find a country for country code RH, and it's not clear that there's a community of English speakers in Vietnam that's large and unique enough that it needs its own locale. Also, the Intl module has a fallback mechanism, so where a bundle for a language-country combination has the same data as the language-only bundle, it can be omitted.

For number formatting, few applications will need names and symbols for all currencies in the world (the currency converter in Yahoo Finance may be one such application). For most applications it's probably adequate to support for each locale the currencies of the country itself, its neighbors, its top ten trading partners, and the top ten economies in the world, and fall back to the ISO currency code for others.

@jenny
YUI Library member

I agree with others that the functionality is important and should ideally be in core. Here are the open issues I see that remain to be resolved:

  • High level design overview
  • Separate pull requests into the features vs resource bundles so we can continue code review
  • User guide documentation and examples
  • Resolution of metadata impact
  • Additional pull requests for the tools used to generate the resource bundles (here or in separate repo?)

/cc @ashiksmd
/cc @triptych

@davglass
YUI Library member

Closing along with #362

@davglass davglass closed this Jan 16, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment