Permalink
Find file
Fetching contributors…
Cannot retrieve contributors at this time
270 lines (203 sloc) 11.5 KB
### Intro
Hello everyone,
Thanks for having me. It's a pleasure to be here again. My name is Rafael
Xavier, I'm the project lead for Globalize, which is one of the jQuery
Foundation projects, specifically the one about internationalization.
I'm also part of the ECMA-402 working group, which is the web specification
thing, alongside with Steven Loomis and John Emmons which are here at the
conference.
I've also made a couple of other related but smaller things, like having created the
mailing list that today is ECMA-402 official discussion list and created JavaScript
libraries for making it easier to develop using CLDR.
~~~ Why is internationalization important?
Skipped due to be among globalization experts...
~~~ What is it about?
Skipped too...
When I talk to developers in general they think it's about translations to
non-English languages, which is true but not in its entirety. So, I argue that
i18n is about "talking" in general, it's about making machine language
understandable for humans. Every applicaion at some point has to "talk" to its
users (I hope). This is, to display information to them, i.e., messages that may
require pluralization support, gender inflection, date formatting, number
formatting, currency formatting, etc. Even in English, it might be tricky to get
that done properly.
### Application
~~~ ICU
~~~ (stop) Application using crazy regexps to format them
On programming languages like C, C++, Java and others we have ICU. But, on JavaScript
we have (crazy regexps)... I wasn't convinced by this answer and keep trying
until I found...
### Develop using Intl
~~~ Application using Intl
... we have Intl, which is the browsers' native API. As of today, it supports
number formatting, currency formatting, date formatting, and collation. It's
exactly what we are looking for, a function that we can call and pass the
locales that we want support for and say what we want displayed and it just do
the right thing for us.
~~~ Intl.NumberFormat
~~~ Intl.DateTimeFormat
~~~ Intl.Collator
IE/Edge Firefor Chrome Safari Opera Android Browser Node.js
Yes IE>=11 Yes Yes No Yes >=4.4 Yes
No iOS But mobile *English only by default
Intl would be perfect if it wasn't for some problems. The drawbacks of using
native Intl as of today are:
Limited browser support: it isn't supported on all browsers and environments.
According to GlobalStats, Intl is supported for 67.82% of the global users,
which is a decent coverage. But, it also means it leaves 1/3 behind - including
iOS users which is pretty significant. Intl is supported on Node.js because of
the great work done by Steven Loomis. The only restriction is that Intl comes
with support for English only by default. If you need different locales, you
need to re-compile it.
Beyond browser support, another drawback is reduced functionalities. There are
functionalities NOT supported by Intl at all, they have NOT even been
specified. Common missing ones are message formatting, relative time formatting,
unit formatting, abbreviation; and even text segmentation, address formatting,
postal codes, phone formatting, and timezones.
ECMA-402 2nd edition is supposed to bring some new functionalities in. For example,
there's a draft made by Caridy, from Yahoo, for plural forms. But, it's hard to
say when all those missing functionalities are going to become part of the
specification and then be implemented and supported by the browsers.
~~~ MessageFormat
~~~ - Pluralization
~~~ - Gender Inflection
~~~ RelativeTime
~~~ UnitFormat
~~~ Abbreviation
~~~ Text Segmentation
extra
~~~ Address Format
~~~ Postal Codes
~~~ Phone Format
~~~ Timezones
### JavaScript libraries
To fill in the gap there are JavaScript libraries and here is where the fun
begins.
Each library has its own strenghts and weakness, which I and the lead developer of
several other projects tried to summarize in this link. Feel free to explore it
and pick the one the best suite your needs.
Today, I want to share with you some of the experience I had during this time
leading Globalize where I had to opportunity to discuss this topic with several
smart engineers from various different companies including a team at Twitter.
~~~ 1st CLDR as a peer dependency
~~~ - wrong data issue -> Ideal: resource loading + cldr as peer dependency.
Differently from Intl, where the implementation is already in the client, using
JavaScript library applicaions have to provide that support, they have to ship
that support to the client if it's a client-side application. Directly or
indirectly it has to ship i18n data.
Thanks to Unicode we have CLDR. But as we know, CLDR in its full form is very
large, especially for client side applicaions, (hundreds of MB) which means
that no JavaScript library includes it in its entirety. Usually, they specify
what data they need for their supported features and allow applications to
cherry-pick that data from the locales they need.
Some libraries choose to ship a preprocessed mix of data and code for each of
its supported locales, so we can include it in a script tag and use it. Its file
structure looks something like this and is usually: version X of the library
has version Y of the data, whenever there's new data available, a new library
release must happen, so the updates are made available to the end user. If
developer finds a problem with the data, he needs to patch the library. As soon
as a new update pops up and he downloads it. He loses his previous fixes. Unless
he keeps a list of patches that he can re-apply over and over and, hopefully,
not get any conflicts while doing so. It's a nightmare for maintenance point of
view.
~~~ package.json w/ cldr-data
Instead of that we believe a better approach is having library code
completely separate from i18n data. Data should be a peer-dependency of the
library as simple as that.
cldr-data is a module for npm and bower that holds the official CLDR in the JSON
format.
~~~ 2nd declarative code
We made Globalize to be really flexible, to be modular, to work in several
different environments, and to allow developers to control data loading, which
requires an imperative programming like in this example: require the modules you
need, load the data, set the locale, then use the library. But, most of it can
be usually automated. So, developer has a declarative programming experience like
he does using Intl, which is just use the library. By no means the idea is to
drop the original Globalize flexibility, but we suggest how to configure tools
like webpack to provide such automation.
The end goal is to make usage easier.
~~~ 3rd performance -> Runtime modules
~~~ - Large in its full form -> ideal: minimum set of code, minimum set of data.
Finally, being only easy to use on development is worthless if the final product
is bad. So, we've also managed to avoid performance impacts. There are two major
contributing factors for performance here.
One is load performance, which is especially important for client side applications
where you have to ship code. As I mentioned earlier, the data is considerably
large and even the small slices that libraries make may include a lot of data
that your application won't need. For example, a currency formatter includes all
currency codes for a certain locale. If your application uses US Dollars only,
all other loaded currency data is unnecessary. Similarly, a date formatter
includes all the localized translation names in several lengths (e.g.,
"January", "Jan", "J"). If your application displays dates using numeric form
only, all the loaded translations are unnecessary. We're talking about tens to
hundreds of wasted KB.
The other is related to runtime performance: generating the formatters are an
expensive operation.
On Globalize, we wanted to do something analogous to templating engines, which
can precompile templates during build time for optimal performance at runtime.
So, applications can get the smallest and fastest custom tailored bundle.
By smallest, I mean avoiding the inclusion of unnecessary code and data. No
unnecessary translation messages, no unnecessary CLDR data, so no bandwidth is
wasted.
By fastest, I mean having all formatters (number, currency, datetime, relative
time, etc) generated at build time. So, no CPU clock cycles are wasted on the
client.
### Solution
We've worked to get tools to do all that automatically. So, let me show you an
example application using Globalize and its static compiler configured on
webpack.
Please, note Globalize is unopinionated and its goal is to empower users to use
it the way they want. But, for obvious reasons, this example is very
opinionated. If you don't use webpack, don't worry. You should be able to
reproduce this in the tool of your preference. Globalize and its compiler are
pure JavaScript, so am sure it's possible to wrap it on a different tool. Also,
note that this example uses CommonJS but Globalize also supports AMD.
~~~ examples/app-npm-webpack
In this example, we've created two running environments.
For development, our goal was that coding was
- declarative, the default locale is set automatically, the data is loaded
automatically.
- optimized for fast live reload, use plaing Globalize where formatters are
generated dynamically, no need for compiling formatters here, so we get faster
refreshs.
For production, our goal was to generate precompiled code for optimal
performance. It runs the Globalize compiler on our source code to figure out
the minimum set of data and modules we need and it generates the precompiled
formatters for the supported locales.
In the setup we're using here, it outputs one JavaScript file for all the
external libraries, one for all our application, and one for the precompiled
formatters.
Again, this is a setup we wanted for the example, this could be completely
customized for your own needs.
Show built sizes, it's not even gzipped. Show one of the precompiled JS.
The precompiled stuff: it's smart enough to reuse formatters. For example,
dateformatter and currencyformatter reuses numberformatter and so on.
~~~ Example - BTC as currency code
Remember the bitcoin case I've mentioned, let's see what it would take in
order to get that working
~~~ Example - React.js
~~~ Example - ECMA-402 + Globalize
~~~ Translation Table
~~~ How easy it is to create new features / new functionalities
~~~ - allow to amend or enrich data, BTC.
Allowing the developer to control the data loading has an interesting
consequence. It makes it so easy to amend any content, or even enrich it.
For example, I have recently heard that Kraken is using Globalize, although I
don't know any of the specifics. All I know that Kraken is a bitcoin trade and
bitcoin is the popular cryptocurrency and one usecase that came into my mind is
that unfortunately for them, bitcoin isn't an available currency in CLDR, so
they would not be able to format bitcoin currencies. But, it's so easy to enrich
CLDR with the missing data, so the library can re-use all of the existing
localized currency rules: for example: whether there is a space between the
symbol and the number and where symbol should be positioned (before or after the
number).
### Good bye
~~~ Call for action:
~~~ - use it (let us know if docs are unclear)
~~~ - contribute
I welcome you give it Globalize a try, let us know if the docs are unclear. If
you are already using it and needs additional support, let us know, also feel
free to jump in and contribute. Recently, Andrew Lunny from Twitter just added a
new module in Globalize, the unit formatting. So, it's made to be extended.
Help us to improve it.
Thanks