Skip to content

Internationalization #26

@JayPanoz

Description

@JayPanoz

Note: I’ll use “i18n” instead of “internationalization”.

So let’s be honest, this issue will quite probably stand as “The Readium CSS issue” since it is roadmap-blocking, is impacting other parts of Readium 2 (streamer, navigator, API, apps developed by EDRLab), and will need a lot of documentation. In other words, it’s a project on its own, nested in the Readium CSS project.

I’ve spent the last 3 weeks documenting this issue, and you can think of it as a summary of the research that has been done. I won’t list everything there but only what is critical to provide implementers and readers with a solid baseline. I’m willing to make this baseline the best we can get (say, bulletproof although rough around the edges) but it’s worth noting we’ll need help from experts in those various and diverse languages and their typography to turn it into an excellent user experience.

Roadmap

First and foremost, I’d be in favor of updating our current roadmap. Vertical writing is indeed blocking and I’d like to move it to the beta version.

What it means is that we could ship support for horizontal-tb CJK, Right to Left and Indic languages in the alpha version relatively quickly, and then focus on vertical-writing, since our work on a11y baseline has been ahead of time since the beginning of this project.

I believe we would all probably agree that the prototype has proved a solid-enough bedrock – of course we have edge cases to deal with but it’s fine for the vast majority of contents – and pushing the small and easy wins for RTL/horizontal-tb CJK/Indic on the develop branch would allow us to release an alpha on the master branch early 2018. This would probably send a good signal too since the proto has been released 3 months ago already.

On a related note, we’ll start documenting columns handling (e.g. page progression) in January 2018 so it would make sense to prioritize LTR/RTL (horizontal-tb), especially as we’ll be able to document vertical-writing immediately after – quite frankly, this will be critical since they are conceptual changes to take into account.

Global needs outscoping Readium CSS

Obviously, Readium CSS won’t be able to fly in autopilot mode there. It needs either flags it can target or smart handling of its resources depending on the publication.

Minimal set of features

What we’ll need:

  • checking the page-progression-direction for the spine (streamer);
  • checking the language, and they can be multiple <meta> (streamer);
  • loading specific stylesheets based on those previous indications (API);
  • appending xml:lang and/or lang attribute if it’s missing in XHTML documents (API);
  • appending dir="rtl" attribute if it’s missing in XHTML documents (API);
  • loading specific fonts’ lists for user settings, based on language (Apps);
  • adding/removing specific user settings, based on language (Apps);
  • having the toc and at least some pieces of user settings (e.g. text-align) with a rtl direction for RTL languages (Apps);
  • page-progression from right to left (navigator).

A longer-term issue will be localization, should you want to get this need covered in the apps, as implementers might want an easy way to translate strings, etc. But it’s up to EDRLab, obviously.

Writing-mode and RTL mapping

For writing mode, those are the writing-mode we should apply based on the language and page-progression-direction:

Language IANA tag page-progression-direction Writing-mode
Chinese zh LTR / Default / None horizontal-tb
Chinese zh RTL vertical-rl
Chinese (Simplified) zh-Hans DNA (?) horizontal-tb
Chinese (Traditional) zh-Hant DNA (?) vertical-rl
Chinese (Taiwan) zh-TW LTR / Default / None horizontal-tb
Chinese (Taiwan) zh-TW RTL vertical-rl
Chinese (Hong Kong) zh-HK LTR / Default / None horizontal-tb
Chinese (Hong Kong) zh-HK RTL vertical-rl
Hangul ko LTR / Default / None horizontal-tb
Hangul ko RTL vertical-rl
Japanese ja LTR / Default / None horizontal-tb
Japanese ja RTL vertical-rl
Mongolian mn-Cyrl LTR / Default / None horizontal-tb
Mongolian mn-Mong DNA vertical-lr

I propose we simplify this model for Chinese and rely on page-progression-direction with an extra check for language (zh), and not bother with all those variants.

It’s worth noting we should not add dir="rtl" there, for the CJK languages.

In Right to left, we can simply rely on page-progression-direction, if the language is not CJK (and Mongolian) but here is a mapping of languages you might encounter, just for your information:

Language IANA tag page-progression-direction dir attribute
Arabic ar RTL rtl
Farsi (Persian) fa RTL rtl
Hebrew he RTL rtl

Right to Left

This shouldn’t be a huge issue in Readium CSS, as we only need a few adjustments, specific base and default styles, and typefaces.

Hopefully, this doesn’t impact our views (paged and scrolled) since columns will behave as expected.

Our pagination model is the following:

 _________________    _________________
|                 |  |                 |
|                 |  |                 |
|                 |  |                 |
|                 |  |                 |
|                 |  |                 |
|      Col 1      |  |      Col 2      |
|                 |  |                 |
|                 |  |                 |
|                 |  |                 |
|                 |  |                 |
|                 |  |                 |
|                 |  |                 |
 —————————————————    ————————————————— 

CSS Multicol in horizontal-tb (x-axis)

When the dir attribute is set on html, it becomes:

 _________________    _________________
|                 |  |                 |
|                 |  |                 |
|                 |  |                 |
|                 |  |                 |
|                 |  |                 |
|      Col 2      |  |      Col 1      |
|                 |  |                 |
|                 |  |                 |
|                 |  |                 |
|                 |  |                 |
|                 |  |                 |
|                 |  |                 |
 —————————————————    ————————————————— 

CSS Multicol in horizontal-tb + dir="rtl" (x-axis)

Our main CSS-related concern there should be typefaces, as we’ll need outstanding fonts to deal with typography requirements (ligatures, multi-baseline levels, joining rules, etc.).

CJK (horizontal-tb)

Similar to RTL: we only need a few adjustments, specific base and default styles, and typefaces.

This should already provide support for the vast majority of contents in Chinese (vertical-writing is not used in mainland China, but only in Taiwan, Hong Kong and Macao), and Korean.

Chinese, Japanese and Hangul share a lot in terms of typography but having a few adjustments for each language would be a plus since differences are quite minor.

Other languages

For the time being, we’re only focusing on Devanagari, which should not have a huge impact. Once again, we’ll need a few adjustments, with the main focus being typefaces.

Vertical Writing

This is by very far our biggest issue in Readium CSS since we can’t necessarily manage that well, cross-platform-wise.

We don’t have anything to force the column-axis in CSS, which means that our spread model (two columns next to each other)

 _________________    _________________
|                 |  |                 |
|                 |  |                 |
|                 |  |                 |
|                 |  |                 |
|                 |  |                 |
|      Col 1      |  |      Col 2      |
|                 |  |                 |
|                 |  |                 |
|                 |  |                 |
|                 |  |                 |
|                 |  |                 |
|                 |  |                 |
 —————————————————    ————————————————— 

CSS Multicol in horizontal-tb (x-axis)

Will automatically become the following in vertical-rl:

 _____________________________________
|                                     |
|                                     |
|                Col 1                |
|                                     |
|                                     |
 —————————————————————————————————————
 _____________________________________
|                                     |
|                                     |
|                Col 2                |
|                                     |
|                                     |
 ————————————————————————————————————— 

CSS Multicol in vertical-* (y-axis)

So the best we can do right now is a fragmented scrolled-view:

  _____________________________________
 |                                     |
 |                                     |
 |                                     |
 |                                     |
 |                                     |
 |                Col 1                |
 |                                     |
 |                                     |
 |                                     |
 |                                     |
 |                                     |
 |                                     |
  ————————————————————————————————————— 
- - - - - - - - - - - - - - - - - - - - - (Overflow begins here)
  _____________________________________
 |                                     |
 |                                     |
 |                                     |
 |                                     |
 |                                     |
 |              Overflowed             |
 |                 Col                 |
 |                                     |
 |                                     |
 |                                     |
 |                                     |
 |                                     |
  ————————————————————————————————————— 

New fragmented scrolled-view for vertical-* (y-axis)

In other words, one column with overflowed columns on the y-axis, which 1) will force implementers to map left/right (swipe/buttons) on bottom/top and 2) won’t allow them to have page-transition animations.

Note: The only alternative to solve those issues at the moment would be writing a renderer in JavaScript. It’s worth noting that if you’re only targeting iOS, there is a solution in pure CSS though.

What’s even worse is that the same typefaces can’t necessarily be used (proportional/fixed-width depending on writing-mode), and I’ll have to make adjustments for quotes and other details in the base and default stylesheets based on writing-mode…

Note: We won’t try to manage horizontal-tb documents in vertical-rl publications in a smart way for the time being. This use case is indeed not defined in the EPUB spec. Besides, we’ve got nothing at the OPF level to deal with it, and checking the writing-mode during runtime will blow performance in extreme ways i.e. 15 seconds to render some XHTML files… which would be worse than supporting this use case in terms of UX.

Longer terms issues include:

  • polyfilling -epub-properties for web apps;
  • support for alternate stylesheets (which is critical if the implementer wants to offer a horizontal/vertical-writing user setting);
  • support for rendition: align-x-center;
  • support for ibooks:respect-image-size-class (gaiji) and ibooks:scroll-axis metas (see EPUB Compat doc);
  • user settings (some like letter- and word-spacing might have to be removed, and not only for CJK);
  • rendition: flow of scrolled-doc.

Out of scope

There are some typography and layout issues which are not our responsibility but rendering engines’. Those issues include:

  • line-adjustment and justification (RTL and CJK);
  • run-in headings (display: run-in), which is popular in CJK;
  • ruby and its styling;
  • bidi;
  • Kashida Elongation (Arabic);
  • joining forms (Arabic);
  • single-letter styling (Arabic).

Documentation

In theory, I would only have to document the new fragmented scrolled-view model for vertical-writing, and adjustments for user settings.

In practice, I’m willing to go the extra mile and will document typographic and layout concepts, and make glossaries, so that Western implementers have everything at hand to deal with requirements and issues in CJK and languages they might not be familiar with.

This will obviously take time but will fix a huge pain point.

Overarching Issues

  • We don’t have any text layout requirement for hebrew, we’ll need help
  • Hangul, Chinese, Arabic and Indic Text Layout requirements are incomplete, we’ll have to keep track of updates and changes
  • We have not defined a precise baseline for i18n, it may be wise to do it
  • It would be useful to have data about writing-mode, RTL, typefaces used/expected, etc.

Resources

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions