Translations need to support down to IE8 #881

Closed
nlhkabu opened this Issue Dec 30, 2015 · 15 comments

Projects

None yet

8 participants

@nlhkabu
Member
nlhkabu commented Dec 30, 2015

I'm planning on making the new design degrade gracefully all the way down to IE8. We still have enough users still using this browser to justify the extra work.

One problem that @dstufft has just raised is translation, as we are using l20n.js - which is great (and has a lot of cool features) but only supports evergreen browsers. (Note - I couldn't actually find the reference for supported browsers - @dstufft are you able to point me int he right direction?)

Given that the majority of our legacy IE users are not located in the English speaking world, this is a bit of a problem.

@dstufft and I discussed a few options:

  1. Using gettext instead
  2. Porting l20n.js to Python
  3. Helping l20n.js support older browsers

After a tiny bit of research, it looks like there is already a Python port available at https://github.com/l20n/python-l20n

Advice and feedback on this issue is warmly welcomed. We'd particularly like to hear from any developers who have experience rolling out l20n to a site of this scale.

@dstufft
Member
dstufft commented Dec 30, 2015

l20n/l20n.js#102 and l20n/l20n.js#64 are likely relevant here.

@dstufft
Member
dstufft commented Dec 31, 2015

After talking to the #l20n IRC channel it appears they are planning on supporting back to IE11 but no further than that. They've said that if we manage to port it to IE8 they might be able to help maintain the branch that does that, but are unlikely to merge it back into mainline. They also said they looked at IE8 support previously and there were missing things that just simply couldn't be polyfilled. There's a good chance getting l20n.js working is going to be harder than it's worth for as long as we need to support older versions of IE.

I'm going to poke at it to see how hard it would be, but I have a feeling it's likely to be a no go.

That leaves us with two remaining options (unless someone comes up with more!).

Switch (back) to gettext

  • + This is the "standard" format that most projects use, so translators are likely to be more familiar with it.
  • + It already exists and it wouldn't be very hard to switch back to it.
  • - It only has limited support for variants (like plural forms) and has no support for things like variants based on gender.
  • - It uses the english translation as the message ID so non semantic changes to the english text can force translators to need to re-translate the text. This could possibly be worked around though.
  • - Doesn't understand HTML, which is most obvious when you're trying to use some sort of inline HTML element like a link or a span with a class. It's required to have a translation string like This is a <a href="%(url)s">translated link</a>.. The contents will be placed in verbatim so any classes, rel's etc need to be inside every single translation.

Port L20n to Python

  • + Supports variants, so translators can change the translation based on any data that is passed into the translation engine, like the user's gender.
  • + Requires the english text to be specified separately from the message ID, allowing tweaks to the english language to not affect other translations.
  • + Natively understands HTML and uses a merging algorithm that allows things like classes, href's, etc to be in the source document while still allowing markup in the translated text. The translation strings look something like This is a <a>translated link</a>.. A downside to this is that there is no support for re-ordering elements like two links in a translation string, the merging is order sensitive.
  • - Doesn't exist, will require us to write it.
  • - Is a newer format so it doesn't have a lot of tool support currently. There are some Mozilla based tools but it's unknown if they'd work with server side via a Python port or not.

All in all, I think the main question is whether or not gettext's lack of variants is a big hinderance or not. Sadly this is way outside of my wheelhouse since I only speak English. Here's an example from the L20n documentation:

<brandShortName "Boot2Gecko"
 _gender: "neutral"
>
<crashBanner[brandShortName::_gender] {
  masculine: "{{brandShortName}} uległ awarii",
  feminine: "{{brandShortName}} uległa awarii",
  neutral: "{{brandShortName}} uległo awarii"
}>

This would be specified like <div data-l20n-id="crashBanner"></div> and it would pick the masculine, feminine, or natural form of the crash banner based on the gender of the brandShortName variable.

The other major benefit/difference is whether we care about the native understanding of HTML. I think it'd probably a nice feature, but @nlhkabu pointed out that translators are probably used to dealing with needing to insert whole HTML. Either way it doesn't affect what can be translated to what like the variants support does.

@jezdez
jezdez commented Jan 4, 2016

I'd like to suggest to ignore IE8 completely (by disabling l20n.js for it) given it's relatively low usage numbers globally. I know it's probably bigger in some regions, but if you fall back to English you should simplify the implementation drastically and make PyPI future proof.

@jezdez
jezdez commented Jan 4, 2016

Ah, of course that implies you use proper English strings as the source language instead of technical IDs ("A button label" vs. "a-button-label").

@dstufft
Member
dstufft commented Jan 4, 2016

I'm not sure if there is a better way to make this data public, so I'll just share some screen shots. These should all be from Dec 4th, 2015 to Jan 3rd, 2016. This is from Google Analytics so it of course doesn't capture people who don't have JS enabled.

language-stats

language-stats2

That's the language vs browser numbers, I don't see a way for it to let me drill that down further into browser versions and include the version numbers. However here is the version numbers of browsers:

browser-stats

I can also break down specifically IE versions by language:

ie-lang-stats

@dstufft
Member
dstufft commented Jan 4, 2016

I will say, if this were some packaging feature, I wouldn't have a problem breaking it for the number of requests in a month that are showing up here. I mainly don't feel qualified to make a decision since I've never had a major hub of information for something I've wanted to do be available in anything other than my native language (and the only language I speak) so I have no reference for how big of a deal it would be to have the ~2600 sessions with language set to zh-cn to get PyPI in English instead of Chinese. I'm not even sure if we would expect that share to go up if we would translate because perhaps people are turned away right now because PyPI isn't in their native language.

Of course, no matter what we do the bulk of the actual content of PyPI is still going to be in English, as well as the documentation and all that jazz so perhaps the translations aren't a massive big deal and are more just a way to provide a smoother experience where possible.

@nlhkabu
Member
nlhkabu commented Jan 4, 2016

Perhaps member of the Python community in China (where the majority of our ie8/9 users are located) might be able to provide some insight? Any idea of who to ping for this?

@dstufft
Member
dstufft commented Jan 26, 2016

Ok, I've sent a message to Twitter and to distutils-sig seeing if we can get anyone familiar with what would be the effected group of users to see how big of a deal this is. Unless someone comes in and makes a compelling case for needing to support back that far, I think we'll just go with supporting translations in whatever browsers l20n.js supports and otherwise people will degrade to the current behavior of English only.

@rsyring
rsyring commented Jan 26, 2016

From Distutils sig reply:

As for your question, though, I would expect some of the less proficient
English speakers to also have outdated hardware or software installs,
especially in poor countries or very humble social environments.

This was my initial thought as well. IMO, using a translation system that requires modern hardware/software is not ideal.

@encukou
encukou commented Jan 26, 2016

Hi! I'm a native Czech speaker, so I know a bit about grammatical numbers and genders. I also run a beginners' Python course in Czech.

I don't think grammatical genders are that much of a deal. If the thing you're describing is constant (i.e. "PyPI" in "Welcome to PyPI"), you can just translate the sentence accordingly. If it's dynamic (i.e. "Django" in "Download Django 1.8", or "user234" in "Logged in as user234"), you actually have to know the gender of the word, in languages you're translating to. For a site where most dynamic stuff is user-generated, that's pretty much impossible. And needing to know the gender of users, with a limited range of options (male, female, and maybe neuter/inanimate), is a whole different kind of worms.
Czech has a workaround: you can say Download the package "Django" or Logged in as user "user234", where the dynamic term doesn't affect the rest of the sentence, so you don't need to know the gender. I expect it might be similar in other languages as well.

As for numbers, Gettext (and everything that came after it) should have adequate support for them, at least for use in UI strings. Its major limitation – one number per message – hits English just as much as other languages. So the website author might need to be a bit creative at some points, but translators won't be stuck with messages they can't translate well.

From my experience running a beginners' course: it's really helpful if the basics are translated, but the more advanced/exotic things become, the more people are willing to accept English (or, to invest in learning it). Making it possible to translate some packages, even if all others will stay English, should enable local communities to be more welcoming to their newbies by translating the packages used in tutorials.

@uranusjr

I’m Taiwanese, so this observation may not be 100% accurate, but I think most Chinese people don’t even expect PyPI to have a Chinese translation. It would be an awesome surprise if it is, but a user visiting PyPI likely would not turn away because it only comes with English. I’m all for supporting translation when it’s at all possible (Python resources in Chinese have always been very scarce), but this will not likely be a deciding issue for Chinese-speaking Python users.

(Edit: Typos)

@wong2
wong2 commented Jan 27, 2016

I'm +1 for @uranusjr

as a side note: in China, students started taking English lessons from at least junior high school (nowadays, kindergarten!), and they have to reach a specified English level to get their college degree. As developers, the ability to read English is even more important, as many learning resources and websites (e.g. GitHub) are in English. so, the lack of a Chinese translation isn't so scary.

@pekkaklarck

FWIW, I've worked with many Chinese developers and none of them had problems reading English. That doesn't mean translations wouldn't help there or in Finland where I'm based, but spending any time to support translations on legacy browsers is waste. Now that MS is actively dropping support from old browsers, their market share is going to go down anyway.

@pekkaklarck

Related feature request: I hope that there's a way to globally turn of translations regardless the browser settings. I use Finnish locale but would generally prefer to have all content in English rather than some in Finnish and most in English.

@nlhkabu
Member
nlhkabu commented Mar 6, 2016

Given the status of #979 (not supporting ie8), I am going to close this issue for now. Should we have any significant negative community feedback at launch, we can look into a solution then.

@nlhkabu nlhkabu closed this Mar 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment