Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Localization #1453

Closed
di opened this issue Sep 8, 2016 · 13 comments
Closed

Add Localization #1453

di opened this issue Sep 8, 2016 · 13 comments

Comments

@di
Copy link
Member

@di di commented Sep 8, 2016

An initial attempt at localization was removed in #1335, but eventually we may want to bring translation back to make PyPI more accessible to a wider audience.

Previously we used http://l20n.org/, but this may or may not be the best tool for the job.

Additional tools that might be worth exploring:

@brainwane
Copy link
Member

@brainwane brainwane commented Dec 17, 2017

I am very much in favor of PyPI being localized, but since legacy PyPI did not support localization, I need to make the hard call and say that it's not on our critical path for launching Warehouse.

@di di added the Low priority label Jan 18, 2018
@brainwane brainwane added the i18n label Feb 13, 2018
@brainwane
Copy link
Member

@brainwane brainwane commented Feb 13, 2018

Per Heidi Waterhouse's overview and appreciating the discussion in #402, there are some prerequisite steps we ought to do even if we don't have time right now to gather volunteer translations via Transifex or TranslateWiki or fully implement a localization tool like l20n.

  • Figure out how to use Pyramid's existing localisation/internationalization support so we can avoid hard-coding messages in our templates. For instance, warehouse/templates/includes/edit-project-button.html has the plain hardcoded English message "Edit Project", not something like ... I don't know the syntax, but, {{ edit-project-msg }} which would have the right message interpolated for the user's locale, which (at launch) we'd default to en.
  • Confirm extended character support so people's names can include, e.g., Chinese characters.
  • Check our screens for fixed-width elements.
@di
Copy link
Member Author

@di di commented Feb 13, 2018

Figure out how to use Pyramid's existing localization/internationalization support so we can avoid hard-coding messages in our templates.

The actual marking of strings for translation can be done pretty easily with pyramid.i18n which provides a request.localizer.translate function, and pyramid-jinja2 which provides the same function for our Jinja2 templates.

The marked strings can then be extracted and compiled for translation with babel

The hard parts here as I see it are:

  • identifying all the strings which require localization
  • updating them (and all the tests that this will break, which will be a lot)
  • figuring out how this will affect caching
  • actually getting the strings translated
@dstufft
Copy link
Member

@dstufft dstufft commented Feb 13, 2018

figuring out how this will affect caching

This is a large part of why when I disabled localization, I ripped it out completely. I also asked a number of people and the responses I got were mixed, ranging from "absolutely, translate this it will help non-native english speakers use PyPI" to "It would be nice, but unless you've got the infrastructure to ensure that they stay up to date, and the bulk of the messages stay translated, it'll probably hurt more than it helps" to "don't bother, it's basically impossible to program Python without knowing English anyways".

I am native english (and only english) so I can't really decide between these.

My biggest concern with localization is really just logistics. We're a volunteer project so we can't pay to have someone ready and able to translate new or changed strings, we have to rely on the community to do it. The impact of that is likely going to mean that at best we're going to end up with partial translations over time as people help and translate a large portion of the text and then lose interest or no longer have the time to contribute and the translation for a particular language starts to bitrot. This would apply both to new strings (which would need translated fresh) and modified strings (which would need a double check to ensure they still make sense with the modifications).

The other issue is just quality of the translation itself. From my dabbling in this before, I've noticed that often times people are eager to help out when they know 2 or more languages and submit translations, but either their proficiency at one the languages is lacking, or translating takes a different skillset than just comprehension/fluency and they end up submitting technically correct but very poor or confusing translations. These translations are next to impossible for me to review (and possibly anyone currently on the team? I'm going to guess everyone currently working on the team is english native) so it's unlikely going to be something any of us are capable of reviewing.

So at a minimum, I think if we're going to offer localization we're going to need someone to take ownership of each language we add. This someone would need to have experience doing localization and a strong background in both english and the target language. With taking ownership they'd effectively be making a commitment to ensuring that the their language stays translated with high quality translations (that doesn't mean that they have to personally do it, they could recruit other people for instance and that's fine).

Until we have such a person (or persons), I don't think it makes sense to bother worrying about the technical side of making localization possible.

@nlhkabu
Copy link
Member

@nlhkabu nlhkabu commented Feb 14, 2018

I agree with Donald on this one - I think the first step here is to set up the social infrastructure to support i18n. IMO, we should look for an "owner" for each language - someone who could either edit/review translations, or write them. Each owner should understand that they are committing to the project and if they want to walk away, they will need to find someone to handover to (of course, we can help them with this).

We might even want to recruit someone to manage the whole subject - someone to notify each language owner when new translations are needed, to help recruit translators and generally ensure that we don't end up with translation rot.

I am currently working with a translations specialist at PeopleDoc (my day job) - one thing that has been really strongly emphasised is the importance of providing context to the translators. Basically, our translation specialist has asked that we annotate each string in the template to describe where it is and what it is doing. This helps the translators understand things such as - is this a verb, an adjective or a noun? A good example being the word "complete" - which could either be an action or a status. I think that doing this would address the vast majority of the quality concerns Donald raised earlier.

If we are going to kick this off, may I suggest French as the first candidate? I work for a French company and have good connections with the French Python community- so it could be a good starting point. May I also suggest we leave RTL languages for last? We should be able to convert our CSS to RTL thanks to https://www.npmjs.com/package/postcss-rtl, but I'd rather tackle this down the line.

@brainwane
Copy link
Member

@brainwane brainwane commented May 2, 2018

I think French as a first candidate is wise!

RTL as a later standalone project makes some sense to me -- perhaps under the Google Summer of Code umbrella (I'm influenced by Moriel Schottlender here).

I'll note here, for future readers, that Warehouse is seeking grants or other sponsorship-type funding to work on researching and implementing l10n (as noted on distutils-sig).

(Another note: came across https://medium.com/@thejameskyle/the-language-of-programming-7983b8f6910d which recommends Crowdin.)

@0101011
Copy link

@0101011 0101011 commented Jan 5, 2019

Signing in here for updates:

Ready to contribute and own Russian localization of PyPi infrastructure

@brainwane
Copy link
Member

@brainwane brainwane commented Jun 8, 2019

Localisation and internationalization work for Warehouse has now been funded by Open Technology Fund so we have plans to have @nlhkabu and @woodruffw work on this task. We'll also get help from volunteers for the bits of work that require fluency in non-English languages or that can easily be split up into small tasks especially for new contributors.

Our criteria for success:

  • Warehouse developers have access to a localization framework allowing for specifying text localization strings and messages in templates and backend code that can be translated.
  • Chosen localization platform is integrated with PyPI's service or build pipeline to include translations as they are available.
  • Existing test strings in templates, views, and email messages that can be localized are set to use the chosen localization framework.
  • Documentation is added which indicates how Administrators utilize the chosen Internationalization framework and Internationalization service.

Issues we need to resolve to get there:

  • Choose localization framework (for string interpolation with Pyramid templates) #5982
  • Systematically find English-hardcoded UI messages #5983
  • Replace hardcoded UI strings with localizable messages #5984
  • Evaluate and choose localization/translation platform #5985
  • Integrate Warehouse with translation/localization platform service #5986
@brainwane
Copy link
Member

@brainwane brainwane commented Jun 23, 2019

Once we finish those 5 issues above, given that this work's funded by Open Technology Fund we can probably get volunteer translator help from OTF's Localization Lab which is "looking for translators and projects!"

@brainwane
Copy link
Member

@brainwane brainwane commented Jun 24, 2019

And potentially help from the PSF's Translation WG which aims to do translations for Python & Python documentation in general.

@brainwane
Copy link
Member

@brainwane brainwane commented Sep 2, 2019

Update from a meeting last week: we are on our way to deliver tooling throughout September, and will probably start recruiting our first translators in mid-September so that we can aim for at least one complete or nearly-complete translation set (possibly French) by the end of September.

@woodruffw
Copy link
Collaborator

@woodruffw woodruffw commented Sep 11, 2019

For tracking purposes: #6535 contains the localization skeleton and initial tagging of translatable strings in views.

@nlhkabu
Copy link
Member

@nlhkabu nlhkabu commented Sep 20, 2019

Closing as per #6535 . I will open a new ticket for reaching out to community members for translations

@nlhkabu nlhkabu closed this Sep 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
7 participants
@dstufft @di @brainwane @woodruffw @nlhkabu @0101011 and others