Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate Internationalization Add-On Into Core; Add interface for translation of site interface #1505

Closed
10 tasks
aembler opened this issue Nov 11, 2014 · 28 comments
Assignees
Milestone

Comments

@aembler
Copy link
Member

aembler commented Nov 11, 2014

This has a number of components.

  • Create an entire new section in the Dashboard > System and Settings area for "Multilingual"
  • Move existing sections from the current Internationalization Add-on Into here. Make sure they save their data. Don't worry about how they work yet.
  • Swap out functions that were performed by Zend Locale in favor of Punic, since Zend Locale is no longer included in the core.
  • Create new files in src/Localization for certain helpers and classes that were in helpers, models and libraries in the old add-on.
  • Create hooks in the application bootstrap that run, instead of using the events system to detect pages
  • Migrate the choose language block. Make sure it uses new cookie classes, etc...
  • Create a new dashboard page /dashboard/system/multilingual/translate_interface.
  • Create site parsing code that builds site.po files (like we used to) and then imports this content into a web interface for translating. Make sure this code parses all application files and packages
  • Add localizable items from mlocati's localizer add-on to this list as well. These things are found in the database but there's no way to translate them in the core currently. Things like group names, etc... Make sure the web-based .po editor has access to these strings too.
  • Do final pass and ensure that multilingual functionality works completely, for default language, language detection based on browser string, etc...

Unknowns

  • Is there a decent free .po editor we can bundle with the core? Do we have to roll our own? Can we parse .po files or is this a db-backed solution?
  • does anything need to happen to Punic in order for this to be possible?

Pull Requests

  • Please push pull requests to the feature/integrate-multilingual branch
@aembler aembler self-assigned this Nov 11, 2014
@aembler aembler added this to the 5.7.3 milestone Nov 11, 2014
@mlocati
Copy link
Contributor

mlocati commented Nov 11, 2014

Create site parsing code that builds site.po file

Here it's not enough to call xgettext, since there are also translatable strings taken from .xml files and from the name of the files (eg. the block custom template names).
The script that currently extract all these strings and push them to Transifex is my i18n.php

EDIT: the relevant part that extracts translatable string is https://github.com/mlocati/concrete5-build/blob/master/i18n.php#L1422-L1568

@aembler
Copy link
Member Author

aembler commented Nov 11, 2014

Yeah, I figured we'd need a hybrid script that did a few things, including get the items for the core we know we need, and parses site files.

@mlocati
Copy link
Contributor

mlocati commented Nov 12, 2014

Another relevant aspect is the language detection.
In case of multilingual sites, we may have both a user language and a page language, and it would be great if concrete5 could offer an option (global or user-specific or both) to specify which one should come first.
Indeed:

  1. sometimes we may prefer to give precedence to the user language, for instance when there are pages in a language that the user really does not understand (when I see Japanese pages I really does not understand anything and I really need to see the c5 interface in my own language)
  2. sometimes we may prefer to give precedence to the site language, for instance to see the blocks messages in the page language

Ideally, we should handle two "current" language, one for the page and one for the user, and use the first for the page-only stuff (block outputs for instance) and use the second for the user-only stuff (block editing interface for instance). Since it'd be really a mess, I'd suggest to use that option to to specify which language should come first.

@ahukkanen
Copy link
Contributor

Very good points from @mlocati above, this is the very aspect because of which I think integrating the multilingual functionality to core is a major decision for good. To add to that: currently e.g. if you edit a form block in 5.6 and the interface and the site language is different, when the form block updates its view, it uses the c5 interface language instead of the selected multilingual section. This causes the translated texts to show up in wrong language which confuses editors (e.g. if you have written "* Required field" to the top of the form).

I think it should be separated as the c5 interface locale and site locale but bit more specifically than it is done currently. And by this I mean that when the site's view template is loaded, it would use the site specific locale but when it loads anything related to the c5 interface, it would use the interface locale.

One more point I'd also like to add to this regarding the language detection of the user when they arrive to the site: when the correct language is selected based on user attributes (i.e. currently only browser language in the multilingual add-on), it does not currently check whether the user has permissions to view that language section. This might be problematic when the language versions are not published at the same time and some languages are still being worked on. It would be great if it checked:

  1. Whether the user has access to the primarily selected language front page
    and
  2. Whether there are any other language sections with the user's language

Then the fallback would be to the default language.

@KorvinSzanto
Copy link
Member

I created a branch feature/internationalization, please make your pull requests against that.

Thanks!

Korvin Szanto

On November 25, 2014 at 2:09:11 PM, Antti Hukkanen (notifications@github.com) wrote:

Should we open a new branch for this or should we commit into the develop branch?

I think I can reserve some time for this tomorrow.


Reply to this email directly or view it on GitHub.

@ahukkanen
Copy link
Contributor

Thanks! Yeah, I noticed, and removed my comment at the same time.

I'll be working on this tomorrow, so if anyone else has similar plans, keep this issue posted. I'll make the pull requests to the feature/integrate-multilingual branch.

@KorvinSzanto
Copy link
Member

My bad, looks like we already have a branch. Everyone reading this, use feature/integrate-multilingual, forget feature/internationalization ever existed :)

Korvin Szanto

On November 25, 2014 at 2:13:31 PM, Antti Hukkanen (notifications@github.com) wrote:

Thanks! Yeah, I noticed, and removed my comment at the same time.

I'll be working on this tomorrow, so if anyone else has similar plans, keep this issue posted. I'll make the pull requests to the feature/integrate-multilingual branch.


Reply to this email directly or view it on GitHub.

@aembler
Copy link
Member Author

aembler commented Nov 25, 2014

Yeah, we do – although I don't think much work's been done in it yet. That
would definitely be the branch to use.

On Tue, Nov 25, 2014 at 2:22 PM, Korvin Szanto notifications@github.com
wrote:

My bad, looks like we already have a branch. Everyone reading this, use
feature/integrate-multilingual, forget feature/internationalization ever
existed :)

Korvin Szanto

On November 25, 2014 at 2:13:31 PM, Antti Hukkanen (
notifications@github.com) wrote:

Thanks! Yeah, I noticed, and removed my comment at the same time.

I'll be working on this tomorrow, so if anyone else has similar plans,
keep this issue posted. I'll make the pull requests to the
feature/integrate-multilingual branch.


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub
#1505 (comment)
.

@aembler aembler modified the milestones: 5.7.3, 5.7.4 Dec 5, 2014
@mlocati
Copy link
Contributor

mlocati commented Dec 10, 2014

@aembler Sorry for not being able to give some help in this period (I'll be very busy until the first days of the new year). In the meanwhile, just a hint: what about adding an option to specify the precedence of the language?
I mean, we may have a page language (set by multilingual) and a user language (set by the uDefaultLanguage field of the Users table). We often see people that want to give precedence the first one or the second one...

@aembler
Copy link
Member Author

aembler commented Dec 10, 2014

We're trying to get this out the door before Christmas, so right now we're
just focused on getting the existing functionality working in 5.7.3.

I do have a question though – right now I've gotten the setup dashboard
screen working for multilingual in the feature/integrate-multilingual
branch, and I've got a list of languages (through Punic) but the list of
languages is pretty low, and there are a large number of countries that say
that they speak English. Is this right? It's a pretty different list than
what was in Zend_Locale.

Also, does Punic have any built-in ability to guess a locale based on
browser?

Thanks again

On Wed, Dec 10, 2014 at 2:36 AM, Michele Locati notifications@github.com
wrote:

@aembler https://github.com/aembler Sorry for not being able to give
some help in this period (I'll be very busy until the first days of the new
year). In the meanwhile, just a hint: what about adding an option to
specify the precedence of the language?
I mean, we may have a page language (set by multilingual) and a user
language (set by the uDefaultLanguage field of the Users table). We often
see people that want to give precedence the first one or the second one...


Reply to this email directly or view it on GitHub
#1505 (comment)
.

@ahukkanen
Copy link
Contributor

Also the same issues as mentioned above what I tried to figure out when playing with this. I've also done some work but never had the time to finish it up to make a pull request. I'll be also looking into the feature/integrate-multilingual branch in the upcoming weeks if I can give any help.

@aembler We've also had cases where we needed to set the a language to a country that does not officially speak that language. E.g. on one site the client needed to have one English version targeted for the Finnish people and another English version targeted for the rest of the world. While it was possible, we couldn't do it though the UI, we needed to touch the DB by hand.

I think we would be better off with just two separate dropdowns of whch in the first one you can select the language and in the second one you can select the country. I've also seen some other systems provide functionality like that.

@ahukkanen
Copy link
Contributor

And while on this, I might also throw in another idea regarding the implementation that I've had in my mind. I've also briefly mentioned this in the forums but don't know if anyone has noticed / given any notice to it.

IMO, it would be quite useful if we could define several "language roots", like we currently have the "Home" root page. And a single locale would only be limited within a single "root", so you could have several "en_US" languages in your sitemap, as long as they are under different "roots".

This would be especially useful in some bigger cases where the client might manage multiple "branches" within a single installation. It would add some complexity to the implementation but as a feature it would be very useful.

@aembler
Copy link
Member Author

aembler commented Dec 10, 2014

We could certainly provide two separate dropdown – one which gives you a language and one which gives you a locale (which would populate your flag icon.) When you had your custom setup, what was the underlying locale? en_FI? Right now we use the flag icon that we selected to populate msLocale which we probably use for any number of things

@ahukkanen
Copy link
Contributor

Yes, the locale was en_FI in that case.

@aembler
Copy link
Member Author

aembler commented Dec 10, 2014

Interesting. So maybe it makes the most sense to do this. For each section:

  1. Choose a language
  2. Choose a country
  3. The flag icon is chosen from the country selection.
  4. msLocale is auto computed from these two fields.

Something like that?

Also I like the idea of multiple language roots – would be interested in
seeing something like that down the line.

On Wed, Dec 10, 2014 at 10:38 AM, Antti Hukkanen notifications@github.com
wrote:

Yes, the locale was en_FI in that case.


Reply to this email directly or view it on GitHub
#1505 (comment)
.

@ahukkanen
Copy link
Contributor

That sounds feasible for me.

About the flag icons, I was also thinking is there any composer library that we could use to get those easily? I briefly played out with this one which provides SVG flags:
https://packagist.org/packages/components/flag-icon-css

But the flag images in that looked like crap, the proportions were off in quite many flags and the colors were somewhat too bright (maybe because of the SVG implemetation).

@ahukkanen
Copy link
Contributor

And by the way, if we could also indicate the country + language combination in the sitemap, it would help in that kind of cases, if it's possible to make it look good somehow. On that site we had three pages with the Finnish icon next to them, so it was a bit confusing although you could also see the front page name next to the icon.

@ahukkanen
Copy link
Contributor

Maybe something like this?
[icon] Front (en_US)
[icon] Etusivu (fi_FI)
etc.

@mlocati
Copy link
Contributor

mlocati commented Dec 11, 2014

@aembler

the list of languages is pretty low

Yes, by default Punic includes a subset of all the CLDR data (we use json.zip and not json-full.zip of http://unicode.org/Public/cldr/26/ ). Including all the data would lead to a much bigger size of Punic. In my to-do list I have to implement a solution for this (aggregate duplicated data), but I think I won't be able to accomplish it by the end of this year.

there are a large number of countries that say that they speak English

I took a look at the getLanguageCountries method of Concrete\Core\Localization\Service\LanguageList and the way it finds the countries where we speak is correct but incomplete: it finds only the countries for which Punic has data for a specific language.
For instance, Punic comes with data for Italian (Italy), so getLanguageCountries only returns that Italian is spoken just in Italy. But the CLDR data itself (in the supplemental/languageData.json file) says that Italian is spoken in Italy, Switzerland and San Marino (and that's more complete).
I'm going to implement a new method to get the Countries where a language is spoken in the short time.

does Punic have any built-in ability to guess a locale based on browser?

Not right now, but I'm going to implement it in the short time.

EDIT: Including all the CLDR data with the current approach of Punic would rise the Punic installation from 4MB to 60MB. I really need to find a solution 😉

@mlocati
Copy link
Contributor

mlocati commented Dec 11, 2014

@aembler I just added new stuff that concrete5 could use in the getLanguageCountries method: Punic\Territory::getTerritoriesForLanguage(). It returns the list of countries where a language is spoken. concrete5 should switch to Punic 1.2.x in order to use this new method

EDIT: see https://github.com/punic/punic/releases/tag/1.2.0

@mlocati
Copy link
Contributor

mlocati commented Dec 11, 2014

@aembler I also added Punic\Misc::getBrowserLocales to detect the browser locales (see https://github.com/punic/punic/releases/tag/1.2.1 )

@aembler
Copy link
Member Author

aembler commented Dec 12, 2014

Awesome. I have updated composer and I will check this out soon.

On Thu, Dec 11, 2014 at 9:49 AM, Michele Locati notifications@github.com
wrote:

@aembler https://github.com/aembler I also added
Punic\Misc::getBrowserLocales to detect the browser locales (see
https://github.com/punic/punic/releases/tag/1.2.1 )


Reply to this email directly or view it on GitHub
#1505 (comment)
.

@mlocati
Copy link
Contributor

mlocati commented Dec 12, 2014

@aembler I just added the new Punic\Language::getAll() method to the new Punic 1.2.2 version.
Although the whole set of language is not supported by the current Punic data (for instance, we don't have date formatting for the Afar language), you may want to use that new method to retrieve the selectable languages.
To better understand this new method you can also take a look at the tests in https://github.com/punic/punic/blob/master/tests/Language/LanguageTest.php

@aembler
Copy link
Member Author

aembler commented Dec 12, 2014

I've updated the get language method to use this, and have integrated the browser locales. Thanks!

@Hypocrite
Copy link

Regarding the unknows of the .po editor. This might be useful:
http://pootle.translatehouse.org/index.html

@aembler
Copy link
Member Author

aembler commented Dec 17, 2014

Version 1 of this is done and now integrated into develop. I rolled our own (very basic) po editor that uses https://github.com/oscarotero/Gettext for its backend.

@mlocati
Copy link
Contributor

mlocati commented Dec 17, 2014

Great!!!! I'm very sorry for not having time to give some help, it's a really busy period (but it's going to finish, fortunately).

@aembler
Copy link
Member Author

aembler commented Dec 18, 2014

No problem! Thanks for taking a look.

On Wed, Dec 17, 2014 at 2:51 PM, Michele Locati notifications@github.com
wrote:

Great!!!! I'm very sorry for not having time to give some help, it's a
really busy period (but it's going to finish, fortunately).


Reply to this email directly or view it on GitHub
#1505 (comment)
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants