Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accept different country variants #11

Closed
rolblo12 opened this issue Apr 18, 2015 · 45 comments
Closed

Accept different country variants #11

rolblo12 opened this issue Apr 18, 2015 · 45 comments
Milestone

Comments

@rolblo12
Copy link

Requests from Firefox include the language code including any country codes completely in lower case. Localeurls just uses this code as is to set the Yii language.

This is ok for language only. However, if you have country specific language variants, Yii specifies them as upper case (ISO-3166). Also ICU requires the country codes as upper case. Furthermore, if you use the JUI Datepicker, its language specific javascript files have the country code in upper case in their name.
So I suppose converting any country code to upper case before it goes into the Yii language would help.

Another issue is that Localeurls does not fall back from an unsupported language variant to the language without variant. So if I support the 'de' translation, I get an page not found error if the request has 'de-at' as the language code. It could silently switch to 'de' instead.

@mikehaertl
Copy link
Collaborator

So this only affects the enableLanguageDetection feature, right? This is where the language is set according to the Accept-Language header (see getAcceptedLanguage()).

I would not extend this to the URL language detection (like http://example.com/de-at/some/page). This case could be solved via a mapping in the $languages configuration.

@rolblo12
Copy link
Author

My simple web site supports en (the default en-US) and de, es, and pt languages (homewebtest.com). For a quick (ugly) workaround I have now added ALL variants of those 4 languages in lower case AND upper case to the $languages array. If you don't add all variants of for example 'en', then someone who has his browser language set to 'en-au' will get a 404 error page from Yii.

If you replace a supported language (for example 'es-ES') with a not supported language (which is not included in $languages, for example 'fr'), you get an url like 'es-ES/fr/...' which gives you another 404 error page. I would prefer to get the page in the default language without error instead.

My simple web site does not yet use the Yii2 formatter. I fear the formatter might need the country part of the locale in upper case, as documented in the ICU description (userguide.icu-project.org/locale#TOC-The-Locale-Concept).

@rolblo12
Copy link
Author

Hmm.. I obviously clicked the wrong button. I did not intend to close the whole issue..

@rolblo12 rolblo12 reopened this Apr 21, 2015
@mikehaertl
Copy link
Collaborator

Not sure if I understand, but why don't you configure only the country and map it to the full code:

'at' => 'de-AT',
'pt' => 'pt-PT',
'br' => 'pt-BR',

It seems we are now mixing several issues into one here:

  • Browser sends de-at but we need de-AT
  • Fallback if de-at ist sent, but only de is configured
  • Fallback if fr is sent, but fr is not configured at all.

@rolblo12
Copy link
Author

The mapping 'at' => 'de-AT' etc. might be a better solution than my workaround. Let me try it...

I agree with the 3 cases.

@rolblo12
Copy link
Author

Sorry to inform you that the mapping ['en' => 'en-US', 'de' => 'de-DE', 'es' => 'es-ES', 'pt' => 'pt-PT'] is not enough: the language variants are not recognized.
For example: test.homewebtest.com/en-au becomes: test.homewebtest.com/-au, which gives a fat 404..

@mikehaertl
Copy link
Collaborator

For example: test.homewebtest.com/en-au becomes: test.homewebtest.com/-au, which gives a fat 404..

If you configure 'en' => 'en-US' then you can't use a URL like /en-au. You should instead use a URL like /en and this should give you the language en-US. It's possible that there's still a bug, though. But your example doesn't make sense to me.

@rolblo12
Copy link
Author

Actually, it was your suggestion to map the languages like that, wasn't it?

I have some friends in down under and their browser just sends 'en-au' as acceptable language...

@mikehaertl
Copy link
Collaborator

Yes, but why do you still use something like en-au in your URL? You should only use au and add a mapping like 'au' => 'en-AU' to your languages.

@rolblo12
Copy link
Author

well if i do that for all the language variants, then I have the same long list as my current workaround.

@mikehaertl
Copy link
Collaborator

That's unavoidable. The extension needs to know, which languages are supported. So you need to list them explicitely.

@mikehaertl
Copy link
Collaborator

Maybe I still don't completely understand your problem. Could you maybe provide some simple examples? Each with the following information:

  • What is the URL being called?
  • What is the Accept-Language header (if it matters at all)
  • What would you expect to happen?

@mikehaertl
Copy link
Collaborator

@rolblo12 I've added a little fix that could improve the situation. Say you have the following language configuration:

'pt',
'br' => 'pt-BR'

And someone sends a request with the following header:

Accept-Language: pt-pt

Now it will search the following combinations:

  • pt-pt
  • pt-PT
  • pt

And use the first one that is found. Maybe you could help test this? It's on master.

@rolblo12
Copy link
Author

Thanks for your fix.

Test

I uploaded the fix to my test site, but left the configuration as is (all language variants (upper and lower case) of en, de, es, and pt are still in the $languages list).
My browser's acceptable languages are de-at, en-au, de, en. Then I deleted all the site's cookies in Firefox.

  1. Loding the base url test.homewebtest.com gives the url test.homewebtest.com/de. Why not de-AT?
  2. When I edit the url in Firefox to .../en-us and load it, test.homewebtest.com/en-us is the result. Only when I edit the url to ../en-US, the locale vanishes from the url. Still the upper/lower case issue.
  3. Loading test.homewebtest.com/fr leaves the url as is and shows the 404 error page. I would rather have the page in the default language instead (no error page).
  4. After loading test.homewebtest.com/es-ES/site/index, I edit the locale in the url line of Firefox to the unsupported fr. After loading that, the browser shows test.homewebtest.com/es-ES/fr/site/index and the 404 error page. I would rather have the page in the default language instead (no error page).

Database language

I am working on another webapp, that stores timestamps in MySQL integer columns. I have to convert these integers to human readable dates. Especially for filter criteria, I use now FROM_UNIXTIME() to compare them with the user's filter value.

This only works correctly, if I run the SQL statement SET lc_time_names = :dblang at the beginning of request processing. Alas, MySQL does not accept the language code only, it needs a full language variant for :dblang:
So en does not work, but en_AU does work.

Therefore I will need a fallback from the language code alone to a reasonable language variant as well. Or maybe the possibility to forbid locales without country variant..

@mikehaertl
Copy link
Collaborator

About your questions:

  1. Well, that's expected. As I've said before, it will only set one of the languages that you have defined. If de-AT is not in your $languages, it will never be set. I could change that, but I'm not really sure about the implications. Is it really a good idea to set de-AT, even if you didn't configure that language?
  2. Will have to look after this. I've also written some test cases now that check for the common cases.
  3. I can not fix this. Say you have the URL /ok. How should I know, if this is a valid page URL or if this is a language code? If the language code is found in $languages everything is fine. But if not, it could be a valid page URL. Unfortunately I can not know at that point.
  4. Not sure if I understand. Could you please show the full configuration of your $languages?

About your DB problem: I think you need to solve this in your code. Like: Check the value of Yii::$app->language and convert it to a full code somehow. This is out of the scope of this extension.

@rolblo12
Copy link
Author

Regarding 1: You are right, de-AT is not in the list, but de-at is:
'languages' => ['en-US', 'en-us', 'en-gb', 'en-au', 'en-bz', 'en-ca', 'en-ie', 'en-nz', 'en-ph', 'en-za', 'en-tt', 'en-zw', 'de-DE', 'de-de', 'de-at', 'de-li', 'de-lu', 'de-ch', 'es-ES', 'es-es', 'es-ar', 'es-bo', 'es-cl', 'es-co', 'es-cr', 'es-do', 'es-ec', 'es-sv', 'es-gt', 'es-hn', 'es-mx', 'es-ni', 'es-pa', 'es-py', 'es-pe', 'es-pr', 'es-uy', 'es-ve', 'pt-PT', 'pt-pt', 'pt-br', 'en', 'de', 'es', 'pt'],

Nevertheless, thank you very much for your extension.
I will modify it for my other project as needed. However, the modified one might not be compatible with yours any more.

@mikehaertl
Copy link
Collaborator

I will have another look at this. So unless you are in a hurry, you can wait for a fix. I just don't want to rush it and consider all the possible implications, before I work on a fix.

@mikehaertl
Copy link
Collaborator

How about this:

We add a two new options: acceptAnyLanguage and acceptAnyCountry (this is only in effect if acceptAnyLanguage is true). They do pretty much what they promise. It will still first check $languages (e.g. to check for aliases) but if the language is not found there and acceptAnyLanguage is true it will check for a 2-letter language code. Or, if acceptAnyCountry is also set, it will also accept ll-CC types of codes.

The same will be in effect for autodetecting the language from browser settings.

What do you think, would this solve your problem?

@mikehaertl
Copy link
Collaborator

After more thinking maybe this is a better idea:

We allow to use de-* in the languages configuratin, like 'languages' => ['en', 'es-*', 'de']. This means, that for es we also accept any country codes in the URL.

Regarding the fallback to the default language I'l open another issue, as this is something different and I want to keep the discussion a bit more focussed on one problem at a time :)

@rolblo12
Copy link
Author

The possibility to use 'de-*' is a great idea. It should match 'de', 'de-at', and 'de-AT' (upper/lower case) though.

Regarding a fall-back, I could imagine to have 'de-DE', 'de-*' in the list. Then you can always search for the first entry starting with 'de-' to obtain the fall-back for the database and/or the formatter.

Easier to configure would be to have only 'de-DE' in the list and the software allows 'de-*' implicitely.. Hmm maybe only if a configuration option is set..

@mikehaertl
Copy link
Collaborator

Sorry, what do you mean by fallback? Is that what I have now moved to #12?

The de-* would work as you said: It accepts de and any de-?? country. If the country is in uppercase (de-AT), it would redirect to the lowercase (de-at) URL. Internally it would still set de-AT in this case.

So in your case you would only have a couple of xy-* entries left in your list: 'de-*', 'en-*', 'es-*'.

@rolblo12
Copy link
Author

Yes, the less configuration, the better. Actually you have the default 'translation' texts anyways, so 'en-*' (or whatever the default language is) could be in the list without explicit configuration..

@mikehaertl
Copy link
Collaborator

@rolblo12 I've pushed the updated version. Maybe you can help testing again?

@rolblo12
Copy link
Author

Thank you for the update. The Home Web Test Team is at your service ;-)

Have grabbed the latest LocaleUrls.php and uploaded to test.homewebtest.com.
Have updated the configuration to 'languages' => [ 'en-*', 'de-*', 'es-*', 'pt-*' ],.
Firefox accepts en-au, de-at, en, de.
Removed all site specific cookies.

Test

  1. Loaded test.homewebtest.com -> Firefox shows the same URL (without any language part), the index page shows up in English. -> OK
  2. Loaded http://test.homewebtest.com/site/index -> Firefox shows the new URL (without any language part), the index page shows up in English. -> OK
  3. Clicked link to contact page -> Firefox shows the URL http://test.homewebtest.com/site/contact(without any language part), the contact form shows up in English. -> OK
  4. Modified the URL in Firefox to contain de-at -> Firefox shows the URL http://test.homewebtest.com/de-at/site/contact, but it shows the 404 error page (in German). -> NOT OK
  5. Modified the URL in Firefox to contain es-es -> Firefox shows the URL http://test.homewebtest.com/es-es/site/contact, but it shows the 404 error page (in Spanish). -> NOT OK
  6. Modified the URL in Firefox to contain es-ES -> Firefox shows the URL http://test.homewebtest.com/es-ES/site/contact, but it shows the 404 error page (in Spanish). -> NOT OK
  7. Modified the URL in Firefox to contain fr (not supported) -> Firefox shows the URL http://test.homewebtest.com/es-/fr/site/contact, but it shows the 404 error page (in Spanish). -> NOT OK (English expected)

The language switch in the index page only shows the English choice (because it relies on the entries 'en-US', 'de-DE', 'es-ES', 'pt-PT'. So I added those entries to the beginning of the $languages array. ('languages' => [ 'en-US', 'de-DE', 'es-ES', 'pt-PT', 'en-*', 'de-*', 'es-*', 'pt-*' ],).
Removed all site specific cookies again.

  1. Loaded test.homewebtest.com -> Firefox shows the same URL (without any language part), the index page shows up in English. All language choices are available -> OK
  2. Switched to português -> Firefox shows the URL http://test.homewebtest.com/pt-PT/site/index and the Portuguese index page. -> OK
  3. Clicked Link to contact page -> Firefox shows the URL http://test.homewebtest.com/pt-PT/site/contact, the form page is in Portuguese. -> OK.
  4. Clicked on the logo to go back to the index page -> Firefox shows the URL http://test.homewebtest.com/pt-PT and the Portuguese index page. -> OK
  5. Modified the URL in Firefox to http://test.homewebtest.com/de-at -> Firefox shows the same URL, but it shows the 404 error page (in German). -> NOT OK

Set Firefox to accept de-at, en-au, de, en.
Removed all site specific cookies.

  1. Loaded test.homewebtest.com -> Firefox shows the same URL (without any language part), the index page shows up in English. All language choices are available -> NOT OK (German expected)

@mikehaertl
Copy link
Collaborator

Ok, another update. Your points 4-6 above should be fixed now. The error in 7) was in spanish, because you probably still had the language in your session / cookie.

@mikehaertl mikehaertl added this to the 1.0.5 milestone Apr 23, 2015
@mikehaertl
Copy link
Collaborator

Just pushed another little fix, so I hope, you didn't test yet :).

Note: I've also very much improved the test cases now. They should cover pretty much every possible aspect of the component. All the tests succeed. So maybe also take a look there if you think something is still not behaving correctly.

@rolblo12
Copy link
Author

No, I'm starting now.. We have siesta here in Spain ;-)

@mikehaertl
Copy link
Collaborator

Crossing fingers here (and may have beer soon as it's usual in German ;)).

BTW thanks for your help so far and for the inspiration to this new feature. It also motivated me to finally write some thorough tests for this extension which already helped me to locate a couple of not so nice bugs. So after all I feel much safer now, when it comes to the quality and reliability of the component.

@rolblo12
Copy link
Author

  • Have downloaded the latest LocaleUrls.php and uploaded it to test.homewebtest.com.
  • Still have in the configuration 'languages' => [ 'en-US', 'de-DE', 'es-ES', 'pt-PT', 'en-*', 'de-*', 'es-*', 'pt-*' ],.
  • Firefox accepts de-at, en-au, de, en, ca (added Catalan that is not supported by the webapp).
  • Removed all site specific cookies.

Test

  1. Loaded test.homewebtest.com -> Firefox shows the same URL (without any language part), the index page shows up in English. -> NOT OK, I expected it would switch to German (../de-AT)
  2. Loaded test.homewebtest.com/site/index -> Firefox shows the new URL (without any language part), the index page shows up in English. -> OK
  3. Switched to German using the language switch on index page -> Firefox shows the URL test.homewebtest.com/de-de/site/index in German -> OK The language switch has the ?language=de-DE query parameter
  4. Clicked link to the contact page -> Firefox shows the URL test.homewebtest.com/de-de/site/contact, the contact form shows up in German. -> OK
  5. Modified the URL in Firefox to contain de-at -> Firefox shows the URL `test.homewebtest.com/de-at/site/contact, the contact form shows in German. -> OK
  6. Modified the URL in Firefox to contain es-es -> Firefox shows the URL test.homewebtest.com/es-es/site/contact, the contact form is Spanish. -> OK
  7. Modified the URL in Firefox to contain es-ES -> Firefox shows the URL test.homewebtest.com/es-es/site/contact, the contact form is Spanish. -> OK
  8. Modified the URL in Firefox to contain fr (not supported) instead of es-es -> Firefox shows the URL test.homewebtest.com/es-es/fr/site/contact, but it shows the 404 error page (in Spanish). -> NOT OK default language (English) expected, no error page
  • Set Firefox to accept ca, de-at, en-au, de, en.
  • Removed all site specific cookies.
  1. Loaded test.homewebtest.com -> Firefox shows the URL test.homewebtest.com/ the index page is English -> OK although it could switch to de-at, because this is the second language in the browser's list
  2. Loaded test.homewebtest.com/fr -> Firefox shows the URL http://test.homewebtest.com/fr, but it is a 404 error page (in English) -> NOT OK no error page expected

@rolblo12
Copy link
Author

I have seen your unit tests. In PHP I have not yet tried to write unit tests, although in Java I was quite keen to do so ;-)
Nevertheless, it is very good practice to have a set of unit tests, especially for some reusable component. So I think I will try some unit tests for my own stuff now as well ;-)

mikehaertl added a commit that referenced this issue Apr 23, 2015
@mikehaertl
Copy link
Collaborator

Unbelievable how many bugs there can be in such a simple piece of code. Added more test cases and fixed more issues. I start to wonder how all this worked at all before. :D

Problem 1) should also be fixed now.

About your issue 8) and 2) of your second run: I still consider them invalid use cases, but please keep the discussion for them in #12. They are really a different story. Let's try to cleanly solve this part of the problems here first.

Apart from that I think you should change your $languages configuration. You don't need both de-DE and de-*. The latter de-* really will match anything now: de, de-AT, de-??.

So maybe you're in the mood for another test run. I'm confident that we're close to solving this issue here.

mikehaertl added a commit that referenced this issue Apr 23, 2015
@rolblo12
Copy link
Author

Actually I had both de-DE and de-* in the list, because my language switcher widget just takes the item from the $languages array, checks whether there is a flag image file with that name and if yes it shows the link to switch the language.
However, I could test for now without the language switcher. I just switch the language by editing the browser URL ;-)

@mikehaertl
Copy link
Collaborator

Not sure, how this is related to the language switcher. :) The important part is really, that you start with a clean $languages array. I'm not sure about side effects if you configure both, de and de-* and maybe even de-AT.

@rolblo12
Copy link
Author

Well, it my own hand-crafted language switcher ;-) It just shows country flags and optionally the language's name. I use that in the index page only.

Nevertheless it would be good to have some unobtrusive dropdown button for the main menu bar to switch languages.

@mikehaertl
Copy link
Collaborator

Well, as I've said in the README this is too specific. It can vary a lot, how such a dropdown looks, which languages it lists, how they are called etc. That's why I only gave a simple example there.

So for now, could we please focus on the localeUrls component? You should really only add languages there, that you want to support. If all you need is to let users with a browser language of de-at see the de version, then you don't even need the de-* anymore. This is fixed now, so that de-at will fall back to de automatically. So either try:

$languages = ['de-*', 'es-*', 'en-*', 'pt-*'];

or maybe even only:

$languages = ['de', 'es', 'en', 'pt'];

The ll-* only makes sense, if you really provide different translations for, say pt-PT and pt-BR.

@rolblo12
Copy link
Author

Ah, so I could just use my 'flag'-languages 'languages' => [ 'en-US', 'de-DE', 'es-ES', 'pt-PT' ],. That is great!!!

@rolblo12
Copy link
Author

  • Have downloaded the latest yii2-localeurls (trunk) and uploaded it to test.homewebtest.com.
  • The configuration is 'languages' => [ 'en-US', 'de-DE', 'es-ES', 'pt-PT' ],.
  • Firefox accepts ca, de-at, en-au, de, en (added Catalan that is not supported by the webapp).
  • Removed all site specific cookies.

Test

  1. Loaded test.homewebtest.com -> Firefox shows the URL http://test.homewebtest.com,
    the index page shows up in English. -> NOT OK I would have expected /de-at because it is
    the second accepted language

@mikehaertl
Copy link
Collaborator

Ah, so I could just use my 'flag'-languages 'languages' => [ 'en-US', 'de-DE', 'es-ES', 'pt-PT' ],. That is great!!!

No, you misunderstood. If you list de-DE then only de-de or de-DE is accepted. Only if you configure de or de-* then de-at and de-AT will be accepted.

@rolblo12
Copy link
Author

  • Have downloaded the latest yii2-localeurls (trunk) and uploaded it to test.homewebtest.com.
  • The configuration is 'languages' => [ 'en-*', 'de-*', 'es-*', 'pt-*' ],.
  • Firefox accepts ca, de-at, en-au, de, en (added Catalan that is not supported by the webapp).
  • Removed all site specific cookies.

Test

  1. Loaded test.homewebtest.com -> Firefox shows the URL test.homewebtest.com/de-at, the index page shows up in German. -> OK

@mikehaertl
Copy link
Collaborator

I'm still a bit confused, what you want to achieve in the end. I think, you would not need the xx-* wildcards at all in your case. If you configure de and the browser sends de-at you will get de, it's that simple.

So can you provide two lists:

  • The full list of languages (either langauge only or language + country) that your application supports
  • A list, how the languages that browsers accept should be mapped to the entries in the first list.

@rolblo12
Copy link
Author

continue test

  1. Clicked link to Kontakt -> Firefox shows the URL test.homewebtest.com/de-at/site/contact, the contact page shows up in German -> OK
  2. Edited browser URL to test.homewebtest.com/es/site/contact -> Firefox shows the same URL and the contact page is Spanish now -> OK
  3. Edited browser URL to test.homewebtest.com/es-aa/site/contact -> Firefox shows the same URLand the contact page is still Spanish -> OK although es-aa is no valid language variant, it is ok for most situations. However, MySQL would not accept it (https://dev.mysql.com/doc/refman/5.0/en/locale-support.html).
  4. Edited browser URL to test.homewebtest.com/pt-PT/site/contact -> Firefox makes it lower case test.homewebtest.com/pt-pt/site/contact, the contact page is Portuguese now. -> OK

BTW: are the language codes internally pt-PT or pt-pt ?

  1. Edited browser URL to test.homewebtest.com/fr/site/contact -> Firefox shows the URL test.homewebtest.com/pt-pt/fr/site/contact, and the 404 error page. -> NOT OK It should just fall back to the default language

Well, I suppose to catch the last case, you have to forbid controller ids that can be confused with language codes.

@rolblo12
Copy link
Author

If you configure de and the browser sends de-at you will get de, it's that simple.

I am aware, that the pure language switch is that simple, because I just provide the translations for the base languages. There is no separate translation for 'de-DE' and 'de-AT' in my webapp.

However, regarding other things that come with the country specification makes me tend to have the full country variant internally. Just think about different date representations: de-DE has Januar, de-AT has Jänner. You get those things for free, no additional translation needed.
Besides MySQL does not accept a simple 'de' either.

@mikehaertl
Copy link
Collaborator

I think, we have to come to an end with this issue. The extension now supports many different things, but still it can't do magic. I'll try to sum up all that is possible now with a complex example:

$languages = [
  'en',
  'at' => 'de-AT',
  'de',
  'pt-*'
];

This configuration will do the following:

  • Browser languages like en, en-us, en-US, en-uk, ... will all lead to an URL of /en and the language of en set as Yii language
  • Browser language of de-at and de-AT will lead to an URL of /at and set de-AT as yii language
  • Browser language of de, de-de, de-DE, de-ch, ... will all lead to an URL of /de and set de as Yii language
  • Browser language of pt-BR or pt-br will leat to url /pt-br and set pt-BR as Yii language
  • Browser language of pt-PT or pt-pt will leat to url /pt-pt and set pt-PT as Yii language
  • Browser language of pt will leat to url /pt and set pt as Yii language

This should give you now many ways how to configure things. Apart from that I don't think there's more we can do here. So I'll close this issue here for now. If you still think, we can improve things, then maybe come up with a similar list as above and explain exactly what accepted language should lead to which URL and set which language in Yii.

@rolblo12
Copy link
Author

I think you have provided a very useful extension. Thank you very much!

@mikehaertl
Copy link
Collaborator

Ok, just released version 1.0.5 with all the above changes. Thanks again for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants