bug 1461350: Reimplement locale-prefixed URLs to mirror Django #4790

jwhitlock · 2018-05-16T15:00:54Z

This replaces the LocaleURLMiddleware ("Based on zamboni.amo.middleware") with one based on Django's LocaleMiddleware, plus additional middlewares for Kuma-specific functionality. This replaces PR #4766, which was the work-in-progress version of this code.

The Django code is from release 1.8.19, and has several customizations to handle Kuma features like mixed-case locale codes (en-US vs Django's en-us). It also removes some features we don't use, such as storing a language preference in the session (adding a Vary: cookie to all responses), and some flexibility needed in the Django framework code.

The widest code change is the splitting of URL patterns into locale-prefixed patterns and no-locale patterns. In apps, these are grouped into lang_urlpatterns and urlpatterns, and they are applied in kuma/urls.py with a customized i18n_patterns. The patterns now include the active locale, and the active locale needs to be temporarily changed (with language.override(locale)) to generate URLs to other locales. The default URLs contain the locale, so Kuma is less likely to generate "generic" URLs.

Another subtle change is the handling of region-specific languages in the Accept-Language header. Previously, a header of fr-FR; de=0.5 would select de, the first exact match. After this change, the Django-based algorithm picks fr, the general locale variant of fr-FR. More discussion is at #4766 (comment).

There is some known follow-on work that is not included, because this monster is scary enough:

Change Zone-remap redirects from permanent to temporary redirects, and refactor tests
Simplify Kuma's reverse to drop support for legacy parameters.
Simplify test code for the default of locale-prefixed URLs
(Maybe) Convert the locale prefix from a special parameter of Kuma's reverse to a standard pattern argument.

Copy code from Django 1.8.19 which needs to be modified to work with Kuma language codes, legacy locale support, and 404s: * from django.conf.urls: - i18n_patterns * from django,core.urlresolvers: - LocaleRegexURLResolver (extended) * from django.middleware.locale: - LocaleMiddleware * from django.utils.translation.trans_real: - get_langugage - get_supported_language_variant - get_language_from_path - get_language_from_request No modifications yet, to make it easier to review changes in future Django versions.

Modify the Django 1.8.19 locale code to work with Kuma: * Return Kuma language codes, with overrides from settings.LOCALE_ALIASES, from get_supported_language_variant. * Don't check the session for a language override. * Don't add header Vary: Accept-Language * Handle alternate code paths unused in Kuma, by converting to asserts, or commenting and skipping code coverage.

Add middlewares for Kuma-specific locale features, provided in to LocaleURLMiddleware or elsewhere, to keep LocaleMiddleware close to Django's version. LangSelectorMiddleware handles locale redirects based on the ?lang query parameter. LocaleStandardizerMiddleware redirects to standardize local prefix cases, legacy locales, and overly specific locales, rather than return 404s.

jwhitlock · 2018-05-16T15:06:17Z

kuma/wiki/tests/test_middleware.py


+    With the Django 1.8-derived middleware, these are combined into a single
+    302 temporary redirect, which may also help with retiring DocumentZones.


Oops - this comment should have been removed. Rebase coming.

Replace LocaleURLMiddleware with LangSelectorMiddleware, LocaleStandardizerMiddleware, and Django's modified LocaleMiddleware. Simplify Kuma's reverse function, and drop code that supports the old locale framework. One large change is that urlpatterns are split into those with a language prefix (lang_urlpatterns), and those without (urlpatterns). The customized LocaleRegexURLResolver and i18n_patterns ensure that Kuma's language codes (such as en-US) are used instead of Django's language codes (such as en-us). Kuma's previous locale tools (reverse, Jinja's wiki_url) defaulted to no locale prefix. With the new framework, the default is the currently active locale, and it is retained in request.path_info. These changes require some adjustments, such as in the DocumentZone middleware and tests. Overly-specific locales in the Accept-Language header are processed differenty. Previously, this header: Accept-Language: fr-FR, de;q=0.5 would results in 'de' being selected as the first exact match. The code now uses Django's behaviour, where 'fr' is chosen as the generic locale match for the first parameter.

escattone · 2018-05-16T17:02:00Z

kuma/core/i18n.py

+    selecting a more generic variant. Raises LookupError if nothing found.
+
+    If `strict` is False (the default), the function will look for an alternative
+    country-specific variant when the currently checked is not found.


Super nit. This portion on strict seems out of place now that it has been removed from the function signature.

escattone · 2018-05-16T17:05:54Z

kuma/core/i18n.py

+
+        # Check for known override
+        if lang_code in settings.LOCALE_ALIASES:
+            return settings.LOCALE_ALIASES[lang_code]


❤️ I love that the LOCALE_ALIASES are incorporated here. They can be handled in a much cleaner, more straightforward fashion.

Yes, it is a much better fit than the "fix after the fact" method from the last PR.

escattone · 2018-05-16T17:32:30Z

kuma/core/i18n.py

+    out the main language.
+
+    If check_path is True, the URL path prefix will be checked for a language
+    code, otherwise this is skipped for backwards compatibility.


Super nit. This check_path portion seems out of place now that it has been removed from the function signature.

escattone · 2018-05-16T17:36:43Z

kuma/core/i18n.py

+        if not language_code_re.search(accept_lang):  # pragma: no cover
+            # Check added with a security fix:
+            # https://www.djangoproject.com/weblog/2007/oct/26/security-fix/
+            # It is unclear how to trigger this branch, so skipping coverage.


Nice research!

I agree. It seems to me that Django's parse_accept_lang_header will only return values that would satisfy the language_code_re.search call.

I'm going to replace it with an assertion, and we'll let the internet tell us what string we haven't thought of.

escattone · 2018-05-16T18:39:50Z

kuma/core/middleware.py

-            query = dict((smart_str(k), v) for
+        """Redirect if ?lang query parameter is valid."""
+        query_lang = request.GET.get('lang')
+        if query_lang not in dict(settings.LANGUAGES):


Nit. I was thinking that since this is run for almost every request, and that most requests will not have a lang query parameter, we could short circuit the need for a dictionary lookup with:

if not (query_lang and (query_lang in dict(settings.LANGUAGES))):

We have a few cases where we'd like a quick check of the supported Kuma language code, so I'll go a little farther down this path and avoid creating a dict every time.

escattone · 2018-05-16T19:12:20Z

kuma/core/i18n.py

+        possible_lang_codes.append(generic_lang_code)
+        raw_supported_lang_codes = get_languages()
+        supported_lang_codes = [kuma_language_code_to_django(lang)
+                                for lang in raw_supported_lang_codes]


Nit. I was thinking that when we create the supported_lang_codes list, it's too bad we lose the OrderedDict (from get_languages()) and its efficiency (for the following code in supported_lang_codes check). What do you think of either one of the following instead?

raw_supported_lang_codes = get_languages() supported_lang_codes = OrderedDict((kuma_language_code_to_django(lang), None) for lang in raw_supported_lang_codes)

or, skipping the call to get_languages altogether:

supported_lang_codes = OrderedDict((kuma_language_code_to_django(lang), None) for lang, _ in settings.LANGUAGES)

with both options assuming, of course, that we add a from collections import OrderedDict.

I'm not sure why an OrderedDict would be more efficient to iterate over than a list.

I'm not as sensitive to efficiency here, because of the lru_cache, but it appears we could use similar code in a few places, so it makes sense to calculate it once and skip get_languages (or call it get_kuma_languages).

What I meant to say is that the OrderedDict is more efficient for the later code in supported_lang_codes check within the loop. Good point about the lru_cache.

Ah I see. We want the set-lookup behavior for testing inclusion, and the ordered-list behavior when iterating through the list. OrderedDict does both.

escattone · 2018-05-16T21:49:27Z

kuma/core/middleware.py

+            fixed_locale = settings.LOCALE_ALIASES[literal_from_path.lower()]
+        elif not match and literal_from_path.startswith(language_from_path):
+            # Language code is a specific locale (fr vs fr-FR)
+            fixed_locale = language_from_path


👍 Nice, I verified that kuma/core/tests/test_locale_middleware.py covers all three of these cases.

safwanrahman · 2018-05-16T22:09:12Z

kuma/core/i18n.py

+    lang_code = request.COOKIES.get(settings.LANGUAGE_COOKIE_NAME)
+
+    try:
+        return get_supported_language_variant(lang_code)


This language code should always be mdn locale other than user modify the cookie!
I think we can only check if the locale is supported

No, I think it is a bad idea to avoid get_supported_language_variant here.

lang_code is None if there is no request cookie, and get_supported_language_variant will raise a LookupError.

get_supported_language_variant checks that the cookie-provided locale is supported, and other stuff. This is where we'll do any locale switching if we remove a previously supported language, or it will also raise a LookupError and we'll fallback to the Accept-Language header.

safwanrahman · 2018-05-16T22:10:51Z

kuma/core/i18n.py

+    except LookupError:
+        pass
+
+    accept = request.META.get('HTTP_ACCEPT_LANGUAGE', '')


I dont know if there are any test for the ordering of locale choice!
Like, first path locale, then cookie, then header!
maybe worth writing one if not exists!

There's a headless test, but probably not a unit test.

There's a few tests of each piece of this. I couldn't find an explicit test that the cookie beat the Accept-Language header, so I updated test_locale_middleware_language_cookie to include a header.

safwanrahman · 2018-05-16T22:12:55Z

kuma/core/i18n.py

+        return lang_code
+
+    lang_code = request.COOKIES.get(settings.LANGUAGE_COOKIE_NAME)
+


This should be inside a if block.

if lang_code: ...

It's unclear which line you are saying should be in an if block. I've tried to determine a line that should be in an if block, and none seem appropriate to me.

safwanrahman · 2018-05-16T22:15:17Z

kuma/core/i18n.py

+    # Django supports a case when LANGUAGE_CODE is not in LANGUAGES
+    # (see https://github.com/django/django/pull/824). but our LANGUAGE_CODE is
+    # always in LANGUAGES.
+    assert settings.LANGUAGE_CODE == settings.LANGUAGES[0][0]


I dont know if we should check it in every request. maybe add a test where its checked?

It's indirectly checked here:

https://github.com/mozilla/kuma/blob/cdfce6d69f6740d2061c8795cbfc7b12ba05b92d/kuma/settings/common.py#L236-L237

I think it makes sense to add a test and remove the assertion here.

escattone · 2018-05-16T22:27:59Z

kuma/wiki/middleware.py

        # Skip slugs that don't have locales, and won't be in a zone
-        request_slug = request.path_info.lstrip('/')
        if any(request_slug.startswith(slug)


Nit. It seems more efficient to move lines 46-48 immediately after the definition of request_slug, and then start the maybe_lang and path stuff.

escattone · 2018-05-16T22:34:18Z

kuma/wiki/middleware.py

+        path = request.path_info
+        request_slug = path.lstrip('/')
+        maybe_lang = request_slug.split(u'/')[0]
+        if maybe_lang in settings.ENABLED_LOCALES:


Nit. Since the code is run for almost every request, it'd be nice to make this more efficient by using a set or frozenset that we create once and use thereafter, so perhaps adding something like ENABLED_LOCALES_SET = frozenset(ENABLED_LOCALES) within kuma/settings/common.py and then using that here?

escattone · 2018-05-16T22:52:09Z

kuma/wiki/urls.py

@@ -29,6 +29,9 @@
    url(r'^\$files$',
        edit_attachment,
        name='attachments.edit_attachment'),
+    url(r'^\$edit/(?P<revision_id>\d+)$',
+        views.edit.edit,
+        name='wiki.new_revision_based_on'),


I'm not sure how this wiki.new_revision_based_on url made its way back into our lives, but it needs to go! 😄

Nice catch!

escattone · 2018-05-17T00:01:18Z

kuma/core/middleware.py

+        language_from_path = get_language_from_path(request.path_info)
+        if response.status_code == 404 and not language_from_path:
+            language_path = '/%s%s' % (language, request.path_info)
+            path_valid = is_valid_path(request, language_path)


This fails to redirect locale-free zone URL's (e.g., http://localhost:8000/Firefox gets a 404 instead of redirecting to the active locale set from get_language_from_request on the way in), which causes a number of the tests within tests/headless/test_cdn.py to fail. It seems that is_valid_path has to consider zones as well? 🤦‍♂️ I'm tempted to say we should take this opportunity to dump zones now, but that probably is a bridge too far.

Which tests fail?

Sorry, all of the zone-related slugs (all that start with /Firefox) with the test_locale_selection_cached and test_locale_selection_not_cached tests.

I posted my actual results in the main thread below.

To summarize the results and discussion below:

Tests against /es/Firefox and /fr/Firefox failed because only the English page is in the sample database. This code returns a 404, where the previous code returned a 302. This is probably an improvement, but means those tests should be skipped or marked XFAIL in local development, until the sample database is updated.

The test for ?lang=de succeeded with a 302, even though the page is not in the sample database. The quick 404 would be nice, but it isn't a regression, so I'm not considering a change for this PR.

Thanks @jwhitlock. Sorry for my confusion around zones yesterday. I realized it later last night. I don't think we have to change anything for the ?lang=de case, even in the future. I like the simplicity of the LangSelectorMiddleware as it is.

escattone

This is really nice. I really like this approach. The nits you can do with as you please, and the re-emergence of the wiki.new_revision_based_on endpoint is an easy fix. However, just when I was thinking we were home free, I noticed that some of the headless CDN tests were failing when run against my local dev instance. I think all of the failures are due to the fact that is_valid_path does not take zones into account (oh man, do I hate zones!).

escattone · 2018-05-17T00:23:11Z

Here is the result of the headless tests run against my local dev instance (ignore test_xx, which is a test I added to do some debugging):

(mdntest) rjohnson-25186:kuma rjohnson$ pytest --base-url http://localhost:8000 tests/headless/test_cdn.py
=============================================== test session starts ===============================================
platform darwin -- Python 2.7.12, pytest-3.5.1, py-1.5.3, pluggy-0.6.0
sensitiveurl: .* *** WARNING: sensitive url matches http://localhost:8000 ***
metadata: {'Python': '2.7.12', 'Driver': None, 'Capabilities': {}, 'Base URL': 'http://localhost:8000', 'Platform': 'Darwin-17.5.0-x86_64-i386-64bit', 'kuma': {u'services': {u'search': {u'available': True, u'count': 0, u'populated': False}, u'kumascript': {u'available': True, u'revision': u'5f676637045a77129a5009a6d15ba3eaba8e632c'}, u'test_accounts': {u'available': True}, u'database': {u'available': True, u'document_count': 964, u'populated': True}}, u'version': 1, u'request': {u'url': u'http://localhost:8000/_kuma_status.json', u'is_secure': False, u'host': u'localhost:8000', u'scheme': u'http'}, 'response': {'headers': {'Content-Length': '791', 'x-xss-protection': '1; mode=block', 'x-content-type-options': 'nosniff', 'Content-Language': 'en-US', 'Expires': 'Wed, 16 May 2018 23:43:39 GMT', 'Server': 'meinheld/0.6.1', 'Last-Modified': 'Wed, 16 May 2018 23:43:39 GMT', 'Connection': 'Keep-Alive', 'ETag': '"9cf33bb911d422d9be82180d93a52b82"', 'Cache-Control': 'no-cache, no-store, must-revalidate, max-age=0', 'Date': 'Wed, 16 May 2018 23:43:39 GMT', 'X-Frame-Options': 'DENY', 'Content-Type': 'application/json'}}, u'settings': {u'ATTACHMENT_HOST': u'localhost:8000', u'PROTOCOL': u'http://', u'INTERACTIVE_EXAMPLES_BASE': u'https://interactive-examples.mdn.mozilla.net', u'MAINTENANCE_MODE': False, u'STATIC_URL': u'/static/', u'SITE_URL': u'http://localhost:8000', u'ATTACHMENT_ORIGIN': u'localhost:8000', u'DEBUG': True, u'ALLOWED_HOSTS': [u'*'], u'REVISION_HASH': u'undefined', u'LEGACY_HOSTS': []}}, 'Plugins': {'variables': '1.7.1', 'selenium': '1.11.4', 'xdist': '1.16.0', 'rerunfailures': '2.1.0', 'html': '1.16.1', 'base-url': '1.4.1', 'metadata': '1.5.1'}, 'Packages': {'py': '1.5.3', 'pytest': '3.5.1', 'pluggy': '0.6.0'}}
baseurl: http://localhost:8000
rootdir: /Users/rjohnson/repos/kuma, inifile: pytest.ini
plugins: xdist-1.16.0, variables-1.7.1, selenium-1.11.4, rerunfailures-2.1.0, metadata-1.5.1, html-1.16.1, base-url-1.4.1
collected 349 items

tests/headless/test_cdn.py ..........................................x.....xx............................xx....s.xx.............................................................F.F.F.F.F.F.F.F........................................................................................................F.........................................F.F.F.F.F.F.F.F.F.F.F.F.F.F.F.F.F.F.F.F

==================================================== FAILURES =====================================================
________________________________ test_locale_selection_cached[/Firefox-fr-cookie] _________________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 428, in test_locale_selection_cached
    response = assert_cached(url, 302, is_behind_cdn, **request_kwargs)
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 84, in assert_cached
    assert response.status_code == expected_status_code
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
____________________________________ test_locale_selection_cached[/Firefox-es] ____________________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 428, in test_locale_selection_cached
    response = assert_cached(url, 302, is_behind_cdn, **request_kwargs)
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 84, in assert_cached
    assert response.status_code == expected_status_code
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
______________________________ test_locale_selection_cached[/Firefox$json-fr-cookie] ______________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 428, in test_locale_selection_cached
    response = assert_cached(url, 302, is_behind_cdn, **request_kwargs)
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 84, in assert_cached
    assert response.status_code == expected_status_code
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_________________________________ test_locale_selection_cached[/Firefox$json-es] __________________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 428, in test_locale_selection_cached
    response = assert_cached(url, 302, is_behind_cdn, **request_kwargs)
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 84, in assert_cached
    assert response.status_code == expected_status_code
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
____________________________ test_locale_selection_cached[/Firefox$history-fr-cookie] _____________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 428, in test_locale_selection_cached
    response = assert_cached(url, 302, is_behind_cdn, **request_kwargs)
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 84, in assert_cached
    assert response.status_code == expected_status_code
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
________________________________ test_locale_selection_cached[/Firefox$history-es] ________________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 428, in test_locale_selection_cached
    response = assert_cached(url, 302, is_behind_cdn, **request_kwargs)
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 84, in assert_cached
    assert response.status_code == expected_status_code
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
____________________________ test_locale_selection_cached[/Firefox$children-fr-cookie] ____________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 428, in test_locale_selection_cached
    response = assert_cached(url, 302, is_behind_cdn, **request_kwargs)
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 84, in assert_cached
    assert response.status_code == expected_status_code
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_______________________________ test_locale_selection_cached[/Firefox$children-es] ________________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 428, in test_locale_selection_cached
    response = assert_cached(url, 302, is_behind_cdn, **request_kwargs)
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 84, in assert_cached
    assert response.status_code == expected_status_code
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_____________________________________________________ test_xx _____________________________________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 459, in test_xx
    response = assert_cached(url, 302, is_behind_cdn, **request_kwargs)
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 84, in assert_cached
    assert response.status_code == expected_status_code
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
____________________________ test_locale_selection_not_cached[/Firefox$edit-fr-cookie] ____________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_______________________________ test_locale_selection_not_cached[/Firefox$edit-es] ________________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
____________________________ test_locale_selection_not_cached[/Firefox$move-fr-cookie] ____________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_______________________________ test_locale_selection_not_cached[/Firefox$move-es] ________________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
___________________________ test_locale_selection_not_cached[/Firefox$files-fr-cookie] ____________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_______________________________ test_locale_selection_not_cached[/Firefox$files-es] _______________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
___________________________ test_locale_selection_not_cached[/Firefox$purge-fr-cookie] ____________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_______________________________ test_locale_selection_not_cached[/Firefox$purge-es] _______________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
___________________________ test_locale_selection_not_cached[/Firefox$delete-fr-cookie] ___________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
______________________________ test_locale_selection_not_cached[/Firefox$delete-es] _______________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_________________________ test_locale_selection_not_cached[/Firefox$translate-fr-cookie] __________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_____________________________ test_locale_selection_not_cached[/Firefox$translate-es] _____________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
________________________ test_locale_selection_not_cached[/Firefox$quick-review-fr-cookie] ________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
___________________________ test_locale_selection_not_cached[/Firefox$quick-review-es] ____________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_________________________ test_locale_selection_not_cached[/Firefox$subscribe-fr-cookie] __________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_____________________________ test_locale_selection_not_cached[/Firefox$subscribe-es] _____________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_____________________ test_locale_selection_not_cached[/Firefox$subscribe_to_tree-fr-cookie] ______________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_________________________ test_locale_selection_not_cached[/Firefox$subscribe_to_tree-es] _________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_______________________ test_locale_selection_not_cached[/Firefox$revert/1358677-fr-cookie] _______________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
__________________________ test_locale_selection_not_cached[/Firefox$revert/1358677-es] ___________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
============================================= short test summary info =============================================
SKIP [1] tests/headless/test_cdn.py:297: unconditional skip
XFAIL tests/headless/test_cdn.py::test_cached[/en-US/dashboards/macros]
  reason: search is not available and populated
XFAIL tests/headless/test_cdn.py::test_cached[/diagrams/workflow/workflow.svg]
  reason: legacy files are typically not served from a local development instance
XFAIL tests/headless/test_cdn.py::test_cached[/presentations/microsummaries/index.html]
  reason: legacy files are typically not served from a local development instance
XFAIL tests/headless/test_cdn.py::test_cached_301[/files/2767/hut.jpg]
  reason: attachments are typically not served from a local development instance
XFAIL tests/headless/test_cdn.py::test_cached_301[/@api/deki/files/3613/=hut.jpg]
  reason: attachments are typically not served from a local development instance
XFAIL tests/headless/test_cdn.py::test_documents_with_cookie_and_param[/en-US/docs/Web/HTML]
  reason: the sg_task_completion waffle flag is not enabled by default in the sample database
XFAIL tests/headless/test_cdn.py::test_documents_with_cookie_and_param[/en-US/Firefox]
  reason: the sg_task_completion waffle flag is not enabled by default in the sample database
========================== 29 failed, 312 passed, 1 skipped, 7 xfailed in 24.49 seconds ===========================

jwhitlock · 2018-05-17T00:32:11Z

@escattone see if http://localhost:8000/es/Firefox is in your locale dev environment, and if it isn't, ./manage.py scrape_document https://developer.mozilla.org/es/Firefox and see if it fixes that half of the failing tests. The new code is more likely to return a 404 rather than a 30x to a 404.

escattone · 2018-05-17T00:39:46Z

@jwhitlock You're right on. All of the headless tests passed after I did the following:

docker-compose exec web ./manage.py scrape_document https://developer.mozilla.org/es/Firefox
docker-compose exec web ./manage.py scrape_document https://developer.mozilla.org/fr/Firefox

escattone · 2018-05-17T00:41:45Z

That's good news! So it seems all that's needed is to xfail those headless tests.

jwhitlock · 2018-05-17T01:05:05Z

Also: the de test passes, and I guess it shouldn't. I may save that for a future bug.

Update the docstrings for locale functions taken from Django to more accurately describe the behaviour and where it deviates from Django's behaviour.

This was removed in PR mdn#4769, and accidently re-added when copying from an earlier version of this pull request.

In the DocumentZoneMiddleware, skip non-locale paths (such as health checks) as early as possible, before the string manipulation for zoned URL manipulations.

codecov-io · 2018-05-17T14:19:40Z

Codecov Report

Merging #4790 into master will increase coverage by 0.02%.
The diff coverage is 99.48%.

@@            Coverage Diff             @@
##           master    #4790      +/-   ##
==========================================
+ Coverage   95.81%   95.84%   +0.02%     
==========================================
  Files         270      270              
  Lines       24489    24560      +71     
  Branches     1748     1754       +6     
==========================================
+ Hits        23465    23539      +74     
+ Misses        810      808       -2     
+ Partials      214      213       -1

Impacted Files	Coverage Δ
kuma/wiki/tests/test_helpers.py	`100% <ø> (ø)`	⬆️
kuma/core/tests/test_middleware.py	`100% <ø> (ø)`	⬆️
kuma/wiki/tests/test_middleware.py	`100% <ø> (ø)`	⬆️
kuma/settings/common.py	`90.97% <ø> (ø)`	⬆️
kuma/core/urlresolvers.py	`100% <100%> (+1.31%)`	⬆️
kuma/core/tests/test_pagination.py	`100% <100%> (ø)`	⬆️
kuma/landing/tests/test_views.py	`100% <100%> (ø)`	⬆️
kuma/dashboards/urls.py	`100% <100%> (ø)`	⬆️
kuma/search/urls.py	`100% <100%> (ø)`	⬆️
kuma/core/i18n.py	`100% <100%> (ø)`	⬆️
... and 12 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d5a21e6...b63ec6b. Read the comment docs.

Use settings.LANGUAGES order to pick region-specific locale prefixes, mirroring Django's algorithm. The first region-specific locale prefix is chosen for the generic language, such as zh-CN for zh, and en-US for en. This is identical to the previous order with the notable exception of pt-PT, not pt-BR, being used for pt (Portuguese). This affects other settings: * ACCEPTED_LOCALES is in resolution order, and includes any candidate locales. * ENABLED_LOCALES is also in resolution order * FIRST_LOCALE is removed * LOCALE_ALIASES drops locales that can be derived from resolution order. Only the Chinese customizations (cn, zh-Hans) remain. * SORTED_LANGUAGES (new) contains the locales in preferred order for display and for forms (en-US, then the others in alphabetical order) Consistency checks for locale settings are in the kuma.core.tests.test_settings, avoiding runtime assertions. New functions kuma.core.i18n.get_django_languages and .get_kuma_languages, modeled after Django's get_languages(), return cached OrderedDicts of LANGUAGES, either with Django (en-us) or Kuma (en-US) language codes. These are efficient for both set (lang_code in get_kuma_languages()) and list (for lang_code in get_kuma_languages()) operations, and are now used in a few places.

Update test_locale_middleware_language_cookie to include an Accept-Language header, which is overridden by the cookie.

Previously, a no-locale prefix would get a redirect, whether or not the next link was a 404. Now, the LocaleMiddleware will 404 if the destination with a locale prefix is not valid. This affects the tests for /fr/Firefox and /es/Firefox, which are not yet in the sample database.

jwhitlock · 2018-05-17T18:37:48Z

I believe I've addressed the review comments.

I made a significant change to settings.LANGUAGES, which now contains locales in resolution order rather than "sorted" order (English, then sorted by locale code. This allowed further simplifications, and the use of an OrderedDict for efficiently validating locale codes (locale in get_kuma_languages()) and iterating (for locale in get_kuma_languages()), similar to how Django's get_languages is used. This big change is in 0567690.

escattone

Thanks for the nice adjustments as well as all of the painstaking work that you've put into this! This is really nicely done. Thanks @jwhitlock!

escattone · 2018-05-17T21:06:44Z

kuma/core/tests/test_settings.py

+     ('sr', 'sr-Latn'),
+     ('zh-CN', 'zh-TW'),
+     ))
+def test_preferred_locale_codes(primary, secondary):


👍 Really nice idea to formalize in a test!

escattone · 2018-05-17T21:07:40Z

kuma/core/tests/test_settings.py

+from django.conf import settings
+
+
+def test_accepted_locales():


Another great test addition, thanks!

jwhitlock added 3 commits May 16, 2018 09:19

jwhitlock requested a review from escattone May 16, 2018 15:00

jwhitlock commented May 16, 2018

View reviewed changes

jwhitlock force-pushed the locale-middleware-1461350 branch from eb3654d to 9f7ba92 Compare May 16, 2018 15:09

jwhitlock mentioned this pull request May 16, 2018

Reimplement locale-prefixed URLs mdn/sprints#14

Closed

9 tasks

escattone reviewed May 16, 2018

View reviewed changes

safwanrahman reviewed May 16, 2018

View reviewed changes

escattone reviewed May 16, 2018

View reviewed changes

escattone reviewed May 17, 2018

View reviewed changes

escattone suggested changes May 17, 2018

View reviewed changes

bug 1461350: Update docstrings for customizations

e7e4d24

Update the docstrings for locale functions taken from Django to more accurately describe the behaviour and where it deviates from Django's behaviour.

jwhitlock added 3 commits May 17, 2018 09:05

bug 1461350: Drop revision_based_on

23e5547

This was removed in PR mdn#4769, and accidently re-added when copying from an earlier version of this pull request.

bug 1461350: Skip non-locale paths earlier

9c0c368

In the DocumentZoneMiddleware, skip non-locale paths (such as health checks) as early as possible, before the string manipulation for zoned URL manipulations.

bug 1461350: Convert duplicate check to assertion

b63ec6b

jwhitlock added 3 commits May 17, 2018 12:23

bug 1461350: Test that language cookie wins

66779ca

Update test_locale_middleware_language_cookie to include an Accept-Language header, which is overridden by the cookie.

escattone approved these changes May 17, 2018

View reviewed changes

escattone merged commit ccb2bdc into mdn:master May 17, 2018

jwhitlock deleted the locale-middleware-1461350 branch May 21, 2018 20:56


		With the Django 1.8-derived middleware, these are combined into a single
		302 temporary redirect, which may also help with retiring DocumentZones.

		return lang_code

		lang_code = request.COOKIES.get(settings.LANGUAGE_COOKIE_NAME)

		from django.conf import settings


		def test_accepted_locales():

bug 1461350: Reimplement locale-prefixed URLs to mirror Django #4790

bug 1461350: Reimplement locale-prefixed URLs to mirror Django #4790

Conversation

jwhitlock commented May 16, 2018

Choose a reason for hiding this comment

escattone May 16, 2018 • edited

Choose a reason for hiding this comment

escattone May 16, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

escattone May 16, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jwhitlock May 17, 2018 • edited

Choose a reason for hiding this comment

safwanrahman May 16, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

escattone left a comment

Choose a reason for hiding this comment

escattone commented May 17, 2018 • edited

jwhitlock commented May 17, 2018

escattone commented May 17, 2018

escattone commented May 17, 2018

jwhitlock commented May 17, 2018

codecov-io commented May 17, 2018

Codecov Report

jwhitlock commented May 17, 2018

escattone left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

escattone May 16, 2018 •

edited

escattone May 16, 2018 •

edited

escattone May 16, 2018 •

edited

jwhitlock May 17, 2018 •

edited

safwanrahman May 16, 2018 •

edited

escattone commented May 17, 2018 •

edited

escattone left a comment •

edited