Skip to content
This repository has been archived by the owner on Aug 26, 2022. It is now read-only.

bug 1461350: Reimplement locale-prefixed URLs to mirror Django #4790

Merged
merged 11 commits into from
May 17, 2018

Conversation

jwhitlock
Copy link
Contributor

This replaces the LocaleURLMiddleware ("Based on zamboni.amo.middleware") with one based on Django's LocaleMiddleware, plus additional middlewares for Kuma-specific functionality. This replaces PR #4766, which was the work-in-progress version of this code.

The Django code is from release 1.8.19, and has several customizations to handle Kuma features like mixed-case locale codes (en-US vs Django's en-us). It also removes some features we don't use, such as storing a language preference in the session (adding a Vary: cookie to all responses), and some flexibility needed in the Django framework code.

The widest code change is the splitting of URL patterns into locale-prefixed patterns and no-locale patterns. In apps, these are grouped into lang_urlpatterns and urlpatterns, and they are applied in kuma/urls.py with a customized i18n_patterns. The patterns now include the active locale, and the active locale needs to be temporarily changed (with language.override(locale)) to generate URLs to other locales. The default URLs contain the locale, so Kuma is less likely to generate "generic" URLs.

Another subtle change is the handling of region-specific languages in the Accept-Language header. Previously, a header of fr-FR; de=0.5 would select de, the first exact match. After this change, the Django-based algorithm picks fr, the general locale variant of fr-FR. More discussion is at #4766 (comment).

There is some known follow-on work that is not included, because this monster is scary enough:

  • Change Zone-remap redirects from permanent to temporary redirects, and refactor tests
  • Simplify Kuma's reverse to drop support for legacy parameters.
  • Simplify test code for the default of locale-prefixed URLs
  • (Maybe) Convert the locale prefix from a special parameter of Kuma's reverse to a standard pattern argument.

Copy code from Django 1.8.19 which needs to be modified to work with
Kuma language codes, legacy locale support, and 404s:

* from django.conf.urls:
  - i18n_patterns
* from django,core.urlresolvers:
  - LocaleRegexURLResolver (extended)
* from django.middleware.locale:
  - LocaleMiddleware
* from django.utils.translation.trans_real:
  - get_langugage
  - get_supported_language_variant
  - get_language_from_path
  - get_language_from_request

No modifications yet, to make it easier to review changes in future
Django versions.
Modify the Django 1.8.19 locale code to work with Kuma:

* Return Kuma language codes, with overrides from settings.LOCALE_ALIASES,
  from get_supported_language_variant.
* Don't check the session for a language override.
* Don't add header Vary: Accept-Language
* Handle alternate code paths unused in Kuma, by converting to asserts,
  or commenting and skipping code coverage.
Add middlewares for Kuma-specific locale features, provided in to
LocaleURLMiddleware or elsewhere, to keep LocaleMiddleware close to
Django's version.

LangSelectorMiddleware handles locale redirects based on the ?lang query
parameter.

LocaleStandardizerMiddleware redirects to standardize local prefix
cases, legacy locales, and overly specific locales, rather than return
404s.
@jwhitlock jwhitlock requested a review from escattone May 16, 2018 15:00

With the Django 1.8-derived middleware, these are combined into a single
302 temporary redirect, which may also help with retiring DocumentZones.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops - this comment should have been removed. Rebase coming.

Replace LocaleURLMiddleware with LangSelectorMiddleware,
LocaleStandardizerMiddleware, and Django's modified LocaleMiddleware.
Simplify Kuma's reverse function, and drop code that supports the old
locale framework.

One large change is that urlpatterns are split into those with a
language prefix (lang_urlpatterns), and those without (urlpatterns). The
customized LocaleRegexURLResolver and i18n_patterns ensure that Kuma's
language codes (such as en-US) are used instead of Django's language
codes (such as en-us).

Kuma's previous locale tools (reverse, Jinja's wiki_url) defaulted to no
locale prefix. With the new framework, the default is the currently
active locale, and it is retained in request.path_info. These changes
require some adjustments, such as in the DocumentZone middleware and
tests.

Overly-specific locales in the Accept-Language header are processed
differenty. Previously, this header:

  Accept-Language: fr-FR, de;q=0.5

would results in 'de' being selected as the first exact match. The code
now uses Django's behaviour, where 'fr' is chosen as the generic locale
match for the first parameter.
selecting a more generic variant. Raises LookupError if nothing found.

If `strict` is False (the default), the function will look for an alternative
country-specific variant when the currently checked is not found.
Copy link
Contributor

@escattone escattone May 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super nit. This portion on strict seems out of place now that it has been removed from the function signature.


# Check for known override
if lang_code in settings.LOCALE_ALIASES:
return settings.LOCALE_ALIASES[lang_code]
Copy link
Contributor

@escattone escattone May 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ I love that the LOCALE_ALIASES are incorporated here. They can be handled in a much cleaner, more straightforward fashion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is a much better fit than the "fix after the fact" method from the last PR.

out the main language.

If check_path is True, the URL path prefix will be checked for a language
code, otherwise this is skipped for backwards compatibility.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super nit. This check_path portion seems out of place now that it has been removed from the function signature.

if not language_code_re.search(accept_lang): # pragma: no cover
# Check added with a security fix:
# https://www.djangoproject.com/weblog/2007/oct/26/security-fix/
# It is unclear how to trigger this branch, so skipping coverage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice research!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. It seems to me that Django's parse_accept_lang_header will only return values that would satisfy the language_code_re.search call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to replace it with an assertion, and we'll let the internet tell us what string we haven't thought of.

query = dict((smart_str(k), v) for
"""Redirect if ?lang query parameter is valid."""
query_lang = request.GET.get('lang')
if query_lang not in dict(settings.LANGUAGES):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. I was thinking that since this is run for almost every request, and that most requests will not have a lang query parameter, we could short circuit the need for a dictionary lookup with:

if not (query_lang and (query_lang in dict(settings.LANGUAGES))):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a few cases where we'd like a quick check of the supported Kuma language code, so I'll go a little farther down this path and avoid creating a dict every time.

possible_lang_codes.append(generic_lang_code)
raw_supported_lang_codes = get_languages()
supported_lang_codes = [kuma_language_code_to_django(lang)
for lang in raw_supported_lang_codes]
Copy link
Contributor

@escattone escattone May 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. I was thinking that when we create the supported_lang_codes list, it's too bad we lose the OrderedDict (from get_languages()) and its efficiency (for the following code in supported_lang_codes check). What do you think of either one of the following instead?

raw_supported_lang_codes = get_languages()
supported_lang_codes = OrderedDict((kuma_language_code_to_django(lang), None)
                                   for lang in raw_supported_lang_codes)

or, skipping the call to get_languages altogether:

supported_lang_codes = OrderedDict((kuma_language_code_to_django(lang), None)
                                   for lang, _ in settings.LANGUAGES)

with both options assuming, of course, that we add a from collections import OrderedDict.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why an OrderedDict would be more efficient to iterate over than a list.

I'm not as sensitive to efficiency here, because of the lru_cache, but it appears we could use similar code in a few places, so it makes sense to calculate it once and skip get_languages (or call it get_kuma_languages).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant to say is that the OrderedDict is more efficient for the later code in supported_lang_codes check within the loop. Good point about the lru_cache.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. We want the set-lookup behavior for testing inclusion, and the ordered-list behavior when iterating through the list. OrderedDict does both.

fixed_locale = settings.LOCALE_ALIASES[literal_from_path.lower()]
elif not match and literal_from_path.startswith(language_from_path):
# Language code is a specific locale (fr vs fr-FR)
fixed_locale = language_from_path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Nice, I verified that kuma/core/tests/test_locale_middleware.py covers all three of these cases.

lang_code = request.COOKIES.get(settings.LANGUAGE_COOKIE_NAME)

try:
return get_supported_language_variant(lang_code)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This language code should always be mdn locale other than user modify the cookie!
I think we can only check if the locale is supported

Copy link
Contributor Author

@jwhitlock jwhitlock May 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think it is a bad idea to avoid get_supported_language_variant here.

lang_code is None if there is no request cookie, and get_supported_language_variant will raise a LookupError.

get_supported_language_variant checks that the cookie-provided locale is supported, and other stuff. This is where we'll do any locale switching if we remove a previously supported language, or it will also raise a LookupError and we'll fallback to the Accept-Language header.

except LookupError:
pass

accept = request.META.get('HTTP_ACCEPT_LANGUAGE', '')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont know if there are any test for the ordering of locale choice!
Like, first path locale, then cookie, then header!
maybe worth writing one if not exists!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a headless test, but probably not a unit test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a few tests of each piece of this. I couldn't find an explicit test that the cookie beat the Accept-Language header, so I updated test_locale_middleware_language_cookie to include a header.

return lang_code

lang_code = request.COOKIES.get(settings.LANGUAGE_COOKIE_NAME)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be inside a if block.

if lang_code:
...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear which line you are saying should be in an if block. I've tried to determine a line that should be in an if block, and none seem appropriate to me.

# Django supports a case when LANGUAGE_CODE is not in LANGUAGES
# (see https://github.com/django/django/pull/824). but our LANGUAGE_CODE is
# always in LANGUAGES.
assert settings.LANGUAGE_CODE == settings.LANGUAGES[0][0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont know if we should check it in every request. maybe add a test where its checked?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's indirectly checked here:

https://github.com/mozilla/kuma/blob/cdfce6d69f6740d2061c8795cbfc7b12ba05b92d/kuma/settings/common.py#L236-L237

I think it makes sense to add a test and remove the assertion here.

# Skip slugs that don't have locales, and won't be in a zone
request_slug = request.path_info.lstrip('/')
if any(request_slug.startswith(slug)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. It seems more efficient to move lines 46-48 immediately after the definition of request_slug, and then start the maybe_lang and path stuff.

path = request.path_info
request_slug = path.lstrip('/')
maybe_lang = request_slug.split(u'/')[0]
if maybe_lang in settings.ENABLED_LOCALES:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. Since the code is run for almost every request, it'd be nice to make this more efficient by using a set or frozenset that we create once and use thereafter, so perhaps adding something like ENABLED_LOCALES_SET = frozenset(ENABLED_LOCALES) within kuma/settings/common.py and then using that here?

@@ -29,6 +29,9 @@
url(r'^\$files$',
edit_attachment,
name='attachments.edit_attachment'),
url(r'^\$edit/(?P<revision_id>\d+)$',
views.edit.edit,
name='wiki.new_revision_based_on'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how this wiki.new_revision_based_on url made its way back into our lives, but it needs to go! 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

language_from_path = get_language_from_path(request.path_info)
if response.status_code == 404 and not language_from_path:
language_path = '/%s%s' % (language, request.path_info)
path_valid = is_valid_path(request, language_path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fails to redirect locale-free zone URL's (e.g., http://localhost:8000/Firefox gets a 404 instead of redirecting to the active locale set from get_language_from_request on the way in), which causes a number of the tests within tests/headless/test_cdn.py to fail. It seems that is_valid_path has to consider zones as well? 🤦‍♂️ I'm tempted to say we should take this opportunity to dump zones now, but that probably is a bridge too far.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which tests fail?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, all of the zone-related slugs (all that start with /Firefox) with the test_locale_selection_cached and test_locale_selection_not_cached tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I posted my actual results in the main thread below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To summarize the results and discussion below:

  • Tests against /es/Firefox and /fr/Firefox failed because only the English page is in the sample database. This code returns a 404, where the previous code returned a 302. This is probably an improvement, but means those tests should be skipped or marked XFAIL in local development, until the sample database is updated.
  • The test for ?lang=de succeeded with a 302, even though the page is not in the sample database. The quick 404 would be nice, but it isn't a regression, so I'm not considering a change for this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jwhitlock. Sorry for my confusion around zones yesterday. I realized it later last night. I don't think we have to change anything for the ?lang=de case, even in the future. I like the simplicity of the LangSelectorMiddleware as it is.

Copy link
Contributor

@escattone escattone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really nice. I really like this approach. The nits you can do with as you please, and the re-emergence of the wiki.new_revision_based_on endpoint is an easy fix. However, just when I was thinking we were home free, I noticed that some of the headless CDN tests were failing when run against my local dev instance. I think all of the failures are due to the fact that is_valid_path does not take zones into account (oh man, do I hate zones!).

@escattone
Copy link
Contributor

escattone commented May 17, 2018

Here is the result of the headless tests run against my local dev instance (ignore test_xx, which is a test I added to do some debugging):

(mdntest) rjohnson-25186:kuma rjohnson$ pytest --base-url http://localhost:8000 tests/headless/test_cdn.py
=============================================== test session starts ===============================================
platform darwin -- Python 2.7.12, pytest-3.5.1, py-1.5.3, pluggy-0.6.0
sensitiveurl: .* *** WARNING: sensitive url matches http://localhost:8000 ***
metadata: {'Python': '2.7.12', 'Driver': None, 'Capabilities': {}, 'Base URL': 'http://localhost:8000', 'Platform': 'Darwin-17.5.0-x86_64-i386-64bit', 'kuma': {u'services': {u'search': {u'available': True, u'count': 0, u'populated': False}, u'kumascript': {u'available': True, u'revision': u'5f676637045a77129a5009a6d15ba3eaba8e632c'}, u'test_accounts': {u'available': True}, u'database': {u'available': True, u'document_count': 964, u'populated': True}}, u'version': 1, u'request': {u'url': u'http://localhost:8000/_kuma_status.json', u'is_secure': False, u'host': u'localhost:8000', u'scheme': u'http'}, 'response': {'headers': {'Content-Length': '791', 'x-xss-protection': '1; mode=block', 'x-content-type-options': 'nosniff', 'Content-Language': 'en-US', 'Expires': 'Wed, 16 May 2018 23:43:39 GMT', 'Server': 'meinheld/0.6.1', 'Last-Modified': 'Wed, 16 May 2018 23:43:39 GMT', 'Connection': 'Keep-Alive', 'ETag': '"9cf33bb911d422d9be82180d93a52b82"', 'Cache-Control': 'no-cache, no-store, must-revalidate, max-age=0', 'Date': 'Wed, 16 May 2018 23:43:39 GMT', 'X-Frame-Options': 'DENY', 'Content-Type': 'application/json'}}, u'settings': {u'ATTACHMENT_HOST': u'localhost:8000', u'PROTOCOL': u'http://', u'INTERACTIVE_EXAMPLES_BASE': u'https://interactive-examples.mdn.mozilla.net', u'MAINTENANCE_MODE': False, u'STATIC_URL': u'/static/', u'SITE_URL': u'http://localhost:8000', u'ATTACHMENT_ORIGIN': u'localhost:8000', u'DEBUG': True, u'ALLOWED_HOSTS': [u'*'], u'REVISION_HASH': u'undefined', u'LEGACY_HOSTS': []}}, 'Plugins': {'variables': '1.7.1', 'selenium': '1.11.4', 'xdist': '1.16.0', 'rerunfailures': '2.1.0', 'html': '1.16.1', 'base-url': '1.4.1', 'metadata': '1.5.1'}, 'Packages': {'py': '1.5.3', 'pytest': '3.5.1', 'pluggy': '0.6.0'}}
baseurl: http://localhost:8000
rootdir: /Users/rjohnson/repos/kuma, inifile: pytest.ini
plugins: xdist-1.16.0, variables-1.7.1, selenium-1.11.4, rerunfailures-2.1.0, metadata-1.5.1, html-1.16.1, base-url-1.4.1
collected 349 items

tests/headless/test_cdn.py ..........................................x.....xx............................xx....s.xx.............................................................F.F.F.F.F.F.F.F........................................................................................................F.........................................F.F.F.F.F.F.F.F.F.F.F.F.F.F.F.F.F.F.F.F

==================================================== FAILURES =====================================================
________________________________ test_locale_selection_cached[/Firefox-fr-cookie] _________________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 428, in test_locale_selection_cached
    response = assert_cached(url, 302, is_behind_cdn, **request_kwargs)
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 84, in assert_cached
    assert response.status_code == expected_status_code
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
____________________________________ test_locale_selection_cached[/Firefox-es] ____________________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 428, in test_locale_selection_cached
    response = assert_cached(url, 302, is_behind_cdn, **request_kwargs)
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 84, in assert_cached
    assert response.status_code == expected_status_code
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
______________________________ test_locale_selection_cached[/Firefox$json-fr-cookie] ______________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 428, in test_locale_selection_cached
    response = assert_cached(url, 302, is_behind_cdn, **request_kwargs)
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 84, in assert_cached
    assert response.status_code == expected_status_code
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_________________________________ test_locale_selection_cached[/Firefox$json-es] __________________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 428, in test_locale_selection_cached
    response = assert_cached(url, 302, is_behind_cdn, **request_kwargs)
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 84, in assert_cached
    assert response.status_code == expected_status_code
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
____________________________ test_locale_selection_cached[/Firefox$history-fr-cookie] _____________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 428, in test_locale_selection_cached
    response = assert_cached(url, 302, is_behind_cdn, **request_kwargs)
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 84, in assert_cached
    assert response.status_code == expected_status_code
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
________________________________ test_locale_selection_cached[/Firefox$history-es] ________________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 428, in test_locale_selection_cached
    response = assert_cached(url, 302, is_behind_cdn, **request_kwargs)
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 84, in assert_cached
    assert response.status_code == expected_status_code
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
____________________________ test_locale_selection_cached[/Firefox$children-fr-cookie] ____________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 428, in test_locale_selection_cached
    response = assert_cached(url, 302, is_behind_cdn, **request_kwargs)
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 84, in assert_cached
    assert response.status_code == expected_status_code
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_______________________________ test_locale_selection_cached[/Firefox$children-es] ________________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 428, in test_locale_selection_cached
    response = assert_cached(url, 302, is_behind_cdn, **request_kwargs)
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 84, in assert_cached
    assert response.status_code == expected_status_code
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_____________________________________________________ test_xx _____________________________________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 459, in test_xx
    response = assert_cached(url, 302, is_behind_cdn, **request_kwargs)
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 84, in assert_cached
    assert response.status_code == expected_status_code
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
____________________________ test_locale_selection_not_cached[/Firefox$edit-fr-cookie] ____________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_______________________________ test_locale_selection_not_cached[/Firefox$edit-es] ________________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
____________________________ test_locale_selection_not_cached[/Firefox$move-fr-cookie] ____________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_______________________________ test_locale_selection_not_cached[/Firefox$move-es] ________________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
___________________________ test_locale_selection_not_cached[/Firefox$files-fr-cookie] ____________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_______________________________ test_locale_selection_not_cached[/Firefox$files-es] _______________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
___________________________ test_locale_selection_not_cached[/Firefox$purge-fr-cookie] ____________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_______________________________ test_locale_selection_not_cached[/Firefox$purge-es] _______________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
___________________________ test_locale_selection_not_cached[/Firefox$delete-fr-cookie] ___________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
______________________________ test_locale_selection_not_cached[/Firefox$delete-es] _______________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_________________________ test_locale_selection_not_cached[/Firefox$translate-fr-cookie] __________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_____________________________ test_locale_selection_not_cached[/Firefox$translate-es] _____________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
________________________ test_locale_selection_not_cached[/Firefox$quick-review-fr-cookie] ________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
___________________________ test_locale_selection_not_cached[/Firefox$quick-review-es] ____________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_________________________ test_locale_selection_not_cached[/Firefox$subscribe-fr-cookie] __________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_____________________________ test_locale_selection_not_cached[/Firefox$subscribe-es] _____________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_____________________ test_locale_selection_not_cached[/Firefox$subscribe_to_tree-fr-cookie] ______________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_________________________ test_locale_selection_not_cached[/Firefox$subscribe_to_tree-es] _________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
_______________________ test_locale_selection_not_cached[/Firefox$revert/1358677-fr-cookie] _______________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
__________________________ test_locale_selection_not_cached[/Firefox$revert/1358677-es] ___________________________
Traceback (most recent call last):
  File "/Users/rjohnson/repos/kuma/tests/headless/test_cdn.py", line 517, in test_locale_selection_not_cached
    assert response.status_code == 302
AssertionError: assert 404 == 302
 +  where 404 = <Response [404]>.status_code
============================================= short test summary info =============================================
SKIP [1] tests/headless/test_cdn.py:297: unconditional skip
XFAIL tests/headless/test_cdn.py::test_cached[/en-US/dashboards/macros]
  reason: search is not available and populated
XFAIL tests/headless/test_cdn.py::test_cached[/diagrams/workflow/workflow.svg]
  reason: legacy files are typically not served from a local development instance
XFAIL tests/headless/test_cdn.py::test_cached[/presentations/microsummaries/index.html]
  reason: legacy files are typically not served from a local development instance
XFAIL tests/headless/test_cdn.py::test_cached_301[/files/2767/hut.jpg]
  reason: attachments are typically not served from a local development instance
XFAIL tests/headless/test_cdn.py::test_cached_301[/@api/deki/files/3613/=hut.jpg]
  reason: attachments are typically not served from a local development instance
XFAIL tests/headless/test_cdn.py::test_documents_with_cookie_and_param[/en-US/docs/Web/HTML]
  reason: the sg_task_completion waffle flag is not enabled by default in the sample database
XFAIL tests/headless/test_cdn.py::test_documents_with_cookie_and_param[/en-US/Firefox]
  reason: the sg_task_completion waffle flag is not enabled by default in the sample database
========================== 29 failed, 312 passed, 1 skipped, 7 xfailed in 24.49 seconds ===========================

@jwhitlock
Copy link
Contributor Author

@escattone see if http://localhost:8000/es/Firefox is in your locale dev environment, and if it isn't, ./manage.py scrape_document https://developer.mozilla.org/es/Firefox and see if it fixes that half of the failing tests. The new code is more likely to return a 404 rather than a 30x to a 404.

@escattone
Copy link
Contributor

@jwhitlock You're right on. All of the headless tests passed after I did the following:

docker-compose exec web ./manage.py scrape_document https://developer.mozilla.org/es/Firefox
docker-compose exec web ./manage.py scrape_document https://developer.mozilla.org/fr/Firefox

@escattone
Copy link
Contributor

That's good news! So it seems all that's needed is to xfail those headless tests.

@jwhitlock
Copy link
Contributor Author

Also: the de test passes, and I guess it shouldn't. I may save that for a future bug.

Update the docstrings for locale functions taken from Django to more
accurately describe the behaviour and where it deviates from Django's
behaviour.
This was removed in PR mdn#4769, and accidently re-added when copying from
an earlier version of this pull request.
In the DocumentZoneMiddleware, skip non-locale paths (such as health
checks) as early as possible, before the string manipulation for zoned
URL manipulations.
@codecov-io
Copy link

Codecov Report

Merging #4790 into master will increase coverage by 0.02%.
The diff coverage is 99.48%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4790      +/-   ##
==========================================
+ Coverage   95.81%   95.84%   +0.02%     
==========================================
  Files         270      270              
  Lines       24489    24560      +71     
  Branches     1748     1754       +6     
==========================================
+ Hits        23465    23539      +74     
+ Misses        810      808       -2     
+ Partials      214      213       -1
Impacted Files Coverage Δ
kuma/wiki/tests/test_helpers.py 100% <ø> (ø) ⬆️
kuma/core/tests/test_middleware.py 100% <ø> (ø) ⬆️
kuma/wiki/tests/test_middleware.py 100% <ø> (ø) ⬆️
kuma/settings/common.py 90.97% <ø> (ø) ⬆️
kuma/core/urlresolvers.py 100% <100%> (+1.31%) ⬆️
kuma/core/tests/test_pagination.py 100% <100%> (ø) ⬆️
kuma/landing/tests/test_views.py 100% <100%> (ø) ⬆️
kuma/dashboards/urls.py 100% <100%> (ø) ⬆️
kuma/search/urls.py 100% <100%> (ø) ⬆️
kuma/core/i18n.py 100% <100%> (ø) ⬆️
... and 12 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d5a21e6...b63ec6b. Read the comment docs.

Use settings.LANGUAGES order to pick region-specific locale prefixes,
mirroring Django's algorithm. The first region-specific locale prefix is
chosen for the generic language, such as zh-CN for zh, and en-US for en.
This is identical to the previous order with the notable exception of
pt-PT, not pt-BR, being used for pt (Portuguese).

This affects other settings:

* ACCEPTED_LOCALES is in resolution order, and includes any candidate
  locales.
* ENABLED_LOCALES is also in resolution order
* FIRST_LOCALE is removed
* LOCALE_ALIASES drops locales that can be derived from resolution
  order. Only the Chinese customizations (cn, zh-Hans) remain.
* SORTED_LANGUAGES (new) contains the locales in preferred order for
  display and for forms (en-US, then the others in alphabetical order)

Consistency checks for locale settings are in the
kuma.core.tests.test_settings, avoiding runtime assertions.

New functions kuma.core.i18n.get_django_languages and
.get_kuma_languages, modeled after Django's get_languages(), return
cached OrderedDicts of LANGUAGES, either with Django (en-us) or Kuma
(en-US) language codes. These are efficient for both set
(lang_code in get_kuma_languages()) and list (for lang_code in
get_kuma_languages()) operations, and are now used in a few places.
Update test_locale_middleware_language_cookie to include an
Accept-Language header, which is overridden by the cookie.
Previously, a no-locale prefix would get a redirect, whether or not the
next link was a 404. Now, the LocaleMiddleware will 404 if the
destination with a locale prefix is not valid. This affects the tests
for /fr/Firefox and /es/Firefox, which are not yet in the sample
database.
@jwhitlock
Copy link
Contributor Author

I believe I've addressed the review comments.

I made a significant change to settings.LANGUAGES, which now contains locales in resolution order rather than "sorted" order (English, then sorted by locale code. This allowed further simplifications, and the use of an OrderedDict for efficiently validating locale codes (locale in get_kuma_languages()) and iterating (for locale in get_kuma_languages()), similar to how Django's get_languages is used. This big change is in 0567690.

Copy link
Contributor

@escattone escattone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the nice adjustments as well as all of the painstaking work that you've put into this! This is really nicely done. Thanks @jwhitlock!

('sr', 'sr-Latn'),
('zh-CN', 'zh-TW'),
))
def test_preferred_locale_codes(primary, secondary):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Really nice idea to formalize in a test!

from django.conf import settings


def test_accepted_locales():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another great test addition, thanks!

@escattone escattone merged commit ccb2bdc into mdn:master May 17, 2018
@jwhitlock jwhitlock deleted the locale-middleware-1461350 branch May 21, 2018 20:56
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants