Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch Bedrock to using Django's own i18n mechanism #14327

Merged
merged 53 commits into from
Apr 17, 2024

Conversation

stevejalim
Copy link
Collaborator

@stevejalim stevejalim commented Mar 18, 2024

Please do not merge - let @stevejalim do it at the ideal point

One-line summary

Along with SUMO and other sites, Bedrock has historically used an in-house internationalization (i18n) mechanism to detect/select the appropriate language for a user and to 'activate' the relevant language for translation, so that the localization (l10n) logic can pick the appropriate Fluent files, etc, to render a page in the language expected by the user.

This has worked fine for a long time, but the time has come to change Bedrock to instead use Django's core i18n machinery. Why? We'll soon be using Wagtail CMS within Bedrock and Wagtail's own l10n system (wagtail-localize) is build to need Django's i18n machinery. (We've tried it with ours and it just doesn't work.) Also, the closer Bedrock is to 'vanilla Django' the better (though even after this PR lands, there's plenty that's special about Bedrock, such as our Fluent l10n mechanism, which is not removed or changed as part of this work).

Significant changes and points to review

There's a lot changed here. We've dropped our custom prefixer and the urlresolver that used it, switching to core django behaviour using i18n_patterns to generate language-code-prefixed URL routes, combined with some custom middleware that does things like ensuring we retain the mixed-case langauge codes of our original system, as well as handle some corner cases such as our special locale-selection root page that only robots see.

Please don't hold back with questions when reviewing this code - I've learned a lot about our and Django's internals along the way, and it may be that there are better ways to do some things.

I'm adding a QA checklist below, which I'd value each reviewer ticking the relevant parts of as they test things out.

Issue / Bugzilla link

Resolves #14256

Testing

In a nutshell, nothing should be operationally different compared to production Bedrock. Please take some time to explore both the Mozilla and Pocket sites on the URLs below, switching between languages and poking at areas that you understand well - eg attribution, analytics, locale-selection, downloads, etc

Also, please can y'all double-check these, which I've tested as part of this work, but can't be too careful. Please tick off any you test

  • Mozorg mode on https://www-demo7.allizom.org
    - [ ] Check the redirect from all lower case country codes to uppercase (example: en-us will redirect automatically to en-US. Ideally check for all settings.PROD_LANGUAGES
    - [ ] Check that explicit fallbacks/canonical languages are still in place (example: ja-JP-mac -> 'jp'; sv -> sv-SE; zh-HK -> zh-TW
    - [ ] Check that the language-switcher in the footer works, with and without JS
    - [ ] Check that the Accept-Language header value (ie Browser language) is successfully respected and Bedrock uses the given lang code if there is none in the path
    - [ ] Also check that picking a specific lang via the path (eg /de/about/) will get a German page, even if the Browser default language is set to something other than German
    - [ ] Confirm that three-letter lang codes (eg sco) work fine
    • Check special 404 Locale picker page is served at / when there is no accept-language header - @robhudson
    • Check sensitive lang-based redirects work (e.g. zh-Hant-TW -> zh-TW)
    • Check newsletter signup and management works normally
      • Check locale changes are reflected in the Newsletter form
      • Check confirming subscription and updating prefs in a non-en-US local works fine
    • Check VPN signup is OK
    • Check releasenotes pages
      • Desktop
      • Developer/beta/
      • Nightly
      • Android
      • iOS
    • Check Contentful-sourced pages are OK - /products/vpn/resource-center/
    • Check WP-blog-sourced pages are OK - eg downpage on /careers there should be some blog posts pulled in
    • Check FF download is OK (but note that the bouncer/download URL on www-demo is 404ing if you use a non en-US lang code, but that's unrelated to these chanegs)
    • Check sitemaps are happy at /sitemap.xml and /{lang-code}/sitemap.xml
  • Pocket mode on https://www-demo5.tekcopteg.com

Copy link

codecov bot commented Mar 18, 2024

Codecov Report

Attention: Patch coverage is 97.26027% with 6 lines in your changes are missing coverage. Please review.

Project coverage is 76.45%. Comparing base (b8ee881) to head (1d3b288).

❗ Current head 1d3b288 differs from pull request most recent head f968cb9. Consider uploading reports for the commit f968cb9 to get more accurate results

Files Patch % Lines
bedrock/settings/__init__.py 0.00% 2 Missing ⚠️
bedrock/urls/mozorg_mode.py 60.00% 2 Missing ⚠️
bedrock/urls/pocket_mode.py 75.00% 1 Missing ⚠️
lib/l10n_utils/translation.py 94.11% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #14327      +/-   ##
==========================================
+ Coverage   75.72%   76.45%   +0.72%     
==========================================
  Files         145      147       +2     
  Lines        7878     7971      +93     
==========================================
+ Hits         5966     6094     +128     
+ Misses       1912     1877      -35     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@stevejalim stevejalim force-pushed the 14256-refactor-i18n-mechanism branch 7 times, most recently from 7617bf1 to 8a19328 Compare March 24, 2024 12:50
@stevejalim stevejalim added the Review: M Code review time: 1-2 hours label Mar 25, 2024
@stevejalim stevejalim changed the title WIP: Refactor i18n mechanism Switch Bedrock to using Django's own i18n mechanism, not our own one Mar 25, 2024
@stevejalim stevejalim changed the title Switch Bedrock to using Django's own i18n mechanism, not our own one Switch Bedrock to using Django's own i18n mechanism Mar 25, 2024
return locale
else:
return f"{parts[0]}-{parts[1].upper()}"
def _fix_case(lang_code):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: This could maybe be merged with bedrock.base.i18n.normalize_langauge (which does more than this function) as long as we add a flag/option to limit behaviour to be the same as _fix_case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in support of that. The less code we have that does similar things the easier to understand.

@@ -187,7 +187,7 @@
url_test("/{m,{firefox/,}mobile}/faq/", "https://support.mozilla.org/products/mobile"),
# bug 885799, 952429
url_test("/projects/calendar/holidays.html", "https://www.thunderbird.net/calendar/holidays/"),
url_test("/en-US/projects/calendar/random/stuff/", "https://www.thunderbird.net/calendar/"),
url_test("/projects/calendar/random/stuff/", "https://www.thunderbird.net/calendar/"),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: This is the only test I had to hack to make it fit. I think it's OK, particularly given the lower proportion of visits it probably gets

if path.lstrip("/").split("/")[0] in settings.SUPPORTED_NONLOCALES:
return False

return True
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: It would be nice to make this dynamically check the mozorg nonlocale_urls.py (and to move any other similar routes into a relevant nonlocale_urls.py file.

Note: this comment will make more sense after you've seen the changes to urls/mozorg_mode.py

LocalePrefixPattern(prefix_default_language=prefix_default_language),
list(urls),
)
]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: This is also how SUMO does it

# them to be for our lookup
lang, _partition, classifier = language.partition("-")
lang = lang.lower() # this part is _always_ lowercase

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: What follows could be refactored into something more easily readable, but waiting on feedback. Supporting the special cases could at least be moved into a small helper func

Return a locale code we support on the site using the
user's Accept-Language header to determine which is best. This
mostly follows the RFCs but read bug 439568 for details.
"""
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: (to self): reading this now, maybe we can just use Django's own helpers for this

2) If no redirect is needed, sets request.locale to be the normalized
lang code we've got from the URL

Querystrings are preserved in GET redirects.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick (non-blocking): This docstring should probably also mention the support for querystring-based language switching

with normalized_get_language():
with simplified_check_for_language():
return super().process_response(request, response)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: Credit for the patcing approach goes to @escattone, who worked out how to do it for SUMO

bedrock/base/middleware.py Show resolved Hide resolved
@stevejalim stevejalim force-pushed the 14256-refactor-i18n-mechanism branch from 8a19328 to 72c281e Compare March 25, 2024 18:26
@stevejalim stevejalim marked this pull request as ready for review March 25, 2024 19:18
@alexgibson
Copy link
Member

alexgibson commented Mar 26, 2024

I have not pulled this locally or looked at any of the code, but I did do some spot checking on demo7 in mozorg mode, and generally everything seemed fine. Nice work! I did not test the full VPN sign up flow, but that seems perhaps a bit much for this PR as I'm not sure there would be any effect there from these lang changes?

  • Check the redirect from all lower case country codes to uppercase (example: en-us will redirect automatically to en-US. Ideally check for all settings.PROD_LANGUAGES
- [x] Check that explicit fallbacks/canonical languages are still in place (example: ja-JP-mac -> 'jp'; sv -> sv-SE; zh-HK -> zh-TW
- [x] Check that the language-switcher in the footer works, with and without JS
- [x] Check that the Accept-Language header value (ie Browser language) is successfully respected and Bedrock uses the given lang code if there is none in the path
- [x] Also check that picking a specific lang via the path (eg /de/about/) will get a German page, even if the Browser default language is set to something other than German
- [x] Confirm that three-letter lang codes (eg sco) work fine
  • Check special 404 Locale picker page is served at / when there is no accept-language header - @robhudson
  • Check sensitive lang-based redirects work (e.g. zh-Hant-TW -> zh-TW)
  • Check newsletter signup and management works normally
    • Check locale changes are reflected in the Newsletter form
    • Check confirming subscription and updating prefs in a non-en-US local works fine
  • Check VPN signup is OK
    • This is a lengthy process that requires a credit card, but I can’t imagine what this PR would have changed that effects anything here?
  • Check releasenotes pages
    • Desktop
    • Developer/beta/
    • Nightly
    • Android
    • iOS
  • Check Contentful-sourced pages are OK - /products/vpn/resource-center/
  • Check WP-blog-sourced pages are OK - eg downpage on /careers there should be some blog posts pulled in
  • Check FF download is OK (but note that the bouncer/download URL on www-demo is 404ing if you use a non en-US lang code, but that's unrelated to these chanegs)
  • Check sitemaps are happy at /sitemap.xml and /{lang-code}/sitemap.xml

Copy link
Member

@robhudson robhudson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of good stuff here. I want to re-read when I have a chance and now drowning in Braze work. I should have more time tomorrow.

bedrock/base/middleware.py Show resolved Hide resolved
bedrock/base/middleware.py Show resolved Hide resolved
bedrock/mozorg/tests/__init__.py Show resolved Hide resolved
return locale
else:
return f"{parts[0]}-{parts[1].upper()}"
def _fix_case(lang_code):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in support of that. The less code we have that does similar things the easier to understand.

@stevejalim stevejalim force-pushed the 14256-refactor-i18n-mechanism branch from 72c281e to 95c5451 Compare April 3, 2024 14:01
Copy link
Member

@robhudson robhudson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on a difficult change.

bedrock/base/i18n.py Outdated Show resolved Hide resolved
# Handle the non-JS language selection as a priority. Once we're switched
# the subsequent checks will clean up things more, if needed.
# Fixes https://github.com/mozilla/bedrock/issues/5931
lang_via_querystring = request.GET.get("lang", None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would None be the default here, assuming request.GET quacks like a dict?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, it doesn't quite quack like that. Querydict.get works the same way as QueryDict.__getitem__ which will blow up if no default is passed (docs)

bedrock/base/middleware.py Show resolved Hide resolved
bedrock/base/urlresolvers.py Show resolved Hide resolved
bin/run-tests.sh Show resolved Hide resolved
@stevejalim stevejalim force-pushed the 14256-refactor-i18n-mechanism branch from 95c5451 to 342335f Compare April 10, 2024 20:14
@stevejalim
Copy link
Collaborator Author

Thank you @robhudson and @alexgibson for the reviews. I've rebased on recent mains, including the change to Django 4.2 and we're still looking good, so I'm hoping to merge this as soon as we're back after the weekend

@stevejalim stevejalim force-pushed the 14256-refactor-i18n-mechanism branch 2 times, most recently from 2883809 to 1d3b288 Compare April 16, 2024 09:08
@a-kyne
Copy link

a-kyne commented Apr 16, 2024

I was able to check a few things and did not spot any problems:
Mozorg mode on https://www-demo7.allizom.org/
- [X] Check the redirect from all lower case country codes to uppercase (example: en-us will redirect automatically to en-US. Ideally check for all settings.PROD_LANGUAGES
- [ ] Check that explicit fallbacks/canonical languages are still in place (example: ja-JP-mac -> 'jp'; sv -> sv-SE; zh-HK -> zh-TW
- [X] Check that the language-switcher in the footer works, with and without JS
- [X] Check that the Accept-Language header value (ie Browser language) is successfully respected and Bedrock uses the given lang code if there is none in the path
- [X] Also check that picking a specific lang via the path (eg /de/about/) will get a German page, even if the Browser default language is set to something other than German
- [ ] Confirm that three-letter lang codes (eg sco) work fine
[X] Check special 404 Locale picker page is served at / when there is no accept-language header - @robhudson

  [X] Check Contentful-sourced pages are OK - /products/vpn/resource-center/
    Check WP-blog-sourced pages are OK - eg downpage on /careers there should be some blog posts pulled in

 [x] Check sitemaps are happy at /sitemap.xml and /{lang-code}/sitemap.xml

…zorg.urls

This includes splitting mozorg.urls into two groups so we only use
i18n_patterns with paths that need it. We've kept the main, localed, group's
name to be urls.py to reduce the chance of merge conflicts, as this piece
of work is likely to need rebasing against main quite a lot until it's ready.

The decision about which get prefixed and which do not is based on the
settings SUPPORTED_NONLOCALES and SUPPORTED_LOCALE_IGNORE that were used
in our former i18n machinery. At the time of committing, they still exist,
but both of should be able to be removed once this work is done (or at least
moved to power some tests)
…ector - we don't need it unless we explicitly ask for it
* Contentful-related tests needed the dev-urls config wrapping in URLpatterns
to make it work as per the real site; if we just specced dev_urls they did not
work because they were not i18n-patterned
…ow up and the former matches original behaviour better
…ng place for us

By default check_for_language() seeks gettext-compatible files in locale/ directories
in the project, but we don't have those because our l10n is handled using Fluent
strings in the www-l10n-team directory, cloned from their git repo.

So, instead, we just perform a light check to see whether a given lang is among the
langs that shold be available. The actual decision about whether we have have (enough)
Fluent strings available for that locale happens in lib.l10n_utils.render()
…nt route to making check_for_language work; this was the wrong direction
…header is present

This changeset ensures we retain the existing behaviour where a robot/spider
going to / with no accept-language header gets a 200 OK response but with a
locale-picker page, NOT the homepage.

The fix here was to avoid running Django's LocaleMiddleware when (and only
when) we detect the special case of a request for the root path and no
Accept-Language header - which is basically a call for the root page with
no way to determine the client's locale. (We don't use sessions in Bedrock
so Django will never find a language code in a session anyway)
@stevejalim stevejalim force-pushed the 14256-refactor-i18n-mechanism branch from 1d3b288 to f968cb9 Compare April 17, 2024 13:08
@stevejalim stevejalim merged commit 66ce747 into main Apr 17, 2024
9 checks passed
@stevejalim stevejalim deleted the 14256-refactor-i18n-mechanism branch April 17, 2024 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Review: M Code review time: 1-2 hours
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor legacy, custom i18n mechanism to use core django i18n
4 participants