Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add full i18n support #1192

Merged
merged 20 commits into from
Feb 23, 2023
Merged

feat: Add full i18n support #1192

merged 20 commits into from
Feb 23, 2023

Conversation

jpmckinney
Copy link
Contributor

@jpmckinney jpmckinney commented Feb 16, 2023

I don't understand how packaging works now, but babel.cfg was previously included in MANIFEST.in.

It's only needed in development, so maybe no action is needed.

Fix #257

@jpmckinney jpmckinney mentioned this pull request Feb 16, 2023
3 tasks
@jpmckinney
Copy link
Contributor Author

@choldgraf Ready for review

Copy link
Collaborator

@choldgraf choldgraf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me - in my opinion we should try to merge it in relatively quickly, since it is a nice iterative improvement (and note we have docs about this setup at the PR below)

Three quick questions but I don't think any of them should block merge.

  • Could we .gitignore the .mo files and then generate them at package build times? Since those are programmatic ones, we could not check them in to our project, and put a build for them in our "build the documentation" contributor docs. That way the only files in the repository are the ones that humans edit.
  • If we do small things like change our HTML template structure, what do we do?. I noticed the docs mention that if you update the text you need to re-generate the .po files for each language. But those files also have hand-edited translations in them as well as specific line numbers. Are those updated automatically when you re-run pybabel?
  • Could we add jobs to our noxfile for the generation steps? I feel like that would be a nice way to make this easier for folks: https://pydata-sphinx-theme--1178.org.readthedocs.build/en/1178/community/i18n.html#changing-natural-language-text
  • Should we provide guidance for whether to include non-translateable text in translations? Given that it is common to have non-translateable words (like brand words such as GitHub) in elements, it might be helpful to explicitly note how to do this. For example you seem to use something like %(provider) around here but I don't know what this means or how to use it.

cc also @12rambau who I believe has also done a bit of translation work in Sphinx.

@12rambau
Copy link
Collaborator

Could we .gitignore the .mo files and then generate them at package build times?

I actually don't know pybabel very well so I don't know how it generate the .po file but we should not keep the .mo in the github for sure.

If we do small things like change our HTML template structure, what do we do?

The best way to proceed is to use an external tool to make the translation (I'm a big fan of crowdinbut Transifexalso has an open-source support). That way the translation platform keep tracks of what you have already translated and automatically update the other locales. here is an example of something I'm working on: https://crowdin.com/project/sepal-doc

Could we add jobs to our noxfile for the generation steps?

that would be neat. Even better would be a pre-commit hook.

Should we provide guidance for whether to include non-translateable text in translations

You can use parameters in the translated messages (i.e. "display (%number) image" ) and add rules for plurals. If you think it's needed then yes documentation would be welcome (I think a link to the .po formatting would be sufficient)

@jpmckinney
Copy link
Contributor Author

jpmckinney commented Feb 17, 2023

The MO files must be in the package for translation to work. Right now they will be - if you later write a way to build them at the packaging step, even better. But, as is, there is no harm in having them in version control.

If you change non-translatable text, you can run pybabel and it will update the line numbers. However, this step is optional - the line numbers are to inform the human translator, not to perform the translation.

If you change translatable text, pybabel will usually try to fuzzy match the new text with an existing translation. If so it’ll add a comment like #, fuzzy. Otherwise, the new text will simply be untranslated in the file.

I’ve used Transifex, but for a small project with a few dozen strings, it’s fine to just hand-edit the file. Up to the maintainers.

When you do some Jinja like {% trans %}Hello {{ name }}{% endtrans %}, it’s extracted as the plain Python format string Hello %(name)s, such that it can be used in both Jinja templates and Python code. The translated text must contain the %(name)s verbatim.

Full sentences and clauses must always be a single translatable string. Otherwise, you get “next page” translated as “suivant page” instead of “page suivante” etc (a real bug before this PR)

@choldgraf
Copy link
Collaborator

That is super helpful knowledge to share, thanks @jpmckinney . Any chance you could work that into the i18n documentation? If it's too hard to find the right place, feel free to just create a "Miscellaneous tips" section at the end and just copy/paste your response above into it. I was worried that the line numbers would mean we'd have to hand-edit them every time something shifted by one line.

re: .mo files being checked into git, I'm fine just tracking that in a separate issue if @12rambau doesn't have strong opinions 👍

@jpmckinney
Copy link
Contributor Author

The docs from the other PR can remove the warning at the top about things not being implemented - otherwise, all good!

@jpmckinney
Copy link
Contributor Author

Any chance you could work that into the i18n documentation?

Can we create a new issue? I might not get around to it right away, and it’s better to merge the feature. The docs aren’t specific to this project - just helpful tips on how pybabel and gettext work.

@choldgraf
Copy link
Collaborator

choldgraf commented Feb 17, 2023

OK I decided to give a quick pass at this, so just pushed a commit with a few updates:

  • Our noxfile.py now has a command for translate, which supports an extra verb for each of the commands we describe here. I documented that too.
  • I removed the .mo files from git history, and added a script to our webpack configuration to generate the .mo files when the theme is compiled. This ensures that they'll be included any time the theme is built (similar to how we treat other static assets). This also means adding pybabel as a dependency which I think is reasonable?
  • I also added sudo apt-get install gettext to our publish job, because I found in the book theme that for some reason this wasn't installed normally and this bit us in the book theme.
  • I added to the documentation from @jpmckinney 's information above.

I'd suggest that @12rambau takes a quick pass, and then we cut a release of the theme, then merge this quickly so that we have time to play around with it.

@choldgraf
Copy link
Collaborator

I'm not sure why the macos and the windows builds aren't working. I suspect that this is because of a dependency that isn't installed somewhere, or a path issue. But I can't figure out what it could be. @12rambau any ideas?

Given that we always build the package for this theme in linux, we could skip the translation test on windows + macosx so that this doesn't become a nagging problem...wdyt?

Copy link
Contributor Author

@jpmckinney jpmckinney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs fixes.

docs/community/topics/i18n.rst Outdated Show resolved Hide resolved
docs/community/topics/i18n.rst Outdated Show resolved Hide resolved
docs/community/topics/i18n.rst Outdated Show resolved Hide resolved
docs/community/topics/i18n.rst Outdated Show resolved Hide resolved
docs/community/topics/i18n.rst Outdated Show resolved Hide resolved
docs/community/topics/i18n.rst Outdated Show resolved Hide resolved
docs/community/topics/i18n.rst Outdated Show resolved Hide resolved
docs/community/topics/i18n.rst Outdated Show resolved Hide resolved
@jpmckinney
Copy link
Contributor Author

The windows and macos tests were passing in my last commit - not sure why they are failing now (other than the MO files not being committed).

babel.cfg Outdated Show resolved Hide resolved
Comment on lines +2 to +5
[jinja2: **.html]
encoding = utf-8
ignore_tags = script,style
include_attrs = alt title summary placeholder
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

current (6.1.x) sphinx has this:

[jinja2: **/themes/**.html]
encoding = utf-8
ignore_tags = script,style
include_attrs = alt title summary

IDK though which is correct for us?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have strong opinions on this, I suggest we just pick one and see if it has any tangible impact, then change it later if we want. IMO unless we have a strong rationale for excluding placeholder, we should just keep it 🤷

src/pydata_sphinx_theme/locale/en/LC_MESSAGES/sphinx.po Outdated Show resolved Hide resolved
src/pydata_sphinx_theme/locale/es/LC_MESSAGES/sphinx.po Outdated Show resolved Hide resolved
src/pydata_sphinx_theme/locale/fr/LC_MESSAGES/sphinx.po Outdated Show resolved Hide resolved
Co-authored-by: Daniel McCloy <dan@mccloy.info>
Co-authored-by: James McKinney <26463+jpmckinney@users.noreply.github.com>
Copy link
Collaborator

@choldgraf choldgraf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I swapped the spanish / french sentences that @drammock mentioned. I suspect that was a one-off mistake because it was the same sentence in both languages.

I also accepted all of @jpmckinney's suggested docs edits.

Given that the translation compilation will always happen via Linux for the forseeable future, I suggest that we:

  1. Skip the HTML translation test on MacOS and Windows jobs
  2. Make a release of this theme
  3. Merge this PR and iterate from there

What do others think?

Comment on lines +2 to +5
[jinja2: **.html]
encoding = utf-8
ignore_tags = script,style
include_attrs = alt title summary placeholder
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have strong opinions on this, I suggest we just pick one and see if it has any tangible impact, then change it later if we want. IMO unless we have a strong rationale for excluding placeholder, we should just keep it 🤷

@jpmckinney
Copy link
Contributor Author

Skip the HTML translation test on MacOS and Windows jobs

Tests pass now.

Copy link
Collaborator

@choldgraf choldgraf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice ! OK so looking back over the code, given that this PR is roughly strictly an enhancement / bugfix of previously broken translation infrastructure, and given that it doesn't change anything in the HTML other than providing translation strings, what do @drammock or @12rambau of getting this in the current release cycle instead of waiting for the next? I'm +1 on merge either way (before or after the next release)

Copy link
Collaborator

@12rambau 12rambau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm +1 merging it whenever. that will be an excellent base to work on.

I have one last comment to make though. I realized that we are using nox in the test only for the translation build. My guess is that we avoid it because it make a virtual env within a very controlled env. What about simply adding the code run in nox instead ?

@choldgraf
Copy link
Collaborator

To answer part of your question, the only reason I avoid using nox in the GitHub actions tests themselves is because I haven't figured out how to cache the environment properly between builds, but i know that some projects do re-use nox for all their cicd. That way when you run it locally you are almost exactly replicating the environment that is used in the actions. IMO there's nothing inherently wrong with using nox inside of actions (but I'm fine on whatever implementation we want to go with here if others prefer to repeat the code by hand in the action)

Copy link
Collaborator

@12rambau 12rambau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then let's go with that and open a new issues about moving toward nox, I think it would make our lives easier if everything is managed the same way from one file (I think we can take inspiration from here: https://github.com/pypa/packaging/blob/main/.github/workflows/lint.yml)

@drammock
Copy link
Collaborator

getting this in the current release cycle

Ok with me.

@choldgraf
Copy link
Collaborator

Ok cool - then I'll merge this and a couple minor fix PRs and cut another RC (because why not and people continue to find tiny bugfixes etc so I figure it's a reasonable thing to keep with the pre-releases as long as that's happening 😅)

@12rambau
Copy link
Collaborator

Instead of doing rc why don't we use patches (0.13.xx) ? so that everyone relying on "pydata-sphinx-theme" get the latest devs

@choldgraf
Copy link
Collaborator

choldgraf commented Feb 23, 2023

@12rambau you mean just make a full release now? I would be fine with that as well, just felt like an RC is safer if we keep adding new things. I also want to double-check that the translation files are actually in the .whl file, because we've run into issues in the book theme where that didn't happen and had to be debugged. I'll make an RC now but if you'd like to go for a full release, I am +1 on it!

@choldgraf choldgraf merged commit 8598864 into pydata:main Feb 23, 2023
12rambau added a commit to 12rambau/pydata-sphinx-theme that referenced this pull request Mar 2, 2023
* Fix extra whitespace in sidebars (pydata#1115)

* Fix extra whitespace in sidebars

* Searchbox

* Update src/pydata_sphinx_theme/__init__.py

Co-authored-by: Daniel McCloy <dan@mccloy.info>

* make test pass

* Fix template filter to remove empty files

* ABlog in template test

* Move clear search button to primary sidebar

* Move search clear button to top of article

Co-authored-by: Daniel McCloy <dan@mccloy.info>

* FIX: Use logo_url instead deprecated logo in theme (pydata#1094) (pydata#1097)

resolves pydata#1094

* ENH/MAINT: avoid overwriting the HtmlTranslator (pydata#1105)

Co-authored-by: Chris Holdgraf <choldgraf@gmail.com>
Fix pydata#143
Fix pydata#94

* fix: align sidebar sliding with the buttons (pydata#1123)

* fix: aline the sidebar sliding with the buttons

* build: force test to run on all platform
if one platform is failing we cannot see if it's platform related as the other were closed

* fix: use correct path for documentation logo

* MAINT: Improve font sizing (pydata#1129)

Fix pydata#1001

* MAINT: Refactor workflows to reduce test dependencies (pydata#1136)

* MAINT: update prerelease workflow (pydata#1140)

* ABlog: Updates for new HTML structure (pydata#1118)

* ABlog: Updates for new HTML structure

* Update templates for latest release

* Bump to dev0

* Standardize logo image behavior between Sphinx and this theme (pydata#1132)

Co-authored-by: Daniel McCloy <dan@mccloy.info>
Co-authored-by: Chris Holdgraf <choldgraf@gmail.com>
Co-authored-by: Chris Holdgraf <choldgraf@berkeley.edu>

* 0.13.0rc1

* Build(deps): Bump http-cache-semantics from 4.1.0 to 4.1.1 (pydata#1154)

* DOC: Use only shield.io for badges in README (pydata#1152)

* Copyright semicolon (pydata#1160)

* FIX: Flex behavior should shrink header items instead of brand (pydata#1158)

Co-authored-by: Chris Holdgraf <choldgraf@gmail.com>
Fixes pydata#1143

* STYLE: lint the documentation with Doc8 (pydata#1150)

Fix pydata#1139

* Add test for internationalization and translations (pydata#1138)

* FIX: Javascript incorrect check for variable (pydata#1166)

* MAINT: update pypi classifiers (pydata#1153)

Fix pydata#1106

* remove emoji from landing page (pydata#1151)

* add fa icons instead of emoji

* remove fix for emojis

* use markup for readme emojis

* use pst-color-primary instead of sd-text-primary

* make our semantic colors available as classes

* try again

---------

Co-authored-by: Daniel McCloy <dan@mccloy.info>

* FIX: Narrow scope of style rule for GitHub & GitLab link shortening (pydata#1167)

Fixes pydata#1156

* ENH: Add breadcrumbs to article header (pydata#1142)

* ENH: Add breadcrumbs to article header

* Update src/pydata_sphinx_theme/theme/pydata_sphinx_theme/components/breadcrumbs.html

Co-authored-by: Tania Allard <taniar.allard@gmail.com>

* More fixes

* Improving nested page behavior

* Documenting breadcrumbs

* Update src/pydata_sphinx_theme/assets/styles/components/_breadcrumbs.scss

Co-authored-by: Rambaud Pierrick <12rambau@users.noreply.github.com>

* Breacrumbs have link color

---------

Co-authored-by: Tania Allard <taniar.allard@gmail.com>
Co-authored-by: Rambaud Pierrick <12rambau@users.noreply.github.com>

* Degrade gracefully when JavaScript is disabled (pydata#1146)

* Fix header vertical spacing and jupyter-sphinx cells (pydata#1164)

Fixes undefined

* RLS: v0.13.0rc2 (pydata#1170)

* DOCS: admonition customization (pydata#1155)

* first draft of the admonition customization

* typo in doc link

* flesh out admon. customization example; DRY

* use :code:rst instead of :literal:

* Update docs/_static/custom.css

---------

Co-authored-by: Daniel McCloy <dan@mccloy.info>

* Fix article header CSS (pydata#1171)

* “Edit this page” → “Edit on GitHub/GitLab/Bitbucket” (pydata#1177)

* “Edit this page” → “Edit on GitHub/GitLab/Bitbucket”

Fixes pydata#1172

* Add tests

* Fix typo

* Properly handle default_mode=auto when writing logos (pydata#1183)

We used to only defaulting to the light version when `default_mode` was
undefined, not when it was explicitly set to `auto`. We also need to
handle the latter, as the new test shows.

Closes pydata#1180

Co-authored-by: Jérémy Bobbio (Lunar) <lunar@softwareheritage.org>

* fix: correctly add DOM listeners (pydata#1179)

fix adding DOM listeners

* maint: update GitLab URL tests (pydata#1186)

Co-authored-by: JoerivanEngelen <joerivanengelen@hotmail.com>

* Standardize template structure in more sections (pydata#1184)

* Standardize template structure in all sections

* Fixing footer behavior

* Update docs/user_guide/layout.rst

Co-authored-by: Daniel McCloy <dan@mccloy.info>

* Remove use of id= as much as possible

---------

Co-authored-by: Daniel McCloy <dan@mccloy.info>

* maint: remove sphinx-panels support; remove deprecated config shims (pydata#1188)

* Minor style improvements to ablog (pydata#1185)

* RLS: v0.13.0rc3

* dev0

* FIX: Some style bugs (pydata#1191)

* FIX: Some style bugs

* Move link word wrap to global rule

* DOCS: Add internationalization instructions (pydata#1178)

Co-authored-by: James McKinney <26463+jpmckinney@users.noreply.github.com>

* Refactor contributing docs to be more modular (pydata#1173)

* Dev0

* Fix github gitlab brand (pydata#1194)

* RLS: v0.13.0rc4

* FIX: Make wide equations scroll (pydata#1196)

* Fix math scrollbars for realz (pydata#1198)

* Fix math scrollbars for realz

* Update src/pydata_sphinx_theme/assets/styles/content/_math.scss

* Update src/pydata_sphinx_theme/assets/styles/content/_math.scss

* copy_logo_images: do not render dynamic Sphinx template content (pydata#1204)

* copy_logo_images: do not render dynamic Sphinx template content when copying logo image files

* Update src/pydata_sphinx_theme/__init__.py

---------

Co-authored-by: Chris Holdgraf <choldgraf@gmail.com>

* Add conditional check for last-updated template (pydata#1201)

* Add conditional check for last-updated template

* Whitespace

* Properly set configuration with app.builder.theme_options (pydata#1199)

* Properly set configuration

* Dict to values

* Making it explicit in a function

* Better name

* Fix test

* Foot

* Revert complex config set

* Clarify docs

* Use CSS transform for skip link (pydata#1206)

* feat: Add full i18n support (pydata#1192)

Co-authored-by: Daniel McCloy <dan@mccloy.info>
Co-authored-by: James McKinney <26463+jpmckinney@users.noreply.github.com>
Co-authored-by: Chris Holdgraf <choldgraf@berkeley.edu>
Co-authored-by: Chris Holdgraf <choldgraf@gmail.com>

* Dev0

* FIX: Remove icon links component when no icon links given (pydata#1209)

* RLS: 0.13.0rc5

* dev0

* FIX: Get theme options in a more robust way (pydata#1214)

* RLS: v0.13.0rc6

* Make heading-style use the font-weight-heading value (pydata#1213)

* Make heading-style use the font-weight-heading value

* Separate font-weight setting for content headers and admonitions

* Flip var to be consistent with --pst-font-weight-heading instead

* RLS: v0.13.0

* bump: dev0

* DOCS: Remove <p> from announcement sample text (pydata#1223)

---------

Co-authored-by: Chris Holdgraf <choldgraf@berkeley.edu>
Co-authored-by: Daniel McCloy <dan@mccloy.info>
Co-authored-by: Nico Albers <nico.albers@aboutyou.com>
Co-authored-by: Chris Holdgraf <choldgraf@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Brendan Heberlein <bheberlein@wisc.edu>
Co-authored-by: Tania Allard <taniar.allard@gmail.com>
Co-authored-by: Lunar <lunar@debian.org>
Co-authored-by: Jean Abou-Samra <jean@abou-samra.fr>
Co-authored-by: Jérémy Bobbio (Lunar) <lunar@softwareheritage.org>
Co-authored-by: JoerivanEngelen <joerivanengelen@hotmail.com>
Co-authored-by: James McKinney <26463+jpmckinney@users.noreply.github.com>
Co-authored-by: James Addison <55152140+jayaddison@users.noreply.github.com>
Co-authored-by: Veronica Berglyd Olsen <1619840+vkbo@users.noreply.github.com>
@12rambau
Copy link
Collaborator

@jpmckinney I'm asking here because I'm no expert on pybabel and I would need some insight from your side before doing a small refactoring of our actions (related to #1292).

  • in the update command, you set up domain to "sphinx", any reason not to keep the default "message" ?
  • in the extract function you have added multiple keywords (_, __, l_ and lazy_gettext) what are they used for ? why cannot we rely on the defaults ?

@jpmckinney jpmckinney deleted the babelv2 branch April 14, 2023 16:36
@jpmckinney
Copy link
Contributor Author

I just copied Sphinx - we can probably set the domain to anything we want: https://github.com/sphinx-doc/sphinx/blob/2329fdef8c20c6c75194f5d842b8f62ebad5c79d/setup.cfg#L22


Unless --no-default-keywords is specified, then keywords specified with -k are appended: https://github.com/python-babel/babel/blob/56071c90116e6b9ebbb62ac072bcf032fc6987cb/babel/messages/frontend.py#L415-L417

The default keywords are here: https://github.com/python-babel/babel/blob/56071c90116e6b9ebbb62ac072bcf032fc6987cb/babel/messages/extract.py#L77-L88

pydata-sphinx-theme doesn't use __, l_ or lazy_gettext, and _ is already a default keyword, so we can remove this configuration.

I had simply copied Sphinx's configuration: https://github.com/sphinx-doc/sphinx/blob/2329fdef8c20c6c75194f5d842b8f62ebad5c79d/setup.cfg#L18

@12rambau
Copy link
Collaborator

perfect, thanks a lot.

I think I'll move it toward default behaviour to reduce the complexity of the command call

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Translations for page elements
4 participants