Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Internationalization for fortran-lang #201

Closed
wants to merge 25 commits into from

Conversation

awvwgk
Copy link
Member

@awvwgk awvwgk commented Feb 9, 2021

This patch introduces the infrastructure for localization support of the webpage using the jekyll-multiple-languages-plugin. You can checkout my i18n branch locally and add a new language as described in #197 (comment) to play around with the feature. Feedback is more than welcome on this patch.

  • allow to replace page body with localized version
  • have markup only in main tree while translations provide content for page bodies
  • localize webpage title
  • localize all keywords in the main tree
  • localize all keywords in _include
  • localize all keywords in _layouts
  • localize permalinks
  • localize page names
  • avoid copying assets in every language subtree
  • include language navigation in footer
  • add documentation for translators

This PR does not add localization to any language yet, a build preview will only show if nothing is broken.
This pull request includes the Spanish translation provided by @aslozada and the French translation provided by @vmagnin, the side can be previewed at https://fortran-lang.org/pr/201/.

@awvwgk awvwgk added the i18n Related to internationalisation and translations label Feb 9, 2021
@certik
Copy link
Member

certik commented Feb 9, 2021

How does the translation work? I couldn't figure it out ---- do you just translate the text, or do you need to copy the html for each language, which becomes redundant?

@awvwgk
Copy link
Member Author

awvwgk commented Feb 10, 2021

@certik I added a brief guide here describing the basic process, it still needs a bit of polishing though.

The html body, including the markup, is currently just copied to have something to start working and try the plugin. I'll look into separating the markup from the translation later today.

Copy link
Member

@arjenmarkus arjenmarkus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the looks of it, the file and folder names of the language should follow the ordinary abbreviations, like "de" for German, "fr" for French etc. Perhaps this can be stressed.

@certik
Copy link
Member

certik commented Feb 10, 2021

In the SymPy webpage, only the text itself is translated, all the markup is reused from the parent. So that way you are free to change the layout and all translations will still work. Furthermore, if you change a text in the English version, the translated version will simply show the English text until it is translated. So translations are always up to date in both layout and text (although parts of the text might not always be translated until somebody updates it).

Unless we can figure out something similar, it will become a nightmare to make any changes to the English page, as it will not be reflected in the translations, neither in layout nor text.

@vmagnin
Copy link
Member

vmagnin commented Feb 11, 2021

I have translated the keywords in my _i18n/fr.yml file.
As far as I can test, everything is OK except:

  • the packages: section has no effect on the "Packages" page.
  • The sentence "RSS clients can follow the RSS feed" (homepage) can not be translated neither in the fr.yml file nor in the fr/index.html file.

@awvwgk
Copy link
Member Author

awvwgk commented Feb 11, 2021

@vmagnin Thanks for testing, I noticed a few missing keywords while skimming through the code yesterday, but found no time to push another patch before midnight.

There was one RSS client sentence that was difficult to localize as it was using liquid templating inside to generate some links, so the general setup of this sentence + link generation needs reconsideration first. Might have been this one.

@vmagnin
Copy link
Member

vmagnin commented Feb 12, 2021

@awvwgk ,
The "Localize leftover keywords" commit fixed the RSS problem.

Concerning the "Packages" page, the section packages: is working with the subpages (for example packages/fpm), but not with the main packages page.

@awvwgk
Copy link
Member Author

awvwgk commented Feb 12, 2021

This might be partly due to the page titles, which are not localized yet but taken directly from the front matter of the main tree. They have to be handled differently for some reason.

@vmagnin
Copy link
Member

vmagnin commented Feb 14, 2021

Hi Sebastian @awvwgk

my translations of the homepage _i18n/fr/index.hmtl and of the keywords _i18n/fr.yml are ready. I have not yet committed them to my fork. Before doing it, let me know:

  • Should I commit them directly in my _i18n branch or create a _i18n_fr branch?
  • Should I make a Pull Request? Or should I wait?
  • In the first case, should I make a Pull Request to your fork, or to the original project https://github.com/fortran-lang/fortran-lang.org ?

I will now begin translating the other main sub-pages, at a slower pace.

@LKedward
Copy link
Member

Thanks for spearheading this @awvwgk and @vmagnin, this is a big project and is looking promising! However I want to second Ondřej's concerns about layout duplication and the subsequently increased workload for future changes — have you worked out a way to avoid this at all?

@awvwgk
Copy link
Member Author

awvwgk commented Feb 14, 2021

@vmagnin First, that's really amazing. It also means I better hurry up here ;).

Let's try to keep things simple. git can handle feature branches built on non-merged feature branches sufficiently well, the GitHub UI is not that great for reviewing such kind of PRs. But since your French translation lives in a separate subtree this might work out just fine.

If you are fine with rebasing your changes on my i18n branch as I proceed building this branch up, I see no problem to create a separate (draft) PR on the fortran-lang repo, which is probably more visible than my fork to the community. This would allow to gather feedback on the translation and the infrastructure separately if we coordinate correctly here.

@awvwgk
Copy link
Member Author

awvwgk commented Feb 14, 2021

However I want to second Ondřej's concerns about layout duplication and the subsequently increased workload for future changes — have you worked out a way to avoid this at all?

Yes, I have found a way to separate the content from the markup, just have to push the patch here.

@awvwgk
Copy link
Member Author

awvwgk commented Feb 14, 2021

Patch is in, basically we can exploit that no document in the _i18n tree will generate a page, and scatter the content in separate files which are included with the translate_file function at the correct places.

@awvwgk
Copy link
Member Author

awvwgk commented Feb 14, 2021

I think I got everything now, but more eyes will help to catch oversights and mistakes. Let's have a #build_preview for this PR.

@awvwgk awvwgk marked this pull request as ready for review February 14, 2021 15:54
@github-actions
Copy link

This PR has been built with Jekyll and can be previewed at: https://fortran-lang.org/pr/201/

@vmagnin
Copy link
Member

vmagnin commented Feb 14, 2021

Thanks @awvwgk
it seems you did a great job in your last commits! I will git rebase tomorrow and take my time to carefully merge my work with yours. And complete the translations of keywords in the .yml file.
The "preview" feature is very interesting!

@aslozada
Copy link
Member

I think I got everything now, but more eyes will help to catch oversights and mistakes. Let's have a #build_preview for this PR.

I hope to help with these eyes. I think it's great that information about Fortran can be released in other languages.

The discourse tag in the global section seems to be broken. The tags mailing_list, rss_feed and the others on the right side of the webpage work correctly.

@github-actions
Copy link

This PR has been built with Jekyll and can be previewed at: https://fortran-lang.org/pr/201/

@awvwgk
Copy link
Member Author

awvwgk commented Feb 20, 2021

I would suggest to keep the javascript changes for separate patch.

Copy link
Member

@LKedward LKedward left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic effort everyone! The proposed internationalisation mechanism looks good and appears to works well. I spotted just one broken link: the news archive button at the bottom of the news page.

Finally, what are the guidelines for future PRs that change existing content or add new pages?

_i18n/en.yml Outdated Show resolved Hide resolved
TRANSLATING.md Show resolved Hide resolved
@awvwgk
Copy link
Member Author

awvwgk commented Feb 21, 2021

Finally, what are the guidelines for future PRs that change existing content or add new pages?

If localized content is changed this has to be documented somehow. In case one of the content blocks of the main page is changed all localized versions should be updated. I suggest to open an issue under the i18n label, showing the previous version and the new version and asking for updates of the translated content (maybe with a checklist for all available languages to keep track of the progress).

Added content is less problematic, because in the translate_file / tf case it is filled in from the English version by default, it only requires some attention when new keywords are added to the language file (translate / t case) to fill in English version or automatic translated stubs in the other language files (together with a FIXME or TODO note maybe).

@awvwgk awvwgk linked an issue Feb 21, 2021 that may be closed by this pull request
@aslozada
Copy link
Member

@awvwgk Would the changes be added by pull request or by a procedure like that used for the French and Spanish preliminary versions?


I suggest add to checklist, in the patch of infrastructure for localization, the js components (labels), and the data in the directory _data.

@awvwgk
Copy link
Member Author

awvwgk commented Feb 21, 2021

@awvwgk Would the changes be added by pull request or by a procedure like that used for the French and Spanish preliminary versions?

Either way would work, can't tell which will be best in practice.

I suggest add to checklist, in the patch of infrastructure for localization, the js components (labels), and the data in the directory _data.

I opened #211 for the _data entries, and #212 for the javascript localization.

@aslozada
Copy link
Member

aslozada commented Feb 21, 2021

... maybe with a checklist for all available languages to keep track of the progress ...

maybe,

Checklist to keep track of translation to name_language language

Every section corresponds to an item of navigation menu

  • Main page
  • Features
  • FAQ
  • Make a Fortran Better
  • Join Us!
  • Learn
  • Getting Started
  • Mini-book Tutorials
  • Other Resources
  • Fortran Compilers
  • Open Source Compilers
  • Commercial Compilers
  • Discontinued
  • Note
  • Community
  • Fortran-lang Community Projects
  • Get Involved
  • Fortran-lang Contributors
  • Packages
  • Find a Package
  • Browser Package by category

@awvwgk
Copy link
Member Author

awvwgk commented Feb 22, 2021

How do we want to proceed with this patch? I would love to see it merged sooner than later since it basically touches every file in the repository and will therefore be very likely to collide with most other patches and PRs.

@certik
Copy link
Member

certik commented Feb 22, 2021

I think we might be able to use this work to get to the state where I think we should get. However, I am afraid we are not there. For example, I can see a file like this in this PR:

<h3> <i data-feather="edit"></i>
  Guía del colaborador</h3>
<p>
¿Desea contribuir con código y contenido?
Consulte las guías para colaboradores en cada repositório para 
obtener información sobre el flujo del trabajo y de las
prácticas recomendadas.
</p>
<ul>
  <li> <a href="https://github.com/fortran-lang/stdlib/blob/master/WORKFLOW.md" target="_blank" rel="noopener">Guía del colaborador para stdlib</a> </li>
  <li> <a href="https://github.com/fortran-lang/fpm/blob/master/CONTRIBUTING.md" target="_blank" rel="noopener">Guía del colaborador para fpm</a> </li>
  <li> <a href="https://github.com/fortran-lang/fortran-lang.org/blob/master/CONTRIBUTING.md" target="_blank" rel="noopener">Guía del colaborador para fortran-lang.org</a> </li>
</ul>

And that I strongly believe is not the way to do it, because it is replicating the markup in every translation. So the minute we change the layout in the English version, all the translations will be wrong. It should not be the job of translators to fiddle with markup and trying to reproduce the English one. Neither should it be the job of contributors who improve the layout to do so in 30+ languages.

I strongly believe the way to do it is to translate just the sentences / paragraphs / words and use automated system to show immediately in each translation what sentences got updated and must be translated. Markup should be correct right away.

Note: You can look at the SymPy webpage where we do exactly what I described.

@awvwgk
Copy link
Member Author

awvwgk commented Feb 22, 2021

SymPy is using a gettext based solution, which I would prefer as well. There are two discontinued jekyll plugins (https://github.com/Stonelinks/jekyll-gettext-plugin and https://github.com/ruby-gettext/jekyll-task-i18n) which can use po-files, but I'm not too keen to use discontinued software here.

@LKedward
Copy link
Member

And that I strongly believe is not the way to do it, because it is replicating the markup in every translation.

I agree; it looks to me that the duplicated markup here is isolated to a few tf translation files — presumably we can eliminate duplicated markup from this PR simply by removing markup from the tf files or by replacing them with translation expression as used everywhere else. Is this not correct @awvwgk, or am I missing something? I'd like to understand before we write-off this PR to search for other solutions.

@awvwgk
Copy link
Member Author

awvwgk commented Feb 22, 2021

The main issue with the plugin used here is that we don't get safe fallbacks in case we use the t command but get a silent failure. The tf option is safer as it is falling back safely to the default language, yet it requires separate files, the currently chosen structure is the smallest acceptable chunk size in my opinion, because the default language, English, is handled like any other translated language. Moving all the content to the translation file is also inacceptable in my opinion, because we would end up with a single yml file for the complete content of the English main tree.

gettext has the great advantage that we are actually replacing strings with a safe fallback, it allows to write the page in the default language as usual and add a translation on top afterwards with a po-file in an automatic and safe way. But we don't have actively maintained gettext support for jekyll.

@LKedward
Copy link
Member

Thanks for explaining and I see why the gettext approach would be much more preferable. It is a shame that those plugins have been discontinued.

@awvwgk
Copy link
Member Author

awvwgk commented Feb 22, 2021

I would claim this patch is how far we can get with jekyll, unless we want to start maintaining a plugin ourselves for this purpose or switch to a different static page generator with better internationalization support.

I checked hugo already as a simple replacement for jekyll with built-in internationalization, but it basically offers the same solution we have in the patch here. Check gohugoio/hugo#1744 (comment) why hugo is not using gettext and their reasoning behind this decision.

@certik
Copy link
Member

certik commented Feb 22, 2021

Does Jekyll allow to translate a sentence by sentence? It will be really tough for translators to try to figure out which parts of the English versions got updated.

@awvwgk
Copy link
Member Author

awvwgk commented Mar 5, 2021

I think this PR is stuck at the moment. I won't pursue this particular approach further, which doesn't mean I will stop searching for a solution for the localization of the webpage. The translations of @aslozada and @vmagnin won't be lost with this PR, merely delayed until I figured out a better solution.

Anyone participating in the discussion here is invited to join the search for a suitable static side generator with better localization support at #89.

@awvwgk awvwgk added the wontfix This will not be worked on label Mar 5, 2021
@vmagnin
Copy link
Member

vmagnin commented Mar 22, 2021

I have just committed the French translation of the compilers.md page in my fork, corresponding to the latest 2021-03-06 English version:
https://github.com/vmagnin/fortran-lang.org/commits/i18n/_i18n/fr/compilers.md
It was quite difficult to translate, not only because there is a lot of technical details but also because most paragraphs are written in a commercial style.

@certik certik changed the title Internationalization for fortran-lang Draft: Internationalization for fortran-lang Apr 18, 2021
@awvwgk awvwgk closed this Dec 1, 2021
@awvwgk
Copy link
Member Author

awvwgk commented Dec 1, 2021

I guess it is time to close this topic, we can revive this branch from 1fb9900 to get the wonderful translations already contributed here. For now I'll focus on building the fpm documentation to explore how far we can get with sphinx. Maybe we can transfer some of the knowledge and experience from there to rebuilt our webpage with support for translations in the future. Having a multilingual Fortran homepage is still high on my wish list.

@awvwgk awvwgk deleted the i18n branch December 1, 2021 21:43
@awvwgk
Copy link
Member Author

awvwgk commented Dec 17, 2021

#delete_preview

@github-actions
Copy link

The preview build for this PR has now been deleted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
i18n Related to internationalisation and translations wontfix This will not be worked on
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Localization of the webpage
6 participants