Skip to content

Option to write the default locale to a subdirectory also. #60

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pserwylo opened this issue Jun 26, 2017 · 11 comments
Closed

Option to write the default locale to a subdirectory also. #60

pserwylo opened this issue Jun 26, 2017 · 11 comments

Comments

@pserwylo
Copy link
Contributor

pserwylo commented Jun 26, 2017

Problem

The webserver should be able to render the appropriate language based on the clients Accept-Language header. This is difficult when the default_lang from polyglot renders directly to the webroot, rather than a LANG/ subdirectory.

Case study

We are currently internationalizing the F-Droid website using polyglot. After chatting to the folks at Apache, we took our inspiration from the way in which the Apache webserver documentation translates their site and renders the correct language automatically to the user using content negotiation.

Specifically:

Here is their config for reference.

Required webroot layout

For the purpose of connecting Apache mod_negotiation (or equivalent) and its support for the Accept-Language header sent by the client, it is helpful if polyglot can output the following:

  • webroot/
    • en/
      • index.html
      • about.html
    • de/
      • index.html
      • about.html
    • fr/
      • index.html
      • about.html

Of course the webroot here is now empty, but then we run a script which iterates over each language directory, and outputs the following:

  • webroot/
    • index.html.en -> en/index.html
    • index.html.de -> de/index.html
    • index.html.fr -> fr/index.html
    • index.html
    • (and the same for about.html and every other .html file)

Where the top level index.html file is actually an Apache TypeMap with the following:

URI: index.html.en
Content-language: en
Content-type: text/html

URI: index.html.de
Content-language: de
Content-type: text/html

URI: index.html.fr
Content-language: fr
Content-type: text/html

Work around

In order to prevent the default_lang from populating the top level of the webroot, we're currently using this in our Jekyll sites _config.yml, but it feels a bit hackey:

languages: [ "en", "ast", "bo", "de", "es", "es_AR", "fa", "fr", "sc", "tr", "zh_Hans", "zh_Hant" ]
default_lang: "None. This forces jekyll polyglot to generate a en/ subdirectory too (with correct links)."

Originally I tried iterating over each file in fr/, finding the corresponding default version in the webroot, and moving it into en/ manually. This was not a solution though, because all of the links in the en/ pages link to the top level of the website, not the en/ subdirectory. By setting default_lang to nonsense, then the en/ directory gets created the same way as each other directory. That is, it also prefixes all of the links with en/ as required.

@untra
Copy link
Owner

untra commented Jun 27, 2017

Using that workaround, do you notice any problems? The way polyglot is designed, default_lang is never modified, and is used specifically to make the default locale exist at the _site webroot instead of in a subdirectory.

I would suggest you keep using that workaround. It makes sense to me that it gets the job done effectively.

@pserwylo
Copy link
Contributor Author

I haven't had any problems with it yet to be honest, just felt a bit hackey. However, happy for you to close this, and at least people can search and perhaps find this issue if they want to figure out more about using Content Negotiation with polyglot.

@pserwylo
Copy link
Contributor Author

In the future if our script that writes out typemaps becomes unmaintainable, I might see about submitting a PR which includes a config option to generate Apache2 mod_negotiation TypeMaps in the webroot. But that is a story for another day and out of scope of this issue.

@untra
Copy link
Owner

untra commented Jun 28, 2017

I hope this solution continues to work out. It's a very smart approach that I want to look into more, and see where else it can be leveraged.

@untra untra closed this as completed Jun 28, 2017
@pserwylo
Copy link
Contributor Author

Okay, so after continuing experimentation with our site, I've come across a bit of a blocker for the approach I outlined above. If I mark a file/directory as exclude_from_localization: ..., then under normal circumstances, it would get written to the (non-localized) root of the Jekyll site. However when I have no meaningful default_lang: then it doesn't have a directory to write it to. I would prefer it to still write it to the webroot, not the en/ subdirectory.

I thought I'd toy with the idea of a PR which still writes these files to the webroot, but it isn't something which I can see how to do cleanly. Specifically, with my "no default_lang" hack, the process_orig function never gets called and we only ever end up in process_active_language. At that point we've already lost all ability to distinguish between the main Jekyll site root, and the active_lang directory which is being written to.

@pserwylo
Copy link
Contributor Author

Hmm, now I think about it, perhaps this would work:

  • process_active_lang should still @exclude += @exclude_from_localization as it does not.
  • The outer process method (which iterates over @languages) can:
    • Check if default_lang is present in the available @languages or not.
    • If so, don't do anything special, because the process_orig will correctly output the stuff from exclude_from_localization, and process_language will correctly exclude them.
    • If not, manually copy each file/dir from exclude_from_localization` to the main site root.

I'd be happy to prepare a MR with this functionality. However, note that it may seem strange to users who don't realise that it is a requirement for people wanting to use Apache2 + mod_negotiation in the way that we (and the Apache2 folks themselves) are using it.

Does this seem sane and worthy of a PR?

@untra
Copy link
Owner

untra commented Jul 18, 2017

How does #65 look? If you can, please clone down the polyglot repo and switch to the branch, and run the make.sh script to install the latest 1.3.0 build. That should now build exclude_from_localization directories into your site root 👍

@untra untra reopened this Jul 18, 2017
@pserwylo
Copy link
Contributor Author

That works a treat. Thanks very much for the quick response and fix.

@untra
Copy link
Owner

untra commented Jul 23, 2017

#65 has been merged in, and the feature is now available in polyglot v1.3.0.

Thanks for your help @pserwylo making this release happen! I left your name in the credits on the polyglot website. Cheers!

@untra untra closed this as completed Jul 23, 2017
@krzysztofmajewski
Copy link

I'm using polyglot for a bilingual (French and English) site. Currently the default_lang is en. I would like the site to auto-detect the language from the browser's locale (via Accept-Language header, I guess). I'm running Nginx, not Apache. Curious if there has been any progress on this front? Or should I try to adapt @pserwylo 's Apache solution to Nginx?

@untra
Copy link
Owner

untra commented Aug 2, 2020

@krzysztofmajewski here is the nginx documentation for configuring response to the Accept-Language header, for serving different pages:
https://www.nginx.com/resources/wiki/modules/accept_language/

but you might want to open a new issue to discuss this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants