Creating multi-locale sites with Jekyll without relying on i18n plugins is not a straightforward task, and comes with tradeoffs. This approach requires somewhat complex setup and configuration, in return for simple content management. Also, it works with GitHub Pages! This branch is actually a GH Pages site.
Minimum required Jekyll version is 3.7.0, or 3.8.0 if using collections_dir
.
The jekyll-4-proposal
branch builds on this work to explore how native i18n support could hypothetically work in Jekyll 4.0.
Locales are configured in _config.yml
:
locales:
default:
baseurl: ""
lang: en-US
name: English
pt:
baseurl: /pt
lang: pt-PT
name: Português
The default
locale is special, assumed to be the primary one and to be output to the site root.
Collections are mirrored for each locale. Collections in the default
locale are named normally, while other locales take the locale’s label as a suffix:
collections:
photos:
output: true
permalink: /photos/:path/
photos_pt:
output: true
permalink: /pt/fotografias/:path/
Permalinks need to be set manually for each collection to match the locale’s baseurl
.
A few front matter defaults need to be configured (*
glob patterns are helpful here).
To match localized collections to each other, collection_basename
is used. All photos
collections across any locales must have collection_basename
set to photos
:
defaults:
- scope:
path: "_photos*/"
values:
collection_basename: photos
And to associate collections with their respective locales, some more defaults are needed:
- scope:
path: "*_pt/"
values:
locale: pt
lang: pt-PT
collection_suffix: _pt
- scope:
path: ""
values:
locale: default
lang: en-US
Setting lang
in this way allows plugins like jekyll-seo-tag to pick it up. collection_suffix
is used in conjunction with collection_basename
to match collections between locales. Documents in the default
locale don’t have a collection_suffix
.
For pages, pages_LOCALE
collections need to be created as well. The default locale’s pages can be moved to a pages
collection or left as normal pages. Because collections don’t quite act the same way as regular pages, permalinks for index
documents won’t work as expected; when using pretty
permalinks they will output to /index/index.html
. This can only be avoided by setting permalink
manually in front matter. This can be done in the documents themselves, or through more defaults:
- scope:
path: "_pages_pt/index.*"
values:
permalink: "/pt/"
The site’s content should look something like this:
├── about.markdown
├── blog.markdown
├── index.markdown
├── photos.markdown
├── _pages_pt
│ ├── about.markdown
│ ├── blog.markdown
│ ├── index.markdown
│ └── photos.markdown
├── _photos
│ ├── photo-1.markdown
│ └── photo-2.markdown
├── _photos_pt
│ ├── photo-1.markdown
│ └── photo-2.markdown
├── _posts
│ └── 2018-01-01-hello.markdown
└── _posts_pt
└── 2018-01-01-hello.markdown
To localize permalinks, different filenames can be set for each locale. We’ll add front matter to match them up later.
├── _photos
│ ├── photo-1.markdown
│ └── photo-2.markdown
├── _photos_pt
│ ├── fotografia-1.markdown
│ └── fotografia-2.markdown
Collection labels themselves are never localized.
It’s not necessary for each locale to have a copy of every collection, or every file — content can be asymmetric.
Document matching is automatic when file and folder names are exactly the same. In this example both documents represent the same content:
├── _collection
│ └── folder
│ └── document.markdown
└── _collection_pt
└── folder
└── document.markdown
To match them, the i18n
include (explained below) generates a variable called document_id
, based on each document’s path relative to the collection folder, and excluding the file extension.
Both documents in this example would return the same document_id
, so they would match automatically:
folder/document
When filenames don’t match, document_id
can be set manually via YAML front matter. Example:
├── _collection
│ └── folder
│ └── document.markdown
└── _collection_pt
└── pasta
└── documento.markdown
By adding this front matter to _collection_pt/pasta/documento.markdown
, they’d match:
---
document_id: folder/document
---
This is preferable to using matching filenames and setting URL localizations through page.permalink
because it’s derived from the document’s path in the site source, rather than the compiled site. That means the global permalink style can be changed without requiring changes to documents.
The i18n
include does most of the heavy lifting for making i18n manageable in Liquid. It defines the following variables:
Variable | Description |
---|---|
locale |
Contains the attributes of the current locale, as defined in _config.yml . The same as site.locales[page.locale] . |
localized_collections |
An array of labels for all collections matched through page.collection_basename . |
localized_pages |
An array of objects for all documents matched through document_id , based on localized_collections . |
default_page |
The page object of a matched default locale version of the current document. Useful for falling back to the default locale when dealing with unlocalized content. The same as {{ localized_pages | where: "locale", "default" | first }} . |
strings |
Contains any localized text strings defined in _data . More on that below. |
document_id |
Returns page.document_id or automatically generates a document_id string for the page. |
i18n
must be included at the top of every layout requiring i18n features:
{% include i18n/i18n %}
When it’s useful to get i18n variables for a document other than the current one, the obj
parameter can be passed to the include. In this case, all variables will be prefixed with obj_
, to avoid overwriting the current page’s variables. Example:
{% include i18n/i18n %}
<h2>Collection Document IDs</h2>
{% for document in collection %}
{% include i18n/i18n obj=document %}
<p>{{ obj_document_id }}</p>
{% endfor %}
<h2>Current Page ID</h2>
<p>{{ document_id }}</p>
Since getting document_id
is the most common use case, a smaller document_id
include that only assigns that variable can be used:
{% include i18n/document_id %}
<p>{{ document_id }}</p>
{% assign document = site.collection | first %}
{% include i18n/document_id obj=document %}
<p>{{ obj_document_id }}</p>
default_page
can be used to fallback to content in the default locale if it’s not localized:
{{ page.image | default: default_page.image }}
Using these fallbacks means that each localized file only needs to contain localizable content. The about.markdown
file in the default
locale might look like this:
---
title: About
email: hello@example.com
phone: +99 123 456 789
image: logo.png
---
We are a cool company.
But the localized sobre.markdown
in the pt
locale can include just the content that requires localization:
---
title: Sobre
document_id: about
---
Somos uma empresa fixe.
Text strings are defined in _data
— one YAML file for each locale, named strings_LOCALE.yml
. The default
locale file is named strings.yml
.
└── _data
├── strings_pt.yml
└── strings.yml
Any keys set in these files can be accessed directly through the strings
variable:
{{ page.locale }}: {{ strings.hello }}
default: Hello
pt: Olá
A special include deals with dates: i18n/date
.
Locale-specific date formats can be set in strings.yml
data files using the date_formats
key:
date_formats:
full: "%A, %e %B %Y"
long: "%B %e, %Y"
short: "%m/%d/%Y"
Month and weekday translations can also be set:
date_formats:
full: "%A, %e de %B de %Y"
long: "%e %B %Y"
short: "%d-%m-%Y"
months:
January: Janeiro
February: Fevereiro
March: Março
...
weekdays:
Sunday: Domingo
Monday: Segunda-feira
Tuesday: Terça-feira
...
The include can then be used by passing date
and format
parameters. If no format is specified, %Y-%m-%d
will be used.
{% include i18n/date date=page.date format="full" %}
default: Thursday, 15 September 2016
pt: Quinta-feira, 15 de Setembro de 2016
Build performance can be a problem. Glob pattern defaults are not super fast, and generating document_id
can be a huge issue. The biggest example of how inefficient it can be is in the site-nav.html
include, which creates a menu from a list of document_id
strings:
{% for document_id in site.navigation %}
{% for page in pages %}
{% include i18n/document_id obj=page %}
{% if obj_document_id == document_id %}
<li><a href="{{ page.url | prepend: site.baseurl }}">{{ page.title | escape }}</a></li>
{% break %}
{% endif %}
{% endfor %}
{% endfor %}
This nested looping is very bad for performance and can get out of hand quickly. Because document_id
needs to be generated using complex Liquid, the where
filter can't be used instead. The only solution would be to set document_id
in front matter for all documents, but that goes against the initial objective of making content management easy.
A native implementation of document_id
in Jekyll would single-handedly fix most performace problems with this solution.
I’ve tested the most relevant GitHub-whitelisted plug-ins:
jekyll-sitemap
includes every page without a problem, but they aren’t marked up using rel="alternate" links.jekyll-feed
will only generate a feed for the defaultposts
collection. Version 0.11.0 will resolve this by adding the ability to generate feeds for any collection, but it hasn't been whitelisted yet. Liquid layouts like jekyll-rss-feeds can be used in the meantime.jekyll-seo-tag
works fine, as long aslang
is properly set for every document (using front matter defaults), or else everything will haveog:locale
set toen-US
(or whateversite.lang
is set to). Doesn’t createrel="alternate" hreflang="x"
orog:locale:alternate
tags, but those can be added manually.