Skip to content

Commit

Permalink
[localize] Add <x equiv-text> XLIFF placeholder style and use by defa…
Browse files Browse the repository at this point in the history
…ult (#2275)

See #2271 for more context.

XLIFF is the XML format we use to represent extracted templates/strings that need translation. XLIFF specifies multiple ways of encoding placeholders (for representing HTML markup and dynamic expressions). The differences according to the spec are a bit confusing:

<ph>: "Placeholder - The element is used to delimit a sequence of native stand-alone codes in the translation unit."
<x>: "Generic placeholder - The element is used to replace any code of the original document."
Previously we were using <ph>, because the spec seemed to match what we need, and I was primed by XLB (Google's very similar format) which also uses <ph> tags for this purpose. However, I found that in practice translation tools seem to have much better support for the XLIFF <x> tag (I have tested crowdin, phrase, and lokalise). Additionally, <x> is the approach used by Angular (see https://angular.io/guide/i18n-example), so it is likely that translation tools/services have been already tested with Angular style message extraction.

This is possibly breaking because it changes the default from <ph> to <x>, but upgrading both source and translated messages to <x> will happen automatically when the user next runs lit-localize extract. We retain the ability to use <ph> tags by setting a new config file setting.

Fixes #2271
  • Loading branch information
aomarks committed Nov 8, 2021
1 parent 8bb33c8 commit 97f4a3f
Show file tree
Hide file tree
Showing 55 changed files with 812 additions and 107 deletions.
18 changes: 18 additions & 0 deletions .changeset/slimy-brooms-happen.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
---
'@lit/localize-tools': minor
---

**BREAKING** Placeholders containing HTML markup and dynamic expressions are now
represented in XLIFF as `<x>` tags instead of `<ph>` tags.

To preserve the previous behavior of using `<ph>` tags, update your JSON config
file and set `interchange.placeholderStyle` to `"ph"`:

```json
{
"interchange": {
"format": "xliff",
"placeholderStyle": "ph"
}
}
```
8 changes: 8 additions & 0 deletions packages/localize-tools/config.schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,14 @@
],
"type": "string"
},
"placeholderStyle": {
"description": "How to represent placeholders containing HTML markup and dynamic\nexpressions. Different localization tools and services have varying support\nfor placeholder syntax.\n\nDefaults to \"x\". Options:\n\n- \"x\": Emit placeholders using <x> tags. See\n http://docs.oasis-open.org/xliff/v1.2/os/xliff-core.html#x\n\n- \"ph\": Emit placeholders using <ph> tags. See\n http://docs.oasis-open.org/xliff/v1.2/os/xliff-core.html#ph",
"enum": [
"ph",
"x"
],
"type": "string"
},
"xliffDir": {
"description": "Directory on disk to read/write .xlf XML files. For each target locale,\nthe file path \"<xliffDir>/<locale>.xlf\" will be used.",
"type": "string"
Expand Down
51 changes: 41 additions & 10 deletions packages/localize-tools/src/formatters/xliff.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ import type {Config} from '../types/config.js';
import type {XliffConfig} from '../types/formatters.js';
import type {Locale} from '../types/locale.js';
import {Formatter} from './index.js';
import {KnownError} from '../error.js';
import {KnownError, unreachable} from '../error.js';
import {
Bundle,
Message,
Expand Down Expand Up @@ -108,6 +108,15 @@ export class XliffFormatter implements Formatter {
const child = target.childNodes[c];
if (child.nodeType === doc.TEXT_NODE) {
contents.push(child.nodeValue || '');
} else if (
child.nodeType === doc.ELEMENT_NODE &&
child.nodeName === 'x'
) {
const phText = getNonEmptyAttributeOrThrow(
child as Element,
'equiv-text'
);
contents.push({untranslatable: phText});
} else if (
child.nodeType === doc.ELEMENT_NODE &&
child.nodeName === 'ph'
Expand All @@ -118,7 +127,9 @@ export class XliffFormatter implements Formatter {
!phText ||
phText.nodeType !== doc.TEXT_NODE
) {
throw new KnownError(`Expected <ph> to have exactly one text node`);
throw new KnownError(
`Expected <${child.nodeName}> to have exactly one text node`
);
}
contents.push({untranslatable: phText.nodeValue || ''});
} else {
Expand Down Expand Up @@ -205,8 +216,8 @@ export class XliffFormatter implements Formatter {
// TODO The spec requires the source filename in the "original" attribute,
// but we don't currently track filenames.
file.setAttribute('original', 'lit-localize-inputs');
// Plaintext seems right, as opposed to HTML, since our translatable
// message text is just text, and all HTML markup is encoded into <ph>
// Plaintext seems right, as opposed to HTML, since our translatable message
// text is just text, and all HTML markup is encoded into <x> or <ph>
// elements.
file.setAttribute('datatype', 'plaintext');
indent(file);
Expand Down Expand Up @@ -271,14 +282,34 @@ export class XliffFormatter implements Formatter {
if (typeof content === 'string') {
nodes.push(doc.createTextNode(content));
} else {
const {untranslatable} = content;
// https://docs.oasis-open.org/xliff/v1.2/os/xliff-core.html#ph
const ph = doc.createElement('ph');
ph.setAttribute('id', String(phIdx++));
ph.appendChild(doc.createTextNode(untranslatable));
nodes.push(ph);
nodes.push(this.createPlaceholder(doc, String(phIdx++), content));
}
}
return nodes;
}

private createPlaceholder(
doc: Document,
id: string,
{untranslatable}: Placeholder
): HTMLElement {
const style = this.xliffConfig.placeholderStyle ?? 'x';
if (style === 'x') {
// https://docs.oasis-open.org/xliff/v1.2/os/xliff-core.html#x
const el = doc.createElement('x');
el.setAttribute('id', id);
el.setAttribute('equiv-text', untranslatable);
return el;
} else if (style === 'ph') {
// https://docs.oasis-open.org/xliff/v1.2/os/xliff-core.html#ph
const el = doc.createElement('ph');
el.setAttribute('id', id);
el.appendChild(doc.createTextNode(untranslatable));
return el;
} else {
throw new Error(
`Internal error: unknown xliff placeholderStyle: ${unreachable(style)}`
);
}
}
}
6 changes: 3 additions & 3 deletions packages/localize-tools/src/modes/runtime.ts
Original file line number Diff line number Diff line change
Expand Up @@ -220,9 +220,9 @@ function makeMessageString(
// many ${} expressions, so the index of the _placeholder_ is not the same as
// the index of the _expression_:
//
// <ph>&lt;a href="http://example.com/"></ph>
// <ph>&lt;a href="${/*0*/ url}"></ph>
// <ph>&lt;a href="${/*1*/ url}/${/*2*/ path}"></ph>
// <x equiv-text="&lt;a href='http://example.com/'&gt;"/>
// <x equiv-text="&lt;a href='${/*0*/ url}'&gt;"/>
// <x equiv-text="&lt;a href='${/*1*/ url}/${/*2*/ path}'&gt;"/>
const placeholderOrder = new Map<string, number>();

const placeholderOrderKey = (
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
/**
* @license
* Copyright 2021 Google LLC
* SPDX-License-Identifier: BSD-3-Clause
*/

import {e2eGoldensTest} from './e2e-goldens-test.js';

e2eGoldensTest('build-runtime-xliff-ph', [
'--config=lit-localize.json',
'build',
]);
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
/**
* @license
* Copyright 2020 Google LLC
* SPDX-License-Identifier: BSD-3-Clause
*/

import {e2eGoldensTest} from './e2e-goldens-test.js';

e2eGoldensTest('extract-xliff-fresh-ph', [
'--config=lit-localize.json',
'extract',
]);
15 changes: 15 additions & 0 deletions packages/localize-tools/src/types/formatters.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -44,4 +44,19 @@ export interface XliffConfig {
* the file path "<xliffDir>/<locale>.xlf" will be used.
*/
xliffDir: string;

/**
* How to represent placeholders containing HTML markup and dynamic
* expressions. Different localization tools and services have varying support
* for placeholder syntax.
*
* Defaults to "x". Options:
*
* - "x": Emit placeholders using <x> tags. See
* http://docs.oasis-open.org/xliff/v1.2/os/xliff-core.html#x
*
* - "ph": Emit placeholders using <ph> tags. See
* http://docs.oasis-open.org/xliff/v1.2/os/xliff-core.html#ph
*/
placeholderStyle?: 'x' | 'ph';
}

This file was deleted.

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
},
"interchange": {
"format": "xliff",
"xliffDir": "xliff/"
"xliffDir": "xliff/",
"placeholderStyle": "ph"
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
},
"interchange": {
"format": "xliff",
"xliffDir": "xliff/"
"xliffDir": "xliff/",
"placeholderStyle": "ph"
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
/**
* @license
* Copyright 2020 Google LLC
* SPDX-License-Identifier: BSD-3-Clause
*/

import {html} from 'lit';
import {msg, str} from '@lit/localize';

const user = 'Friend';
const url = 'https://www.example.com/';

// Plain string
msg('Hello World!');

// Plain string with expression
msg(str`Hello ${user}!`);

// Lit template
msg(html`Hello <b>World</b>!`);

// Lit template with variable expression (one placeholder)
msg(html`Hello <b>${user}</b>!`);

// Lit template with variable expression (two placeholders)
msg(html`Click <a href=${url}>here</a>!`);

// Lit template with string expression
//
// TODO(aomarks) The "SALT" text is here because we have a check to make sure
// that two messages can't have the same ID unless they have identical template
// contents. After https://github.com/lit/lit/issues/1621 is
// implemented, add a "meaning" parameter instead.
msg(html`[SALT] Click <a href="${'https://www.example.com/'}">here</a>!`);

// Lit template with nested msg expression
msg(html`[SALT] Hello <b>${msg('World')}</b>!`);

// Lit template with comment
msg(html`Hello <b><!-- comment -->World</b>!`);

// Lit template with expression order inversion
msg(html`a:${'A'} b:${'B'} c:${'C'}`);

// Custom ID
msg('Hello World', {id: 'myId'});

// Description
msg('described 0', {desc: 'Description of 0'});

// This example has 4 <ph> placeholders. The 2nd has two expressions, and the
// rest have 0 expressions. Ensure that we index these expressions as [0, 1] by
// counting _expressions_, instead of [2, 2] by counting _placeholders_ See
// https://github.com/lit/lit/issues/1896).
const urlBase = 'http://example.com/';
const urlPath = 'foo';
msg(html`<b>Hello</b>! Click <a href="${urlBase}/${urlPath}">here</a>!`);

// Escaped markup characters should remain escaped
msg(html`&lt;Hello<b>&lt;World &amp; Friends&gt;</b>!&gt;`);
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
{
"$schema": "../../../config.schema.json",
"sourceLocale": "en",
"targetLocales": ["es-419", "zh_CN"],
"tsConfig": "tsconfig.json",
"output": {
"mode": "runtime",
"outputDir": "tsout",
"localeCodesModule": "locale-codes.ts"
},
"interchange": {
"format": "xliff",
"xliffDir": "xliff/",
"placeholderStyle": "ph"
},
"patches": {
"es-419": {
"lit": [{"before": "Mundo", "after": "Galaxia"}]
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
// Do not modify this file by hand!
// Re-generate this file by running lit-localize.

/**
* The locale code that templates in this source code are written in.
*/
export const sourceLocale = `en`;

/**
* The other locale codes that this application is localized into. Sorted
* lexicographically.
*/
export const targetLocales = [`es-419`, `zh_CN`] as const;

/**
* All valid project locale codes. Sorted lexicographically.
*/
export const allLocales = [`en`, `es-419`, `zh_CN`] as const;
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
{
"compilerOptions": {
"strict": true,
"noUnusedLocals": true,
"noUnusedParameters": true,
"preserveConstEnums": true,
"forceConsistentCasingInFileNames": true,
"rootDir": "./"
},
"include": ["**/*.ts"]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
// Do not modify this file by hand!
// Re-generate this file by running lit-localize

import {html} from 'lit';
import {str} from '@lit/localize';

/* eslint-disable no-irregular-whitespace */
/* eslint-disable @typescript-eslint/no-explicit-any */

export const templates = {
h02c268d9b1fcb031: html`&lt;Hola<b>&lt;Mundo &amp; Amigos&gt;</b>!&gt;`,
h349c3c4777670217: html`[SALT] Hola <b>${0}</b>!`,
h3c44aff2d5f5ef6b: html`Hola <b>Mundo</b>!`,
h82ccc38d4d46eaa9: html`Hola <b>${0}</b>!`,
h8d70dfec810d1eae: html`<b>Hola</b>! Clic <a href="${0}/${1}">aquí</a>!`,
h99e74f744fda7e25: html`Clic <a href="${0}">aquí</a>!`,
hbe936ff3da20ffdf: html`Hola <b><!-- comment -->Mundo</b>!`,
hc1c6bfa4414cb3e3: html`[SALT] Clic <a href="${0}">aquí</a>!`,
hf979404a36e879cb: html`c:${2} a:${0} b:${1}`,
myId: `Hola Mundo`,
s00ad08ebae1e0f74: str`Hola ${0}!`,
s03c68d79ad36e8d4: `described 0`,
s0f19e6c4e521dd53: `Mundo`,
s8c0ec8d1fb9e6e32: `Hola Mundo!`,
};
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
// Do not modify this file by hand!
// Re-generate this file by running lit-localize

import {html} from 'lit';
import {str} from '@lit/localize';

/* eslint-disable no-irregular-whitespace */
/* eslint-disable @typescript-eslint/no-explicit-any */

export const templates = {
h3c44aff2d5f5ef6b: html`你好 <b>世界</b>!`,
s8c0ec8d1fb9e6e32: `你好,世界!`,
s00ad08ebae1e0f74: str`Hello ${0}!`,
h82ccc38d4d46eaa9: html`Hello <b>${0}</b>!`,
h99e74f744fda7e25: html`Click <a href="${0}">here</a>!`,
hc1c6bfa4414cb3e3: html`[SALT] Click <a href="${0}">here</a>!`,
h349c3c4777670217: html`[SALT] Hello <b>${0}</b>!`,
s0f19e6c4e521dd53: `World`,
hbe936ff3da20ffdf: html`Hello <b><!-- comment -->World</b>!`,
hf979404a36e879cb: html`a:${0} b:${1} c:${2}`,
myId: `Hello World`,
s03c68d79ad36e8d4: `described 0`,
h8d70dfec810d1eae: html`<b>Hello</b>! Click <a href="${0}/${1}">here</a>!`,
h02c268d9b1fcb031: html`&lt;Hello<b>&lt;World &amp; Friends&gt;</b>!&gt;`,
};

0 comments on commit 97f4a3f

Please sign in to comment.