Skip to content

Commit

Permalink
[localize-tools] Improve handling of duplicate msg id (#2405)
Browse files Browse the repository at this point in the history
* [localize-tools] Improve handling of duplicate msg id

- For messages to be considered same:
  - No longer check for expressions to be identical
  - Do check that desc be identical
- Displays a more helpful aggregate error message for all messages with the same id
- For XLB format, add placeholder index to name attribute of <ph> tags
- Utilize the placeholder indexes embedded in tags for determining placeholder locations during transform mode

Co-authored-by: Alexander Marks <aomarks@google.com>
  • Loading branch information
augustjk and aomarks committed Jan 25, 2022
1 parent 7453e36 commit 4a4afa7
Show file tree
Hide file tree
Showing 15 changed files with 286 additions and 136 deletions.
7 changes: 7 additions & 0 deletions .changeset/many-singers-complain.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
'@lit/localize-tools': minor
---

**BREAKING** Update analysis to consider messages with same id **and** description to be identical (but no longer checks for expressions to be same) and improve error message on finding incompatible duplicates.

`lit-localize extract` will now error if multiple messages had the same text but different `desc` option. Be sure to add the same `desc` option for these messages to be considered the same translatable message or add different `id` options to differentiate them.
7 changes: 7 additions & 0 deletions .changeset/modern-seas-jog.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
'@lit/localize-tools': minor
---

**BREAKING** (XLB format only) Add index to `name` attribute for `<ph>` tags for tracking placeholder locations.

XLB format users should run `lit-localize extract` to regenerate the `.xlb` file for the source locale and make sure the `<ph>` tags in other locale files have matching `name` attribute values to that of the newly generated source file.
7 changes: 6 additions & 1 deletion packages/localize-tools/src/formatters/xlb.ts
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,10 @@ class XlbFormatter implements Formatter {
) {
throw new KnownError(`Expected <ph> to have exactly one text node`);
}
contents.push({untranslatable: phText.nodeValue || ''});
const index = Number(
getNonEmptyAttributeOrThrow(child as Element, 'name')
);
contents.push({untranslatable: phText.nodeValue || '', index});
} else {
throw new KnownError(
`Unexpected node in <msg>: ${child.nodeType} ${child.nodeName}`
Expand Down Expand Up @@ -127,12 +130,14 @@ class XlbFormatter implements Formatter {
}
indent(messagesNode, 2);
messagesNode.appendChild(messageNode);
let phIdx = 0;
for (const content of contents) {
if (typeof content === 'string') {
messageNode.appendChild(doc.createTextNode(content));
} else {
const {untranslatable} = content;
const ph = doc.createElement('ph');
ph.setAttribute('name', String(phIdx++));
ph.appendChild(doc.createTextNode(untranslatable));
messageNode.appendChild(ph);
}
Expand Down
10 changes: 8 additions & 2 deletions packages/localize-tools/src/formatters/xliff.ts
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,10 @@ export class XliffFormatter implements Formatter {
child as Element,
'equiv-text'
);
contents.push({untranslatable: phText});
const index = Number(
getNonEmptyAttributeOrThrow(child as Element, 'id')
);
contents.push({untranslatable: phText, index});
} else if (
child.nodeType === doc.ELEMENT_NODE &&
child.nodeName === 'ph'
Expand All @@ -131,7 +134,10 @@ export class XliffFormatter implements Formatter {
`Expected <${child.nodeName}> to have exactly one text node`
);
}
contents.push({untranslatable: phText.nodeValue || ''});
const index = Number(
getNonEmptyAttributeOrThrow(child as Element, 'id')
);
contents.push({untranslatable: phText.nodeValue || '', index});
} else {
throw new KnownError(
`Unexpected node in <trans-unit>: ${child.nodeType} ${child.nodeName}`
Expand Down
1 change: 1 addition & 0 deletions packages/localize-tools/src/messages.ts
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ export interface Bundle {
*/
export interface Placeholder {
untranslatable: string;
index: number;
// TODO(aomarks) Placeholders can also have names and examples, to help the
// translator understand the meaning of the placeholder. We could
// automatically add names for common markup patterns like START_BOLD and
Expand Down
12 changes: 1 addition & 11 deletions packages/localize-tools/src/modes/runtime.ts
Original file line number Diff line number Diff line change
Expand Up @@ -228,17 +228,7 @@ function makeMessageString(
const placeholderOrderKey = (
placeholder: Placeholder,
placeholderRelativeExpressionIdx: number
) =>
JSON.stringify([
// TODO(aomarks) For XLIFF files, we have a unique numeric ID for each
// placeholder that would be preferable to use as the key here over the
// placeholder text itself. However, we don't currently have that ID for
// XLB. To add it to XLB, we need to do some research into the correct XML
// representation, and then make a breaking change. See
// https://github.com/lit/lit/issues/1897.
placeholder.untranslatable,
placeholderRelativeExpressionIdx,
]);
) => JSON.stringify([placeholder.index, placeholderRelativeExpressionIdx]);

let absIdx = 0;
for (const content of canon.contents) {
Expand Down
129 changes: 89 additions & 40 deletions packages/localize-tools/src/modes/transform.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
* SPDX-License-Identifier: BSD-3-Clause
*/

import {Message, makeMessageIdMap} from '../messages.js';
import {Message, Placeholder, makeMessageIdMap} from '../messages.js';
import {writeLocaleCodesModule} from '../locales.js';
import type {Locale} from '../types/locale.js';
import type {Config} from '../types/config.js';
Expand Down Expand Up @@ -331,8 +331,8 @@ class Transformer {
if (templateResult.error) {
throw new Error(stringifyDiagnostics([templateResult.error]));
}
const {tag} = templateResult.result;
let {template} = templateResult.result;
const {tag, contents, template} = templateResult.result;
let newTemplate = template;

const optionsResult = extractOptions(optionsArg, this.sourceFile);
if (optionsResult.error) {
Expand All @@ -341,59 +341,108 @@ class Transformer {
const options = optionsResult.result;
const id = options.id ?? generateMsgIdFromAstNode(template, tag === 'html');

const sourceExpressions = new Map<string, ts.Expression>();
if (ts.isTemplateExpression(template)) {
for (const span of template.templateSpans) {
// TODO(aomarks) Support less brittle/more readable placeholder keys.
const key = this.sourceFile.text.slice(
span.expression.pos,
span.expression.end
);
sourceExpressions.set(key, span.expression);
}
}

// If translations are available, replace the source template from the
// second argument with the corresponding translation.
if (this.translations !== undefined) {
const translation = this.translations.get(id);
if (translation !== undefined) {
// If translations are available, replace the source template from the
// second argument with the corresponding translation.

// Maps from <translation absolute expression index> to
// <[source placeholder index, placeholder-relative expression index]>.
const transExprToSourcePosition = new Map<number, [number, number]>();

// Maps from <source placeholder index> to <the number of expressions in
// that placeholder>.
const placeholderExpressionCounts = new Map<number, number>();

// The absolute position of each expression within the translated
// message.
let absTransExprIdx = 0;

// Maps source placeholder to their index.
const placeholdersByIndex = new Map<number, Placeholder>();
for (let i = 0, phIdx = 0; i < contents.length; i++) {
const content = contents[i];
if (typeof content === 'object') {
placeholdersByIndex.set(phIdx++, content);
}
}

const templateLiteralBody = translation.contents
.map((content) =>
typeof content === 'string'
? escapeTextContentToEmbedInTemplateLiteral(content)
: content.untranslatable
)
.map((content) => {
if (typeof content === 'string') {
return escapeTextContentToEmbedInTemplateLiteral(content);
}
const sourcePlaceholderIdx = content.index;
const matchingPlaceholder =
placeholdersByIndex.get(sourcePlaceholderIdx);
if (matchingPlaceholder === undefined) {
throw new Error(
`Placeholder from translation does not appear in source.` +
`\nLocale: ${this.locale}` +
`\nPlaceholder: ${content.untranslatable}`
);
}
const parsedPlaceholder = parseStringAsTemplateLiteral(
matchingPlaceholder.untranslatable
);
if (ts.isTemplateExpression(parsedPlaceholder)) {
placeholderExpressionCounts.set(
sourcePlaceholderIdx,
parsedPlaceholder.templateSpans.length
);
for (let i = 0; i < parsedPlaceholder.templateSpans.length; i++) {
const placeholderRelativeExprIdx = i;
transExprToSourcePosition.set(absTransExprIdx++, [
sourcePlaceholderIdx,
placeholderRelativeExprIdx,
]);
}
}

return matchingPlaceholder.untranslatable;
})
.join('');

template = parseStringAsTemplateLiteral(templateLiteralBody);
if (ts.isTemplateExpression(template)) {
const newParts = [];
newParts.push(template.head.text);
for (const span of template.templateSpans) {
const expressionKey = templateLiteralBody.slice(
span.expression.pos - 1,
span.expression.end - 1
);
const sourceExpression = sourceExpressions.get(expressionKey);
if (sourceExpression === undefined) {
newTemplate = parseStringAsTemplateLiteral(templateLiteralBody);
if (ts.isTemplateExpression(newTemplate)) {
const newParts: Array<string | ts.Expression> = [];
newParts.push(newTemplate.head.text);
for (let i = 0; i < newTemplate.templateSpans.length; i++) {
const span = newTemplate.templateSpans[i];
const srcPos = transExprToSourcePosition.get(i);
if (srcPos === undefined) {
const expressionText = templateLiteralBody.slice(
span.expression.pos - 1,
span.expression.end - 1
);
throw new Error(
`Expression in translation does not appear in source.` +
`\nLocale: ${this.locale}` +
`\nExpression: ${expressionKey}`
`\nExpression: ${expressionText}`
);
}
newParts.push(sourceExpression);
const [sourcePlaceholderIdx, placeholderRelativeExprIdx] = srcPos;
let absSourceExprIdx = placeholderRelativeExprIdx;
for (let j = 0; j < sourcePlaceholderIdx; j++) {
// Offset by the length of all preceding placeholder indexes.
absSourceExprIdx += placeholderExpressionCounts.get(j) ?? 0;
}
if (!ts.isTemplateExpression(template)) {
throw new Error('Internal error');
}
const sourceExpression = template.templateSpans[absSourceExprIdx];
newParts.push(sourceExpression.expression);
newParts.push(span.literal.text);
}
template = makeTemplateLiteral(newParts);
newTemplate = makeTemplateLiteral(newParts);
}
}
// TODO(aomarks) Emit a warning that a translation was missing.
}

// Nothing more to do with a simple string.
if (ts.isStringLiteral(template)) {
if (ts.isStringLiteral(newTemplate)) {
if (tag === 'html') {
throw new KnownError(
'Internal error: string literal cannot be html-tagged'
Expand All @@ -407,10 +456,10 @@ class Transformer {
//
// Given: html`Hello <b>${"World"}</b>`
// Generate: html`Hello <b>World</b>`
template = makeTemplateLiteral(
this.recursivelyFlattenTemplate(template, tag === 'html')
newTemplate = makeTemplateLiteral(
this.recursivelyFlattenTemplate(newTemplate, tag === 'html')
);
return tag === 'html' ? tagLit(template) : template;
return tag === 'html' ? tagLit(newTemplate) : newTemplate;
}

/**
Expand Down

0 comments on commit 4a4afa7

Please sign in to comment.