Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i18n: placeholders #9114

Merged
merged 77 commits into from
Jul 24, 2019
Merged

i18n: placeholders #9114

merged 77 commits into from
Jul 24, 2019

Conversation

exterkamp
Copy link
Member

@exterkamp exterkamp commented Jun 4, 2019

Add $placeholder$ syntax to i18n. This will allow us to escape all [mark](down) and {ICU} syntax so that it cannot be translated! Check out the README.md for a good summary of the new process.

Tested with tc/ and we are looking pretty good afaict.

Additional fun:

  • No-op to locales/ files

Follow ups:

  • multi-line descriptions
  • @description required
  • type getMessageInstanceIdFn param values with Record<string, string | number>
  • make test to fail when adding new @meanings

Related Issues/PRs
More talk: #6945
Evidence of need: #9197

@exterkamp exterkamp added the i18n internationalization thangs label Jun 4, 2019
Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only took a quick look, but here are some first thoughts.

Should also do a string with a plural replacement and a {timeInMs, number, milliseconds} replacement to explore how those could be handled

content: '[',
},
link_end: {
content: '](https://developers.google.com/web/tools/lighthouse/audits/minify-css)',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a pretty unfortunate syntax. Maybe we want to come up with something that expands to this in collect-strings.js?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we want to come up with something that expands to this in collect-strings.js?

Just to be clear, I meant this as a super low priority idea that we can think about while we do everything else first and implement it later if there's a nice fix. It kind of sucks to look at/edit but we could live with it if need be (we may need a test to make sure a typo in one of the contents doesn't accidentally break a link)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was also my thought. This is super heavyweight to have in each .js file. I can try to experiment.

Copy link
Collaborator

@connorjclark connorjclark Jun 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something that expands to this in collect-strings.js

Let's just keep it as [some text {name} or even placeholders](url) and expand it to additional placeholders in collect strings.

Copy link
Member Author

@exterkamp exterkamp Jun 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like auto convert [ and ](url) into link_start link_end in collect strings?

I worry about making special cases, adding unseen complexity behind the scenes. I think part of the positives of a change like this is that it removes all ambiguity.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah - I think that'd be dope

const key = entry[0];
const value = entry[1];
//use key and value here
let regexStr = `\$${key}\$`;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll definitely want a collect-strings test that verifies that all the placeholders are in the string (would we also want the stricter check that there aren't any escaped values that don't have placeholders?)

//use key and value here
let regexStr = `\$${key}\$`;
console.log(key, value, regexStr);
localeMessage = localeMessage.replace(regexStr, value.content);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe to avoid the issue with positional replacements ($1 etc), we could use use the named replacements in values if value.content is \$\d+ or whatever? We'll just have to figure out how to do the mapping (just use key?)

'other {# errors found}}',
/** [ICU Syntax] Some gendered (ICU select) explanation... */
explanationGender: {
message: 'Someone minified this, {direct_replace_name}. {static_replacement} {person, select, ' +
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This contains a 1. direct replacement in {ICU string replace} a static_replacement that is covered by a placeholder, and "Google valid" ICU select syntax.

},
/** [ICU Syntax] Some gendered (ICU select) explanation... */
explanationGender2: {
message: 'Someone minified this, {name}. {static_replacement} {person, select, ' +
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example uses placeholders to allow us to make more perfectly formed translation strings, but stutters with {name} in the message, and the content being the same {name} since that is the replacement we want to fill it with. The advantage here is that we can give the translators an example value of the replacement, e.g. here it is a name, so the example like "Karen" would give more context to the translators.

@exterkamp
Copy link
Member Author

Started some docs on explaining this for future development

@exterkamp
Copy link
Member Author

Update: Having a lot of trouble on the backend infra side converting these .json messages into .xmb's and roundtripping them back to .json intact.

@exterkamp
Copy link
Member Author

Okay, fixed the roundtrip issues 👍 But design question @paulirish @brendankenny @patrickhulce @hoten :
The intial messages.json format in en-US looks like this (fake-example):

  "lighthouse-core/audits/accessibility/accesskeys.js | failureTitle": {
    "message": "$ACCESS_KEY$ $VALUES$ are not unique",
    "placeholders": {
      "ACCESS_KEY": {
        "content": "`[accesskey]`"
      },
      "VALUES": {
        "example": "12",
        "content": "{value, number}"
      }
    },
    "description": "Title of an accesibility audit that evaluates if the ARIA HTML attributes are misaligned with the aria-role HTML attribute specificed on the element, such mismatches are invalid. This title is descriptive of the failing state and is shown to users when there is a failure that needs to be addressed."
  },

And after a roundtrip it looks like this (in another language here its en-XA):

  "lighthouse-core/audits/accessibility/accesskeys.js | failureTitle": {
    "message": "$ACCESS_KEY$ $VALUES$ åŕé ñöţ ûñîqûé",
    "placeholders": {
      "ACCESS_KEY": {
        "content": "`[accesskey]`"
      },
      "VALUES": {
        "content": "{value, number}"
      }
    }
  },

Notice the small variations, the roundtrip loses

  • examples
  • descriptions

Which is desired since they aren't really needed post translation.

But it also brings along all the static placeholder text. Which is a lot of extra bytes. Each language will have all the placeholder data with it. Thats a lot of copies of the strings. Suggestion:

  1. Keep all the strings, it makes each locale.json file totally self-contained
  2. Make a backfill mechanism to fallback to en-US to lookup the contents.

Thoughts?

@connorjclark
Copy link
Collaborator

But it also brings along all the static placeholder text. Which is a lot of extra bytes. Each language will have all the placeholder data with it. Thats a lot of copies of the strings. Suggestion:

could collect-strings save off a placeholders.json for just the icuMessageId -> placeholders data here, and just load placeholders.json here? Then delete placeholders from each locale file. I don't think having them self contained is important.

But these extra bytes don't end up in the report (just the bundle), so maybe it's not worth the effort.

Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last round! Looks great

lighthouse-core/lib/i18n/README.md Show resolved Hide resolved
lighthouse-core/lib/i18n/i18n.js Outdated Show resolved Hide resolved
}

/**
* Take a series of messages and apply ĥât̂ markers to the translatable portions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 👍 🎩

lighthouse-core/scripts/i18n/collect-strings.js Outdated Show resolved Hide resolved
lighthouse-core/scripts/i18n/collect-strings.js Outdated Show resolved Hide resolved
psuedoLocalizedString.push(useHatForAccentMark ? `\u0302` : `\u0301`);
useHatForAccentMark = !useHatForAccentMark;
// Stop empty placeholders from being added to every message
if (Object.entries(converted.placeholders).length > 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could move this into convertMessageToPlaceholders, which would keep it encapsulated there (they just come out with no placeholders prop if there are no placeholders) and allow you to do the tests as you were in 125cb1d

something like this should be fine:

if (Object.entries(icuDefn.placeholders).length > 0) {
  icuDefn.placeholders = undefined;
}

then in here it would be

/** @type {ICUMessageDefn} */
const icuDefn = {
  message: converted.message,
  description,
  placeholders: converted.placeholders,
};

and while that's technically copying over undefined to the new object, this is bound for json which will strip out undefined values anyways

Copy link
Member Author

@exterkamp exterkamp Jul 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, but I had to add a //@ts-ignore dunno if that was intended 😮 Unless you want to make placeholders | undefined in ts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, but I had to add a //@ts-ignore dunno if that was intended 😮 Unless you want to make placeholders | undefined in ts.

yeah, this is terrible it had to do that, but I wanted to do a follow up cleaning up some of these types, so I can handle it there.

@brendankenny
Copy link
Member

needs a rebase, too

Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

🎉 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes i18n internationalization thangs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants