Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically generate UK English and bb_BB locale language files #3

Open
bobbingwide opened this issue Sep 12, 2017 · 4 comments
Open
Assignees

Comments

@bobbingwide
Copy link
Owner

bobbingwide commented Sep 12, 2017

In bobbingwide/oik#9 we've changed the default language for translatable strings from UK English ( en_GB ) to US English ( en_US ).
For testing of oik and the shared libraries in oik-libs we use two locales - en_GB and bb_BB ( the bbboing language ).
We need to automatically generate the UK English files using l10n.php, which calls la_CY.php.
The bb_BB locale is generated using bb_BB.php.

Proposed solution

  • For UK English we'll use a language variants file, which contains different spellings or words for each target locale.
  • Some variants will always be used, regardless of context. e.g. Behaviour for Behavior.
  • Others will require checking against the context. e.g. Check for Cheque when used in a banking context.
  • When the context can be determined automatically we should be able to automate the logic to deliver the correct variant without prompting.
@bobbingwide
Copy link
Owner Author

bobbingwide commented Sep 13, 2017

The current implementation for UK English uses multiple language variants file.

The first file loaded is WordPress.org Shared English Variants Translation Glossary - Variants.csv

  • This file was created by downloading the variants from wordpress.org and exporting to a .csv file.
  • See https://en-gb.wordpress.org/translations/
  • This file contains multiple variants for different English locales.
  • After the variants for each locale it contains additional columns.
  • pos ( Part of Speech ) indicates noun, verb, adjective, etc
  • description - indicates when we may need to take context into account.

Then we load a locale specific file that overrides the 'standard' variants.

  • This is to take into account the context ( msgctxt in the .pot file ) for the string.
  • The file name is source-locale-target_locale.csv. e.g. en_US-en_GB.csv
  • The format of the file is source,target,context
  • e.g. check,cheque,bank
  • The mapping takes into account the context, which must be set using the appropriate source code.

Variant examples

US English Context UK English
Check color - Check colour
Check color examine Check colour
Check color bank Cheque colour

Note:

  • The logic attempts to match the capitalization in the source text.
  • The logic does not cater for plurals.
  • The override need not take into account the context
  • It can therefore be used to add further variants e.g. 'colored,coloured'

@bobbingwide
Copy link
Owner Author

bobbingwide commented Nov 25, 2017

It was pointed out to me that each language/locale may have its own glossary file. e.g. locale-en-gb-glossary.csv for UK English.

This can be downloaded from GlotPress https://translate.wordpress.org/locale/en-gb/default/glossary
using the Export as CSV link https://translate.wordpress.org/locale/en-gb/default/glossary/-export

The default file name for en_GB is --locale-en-gb-glossary.csv.

As it was rather tricky using a file named like this in my Windows machine, I have called my local version locale-en-gb-glossary.csv.

Similarly the en-au file contains the Australian English glossary.

The oik-i18n code should be changed to use the default glossary file in preference to the variants file, and then the local override if the translation is still not right.

@bobbingwide
Copy link
Owner Author

bobbingwide commented Nov 25, 2017

The US English 'Howdy,' replacement string is not being translated into
the UK English 'Hi,' replacement string

This was because the variants file was in mixed case. Adding howdy,hi to the en_US-en_GB.csv file resolves the issue.

@bobbingwide
Copy link
Owner Author

The en_GB and bb_BB locale files are automatically generated but the en_GB version isn't using the latest glossary files for UK English or the other English variants.

Leaving open until that's done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant