Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write a script that does a best guess at calculating locale translation percentages #550

Closed
pdehaan opened this issue Oct 22, 2018 · 7 comments

Comments

@pdehaan
Copy link
Collaborator

pdehaan commented Oct 22, 2018

Not sure if there's much value in a local script that would calculate translation percentages, since we can get that information from https://pontoon.mozilla.org/projects/firefox-monitor-website/ but it may be useful if we want to try scripting which locales to say are "production ready".

I took a stab at it, and the results seem to be similar to Pontoon site (give or take some minor rounding variations).

#!/usr/bin/env node

/* eslint-disable no-console, no-process-exit, node/shebang, strict */

const {spawnSync} = require("child_process");
const {readdirSync} = require("fs");

const arg = require("arg");

main();

function main() {
  const flags = cli();

  if (flags.help) {
    return showHelp();
  }

  if (flags.version) {
    return showVersion();
  }

  const locales = readdirSync("./locales");
  const args = ["l10n.toml", ".", ...locales, "--json", "-"];
  const {stdout} = spawnSync("compare-locales", args);
  const {summary} = JSON.parse(stdout)[0];
  const DEFAULT_LOCALE = "en";
  const totalStrings = summary[DEFAULT_LOCALE].unchanged;

  const res = Object.keys(summary)
    .map(locale => {
      const {missing, missingInFiles} = summary[locale];
      const missingStrings = (missing || 0) + (missingInFiles || 0);
      const progress = ((1 - (missingStrings / totalStrings)) * 100);
      return {locale, progress};
    })
    .filter(({progress}) => progress >= flags.threshold)
    .sort(sortLocales);

  if (flags.json) {
    console.log(JSON.stringify(res.map(r => r.locale), null, 2));
  } else {
    console.log(`total "${DEFAULT_LOCALE}" strings: ${totalStrings}`);
    res.forEach(language => console.log(`  - ${language.locale}: ${Math.round(language.progress)}%`));
  }
}

// Sort by progress %, then by locale name.
function sortLocales(localeA, localeB) {
  const diff = localeB.progress - localeA.progress;
  // If translation % is the same, sort by locale name.
  if (diff === 0) {
    if (localeA.locale > localeB.locale) {
      return 1;
    }
    if (localeB.locale > localeA.locale) {
      return -1;
    }
    return 0;
  }
  return diff;
}

function cli() {
  const args = arg({
    "--threshold": Number,
    "-t": "--threshold",
    "--help": Boolean,
    "-h": "--help",
    "--version": Boolean,
    "-v": "--version",
    "--json": Boolean
  });
  return {
    threshold: args["--threshold"] || 0,
    json: args["--json"] || false,
    help: args["--help"] || false,
    version: args["--version"] || false,
  };
}

function showHelp() {
  console.log("Loop through the locales and show the translation %.");
  console.log("\nUSAGE: `node l10n-lint [--threshold NN]`\n");
  console.log("NOTE: The optional `--threshold` (or `-t`) flag filters out locales that are below the specified translation percentage.");
  console.log("For example, `$ node l10n-lint -t 50` will only show locales that have at least 50% of the strings translated.");
}

function showVersion() {
  const {version} = require("./package.json");
  console.log(version);
}

Output:

$ node l10n-lint.js

total "en" strings: 261
  - en = 100%
  - en-CA = 100%
  - de = 100%
  - zh-TW = 85%
  - sl = 47%
  - pt-PT = 42%
  - cy = 22%
  - sv-SE = 18%
  - fr = 14%
  - zh-CN = 10%
  - it = 3%
  - es-MX = 2%

UPDATE: Updated the code above to allow for an ominous --threshold (or -t) argument which lets you set a minimum translation %:

$ node l10n-lint.js -t 85

total "en" strings: 261
  - de = 100%
  - en = 100%
  - en-CA = 100%
  - zh-TW = 85%

UPDATE: Refactored and added a couple more CLI flags. Now you can specify --json which will return an array of locales (which may play a bit nicer w/ l10n.toml):

$ ./l10n-lint.js --threshold 50 --json

[
  "cy",
  "de",
  "en",
  "en-CA",
  "zh-TW",
  "sl"
]
@groovecoder
Copy link
Member

I'd actually prefer to make this part of LocaleUtils.init() so we could potentially only push a locale into availableLanguages when its translation % reaches a certain threshold.

@groovecoder
Copy link
Member

@flodolo is there any kind of Pontoon API we could query to get the translation % for a project per locale?

@flodolo
Copy link
Collaborator

flodolo commented Oct 25, 2018

Can you take a look at this doc and ping @mathjazz if you get stuck?

@mathjazz
Copy link
Contributor

mathjazz commented Oct 25, 2018

This API call should give you the data you need. The query looks like this:

query {
    project(slug: "firefox-monitor-website") {
        name
        localizations {
            locale {
                code
            },
            approvedStrings,
            stringsWithWarnings,
            totalStrings,
        }
    }
}

Completion is calculated as (approvedStrings + stringsWithWarnings) / totalStrings.

You can also have a look at the blog post to learn more about the API.

@groovecoder groovecoder added this to the l10n Launch milestone Oct 25, 2018
@pdehaan
Copy link
Collaborator Author

pdehaan commented Oct 25, 2018

Awesome, I was looking for an excuse to poke around w/ graphql.

This is about as close as I got. Oddly, I'm not seeing our default locale ("en") in the results:

const got = require("got");

const PROJECT_SLUG = "firefox-monitor-website";
const query = `{project(slug:"${PROJECT_SLUG}"){name,localizations{locale{code,name}totalStrings,approvedStrings,stringsWithWarnings,missingStrings}}}`;

async function main() {
  const {body} = await got("https://pontoon.mozilla.org/graphql", {json: true, query: `query=${query}`});
  const localizations = body.data.project.localizations
    .map(locale => {
      locale.progress = (1 - (locale.missingStrings / locale.totalStrings)) * 100;
      return locale;
    })
    .filter(locale => locale.progress > 0)
    .sort((localeA, localeB) => localeB.progress - localeA.progress);

  localizations.forEach(locale => console.log(`${locale.locale.code} \t => ${Math.round(locale.progress)}%`));
}

main()
  .catch(err => console.error(err));
$ node graphql-test.js

zh-CN    => 100%
de       => 100%
cy       => 100%
zh-TW    => 100%
en-CA    => 100%
cs       => 95%
fr       => 58%
sl       => 54%
pt-PT    => 41%
id       => 38%
es-AR    => 32%
az       => 25%
sv-SE    => 21%
sk       => 14%
kab      => 12%
es-CL    => 9%
ur       => 9%
it       => 7%
ru       => 7%
es-MX    => 6%
ja       => 3%

UPDATE: Or this one, which combines the .map() and .filter() into a single .reduce() and shims the default "en" locale into the response:

const got = require("got");

async function main(slug) {
  const query = {query: `{project(slug:"${slug}"){name,localizations{locale{code,name}totalStrings,approvedStrings,stringsWithWarnings,missingStrings}}}`};
  const {body} = await got("https://pontoon.mozilla.org/graphql", {json: true, query});

  // TODO: Add a couple more guards around blindly hoping "body.data.project.localizations"
  // is a thing.
  const localizations = body.data.project.localizations
    .reduce((arr, locale) => {
      locale.progress = (locale.approvedStrings + locale.stringsWithWarnings) / locale.totalStrings * 100;
      if (locale.progress > 0) {
        arr.push(locale);
      }
      return arr;
    }, [])
    .sort((localeA, localeB) => localeB.progress - localeA.progress);

  localizations.unshift({locale: {code: "en"}, progress: 100});
  localizations.forEach(locale => console.log(`${locale.locale.code} \t => ${Math.round(locale.progress)}%`));
}

main("firefox-monitor-website")
  .catch(err => console.error(err));

UPDATE: Pretty boring, but I pushed a version of this to my GitHub repo as a Node module (but haven't published to npm): https://github.com/pdehaan/pontoonql

@groovecoder
Copy link
Member

Pulling this out of l10n launch milestone. We're going to launch with a manually-configured list of supported locales in an environment variable. (See #563)

@groovecoder groovecoder removed this from the l10n Launch milestone Oct 30, 2018
@EMMLynch
Copy link
Collaborator

EMMLynch commented Mar 6, 2024

Closing since we've redesigned the site and functionality since this was created. If you feel that this is still needed, please let me know.

@EMMLynch EMMLynch closed this as completed Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants