Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for string patterns that were erroneously translated. #18

Closed
5 tasks
kathy-phet opened this issue Dec 16, 2022 · 12 comments
Closed
5 tasks

Fix for string patterns that were erroneously translated. #18

kathy-phet opened this issue Dec 16, 2022 · 12 comments

Comments

@kathy-phet
Copy link

kathy-phet commented Dec 16, 2022

In issue phetsims/rosetta#329, we note that some translations have "translated" the strings that are within a code pattern. These strings need to remain as original strings, for the code to correctly work. They are basically variable names, not actual text that needs to be translated.

We need to find the translations where the code pattern was mistakenly translated, and fix it. We will do this by hand for the Molecule Polarity strings. Over in phetsims/rosetta#329, we are investigating if babel allows this error to happen now or whether it has a fix already in place.

@oliver-phet - Can you please go through the existing translations, and inspect this row

pattern.dipoleDirection {{from}} → {{to}}

And if it reads anything differently from the English, replace it with the English version.
{{from}} → {{to}}

And do so without changing the credits?
@zepumph - I think you mentioned you might be able to make a quick script to see which translations needed this hand fix?

Here is a short list CM started:

  • da (1)
  • fr (1)
  • ko (2) -- I think CM means that this impacted 2 such code patterns within Molecule Polarity, so keep an eye out for that!
  • vi (1)
  • zh_CN (2) -- I think CM means that this impacted 2 such code patterns within Molecule Polarity, so keep an eye out for that!
@zepumph
Copy link
Member

zepumph commented Dec 16, 2022

Oops sorry, I'll repeat phetsims/rosetta#329 over here:

Hello! I spent 15 minutes and wrote a script that should show every translation in which the current value of a translated template string key does not match the english version in master. I think it will at least be a good starting point, but it is worth noting that there may be a disconnect between published sims and master. Please also note that this does not look at any values of the babel translation "history", just the current value.

the script
const fs = require( 'fs' );
const _ = require( 'lodash' );
const x = fs.readdirSync( '../babel' );


const templateVarRegex = /{\{?[\w\d]+\}\}?/g;
const openRegex = /{/g;
const closeRegex = /}/g;

function getTemplatedVars( string ) {

  const matches = string.match( templateVarRegex ) || [];

  // Don't match {HI}} for whatever reason that may be
  return matches.filter( templateVar => templateVar.match( openRegex ).length === templateVar.match( closeRegex ).length );
}

// @returns map where string key is key, and list of template vars in it are the value.
function getTemplateVarsInEnglishStringFile( repo ) {

  const englishStringFileName = `../${repo}/${repo}-strings_en.json`;
  const englishStrings = JSON.parse( fs.readFileSync( englishStringFileName ).toString() );

  // Record<stringKey, Array<templateVarString>>
  const keyTemplateMap = {};
  Object.keys( englishStrings ).forEach( key => {
    const stringValue = englishStrings[ key ].value;
    if ( stringValue ) {
      const templateVars = getTemplatedVars( stringValue );
      if ( templateVars.length > 0 ) {
        keyTemplateMap[ key ] = templateVars;
      }
    }
  } );

  return keyTemplateMap;
}

x.forEach( dir => {
  const babelRepoDir = `../babel/${dir}`;
  if ( !dir.startsWith( '_' ) && !dir.startsWith( '.' ) && fs.statSync( babelRepoDir ).isDirectory() ) {


    const repo = dir;
    const templateVarsMap = getTemplateVarsInEnglishStringFile( repo );

    fs.readdirSync( babelRepoDir ).forEach( translatedStringFileContents => {
      const translatedStrings = JSON.parse( fs.readFileSync( `../babel/${dir}/${translatedStringFileContents}` ).toString() );
      Object.keys( templateVarsMap ).forEach( stringKey => {

        if ( translatedStrings[ stringKey ] ) {

          const translatedStringValue = translatedStrings[ stringKey ].value;
          const translatedTemplateVars = getTemplatedVars( translatedStringValue );
          const englishTemplatedVars = templateVarsMap[ stringKey ];

          // Set because order doesn't matter
          if ( !_.isEqual( new Set( englishTemplatedVars ), new Set( translatedTemplateVars ) ) ) {
            console.log( `${translatedStringFileContents}:\n`, 'english vars:', englishTemplatedVars, '\n translated vars:', translatedTemplateVars, '\n' );
          }
        }
      } );
    } );
  }
} );
Results

beers-law-lab-strings_ru.json@pattern.0label:
english vars: [ '{0}' ]
translated vars: []

beers-law-lab-strings_ru.json@pattern.0value.1units:
english vars: [ '{0}', '{1}' ]
translated vars: []

beers-law-lab-strings_ru.json@pattern.0percent:
english vars: [ '{0}' ]
translated vars: []

beers-law-lab-strings_ru.json@pattern.0formula.1name:
english vars: [ '{0}', '{1}' ]
translated vars: []

charges-and-fields-strings_gu.json@pattern.0value.1units:
english vars: [ '{0}', '{1}' ]
translated vars: []

circuit-construction-kit-common-strings_nb.json@resistanceOhmsSymbol:
english vars: [ '{{resistance}}' ]
translated vars: []

circuit-construction-kit-common-strings_ta.json@resistanceOhmsValuePattern:
english vars: [ '{{resistance}}' ]
translated vars: []

expression-exchange-strings_ko.json@numberCentsPattern:
english vars: [ '{{number}}' ]
translated vars: []

expression-exchange-strings_ko.json@levelNumberPattern:
english vars: [ '{{levelNumber}}' ]
translated vars: []

expression-exchange-strings_zh_CN.json@numberCentsPattern:
english vars: [ '{{number}}' ]
translated vars: []

expression-exchange-strings_zh_CN.json@levelNumberPattern:
english vars: [ '{{levelNumber}}' ]
translated vars: []

fluid-pressure-and-flow-strings_gu.json@readoutFeet:
english vars: [ '{0}' ]
translated vars: []

fluid-pressure-and-flow-strings_gu.json@massLabelPattern:
english vars: [ '{0}' ]
translated vars: []

fluid-pressure-and-flow-strings_gu.json@readoutMeters:
english vars: [ '{0}' ]
translated vars: []

fluid-pressure-and-flow-strings_gu.json@valueWithUnitsPattern:
english vars: [ '{0}', '{1}' ]
translated vars: []

gene-expression-essentials-strings_es_ES.json@gene:
english vars: [ '{{geneID}}' ]
translated vars: []

gene-expression-essentials-strings_fa.json@gene:
english vars: [ '{{geneID}}' ]
translated vars: []

gene-expression-essentials-strings_ja.json@gene:
english vars: [ '{{geneID}}' ]
translated vars: []

gene-expression-essentials-strings_mk.json@gene:
english vars: [ '{{geneID}}' ]
translated vars: []

gene-expression-essentials-strings_ta.json@gene:
english vars: [ '{{geneID}}' ]
translated vars: []

joist-strings_nl.json@credits.qualityAssurance:
english vars: [ '{0}' ]
translated vars: []

masses-and-springs-strings_ig.json@dampingEqualsZero:
english vars: [ '{{equalsZero}}' ]
translated vars: []

masses-and-springs-strings_ig.json@gravityValue:
english vars: [ '{{gravity}}' ]
translated vars: []

masses-and-springs-strings_ig.json@massValue:
english vars: [ '{{mass}}' ]
translated vars: []

molarity-strings_fa.json@pattern.parentheses.0text:
english vars: [ '{0}' ]
translated vars: []

molecule-polarity-strings_bs.json@pattern.dipoleDirection:
english vars: [ '{{from}}', '{{to}}' ]
translated vars: [ '{{od}}', '{{do}}' ]

molecule-polarity-strings_da.json@pattern.dipoleDirection:
english vars: [ '{{from}}', '{{to}}' ]
translated vars: [ '{{fra}}', '{{til}}' ]

molecule-polarity-strings_fa.json@pattern.atomName:
english vars: [ '{{name}}' ]
translated vars: []

molecule-polarity-strings_ko.json@pattern.atomName:
english vars: [ '{{name}}' ]
translated vars: []

molecule-polarity-strings_ko.json@pattern.symbolName:
english vars: [ '{{symbol}}', '{{name}}' ]
translated vars: []

molecule-polarity-strings_ko.json@pattern.dipoleDirection:
english vars: [ '{{from}}', '{{to}}' ]
translated vars: []

molecule-polarity-strings_vi.json@pattern.dipoleDirection:
english vars: [ '{{from}}', '{{to}}' ]
translated vars: []

molecule-polarity-strings_zh_CN.json@pattern.atomName:
english vars: [ '{{name}}' ]
translated vars: []

molecule-polarity-strings_zh_CN.json@pattern.symbolName:
english vars: [ '{{symbol}}', '{{name}}' ]
translated vars: []

molecule-polarity-strings_zh_CN.json@pattern.dipoleDirection:
english vars: [ '{{from}}', '{{to}}' ]
translated vars: []

number-play-strings_ht.json@wordLanguage:
english vars: [ '{{language}}' ]
translated vars: [ '{{langue}}' ]

pendulum-lab-strings_nb.json@gravitationalAccelerationPattern:
english vars: [ '{{gravity}}' ]
translated vars: [ '{{gravitasjon}}' ]

pendulum-lab-strings_nb.json@degreesPattern:
english vars: [ '{{degrees}}' ]
translated vars: [ '{{grader}}' ]

pendulum-lab-strings_nb.json@secondsPattern:
english vars: [ '{{seconds}}' ]
translated vars: [ '{{sekunder}}' ]

proportion-playground-strings_eu.json@pricePattern:
english vars: [ '{{price}}' ]
translated vars: [ '{{prezioa}}' ]

proportion-playground-strings_ko.json@pricePattern:
english vars: [ '{{price}}' ]
translated vars: []

proportion-playground-strings_vi.json@pricePattern:
english vars: [ '{{price}}' ]
translated vars: []

proportion-playground-strings_zh_CN.json@pricePattern:
english vars: [ '{{price}}' ]
translated vars: []

scenery-phet-strings_ht.json@keyboardHelpDialog.grabOrReleaseHeadingPattern:
english vars: [ '{{thing}}' ]
translated vars: [ '{{chose}}' ]

scenery-phet-strings_ht.json@keyboardHelpDialog.grabOrReleaseLabelPattern:
english vars: [ '{{thing}}' ]
translated vars: [ '{{chose}}' ]

scenery-phet-strings_ht.json@measuringTapeReadoutPattern:
english vars: [ '{{distance}}', '{{units}}' ]
translated vars: [ '{{distance}}' ]

scenery-phet-strings_tg.json@measuringTapeReadoutPattern:
english vars: [ '{{distance}}', '{{units}}' ]
translated vars: []

trig-tour-strings_iw.json@valueUnitPattern:
english vars: [ '{0}', '{1}' ]
translated vars: []

trig-tour-strings_iw.json@numberPiPattern:
english vars: [ '{0}', '{1}' ]
translated vars: []

vegas-strings_ht.json@pattern.0yourBest:
english vars: [ '{0}' ]
translated vars: []

vegas-strings_mt.json@label.level:
english vars: [ '{0}' ]
translated vars: []

@zepumph
Copy link
Member

zepumph commented Dec 16, 2022

I think it would be good to think of figuring out if/how all of these should be changed, as opposed to just handling for molecule polarity. Perhaps a more common issue?

It wasn't clear to me if we should do this via rosetta interface, (which will likely help by triggering a rebuilt of the production version), or something else that is faster and more behind the scenes. Over to you @kathy-phet to see the scope of the issue (beyond molecule-polarity) and to recommend who should lead the assault!

@zepumph zepumph assigned kathy-phet and unassigned zepumph Dec 16, 2022
@pixelzoom
Copy link
Contributor

Transferring this issue to babel. It's not specific to molecule-polarity, it's a problem in many sims, as demonstrated by the "Results" in #18.

@pixelzoom pixelzoom changed the title Fix for string pattern {{from}} {{to}} in existing Molecule Polarity translations Fix for string patterns that were erroneously translated. Dec 16, 2022
@pixelzoom pixelzoom transferred this issue from phetsims/molecule-polarity Dec 16, 2022
@oliver-phet
Copy link
Contributor

@kathy-phet can you please prioritize this?

@kathy-phet
Copy link
Author

@oliver-phet - If you can take care of Molecule Polarity by next Tuesday that would be great, and since I don't think it will take a lot of time just finish the list off before you leave for the holiday, and close this issue that would be ideal. Since we know Rosetta won't let this happen again, we can just fix these and be done. Thanks!

@zepumph
Copy link
Member

zepumph commented Dec 17, 2022

@kathy-phet @oliver-phet and I spoke again today. We are most worried about phetsims/rosetta#329 and if translators can continue to make this mistake, but when it comes time to fix things, here is a more accurate script that compared the release branch en strings to babel, instead of using master. I also fixed a bug where non ascii chars were not being printed in the results:

The script (run from `./perennial/script.js`)
const fs = require( 'fs' );
const _ = require( 'lodash' );
const axios = require( 'axios' );
const gitCheckout = require( '../perennial/js/common/gitCheckout.js' );

let output = '';

// template vars with anything inside but template markers(i.e. supporting '{{名字}}')
const templateVarRegex = /{\{?[^{}]+\}\}?/g;
const openRegex = /{/g;
const closeRegex = /}/g;

function getTemplatedVars( string ) {

  const matches = string.match( templateVarRegex ) || [];

  // Don't match {HI}} for whatever reason that may be
  return matches.filter( templateVar => templateVar.match( openRegex ).length === templateVar.match( closeRegex ).length );
}

let metadata = null;


const metadataURL = 'https://phet.colorado.edu/services/metadata/1.3/simulations?format=json&type=html&summary';

async function getMetadata() {
  return ( await axios.get( metadataURL ) ).data.projects;
}

async function getLatestReleaseBranch( repo ) {

  if ( !metadata ) {
    metadata = await getMetadata();
  }
  const simProject = _.find( metadata, entry => entry.name === `html/${repo}` );
  if ( !simProject ) {
    return 'master';
  }
  const simProjectVersion = simProject.version;
  return `${simProjectVersion.major}.${simProjectVersion.minor}`;
}

// @returns map where string key is key, and list of template vars in it are the value.
async function getTemplateVarsInEnglishStringFile( repo ) {
  console.log( repo );
  await gitCheckout( repo, await getLatestReleaseBranch( repo ) );
  const englishStringFileName = `../${repo}/${repo}-strings_en.json`;
  const englishStrings = JSON.parse( fs.readFileSync( englishStringFileName ).toString() );

  await gitCheckout( repo, 'master' );


  // Record<stringKey, Array<templateVarString>>
  const keyTemplateMap = {};
  Object.keys( englishStrings ).forEach( key => {
    const stringValue = englishStrings[ key ].value;
    if ( stringValue ) {
      const templateVars = getTemplatedVars( stringValue );
      if ( templateVars.length > 0 ) {
        keyTemplateMap[ key ] = templateVars;
      }
    }
  } );

  return keyTemplateMap;
}

const babelRepoDir = '../babel';

async function checkForRepos( repos ) {
  for ( const repo of repos ) {

    const templateVarsMap = await getTemplateVarsInEnglishStringFile( repo );

    fs.readdirSync( `${babelRepoDir}/${repo}` ).forEach( translatedStringFileContents => {
      const translatedStrings = JSON.parse( fs.readFileSync( `../babel/${repo}/${translatedStringFileContents}` ).toString() );
      Object.keys( templateVarsMap ).forEach( stringKey => {

        if ( translatedStrings[ stringKey ] ) {

          const translatedStringValue = translatedStrings[ stringKey ].value;
          const translatedTemplateVars = getTemplatedVars( translatedStringValue );
          const englishTemplatedVars = templateVarsMap[ stringKey ];

          // Set because order doesn't matter
          if ( !_.isEqual( new Set( englishTemplatedVars ), new Set( translatedTemplateVars ) ) ) {
            output += `${translatedStringFileContents}:\n` + ' english vars: ' + JSON.stringify( englishTemplatedVars ) + '\n translated vars: ' +
                      JSON.stringify( translatedTemplateVars ) + '\n';
          }
        }
      } );
    } );
  }
}

( async () => {

  const babelRepos = fs.readdirSync( babelRepoDir ).filter( dir => {
    return !dir.startsWith( '_' ) && !dir.startsWith( '.' ) && fs.statSync( `${babelRepoDir}/${dir}` ).isDirectory();
  } );

  await checkForRepos( babelRepos );
  fs.writeFileSync( 'output.txt', output );
  console.log( 'written to output.txt' );
} )();
Results
beers-law-lab-strings_ru.json:
 english vars: ["{0}"]
 translated vars: []
beers-law-lab-strings_ru.json:
 english vars: ["{0}","{1}"]
 translated vars: []
beers-law-lab-strings_ru.json:
 english vars: ["{0}"]
 translated vars: []
beers-law-lab-strings_ru.json:
 english vars: ["{0}","{1}"]
 translated vars: []
charges-and-fields-strings_gu.json:
 english vars: ["{0}","{1}"]
 translated vars: []
circuit-construction-kit-common-strings_nb.json:
 english vars: ["{{resistance}}"]
 translated vars: []
circuit-construction-kit-common-strings_ta.json:
 english vars: ["{{resistance}}"]
 translated vars: []
expression-exchange-strings_ko.json:
 english vars: ["{{number}}"]
 translated vars: ["{{수)}}"]
expression-exchange-strings_ko.json:
 english vars: ["{{levelNumber}}"]
 translated vars: []
expression-exchange-strings_zh_CN.json:
 english vars: ["{{number}}"]
 translated vars: ["{{数字}}"]
expression-exchange-strings_zh_CN.json:
 english vars: ["{{levelNumber}}"]
 translated vars: ["{{水平数字}}"]
fluid-pressure-and-flow-strings_gu.json:
 english vars: ["{0}"]
 translated vars: []
fluid-pressure-and-flow-strings_gu.json:
 english vars: ["{0}"]
 translated vars: []
fluid-pressure-and-flow-strings_gu.json:
 english vars: ["{0}"]
 translated vars: []
fluid-pressure-and-flow-strings_gu.json:
 english vars: ["{0}","{1}"]
 translated vars: []
gene-expression-essentials-strings_es_ES.json:
 english vars: ["{{geneID}}"]
 translated vars: []
gene-expression-essentials-strings_fa.json:
 english vars: ["{{geneID}}"]
 translated vars: []
gene-expression-essentials-strings_ja.json:
 english vars: ["{{geneID}}"]
 translated vars: []
gene-expression-essentials-strings_mk.json:
 english vars: ["{{geneID}}"]
 translated vars: []
gene-expression-essentials-strings_ta.json:
 english vars: ["{{geneID}}"]
 translated vars: []
joist-strings_nl.json:
 english vars: ["{0}"]
 translated vars: []
masses-and-springs-strings_ig.json:
 english vars: ["{{equalsZero}}"]
 translated vars: []
masses-and-springs-strings_ig.json:
 english vars: ["{{gravity}}"]
 translated vars: []
masses-and-springs-strings_ig.json:
 english vars: ["{{mass}}"]
 translated vars: []
molarity-strings_fa.json:
 english vars: ["{0}"]
 translated vars: []
molecule-polarity-strings_bs.json:
 english vars: ["{{from}}","{{to}}"]
 translated vars: ["{{od}}","{{do}}"]
molecule-polarity-strings_da.json:
 english vars: ["{{from}}","{{to}}"]
 translated vars: ["{{fra}}","{{til}}"]
molecule-polarity-strings_fa.json:
 english vars: ["{{name}}"]
 translated vars: []
molecule-polarity-strings_ko.json:
 english vars: ["{{name}}"]
 translated vars: ["{{이름}}"]
molecule-polarity-strings_ko.json:
 english vars: ["{{symbol}}","{{name}}"]
 translated vars: ["{{기호}}","{{이름}}"]
molecule-polarity-strings_ko.json:
 english vars: ["{{from}}","{{to}}"]
 translated vars: ["{{에서}}","{{로}}"]
molecule-polarity-strings_vi.json:
 english vars: ["{{from}}","{{to}}"]
 translated vars: ["{{từ}}","{{đến}}"]
molecule-polarity-strings_zh_CN.json:
 english vars: ["{{name}}"]
 translated vars: ["{{名字}}"]
molecule-polarity-strings_zh_CN.json:
 english vars: ["{{symbol}}","{{name}}"]
 translated vars: ["{{标志}}","{{名字}}"]
molecule-polarity-strings_zh_CN.json:
 english vars: ["{{from}}","{{to}}"]
 translated vars: ["{{从}}","{{到}}"]
number-play-strings_ht.json:
 english vars: ["{{language}}"]
 translated vars: ["{{langue}}"]
pendulum-lab-strings_nb.json:
 english vars: ["{{gravity}}"]
 translated vars: ["{{gravitasjon}}"]
pendulum-lab-strings_nb.json:
 english vars: ["{{degrees}}"]
 translated vars: ["{{grader}}"]
pendulum-lab-strings_nb.json:
 english vars: ["{{seconds}}"]
 translated vars: ["{{sekunder}}"]
proportion-playground-strings_eu.json:
 english vars: ["{{price}}"]
 translated vars: ["{{prezioa}}"]
proportion-playground-strings_ko.json:
 english vars: ["{{price}}"]
 translated vars: ["{{가격}}"]
proportion-playground-strings_vi.json:
 english vars: ["{{price}}"]
 translated vars: ["{{giá}}"]
proportion-playground-strings_zh_CN.json:
 english vars: ["{{price}}"]
 translated vars: ["{{单价}}"]
scenery-phet-strings_ht.json:
 english vars: ["{{thing}}"]
 translated vars: ["{{chose}}"]
scenery-phet-strings_ht.json:
 english vars: ["{{thing}}"]
 translated vars: ["{{chose}}"]
scenery-phet-strings_ht.json:
 english vars: ["{{distance}}","{{units}}"]
 translated vars: ["{{distance}}","{{unités}}"]
scenery-phet-strings_tg.json:
 english vars: ["{{distance}}","{{units}}"]
 translated vars: ["{{масофа}}","{{адад}}"]
trig-tour-strings_iw.json:
 english vars: ["{0}","{1}"]
 translated vars: []
trig-tour-strings_iw.json:
 english vars: ["{0}","{1}"]
 translated vars: []
vegas-strings_ht.json:
 english vars: ["{0}"]
 translated vars: []
vegas-strings_mt.json:
 english vars: ["{0}"]
 translated vars: []

@zepumph
Copy link
Member

zepumph commented Jan 4, 2023

QA ran into this for the scenery-phet ht strings (listed above) over in phetsims/friction#315. @oliver-phet what is the current status on this issue? Can we prioritize that fix? Should this be blocking the publication of friction?

@oliver-phet
Copy link
Contributor

oliver-phet commented Jan 4, 2023

QA ran into this for the scenery-phet ht strings (listed above) over in phetsims/friction#315. @oliver-phet what is the current status on this issue? Can we prioritize that fix? Should this be blocking the publication of friction?

I don't have any updates on this. I'm still on hold to do any manual fixes until either

I can do the manual fixes whenever, but it seems that until the pattern validation is fixed this issue can continue to occur.
@kathy-phet I think this is your call. Do you want (probably @jbphet @liammulh) to try to fix the validation issue in Rosetta 1.0? Wait until Rosetta 2.0 is published? Or just do manual clean up now (and possibly again)?

@kathy-phet
Copy link
Author

@oliver-phet - I discussed with JB before break. I think you should do the manual fixes now. There will not likely be very many additional breaks, since we have not figured out how to even reproduce it.

@zepumph
Copy link
Member

zepumph commented Jan 23, 2023

@oliver-phet, I see 8fb649d came in. Could you please give an update for this issue? Perhaps it was all covered over in phetsims/rosetta#329 (now closed?)

@oliver-phet
Copy link
Contributor

Sorry, I forgot there were 2 issues about this. All the pattern fixes (identified by @jbphet 's script output) were made in phetsims/rosetta#329

If there's something else for me on this issue, let me know!

@oliver-phet oliver-phet assigned zepumph and unassigned oliver-phet Jan 23, 2023
@zepumph
Copy link
Member

zepumph commented Jan 24, 2023

Awesome! I confirmed my specific bug is fixed on master with phetsims/friction#315, and will be picked up in the next rc/production deploy.

@zepumph zepumph closed this as completed Jan 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants