Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address some potential bugs in translations #22

Open
3 of 4 tasks
zepumph opened this issue Apr 13, 2023 · 5 comments
Open
3 of 4 tasks

Address some potential bugs in translations #22

zepumph opened this issue Apr 13, 2023 · 5 comments
Assignees

Comments

@zepumph
Copy link
Member

zepumph commented Apr 13, 2023

Over in phetsims/number-play#226, QA reported some spots that seemed pretty obvious to be a mistake. For example, where a parenthesis is opened but not closed, or there is a third template var curly brace. We spoke to @jbphet and he mentioned that just manually fixing these, if we feel very very confident of translator error, is totally fine.

So we did 98db64e

From that there are a couple of questions more generally (note that by searching in _generated_development_strings you don't search through history, and just current values of strings):

  • 2 usages {{{
  • 2 usage of }}}
  • 86 usages of ".*\([^)\n]*" (match an open paren that isn't closed), though we need to be careful about rtl languages here
  • 24 usages of "[^(\n]*\).*" (match a closed paren without an open), again worry about rtl languages and manually inspect things

@jbphet, does this seem like a worth while investigation?

@jbphet
Copy link
Contributor

jbphet commented Apr 13, 2023

@jbphet, does this seem like a worth while investigation?

Potentially, sure. Historically we haven't proactively reviewed and fixed up translations, we've mostly just left this up to the translators and fixed obvious problems if we happened to stumble across them. However, with the Global Initiative up and running, we might consider searching for and fixing problems like those identified above in order to generally improve the quality of our translated offerings.

I'd guess this would take a couple of hours to look these over, make the changes, and trigger the rebuilds. And of course there's always a risk that something would come up and it would take longer, or that we accidently undo something that was intended by a translator.

It's not my call to make as to whether this is sufficiently important to assign to someone. @zepumph - I'll hand this back to you to see if you roughly agree with my estimate, then I think you should assign to @kathy-phet and @RVieyra to get their input on whether this should be considered for our next sprint. Also tagging @liammulh so that he is aware that this is being discussed.

@jbphet jbphet assigned zepumph and unassigned jbphet Apr 13, 2023
@zepumph
Copy link
Member Author

zepumph commented Apr 13, 2023

Yeah I would love if @RVieyra could chime in a bit about what the global team thinks about taking some time to improve translations like this.

To catch up, we have always known there are likely mistakes in translations, but we haven't spent any time fixing them. There are cases where a mistake is obvious enough that we may just want to manually fix it for the translation. For example from phetsims/number-play#226:

Likely wanting a closing parenthesis:
Screenshot 2023-04-03 at 9 10 39 PM

Closing parenthesis replaced with curly bracket
Screenshot 2023-04-08 at 3 25 44 PM

The first comment lists a few ways to find these sorts of problems in the sim translation data.

@RVieyra do you think of this as a priority, and something you would like us to spend some time on?

@zepumph zepumph assigned RVieyra and unassigned zepumph Apr 13, 2023
@RVieyra
Copy link

RVieyra commented Apr 16, 2023

Two questions (and bringing in @solaolateju , who is more directly involved with hearing out translator issues):

(1) Is this parenthesis/curly bracket issue exclusively a matter of punctuation? I want to verify -- is there no issue with the actual translations?

(2) How frequent are these issues across PhET's sims?

@zepumph zepumph self-assigned this Aug 18, 2023
@zepumph
Copy link
Member Author

zepumph commented Aug 18, 2023

Hello! I just saw this again from working on phetsims/build-a-nucleus#76. Sorry about that. Please assign me if you want my input, because I'm really bad at keeping track of things otherwise. @solaolateju met today and were able to discuss this issue and fix a couple of the problems (mostly as an investigation of the more general issue).

(1) Yes, the translations are working well, just displaying these punctuation problems.
(2). @solaolateju reviewed 3 of the 4 checkboxes in the first issue, and found that 14 spots could be bug-fixed. The last checkbox has 86 items to check on, and my guess about 20 or so could be fixed. That isn't really the question though, because these are just some of the potentially bugs that may be occurring. It certainly isn't a complete list. @solaolateju and I discussed this and feel like it is important for devs to let the global team know when things like this come up, and that also the translation quality could generally be improved by having @solaolateju lead some more manual inspections/audits of translations. That said we don't necessarily recommend doing this eagerly or completely, but were instead just noting that that would likely be part of the solution.

We also discussed having rosetta automatically notice some of the above problems, but most likely it wouldn't be too helpful for parenthesis, as there are as many false positives as bugs.

From here, I will go through the last bullet, and bring @solaolateju a list of changes to review, as well as a list of unsure-if-they-are-problems to have him confirm with translators (if applicable).

A couple more potential ways to check for problems:

  • (0 cases) "[^\n]*[^\n{]\{[^{\n][^\n}]*" - single open curly brace with no closed one
  • (1 case) "[^\n{]*[^\n}]\}[^}\n][^\n]*" - single close curly brace with no open

@zepumph
Copy link
Member Author

zepumph commented Aug 31, 2023

I went through and looked at the remaining cases. Most seemed buggy, but I have no idea how all other languages treat parenthesis, so I wouldn't want to just make these changes myself. Over to @solaolateju.

  • (1 case) "[^\n{]*[^\n}]\}[^}\n][^\n]*" - single close curly brace with no open

86 usages of ".*\([^)\n]*" (match an open paren that isn't closed), though we need to be careful about rtl languages here

@zepumph zepumph assigned solaolateju and unassigned zepumph Aug 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants