New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sync i18n without merge #23706
Merged
Merged
Sync i18n without merge #23706
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Also going to spot-check the course 1-4 translations before merging |
tanyaparker
approved these changes
Jul 16, 2018
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Context
We have a desire in our system to ensure that when a source string is updated, we don't lose translation data for existing strings. Since most source string changes are minor tweaks like typo fixes or curriculum adjustments, this allows us to make those changes without a cost in translation saturation.
However! We are unfortunately doing this work twice; once as part of the sync up to Crowdin, which we do with the
update_as_unapproved
option (which just switches the translations from "approved" to "unapproved" when we update the source string, but doesn't actually discard them) and again during the sync out from Crowdin, where we use merge-translation.rb, in which we decline to update a given string if the translation data received from Crowdin is identical to the English source string.Unfortunately, what this means is that when we update English source strings, any languages for which that string has not been translated will retain the old source string. This leads to quite a bit of badness, some of which is significant:
Unfortunately, there are also a handful of places where we have translation data that only exists in these files:
Which means that by merging this, we will indeed lose some translation data.
Process
These changes were originally generated by running a sync with the following code change:
Changes were then added in commits that break everything up by content type. When reviewing this (massive) PR, I recommend checking commit-by-commit. The Instructions and Markdown Instructions commits are likely going to be too big to even view in github, but the rest of them should give you at least an overview of what kind of changes are going to happen.
I manually reviewed changes in a handful of files and documented my findings here. I was able to identify a couple of instances in which we loaded in translations directly into the project, and we're working on making sure all the data from those changes gets into Crowdin.
At Tanya's suggestion, I also specifically examined the Minecraft tutorials and /learn in our top ten languages. I found a total of two translations (one in Italian, one in German) which would be lost by these changes, but no more (also added those strings to Crowdin).
Conclusion
This PR is going to fix a lot of bad english strings in other locales throughout our system, and also cause us to lose some translation data. But the ratio of bad to good is strongly in favor of good, and with our spot-checks of critical/risky content, I'm convinced that we are unlikely to lose any high-profile translation data.
Next steps