Object dump import patch cleaning tool #119

AaronVanGeffen · 2020-07-10T18:52:59Z

The object string dumper and loader scripts introduced in #27 have greatly simplified the workflow for object translations. However, the barrier of entry for use by most translators is still too big. Recent work by @tupaschoal has been working towards making an automated workflow, bridging the Objects and Localisation repositories. However, the holdup has been that Python's method of serialising JSON differs in output from what is already present (#93). Rather than writing our own encoder, I propose the following solution.

I have written a simple Python class, PatchCleaner, that uses the unidiff package to disregard changes that arise in patches due to the formatting changes mentioned. In short, this means it only picks up the changes we need. I have added a heuristic to pick up on cases where a trailing comma was added to a line with an irrelevant language. This needs some field testing, but in my brief testing it has worked well.

The workflow is still suboptimal -- and the script doesn't take language and patch file as its arguments yet:

./language_dump.py -l nl-NL -d nl-NL_dump.json
$EDITOR nl-NL_dump.json
./language_load.py -l nl-NL -i nl-NL_dump.json
git diff > patch.diff
git stash
./language_clean_patch.py | git apply    # no arguments yet, sorry
git commit -m 'Updated object strings for nl-NL'

Again, this removes any need for cherry-picking the diff manually. This cherry-picking gets particularly cumbersome for rides and large scenery objects. Indeed, with this script, this workflow can now be completely automated!

As an aside: personally, I like to order the dump files before working on translations. We can avoid complicating language_dump.py by using jq to this end:

jq 'to_entries | sort_by(.value."reference-name") | from_entries' \
    < nl-NL_dump.json \
    > nl-NL_dump.json

tupaschoal

This looks like a very good start, easy enough to follow the code and, from a quick glance, might actually do the trick. Nice job!

Some notes:

I've added it to the project for proper tracking
It doesn't seem hard at all to make this support target language, target file and input file

language_clean_patch.py

tupaschoal

Looks great to me

tupaschoal · 2020-07-20T00:50:02Z

language_clean_patch.py

+SUPPORTED_LANGUAGES = ["ar-EG", "ca-ES", "cs-CZ", "da-DK", "de-DE", "en-GB", "en-US", "es-ES",\
+                       "fi-FI", "fr-FR", "hu-HU", "it-IT", "ja-JP", "ko-KR", "nb-NO", "nl-NL",\
+                       "pl-PL", "pt-BR", "ru-RU", "sv-SE", "tr-TR", "zh-CN", "zh-TW"]


In the future we should probably have all scripts source the languages from a single place

Yes, definitely. I was also thinking we should probably move the language*.py tools into a Python package of some sort. Another item for the to do list…

AaronVanGeffen added 2 commits July 10, 2020 20:40

Initial version of the language patch cleaner script.

ac89e8f

Rework initial cleaning function into a class.

27e5639

tupaschoal reviewed Jul 14, 2020

View reviewed changes

language_clean_patch.py Show resolved Hide resolved

language_clean_patch.py Outdated Show resolved Hide resolved

language_clean_patch.py Show resolved Hide resolved

language_clean_patch.py Outdated Show resolved Hide resolved

language_clean_patch.py Show resolved Hide resolved

AaronVanGeffen requested a review from tupaschoal July 19, 2020 13:51

AaronVanGeffen added 2 commits July 19, 2020 15:52

Split logic off to is_accommodating_change.

93993d8

Add support for command-line arguments.

d503251

AaronVanGeffen force-pushed the language-clean-patch branch from d59dda0 to d503251 Compare July 19, 2020 13:56

tupaschoal approved these changes Jul 20, 2020

View reviewed changes

AaronVanGeffen merged commit e991be9 into master Jul 20, 2020

tupaschoal deleted the language-clean-patch branch July 20, 2020 21:33

tupaschoal mentioned this pull request Jul 20, 2020

Improve language_load.py JSON encoder #93

Closed

AaronVanGeffen mentioned this pull request Oct 16, 2020

Tool for finding missing translations #128

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Object dump import patch cleaning tool #119

Object dump import patch cleaning tool #119

AaronVanGeffen commented Jul 10, 2020

tupaschoal left a comment

tupaschoal left a comment

tupaschoal Jul 20, 2020

AaronVanGeffen Jul 20, 2020

Object dump import patch cleaning tool #119

Object dump import patch cleaning tool #119

Conversation

AaronVanGeffen commented Jul 10, 2020

tupaschoal left a comment

Choose a reason for hiding this comment

tupaschoal left a comment

Choose a reason for hiding this comment

tupaschoal Jul 20, 2020

Choose a reason for hiding this comment

AaronVanGeffen Jul 20, 2020

Choose a reason for hiding this comment