Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Showing text diff and old translation (tagged as needs revision) for changes made in synchronized translation #502

Open
jnagler-git opened this issue Jan 7, 2022 · 3 comments

Comments

@jnagler-git
Copy link

The current behaviour: when changing a field e.g., a paragraph, in the default language the old translation of that field gets deleted after the synchronization and the new text does not show any hints of changes that have been made. Thus, a new translation is needed.

It would be nice to have a diff feature, that does not delete the old translation, but to show it with a text diff of the default language. Sometimes just a few words or single sentences change in paragraph, so the translator can save a lot of time with this feature.

To not just give a request, I was looking a bit around and found out that this might be done using Google's diff_match_patch library in the following way:

import diff_match_patch
dmp = diff_match_patch.diff_match_patch()
diffs = dmp.diff_main('This is my original paragraph.', 'This is my revised paragraph with new words.')
 # semantic cleanup seems important 
dmp.diff_cleanupSemantic(diffs)
print(diffs)  

Output: [(0, 'This is my '), (-1, 'original'), (1, 'revised'), (0, ' paragraph'), (1, ' with new words'), (0, '.')]

This could then be canonically written as "This is my original revised paragraph with new words." Having the old translation still available then makes it a lot easier to revise the changes instead of translating it from scratch.

Would be wonderful to have this feature considered for wagtail-localize.

@saevarom
Copy link

I've had this issue too, and it is quite annoying when e.g. someone just makes a linebreak in the original text, the original text block becomes two entirely new text blocks with their own hashes of the contents. The translation for the original text is lost to the end user, even though it still exists in the database. To confirm that, you can download a .po file to see that the original text exists in the .po file, but marked as an obsolete message.

I would propose giving the end user access to "obsolete" messages so that you would only be a couple of clicks away from the original translation to restore it.

I have a client with almost 400 pages, all of them translated to another language, many of them with a lot of text. It is quite a pain to lose translations just because someone introduced a linebreak in the original text!

Any thoughts here @zerolab ?

@th3hamm0r
Copy link
Contributor

Fyi: there is also a related discussion #394
It's really a pain, always having to backup the original translated content. It is very easy to loose content, even if it it is probably still in the database somewhere...

@zerolab
Copy link
Collaborator

zerolab commented Nov 16, 2022

I agree this is annoying. Unfortunately I do not have much capacity in the next month or so. Any PRs fixing this are most welcome ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants