Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Colorization of current vs. previous msgid diff for fuzzy translations #230

Open
sebras opened this issue Dec 16, 2015 · 10 comments
Open

Colorization of current vs. previous msgid diff for fuzzy translations #230

sebras opened this issue Dec 16, 2015 · 10 comments

Comments

@sebras
Copy link

sebras commented Dec 16, 2015

These are two wishlist items.

I'm currently using poedit 1.8.6 for doing translations into Swedish. For fuzzy translations you can currently see the dequoted string of the current msgid in the source text field while the the old msgid is shown in the side bar as a quoted string.

First of all I'd prefer the old msgid also to be a dequoted string (without the preceeding msgid and " " for each line).

Secondly, attempting to figure out what part of a long msgid has changed is currently quite difficult. Having some form of color coding for the parts that changed in a fuzzy string would be useful. I'm thinking of something along the lines of red for removed parts (in the sidebar) and green for added parts (in the source text field). Similar to the deep green and deep red in the micro version number in this commit here on github. It is quite likely I'm asking you to implement diff(1) with coloring which might be a bit difficult. I guess if the differences are too big then poedit could just claim that the entire old msgid was removed and that the entire current msgid was the replacement. It seems like the current implementation uses wxControl::SetText() while probably wxControl::SetLabelMarkup() could be used to achieve the coloring. Do you think this idea is feasible, and it is something you would consider?

poedit is an excellent tool and it has certainly helped me a lot when doing translations!
Thanks for the effort you put into the project! Even if you may decide to skip this wishlist item. :)

@vslavik
Copy link
Owner

vslavik commented Dec 17, 2015

These are two wishlist items.

[ Please don't mix unrelated things in a single issue — you’re making it impossible to handle it sanely (when do you close such an issue? with what commit?) ]

Secondly, attempting to figure out what part of a long msgid has changed is currently quite difficult.

Your problem is not the lack of msgid diffing in Poedit. Your problem is that you have too long msgids, which isn’t normal for software localization — I suspect gettext misuse for something like freeform text, am I right?

You’re not the only one asking about something like this, so I’m leaving this open. But honestly, it’s a rather obscure requirement that doesn’t normally arise in localization. I’m not likely to address it in immediate (or even not so immediate) future. A PR (based e.g. on https://code.google.com/p/google-diff-match-patch/) would go a long way.

@sebras
Copy link
Author

sebras commented Dec 17, 2015

[That's a fair comment, I'll separate out the dequoting into a separate issue. Sorry for the inconvenience. :)]

Your problem is that you have too long msgids, which isn’t normal for software localization — I suspect
gettext misuse for something like freeform text, am I right?

Well, you are right of course. It is used for translations such as the GNOME documentation where each paragraph from the online help text is its own string. Whether that is considered misuse or not might can be discussed, but I want to explain the case where I see a use for a feature like this at least. Oh, and I do agree that I seldom need this feature for normal software translation!

Alright, I'll dust off my C++-knowledge and look into the google-diff implementation to see if I manage to build something resembling the feature I'm asking for. If so I'll definitely come back with a PR.

@lanurmi
Copy link
Contributor

lanurmi commented Jan 11, 2016

Your problem is not the lack of msgid diffing in Poedit. Your problem is that you have too long msgids, which isn’t normal for software localization — I suspect gettext misuse for something like freeform text, am I right?

I don't know what you consider a "too long" msgid, but it can be hard to spot differences even in single-line messages if the difference is only about changing one letter, or whitespace.

As for misusing gettext, that is something the translator often cannot do much about, not even in open source projects in practice. So a translation tool should adapt to the reality, even when the reality is not ideal.

@vslavik vslavik changed the title Colorization of current/new msgid for fuzzy translations Colorization of current vs. previous msgid diff for fuzzy translations Nov 4, 2016
vslavik added a commit that referenced this issue Nov 4, 2016
Always use the --previous flag when updating catalogs, to preserve the
msgid before msgmerge marked a translation as fuzzy. This is often
useful in identifying what changes need to be done.

A very small first step towards #230.
Repository owner deleted a comment from osamuaoki Aug 28, 2018
@Fat-Zer
Copy link

Fat-Zer commented Feb 19, 2021

Hi, I've looked into this one for a bit.

As I can see to implement this we need two things:

  • A diff algorithm implementation.
  • A widget to display color-coded text.

A diff algorithm implementation.

I've managed to find a couple of suitable diff implementation if form of c++ header-only libs:

  • cubicdaiya/dtl
    • License: BSD3
    • Well-documented with examples etc
    • Alive upstream (at least has some recent enough commits)
  • leea/diff-cpp
    • License: MIT
    • The repository is inactive for 7 years
    • A bit sloppy code, i.e. everything is in global namespace

The google/diff-match-patch mentioned above is off the table since it has only Qt-binded implementation for c++.

Also off coarse we can provide our own implementation: the Myers LCS is not that complicated.

A widget to display color-coded text.

As for now poedit uses a wxStaticText-derived widget, which doesn't support colouring. I suppose we can replace it with a read-only wxTextCtrl.


So, some questions:

  • Would it be ok to use a BSD3-licensed dependency in poedit?
  • Would you prefer to add a sub module for a header-only library into the decency or just to copy it to the deps dir?
  • Are you ok with reimplementing the diff algorithm?
  • Would it be ok to use the text edit area instead of a label in the GUI?

@vslavik
Copy link
Owner

vslavik commented Feb 20, 2021

As I can see to implement this we need two things:

You're forgetting "just" the decision logic @sebras discusses and gluing it all together in the UI, but even "only" basic UI (e.g. not coloring source text and showing the diff just in the previous msgid part, without worrying about the threshold), would be great and much appreciated contribution.

Would it be ok to use a BSD3-licensed dependency in poedit?
Are you ok with reimplementing the diff algorithm?

It's compatible, so yes. dtl also has the added benefit of being packaged by Debian and thus Ubuntu and seems a good choice.

Would you prefer to add a sub module for a header-only library into the decency or just to copy it to the deps dir?

Definitely a submodule.

Would it be ok to use the text edit area instead of a label in the GUI?
As for now poedit uses a wxStaticText-derived widget, which doesn't support colouring.

wxStaticText does support rich text, see SetLabelMarkup(). On Windows, it is available through wxGenericStaticText, natively elsewhere. Using that seems both better (the control shows up as static text in a11y) and simpler to change to me.

Fat-Zer added a commit to Fat-Zer/poedit that referenced this issue Feb 21, 2021
@Fat-Zer
Copy link

Fat-Zer commented Feb 21, 2021

Ok, to get a diff in the markup form and to stuff into a wxStaticText was the easy part...

The hard part is to get it displayed correctly. The god damn label (at least with wxGTK) wraps lines by itself fine enough, but it doesn't provide any freaking way to query how much space it needs in pixels. All the Get*Size() calls just return some bogus wish-to-be sizes instead of the actual one...

Is that the reason you used the manual line wraping at the first place? Are there any sane way to do it without hacking into the wrapping code to support Markup. I would probably need some help with it... I don't have much enough experience with wxWidgets and I'm already starting to hate it...


The current code can be found on the fork, branch named diff : https://github.com/Fat-Zer/poedit/commits/diff

And a teaser:

scr34

@vslavik
Copy link
Owner

vslavik commented Feb 25, 2021

The current code can be found on the fork, branch named diff : https://github.com/Fat-Zer/poedit/commits/diff

Could you make this into PR (marked as draft), so that we could move the discussion to it and e.g. it would allow me to commit too?

Is that the reason you used the manual line wraping at the first place?

As usual, reasons exist, yes. Specifically, that would be Windows native controls that don't have automatic wrapping.

All the Get*Size() calls just return some bogus wish-to-be sizes instead of the actual one…

You're probably looking at the sizes at a time when the window itself doesn't have a meaningful size (e.g before showing it).

The auto-wrapping control, which Poedit has, handles this, but doesn't have support for markup yet, that's true. Perhaps feel free to ignore it, and I'll add this part?

@Fat-Zer
Copy link

Fat-Zer commented Feb 25, 2021

Done, see PR #687.

It would be better if I leave the interaction with wxWidgets internals to you, because I clearly have got much less experience with it and struggling to do some basic things... I would definitely mess it up.

By the word, the doc says that Windows doesn't support multiline markup text, so It probably will be problematic.

I may implement some other useful stuff I've mentioned on the PR after this one will be sorted out.

@vslavik
Copy link
Owner

vslavik commented Feb 26, 2021

By the word, the doc says that Windows doesn't support multiline markup text, so It probably will be problematic.

Oh crap. Sounds like you were right all along and only a read-only wxTextCtrl is realistic on Windows.

@vslavik
Copy link
Owner

vslavik commented Feb 23, 2023

Sounds like you were right all along and only a read-only wxTextCtrl is realistic on Windows.

That woudl also help with #783 by making it possible to have these sections of the sidebar individually scrollable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants