Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicated entries when merging po file with update pot #68

Closed
izimobil opened this issue Oct 4, 2015 · 4 comments
Closed

Duplicated entries when merging po file with update pot #68

izimobil opened this issue Oct 4, 2015 · 4 comments
Labels

Comments

@izimobil
Copy link
Owner

izimobil commented Oct 4, 2015

Originally reported by: Luigi Toscano (Bitbucket: tosky, GitHub: tosky)


When merging a fairly complex po with the updated po template, the generated file contains various duplicated entries and it is invalid according msgfmt.

I tried with the current translation of digikam.po (both Italian and Ukrainian, quite complete) with the current pot, with a simple program which is almost like test_merge. The result is:

$ LANG=C msgfmt -o /dev/null --statistics digikam_it_new.po
digikam_it_guineapig.po:23014: duplicate message definition...
digikam_it_guineapig.po:11517: ...this is the location of the first definition
[...]

I tried to investigate in the source code and in the merge method, but I didn't identify so far the place where the duplicated entries are added.


@izimobil
Copy link
Owner Author

izimobil commented Oct 4, 2015

Original comment by Luigi Toscano (Bitbucket: tosky, GitHub: tosky):


Upon further investigation, the problem seems to be linked to the obsolete entries in the current po. If I remove them, the generated file is correct and the content matches the result of gettext msgmerge. Still, it shouldn't fail.

@izimobil
Copy link
Owner Author

izimobil commented Oct 4, 2015

Original comment by Luigi Toscano (Bitbucket: tosky, GitHub: tosky):


I think the problem is in the merge method, here:
self_entries = dict((entry.msgid, entry) for entry in self)
Because entry.msgid is not unique. I suspect that the obsolete entry wins. Moreover, this means that all entries with the same msgid, but different context (which could lead to different translation) are going to have the same translation.

@izimobil
Copy link
Owner Author

izimobil commented Dec 2, 2015

Original comment by fyrestone NA (Bitbucket: fyrestone, GitHub: fyrestone):


Maybe it is better to make context and msgid together as the merge key.
Steal code from django trans_real.py:

#!python

# magic gettext number to separate context from message
CONTEXT_SEPARATOR = "\x04"
msg_with_ctxt = "%s%s%s" % (context, CONTEXT_SEPARATOR, message)

Also, gettext.py in Python standard library make context and msgid together as the gettext() key.

@izimobil
Copy link
Owner Author

Original comment by David Jean Louis (Bitbucket: izi, GitHub: izi):


This should be fixed by https://bitbucket.org/izi/polib/commits/d0fcec9991c231015244e7e8cd6cac4f9bfbb0d0.
Thank you both for the suggestions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant