Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Message uniqueness #48

Closed
mlocati opened this issue Jan 20, 2015 · 11 comments
Closed

Message uniqueness #48

mlocati opened this issue Jan 20, 2015 · 11 comments

Comments

@mlocati
Copy link
Member

mlocati commented Jan 20, 2015

Let's say we have a test.po file like this:

msgid ""
msgstr ""
"Project-Id-Version: \n"
"POT-Creation-Date: \n"
"PO-Revision-Date: \n"
"Language: it\n"
"Language-Team: \n"
"Last-Translator: \n"
"Plural-Forms: nplurals=2; plural=(n != 1);\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"

msgid "1 child"
msgstr ""

msgid "1 child"
msgid_plural "2 children"
msgstr[0] ""
msgstr[1] ""

If we compile it with msgfmt we have:

test.po:17: duplicate message definition...
test.po:15: ...this is the location of the first definition
msgfmt: found 1 fatal error

So, we may not have in .po files two strings with same context+msgid but different plurals.

I think that we should apply two changes:

  1. Revert Fix mo generation #46
  2. Add an alternative method to Translations::find that does not take in account the plural form
@mlocati
Copy link
Member Author

mlocati commented Jan 20, 2015

Another problem is with Translations::mergeWith. When working with gettext files we should merge entries with the same context+single-form (even if they have different plurals)

@oscarotero
Copy link
Member

I see. I'm not agree with this behaviour, but the library should be compatible with gettext (as its name says) so this change is necessary, but this affects to different things:

  • What happens on scan a po/phpcode/etc source and two entries with the same id but different plural are found? Should it throw an exception? or overwrite it?
  • The same behaviour that above should be applied to Translations::find
  • Translations::find should remove the plural argument, or make it optional (only take in account if its defined). But if plurals are not part of the id, I vote for remove it. Maybe an array as third argument with different properties to filter, for example:
//Search only by context + msgid
$t = $translations->find(null, 'comment');

//Search by context + msgid + other non-unique parameters
$t = $translations->find(null, 'comment', array(
    'plural' => 'comments',
    'translation' => 'comentario',
    'plural_translation' => 'comentarios'
));

@mlocati
Copy link
Member Author

mlocati commented Jan 20, 2015

What happens on scan a po/phpcode/etc source and two entries with the same id but different plural are found? Should it throw an exception? or overwrite it?

I did a test with these two files (I don't write here the headers to be more clear):
singular.po:

msgid "1 child"
msgstr ""

plural.po:

msgid "1 child"
msgid_plural "2 children"
msgstr[0] ""
msgstr[1] ""

Running msgcat singular.po plural.po we have:

#, fuzzy
msgid "1 child"
msgstr ""
"#-#-#-#-#  singular.po  #-#-#-#-#\n"
"#-#-#-#-#  plural.po  #-#-#-#-#\n"

Running msgcat plural.po singular.po we have:

#, fuzzy
msgid "1 child"
msgid_plural "2 children"
msgstr[0] ""
"#-#-#-#-#  plural.po  #-#-#-#-#\n"
"#-#-#-#-#  singular.po  #-#-#-#-#\n"
msgstr[1] "#-#-#-#-#  plural.po  #-#-#-#-#\n"

The msgstr are quite ugly. I'd avoid such a behaviour. The relevant result is that the form of the first file given is kept. I'd keep this same behaviour...


Translations::find should remove the plural argument, or make it optional (only take in account if its defined). But if plurals are not part of the id, I vote for remove it. Maybe an array as third argument with different properties to filter

The problem is that in gettext files we may have entries that are univocally identified only by context+msgid.

@oscarotero
Copy link
Member

So if we have two po files like:

#: singular.po
msgid "1 child"
msgstr "1 fillo"
#: plural.po
msgid "1 child"
msgid_plural "2 children"
msgstr[0] "1 fillo"
msgstr[1] "2 fillos"

And we merge them, as they have the same id, they wont merged

$singular->mergeWith($plural); //this has not effect
$plural->mergeWith($singular); //this neither

@mlocati
Copy link
Member Author

mlocati commented Jan 20, 2015

And we merge them, as they have the same id, they wont merged

Yes, I tend to agree... With one exception:
if $singular does not have translation and $plural yes, for $singular->mergeWith($plural); I'd copy the msgstr[0] from $plural to $singular...
offiicial gettext msgcat merges them in quite an ugly way (see my example above), I'd avoid that behaviour...

@oscarotero
Copy link
Member

Ok, I'm agree.

@oscarotero
Copy link
Member

In the commit f9d1854 I've made the following changes:

  • Removed plurals from Translations::find() and Translations::is()
  • Changed the method Translation::mergeWith() to have the same api than Translations::mergeWith (using bitwise constants) here
  • More control on insert new translations. Now on insert a new value in the Translations class, it check if it's a instance of Translation class and if it's duplicated or not. If it's duplicated, merge it.

So this allows scan multiple files with duplicated entries or even a po file with duplicated values with no problems, merged silently the duplicated values.
The merge method can be configurated creating more constants. Now it replace the values just if the previous is empty.

@mlocati
Copy link
Member Author

mlocati commented Jan 20, 2015

Great! IMHO, since you added some breaking change, the next gettext version should be a major (2.x) release (see semver rules).

@oscarotero
Copy link
Member

I was thinking in 2.4 but maybe you're right and 3.0 is better.

@mlocati
Copy link
Member Author

mlocati commented Jan 21, 2015

Great! IMHO, since you added some breaking change, the next gettext version should be a major (2.x) release (see semver rules).

As you guessed, I meant 3.x and not 2.x 😉

@oscarotero
Copy link
Member

done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants