(This patch is mostly intended as a discussion)
This is a contrib back from Drupal to make PO file parsing better and hopefully use the translation component within Drupal into the near future.
Reading PO Headers
Multi-line source / translation
See more @ clemens-tolboom@69b7d25#L1L56
Fixed example code.
Merge remote branch 'upstream/master'
Merge branch 'master' of github.com:symfony/symfony
Initial stab at multiline po.
It currently has coding style 'problems' as part of the code is copied verbatim from Drupal to make sure those keep in sync for now.
Problems from a Drupal perspective are:
The POLoader produces a parser log of errors for which there is no way to read.
It still needs a test for parsing a header successfully.
The header info is not used anywhere within the system.
I'm not sure where to test the multiline parsing.
I want to make sure the texts are joined correctly by the PoFileLoader.
This pull request passes (merged 69b7d25 into e54f4e4).
@clemens-tolboom this class indeed does not exist. You probably forgot a use statement
I will let someone else review it. I don't know gettext enough to know if the implementation is right, and my brain's parser gives me too much warnings when reading the code...
@clemens-tolboom First of all you should follow Symfony2 CS.
@stloyd he wrote that the CS is copied from Drupal and will be fixed later. So no need to tell him again... (especially since its intended as RFC)
@stof : how many symfony people use PO files? Drupal only uses PO so is this worth the effort?!?
@stloyd & @Tobion : I somehow expected travis to kick my CS ass ... Netbeans does not support two PHP CS formats at once afaik :(
Is the problem about batch loading PO file solvable within the symfony framework?
I don't want to crush the party, but this code is GPL licensed.
As @beberlei mentioned, the biggest problem is the license of the code... Symfony being licensed under the MIT license.
@clemens-tolboom Travis is about running tests, not fixing CS.
Looks like Symfony already supported multiline string parsing (storing the string value in the buffer when reading the string). So that does not seem to be new with this (mentioned above as "Multi-line source / translation").
I would suspect anything ported from Drupal might be hopelessly entangled with the GPL and be uncovertable to another license.
drak: interesting observation. The code posted above is not yet in Drupal and never was. It is from a work in progress patch from the Drupal issue queues. It is based on code from Drupal 7 but is mostly rewritten from the ground up (I think Clemens could vouch for a better estimate on how much different it is). The Drupal 7 Gettext parser is hopelessly ugly (even though it does work well :).
@clemens-tolboom how much of this code was already in Drupal previously (and so subject to the GPL) ? For the new code, licensing is not an issue as it can be submitted as MIT
Wow ... thanks for the quick feedback :)
And darn ... you guys are right about license (why didn't I think of that)
Doing a diff on prepared files (locale.inc from Drupal 7 and the PoFileLoader.php) to only check the readLine code the following stats show we have license problems.
Lines that differ ... mostly by having a $this-> in front of state variables
$ diff --suppress-common-lines -F 100 -y -E -b PoFileLoader.php locale-d7.inc | wc -l
where total lines counts are
Almost 50% are common. Most of the rest has a $this-> and some lines are new related to Batch/state management.
I think I should retract this pull request right?
Sorry for wasting your time :(
can you figure out the original contributors and ask them for permission?
This part of the Drupal codebase has a history of about 9 years. It would likely be impossible to track all the people down who wrote it originally and then fixed bugs and added new features to them.
I see .. in that case there are 2 options:
1) you simply leave this file inside the Drupal code base, but try to make it easy for users to grab standalone
2) you write the code from scratch, while using the original code as inspiration .. this is perfectly legal as we are only talking about copyright and not patents here.
We're trying to do option 1 already as part of internal cleanup. Long road, but hopefully. :-) That still keeps it GPL, though.
If clemens is up for option 2, I would have no problem backing us using that in place of the existing code provided it is in fact technically better than what we have now. We'd have to work out how we deal with "Drupal-created 3rd party dependencies", though. That's a discussion better had over on Drupal.org than here.
@Crell If we go for option 2 we from Drupal only have to add another symfony component 'Translation' to the plate right?
Gabor, let me challenge that. 9 years or not, we have git blame and this code , as far as I can remember is not a particularly often changed. The likely culprits are you and Gerhard anyways and both of you are still within easy reach. Someone needs to run a bunch of git blames to figure this out. I think it's doable.
git blame is not enough, whitespace changes or coding standard adjustments move the blame away from the copyright holders of the code. You have to walk the history of files back, plus handling potential moves of the file or parts of the code. Its really not that easy :)
i just clicked through https://github.com/drupal/drupal/commits/7.x/includes/locale.inc?page=8 .. seems like mostly the same names .. though many commits were done on behalf of someone else .. but as @chx points out .. its only a portion of the file ..
It really is not that hard. We need someone with more time than me to verify this completely but I think only 57c9a13e, 30c2e89c, 9841e29a, 941139bf are responsible. At least that's when the import one was introduced. This suggests the same names as I suspected originally plus kkaefer/timcn.
Well, for that, we should assume that the commit message surely includes everybody who contributed to the patches involved. Is that legally possible to assume? Eg. how do you parse "New locale module thanks to Gerhard, Goba, Marco, Kristjan and others." for the complete list of people involved? From http://drupalcode.org/project/drupal.git/commit/1831e1b690f02d7f551d38ef88a0ba200f786497 Then how do you dig up the discussion history from 8 years ago (2004) on that commit to figure out who were the "others"?
Do we have code from that commit? If yes, that's unfortunate.
Well, you might be surprised how little changed in the actual .po file / plural formula parsing code since then. For example, the much dreaded eval() for formula evaluation was introduced with this 2004 patch and is still used in the 2011 released Drupal 7. But this is just the big early commit. There are more than likely intermediate commits with "and others" mentioned from times where we did not have an issue queue based process to track people down retroactively.
Puzzled by the multiline problem and other I did another #4249
@ #drupal-i18n we discussed our effort and decided to follow two tracks of which another is to try to use as much as possible of the Translation component:
I'm not sure whether we should close this pull for now. I'm willing to write a new parser when a sponsor is available.
As one of the original authors of the currently used implementation. I'd also be interested in any developments regarding plural rule parsing and evaluation without using evil eval(). I remember CakePHP wasn't using eval for that task. You may also want to look if anything has changed/improved since you last looked on the implementation.
Full disclosure: I'm part of the Lithium team and CakePHP alumni.
However: I just discovered you're using the implementation and didn't know it before - hope it does a good job.
http://drupal.org/node/1273968 rips out the eval from Drupal. This code, however, was written by me, and as such, unless someone convinces me otherwise, is GPL.
Than it can't be used in non-GPL projects :(
But nice work!
I retract this pull request as it is be replaced by #4249
From a Symfony perspective we have PluralizationRules which can be extended by ones own pluralization rule.
I hope @chx can interface with that for Drupal as my Drupal patch http://drupal.org/node/1570346 (needs lots of work still) is using MessageSelector already.
@davidpersson tnx for the pointers.
@davidpersson - there is nothing evil about eval in this particular circumstance: please see #2630