Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further improve comment extraction #166

Merged
merged 7 commits into from
Feb 22, 2018
Merged

Further improve comment extraction #166

merged 7 commits into from
Feb 22, 2018

Conversation

swissspidy
Copy link
Contributor

@swissspidy swissspidy commented Feb 22, 2018

Follow-up to #164:

* Case insenstive comment prefix matching
* Find comments on the same line
* Find multi-line comments
@oscarotero
Copy link
Member

Hi. I think is better to have a pull request per feature 😃
Anyway, I want to keep as close as possible to the gettext behaviour, in order to have high compatibility. Case insensitive prefixes for comments breaks this compatibility, so I prefer do not include this.

Not sure whether comment in the same line is valid in xgettext. The gettext docs says "preceding keyword lines". 🤔

And about multiline comments, I think this logic should be placed inside the ParsedComment class.

So, instead:

$lineNumber = $value[2] + substr_count($value[1], "\n");

You could do simply:

$lineNumber = $comment->getLastLine();

Or even better:

if ($comment->isRelatedWith($bufferFunctions[0])) {
    $bufferFunctions[0]->addComment($comment);
}

This leaves the parser cleaner and move the responsability of how to handle comments to the ParsedComment class.

@swissspidy
Copy link
Contributor Author

Anyway, I want to keep as close as possible to the gettext behaviour, in order to have high compatibility. Case insensitive prefixes for comments breaks this compatibility, so I prefer do not include this.

Fair enough. An alternative would be to pass the various case variations to the functions scanner, so that's not a big deal. I'll undo it.

Not sure whether comment in the same line is valid in xgettext. The gettext docs says "preceding keyword lines".

It is.

Example code:

<?php

/* allowed1: boo */ /* allowed2: only this should get extracted. */ /* some other comment */ $bar = strtolower( __( 'Foo' ) );

Result of xgettext --add-comments=allowed2 --keyword=__:1 --add-location example.php:

#. allowed2: only this should get extracted.
#. some other comment
#: example.php:2
msgid "Foo"
msgstr ""

Whereas with this PR the library does this:

#. allowed2: only this should get extracted.
#: ./example.php:2
msgid "Foo"
msgstr ""

So I'd even say this library does it better than xgettext because some other comment is not a valid translator comment in that case :-)

And about multiline comments, I think this logic should be placed inside the ParsedComment class.

I'll try to come up with something :-)

@swissspidy
Copy link
Contributor Author

It would probably be a bit less complex if parsePhpComment() would return a ParsedComment 🤔🤷‍♀️

@oscarotero
Copy link
Member

It would probably be a bit less complex if parsePhpComment() would return a ParsedComment

Yes, good idea.

And about the last line of the comment, I meant to let the ParsedComment instance to calculate the latest line, instead the parsePhpComment(), because the only things needed is the comment string and the first line number, that were passed as constructor arguments.

@swissspidy
Copy link
Contributor Author

because the only things needed is the comment string and the first line number, that were passed as constructor arguments.

After parsing, the comment is on a single line though, so the ending line number can't be calculated from that.

@oscarotero
Copy link
Member

Ok, so what do you think about moving all this logic to a static factory of ParsedComment?. For example:

protected function parsePhpComment($value)
{
    if ($this->extractComments === false) {
        return;
    }

    //this returns a comment or null
    $comment = ParsedComment::create($value);

    if ($comment && $comment->checkPrefix($this->extractComments)) {
        return $comment;
    }
}

@swissspidy
Copy link
Contributor Author

Agreed, makes sense. It looks much clearer with that.

* @param array $prefixes An array of prefixes to check.
* @return bool Whether the comment matches the prefixes or not.
*/
public function checkPrefixes($prefixes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This argument could be typed (checkPrefixes(array $prefixes))

@oscarotero
Copy link
Member

Thanks!

@oscarotero oscarotero merged commit 37882dc into php-gettext:master Feb 22, 2018
@swissspidy swissspidy deleted the comment-extraction-2 branch February 22, 2018 18:25
schlessera pushed a commit to wp-cli/i18n-command that referenced this pull request Jan 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants