Skip to content

Conversation

jesseoverright
Copy link
Contributor

Adds a function_exists check for mb_strlen() and will use php's strlen() in place of mb_strlen(). Addresses issues with PHP installations that do not have the non-default mbstring extension enabled.

lib/cli/cli.php Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's follow WP coding standards here.

@jesseoverright
Copy link
Contributor Author

Thanks for the feedback @danielbachhuber . I've updated based on your comments.

lib/cli/cli.php Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How exactly is using plain substr() encoding-safe?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I.e. if it was encoding-safe, people wouldn't have used mb_substr() in the first place, no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Demonstration:

<?php

$str = 'sécurité';
echo substr( $str, 0, 2 ) . "\n";
echo mb_substr( $str, 0, 2, mb_detect_encoding( $str ) ) . "\n";

Output:

s�
sé

@jesseoverright
Copy link
Contributor Author

@scribu I see your point. Plain substr()as written would not be encoding safe. mb_substr() and mb_strlen() would definitely be the preferred method, but if that extension is not enabled on a PHP installation then it results in a fatal error.

Would it be safe to assume UTF-8 encoding in cases where mb_substr() is not installed? If that were the case then:

<?php
$str = 'sécurité';
echo utf8_encode( substr( utf8_decode( $str ), 0, 2 ) ) . "\n";`

Outputs:

and could be used as a fallback for mb_substr() to avoid the fatal error. Would that be acceptable?

@scribu
Copy link
Member

scribu commented Sep 5, 2014

Would it be safe to assume UTF-8 encoding in cases where mb_substr is not installed?

No, I don't think that's a safe assumption.

@scribu
Copy link
Member

scribu commented Sep 5, 2014

However, I think mb_substr() usage in the search-replace command is an optimization. If mb_substr() isn't available, you could make the command fall back to the slow method.

Edit: Oh, it seems mb_strlen() is used in php-cli-tools: #53. Welp.

@danielbachhuber
Copy link
Member

What I'd like to have for php-cli-tools is:

  1. Use mb_strlen() if it exists.
  2. If not 1, use strlen(). But, try and detect if there is encoding in the string, and trigger a PHP notice if there is.

For WP-CLI, we can fall back to strlen()

@danielbachhuber
Copy link
Member

For WP-CLI, we can fall back to strlen()

I spoke too soon. We don't use mb_strlen() or mb_substr() in WP-CLI. Your original fatal error is because of php-cli-tools

@scribu
Copy link
Member

scribu commented Sep 5, 2014

If not 1, use strlen(). But, try and detect if there is encoding in the string, and trigger a PHP notice if there is.

Since these functions are just used for displaying data, as opposed to manipulating data that will be sent to a database, I think that's acceptable. The function descriptions are still incorrect, though.

@jesseoverright
Copy link
Contributor Author

The function descriptions are still incorrect, though.

I'll try to add that clarification in an update of the pull request that also includes @danielbachhuber recommendation for the mb_strlen() fallback

@danielbachhuber
Copy link
Member

Thanks @jesseoverright. I'd love to get this sorted out in the next couple of days so we can ship 0.17.0

@jesseoverright
Copy link
Contributor Author

Sounds good. I'll submit an updated pull request tomorrow.

On Sep 6, 2014, at 1:36 PM, Daniel Bachhuber notifications@github.com wrote:

Thanks @jesseoverright. I'd love to get this sorted out in the next couple of days so we can ship 0.17.0


Reply to this email directly or view it on GitHub.

@jesseoverright
Copy link
Contributor Author

Function descriptions are updated. Also added iconv() (available by default in PHP) to trigger a PHP notice if non-ascii encoding is present (in cases where mb_substr() & mb_strlen() are missing). mb_detect_encoding() would of course be preferred, but wouldn't be available in this edge case.

Also included tests for \cli\safe_subst()

@danielbachhuber danielbachhuber added this to the next milestone Sep 7, 2014
danielbachhuber added a commit that referenced this pull request Sep 7, 2014
Adds fallback for mb_strlen, fixes #61
@danielbachhuber danielbachhuber merged commit 0fa663f into wp-cli:master Sep 7, 2014
@danielbachhuber
Copy link
Member

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants