New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --regex-limit option. #62

Merged
merged 23 commits into from Jun 18, 2018

Conversation

4 participants
@miya0001
Member

miya0001 commented Jan 27, 2018

Related: #61

miya0001 added some commits Jan 27, 2018

@@ -22,7 +23,6 @@ class Search_Replace_Command extends WP_CLI_Command {
private $log_prefixes = array( '< ', '> ' );
private $log_colors;
private $log_encoding;
private $log_run_data = array();

This comment has been minimized.

@miya0001

miya0001 Jan 27, 2018

Member

It is an unused variable.

@miya0001 miya0001 requested a review from wp-cli/committers Jan 27, 2018

miya0001 added some commits Jan 27, 2018

@@ -177,6 +177,16 @@ Feature: Do global search/replace
"""
world, Hello
"""
When I run `wp search-replace 'o' 'O' --regex --regex-limit=1`

This comment has been minimized.

@gitlost

gitlost Feb 11, 2018

Contributor

I think --regex-limit deserves its own scenario(s) with targetted data rather than piggy-backing on existing ones.

@@ -190,6 +200,16 @@ Feature: Do global search/replace
kppr
"""
And the return code should be 1
When I run `wp search-replace 'o' 'O' --regex --regex-limit=1`

This comment has been minimized.

@gitlost

gitlost Feb 11, 2018

Contributor

See above.

@@ -880,6 +900,17 @@ Feature: Do global search/replace
Content_ab\1z__baz_1234567890_eb\1z__bez_1234567890_ib\1z__biz_1234567890_ob\1z__boz_1234567890_ub\1z__buz_
"""
When I run `wp search-replace '_{2}' '-' wp_posts --regex --regex-limit=2 --log --before_context=11 --after_context=11`

This comment has been minimized.

@gitlost

gitlost Feb 11, 2018

Contributor

See above.

@@ -108,6 +108,9 @@ class Search_Replace_Command extends WP_CLI_Command {
* [--regex-delimiter=<regex-delimiter>]
* : The delimiter to use for the regex. It must be escaped if it appears in the search string. The default value is the result of `chr(1)`.
*
* [--regex-limit=<regex-limit>]
* : The maximum possible replacements for each pattern in each subject string. Defaults to `-1` (no limit).

This comment has been minimized.

@gitlost

gitlost Feb 11, 2018

Contributor

I see you copied this from http://php.net/manual/en/function.preg-replace.php but it needs adapting here, eg there's only one subject string. Maybe The maximum possible replacements for the regex per row (or per unserialized data bit per row). Defaults... or something (need to mention the special serialized data case I think).

@@ -165,10 +168,14 @@ public function __invoke( $args, $assoc_args ) {
$total = 0;
$report = array();
$this->dry_run = \WP_CLI\Utils\get_flag_value( $assoc_args, 'dry-run' );
$php_only = \WP_CLI\Utils\get_flag_value( $assoc_args, 'precise' );
$php_only = \WP_CLI\Utils\get_flag_value( $assoc_args, 'precise' );

This comment has been minimized.

@gitlost

gitlost Feb 11, 2018

Contributor

Not a fan of manually aligning assignments that happen to follow each other (it's so awkward to maintain and fundamentally random - when do you start/stop?!) so personally would just remove but ignore me.

@@ -801,13 +808,16 @@ private function log_bits( $search_regex, $old_data, $old_matches, $new ) {
$diff += strlen( $new ) - strlen( $old_matches[0][ $i ][0] );
$i++;
return $new;
}, $old_data );
}, $old_data, $this->regex_limit );
$old_bits = $new_bits = array();
$append_next = false;
$last_old_offset = $last_new_offset = 0;
$match_cnt = count( $old_matches[0] );

This comment has been minimized.

@gitlost

gitlost Feb 11, 2018

Contributor

I think the easiest thing is just to do

$match_cnt = count( $new_matches[0] ); // `$new_matches[0]` may be limited by `regex_limit` so use instead of `$old_matches[0]`.

here which means the empty() checks following aren't needed.

This comment has been minimized.

@alpipego

alpipego Feb 17, 2018

Contributor

@gitlost why not use the $count preg_replace_callback provides?

This comment has been minimized.

@gitlost

gitlost Feb 17, 2018

Contributor

Er, well one could, but it's not important and I don't think reviewing a review is going to work to be honest! If you'd like to contribute please create an idea or an issue or a PR.

This comment has been minimized.

@gitlost

gitlost Feb 17, 2018

Contributor

On second thoughts you're right @alpipego that is a nicer way to do it so as suggested in #70 @miya0001 you can just do }, $old_data, $this->regex_limit, $match_cnt ); at line 811 instead.

This comment has been minimized.

@alpipego

alpipego Feb 17, 2018

Contributor

don't think reviewing a review is going to work

Of course you're right. The context is, that I am going through this with @miya0001 at my side at WordCamp Bangkok Contributer Day with the goal of fixing his PR.

This comment has been minimized.

@gitlost

gitlost Feb 17, 2018

Contributor

Ah I see! Sorry, I thought the PR was being hi-jacked! Ignore my previous comments!

$old_bits = $new_bits = array();
$append_next = false;
$last_old_offset = $last_new_offset = 0;
$match_cnt = count( $old_matches[0] );
for ( $i = 0; $i < $match_cnt; $i++ ) {
if ( empty( $old_matches[0][ $i ] ) || empty( $new_matches[0][ $i ] ) ) {

This comment has been minimized.

@gitlost

gitlost Feb 11, 2018

Contributor

So this change isn't needed.

$new_after = substr( $new_after, 0, $new_matches[0][ $i + 1 ][1] - $new_end_offset );
$after_shortened = true;
// On the next iteration, will append with no before context.
if ( ! empty( $old_matches[0][ $i + 1 ] ) && ! empty( $new_matches[0][ $i + 1 ] ) ) {

This comment has been minimized.

@gitlost

gitlost Feb 11, 2018

Contributor

Nor this change.

$this->recurse_objects = \WP_CLI\Utils\get_flag_value( $assoc_args, 'recurse-objects', true );
$this->verbose = \WP_CLI\Utils\get_flag_value( $assoc_args, 'verbose' );
$this->format = \WP_CLI\Utils\get_flag_value( $assoc_args, 'format' );
$this->regex_limit = \WP_CLI\Utils\get_flag_value( $assoc_args, 'regex-limit', -1 );
if ( 0 === intval( $this->regex_limit ) ) {
WP_CLI::error( '`--regex-limit` expects integer.' );

This comment has been minimized.

@gitlost

gitlost Feb 11, 2018

Contributor

Need a test for this anyway.

@@ -405,7 +412,7 @@ private function php_export_table( $table, $old, $new ) {
'chunk_size' => $chunk_size
);
$replacer = new \WP_CLI\SearchReplacer( $old, $new, $this->recurse_objects, $this->regex, $this->regex_flags, $this->regex_delimiter );
$replacer = new \WP_CLI\SearchReplacer( $old, $new, $this->recurse_objects, $this->regex, $this->regex_flags, $this->regex_delimiter, $this->regex_limit );

This comment has been minimized.

@gitlost

gitlost Feb 11, 2018

Contributor

See below.

@gitlost

This comment has been minimized.

Contributor

gitlost commented Feb 11, 2018

Not sure why the review comments are listed in a weird order...?!

@gitlost

This comment has been minimized.

Contributor

gitlost commented Feb 17, 2018

Closing this in favor of #70

@gitlost gitlost closed this Feb 17, 2018

@gitlost

This comment has been minimized.

Contributor

gitlost commented Feb 17, 2018

Re #70 (comment) okay I'll leave both open and step away for a while!

@gitlost gitlost reopened this Feb 17, 2018

miya0001 added some commits Feb 18, 2018

@miya0001

This comment has been minimized.

Member

miya0001 commented Feb 18, 2018

@gitlost

Thanks for review.
I merged #70 into this PR for now and I am going to work to improve this patch.
Please wait a couple of days. 😊

@alpipego

alpipego suggested changes Feb 19, 2018 edited

@miya0001 the default should of course be -1 (and not 1).

private $recurse_objects;
private $regex;
private $regex_flags;
private $regex_delimiter;
private $regex_limit = 1;

This comment has been minimized.

@alpipego

alpipego Feb 19, 2018

Contributor

@miya0001 the default should of course be -1 (instead of 1).

This comment has been minimized.

@miya0001

miya0001 Mar 6, 2018

Member

thanks 👍

miya0001 added some commits Mar 6, 2018

@miya0001

This comment has been minimized.

Member

miya0001 commented Mar 8, 2018

@gitlost
Please review.

@gitlost

Thanks @miya0001 , just some extra tests needed and some fiddling with the arg checking and a few doc and indentation nitpicks.

When I run `wp search-replace --regex "ap{2}le" "orange" --regex-limit=1 --log`
Then STDOUT should contain:
"""

This comment has been minimized.

@gitlost

gitlost Mar 8, 2018

Contributor

Could you indent the PyString by 2 spaces here please (to match standard indentation).

I have a pen, I have an orange. Pen, pine-apple, apple-pen.
"""

This comment has been minimized.

@gitlost

gitlost Mar 8, 2018

Contributor

And remove the extra newline here,

When I run `wp search-replace --regex "ap{2}le" "orange" --regex-limit=2 --log`
Then STDOUT should contain:
"""

This comment has been minimized.

@gitlost

gitlost Mar 8, 2018

Contributor

Again indent by standard 2 spaces please.

Then STDOUT should contain:
"""
I have a pen, I have an orange. Pen, pine-orange, apple-pen.
"""

This comment has been minimized.

@gitlost

gitlost Mar 8, 2018

Contributor

Could you add a scenario to test for incorrect args here, eg

  Scenario: Regex search/replace with incorrect or default `--regex-limit`
    Given a WP install
    When I try `wp search-replace '(Hello)\s(world)' '$2, $1' --regex --regex-limit=asdf`
    Then STDERR should be:
      """
      Error: `--regex-limit` expects a non-zero positive integer or -1.
      """
    When I try `wp search-replace '(Hello)\s(world)' '$2, $1' --regex --regex-limit=0`
    Then STDERR should be:
      """
      Error: `--regex-limit` expects a non-zero positive integer or -1.
      """
    When I try `wp search-replace '(Hello)\s(world)' '$2, $1' --regex --regex-limit=-2`
    Then STDERR should be:
      """
      Error: `--regex-limit` expects a non-zero positive integer or -1.
      """
    When I run `wp search-replace '(Hello)\s(world)' '$2, $1' --regex --regex-limit=-1`
    Then STDOUT should contain:
      """
      Success:
      """

(See below for disallowing zero.)

@@ -108,6 +108,9 @@ class Search_Replace_Command extends WP_CLI_Command {
* [--regex-delimiter=<regex-delimiter>]
* : The delimiter to use for the regex. It must be escaped if it appears in the search string. The default value is the result of `chr(1)`.
*
* [--regex-limit=<regex-limit>]
* : The maximum possible replacements for the regex in each unserialized data bit per row. Defaults to `-1` (no limit).

This comment has been minimized.

@gitlost

gitlost Mar 8, 2018

Contributor

The serialized case is one case only and secondary, so I think The maximum possible replacements for the regex per row (or per unserialized data bit per row). Defaults to -1 (no limit). is better.

@@ -197,6 +200,12 @@ public function __invoke( $args, $assoc_args ) {
}
WP_CLI::error( $msg );
}
if ( ( $regex_limit = (int) \WP_CLI\Utils\get_flag_value( $assoc_args, 'regex-limit', - 1 ) ) > - 1 ) {

This comment has been minimized.

@gitlost

gitlost Mar 8, 2018

Contributor

This check should be moved to the previous block where the other options are checked (after line 183) and should not allow zero as it doesn't make sense to me. So I'd prefer:

			if ( null !== ( $regex_limit = \WP_CLI\Utils\get_flag_value( $assoc_args, 'regex-limit' ) ) ) {
				if ( ! preg_match( '/^(?:[0-9]+|-1)$/', $regex_limit ) || 0 === (int) $regex_limit ) {
					WP_CLI::error( '`--regex-limit` expects a non-zero positive integer or -1.' );
				}
				$this->regex_limit = (int) $regex_limit;
			}
@@ -828,8 +836,8 @@ private function log_bits( $search_regex, $old_data, $old_matches, $new ) {
$new_after = \cli\safe_substr( substr( $new_data, $new_end_offset ), 0, $this->log_after_context, false /*is_width*/, $encoding );
// To lessen context duplication in output, shorten the after context if it overlaps with the next match.
if ( $i + 1 < $match_cnt && $old_end_offset + strlen( $old_after ) > $old_matches[0][ $i + 1 ][1] ) {
$old_after = substr( $old_after, 0, $old_matches[0][ $i + 1 ][1] - $old_end_offset );
$new_after = substr( $new_after, 0, $new_matches[0][ $i + 1 ][1] - $new_end_offset );
$old_after = substr( $old_after, 0, $old_matches[0][ $i + 1 ][1] - $old_end_offset );

This comment has been minimized.

@gitlost

gitlost Mar 8, 2018

Contributor

These 2 inner indentation changes are unnecessary

@@ -20,15 +21,17 @@ class SearchReplacer {
* @param bool $regex Whether `$from` is a regular expression.
* @param string $regex_flags Flags for regular expression.
* @param string $regex_delimiter Delimiter for regular expression.
* @param integer $regex_limit The maximum possible replacements for each pattern in each subject string.

This comment has been minimized.

@gitlost

gitlost Mar 8, 2018

Contributor

This should appear last after the $logging arg now.

@schlessera schlessera removed this from the 1.3.0 milestone Apr 21, 2018

miya0001 added some commits Apr 28, 2018

@miya0001 miya0001 requested a review from schlessera Jun 14, 2018

@miya0001

This comment has been minimized.

Member

miya0001 commented Jun 14, 2018

@schlessera

I forgot to work this ticket. You can review it now. :)

@schlessera schlessera added this to the 1.4.0 milestone Jun 18, 2018

@schlessera schlessera merged commit 9947e9e into master Jun 18, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@schlessera schlessera deleted the patch-61 branch Jun 18, 2018

@schlessera

This comment has been minimized.

Member

schlessera commented Jun 18, 2018

Thanks for the PR, @miya0001 !

@schlessera

This comment has been minimized.

Member

schlessera commented Jun 18, 2018

Also thanks to @alpipego for the collaboration, as his original PR is included in this as well.

@CodeProKid CodeProKid referenced this pull request Jul 20, 2018

Closed

Add --regex-limit #61

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment