Permalink
Browse files

RT #72151: 5.14 re pragma and

PCP::RegularExpressions::RequireDotMatchAnything,
::RequireExtendedFormatting, ::RequireLineBoundaryMatching

THIS COMMIT REQUIRES PPix::Regexp 0.022

The requester asked that Perl::Critic honor the Perl 5.14 'use re
/modifiers' pragma. The implementation is divided between Perl::Critic
(which figures out which default modifiers are in-scope) and
PPIx::Regexp (which figures out what modifiers are actually in effect
based on the modifiers actually asserted, and the in-scope default
modifiers if any).

The Perl::Critic portion involved:
* Adding method element_is_in_lexical_scope_after_statement_containing()
  to Perl::Critic::Document. There is no current reason for this code to
  be here rather than in one of the utility packages, but this way
  caching of scope objects can be done without changing the interface.
* Modified Perl::Critic::Document method ppix_regexp_from_element() to
  make use of the above to find all the default modifier pragmas
  in-scope and pass them to PPIx::Regexp->new(). The PPIx::Regexp
  objects are already cached, the overhead of finding the pragmas should
  only be incurred once for a given regexp.
* Converting the relevant policies to use the new PPIx::Regexp
  modifier_asserted() method (which takes the defaults into account)
  rather than using the PPIx::Regexp modifiers() method (to return the
  object representing the modifiers present on the regexp) or the
  PPI::Token::Regexp get_modifiers method (ditto). This included:
  - ControlStructures::ProhibitMutatingListFunctions
  - RegularExpressions::RequireExtendedFormatting *
  - RegularExpressions::ProhibitUnusedCapture
  - RegularExpressions::ProhinitFixedStringMatches
  - RegularExpressions::RequireDotMatchAnything *
  = RegularExpressions::RequireLineBoundaryMatching *
  The starred policies had their t/*.run files updated as well.
* Boosting the version of PPIx::Regexp required to 0.022.

The so-called 'extra credit' portion of the ticket (handling
Regexp::DefaultFlags as well) is not in this commit. Or anywhere else at
the moment.


git-svn-id: http://perlcritic.tigris.org/svn/perlcritic/trunk/distributions/Perl-Critic@4095 7ac2fd03-a204-0410-b7b1-b9f6acf132f9
  • Loading branch information...
1 parent df2f0d7 commit 5b04e2c1562b36b5ca94639b52fa84ac9dc7c8f7 wyant committed Nov 26, 2011
View
@@ -17,6 +17,10 @@ Next release, whenever it is:
* RegularExpressions::ProhibitUnusedCaptures now looks into lists
and blocks in the replacement portion of the regular expression if
/e is asserted. RT #72086.
+ * RegularExpressions::RequireDotMatchAnything,
+ RegularExpressions::RequireExtendedFormatting and
+ RegularExpressions::RequireLineBoundaryMatching now honor defaults
+ set with 'use re "/modifiers"'. RT #72151.
* Subroutines::ProhibitManyArgs now recognizes '+' as a prototype
character.
Other Changes:
@@ -58,7 +58,7 @@ sub required_module_versions {
'PPI::Node' => '1.215',
'PPI::Token::Quote::Single' => '1.215',
'PPI::Token::Whitespace' => '1.215',
- 'PPIx::Regexp' => '0.019', # RT 67273
+ 'PPIx::Regexp' => '0.022', # RT 72151
'PPIx::Utilities::Node' => '1.001',
'PPIx::Utilities::Statement' => '1.001',
'Perl::Tidy' => 0,
@@ -242,12 +242,93 @@ sub ppix_regexp_from_element {
return $self->{_ppix_regexp_from_element}{$addr}
if exists $self->{_ppix_regexp_from_element}{$addr};
return ( $self->{_ppix_regexp_from_element}{$addr} =
- PPIx::Regexp->new( $element ) );
+ PPIx::Regexp->new( $element,
+ default_modifiers =>
+ $self->_find_use_re_modifiers_in_scope_from_element(
+ $element ),
+ ) );
} else {
return PPIx::Regexp->new( $element );
}
}
+sub _find_use_re_modifiers_in_scope_from_element {
+ my ( $self, $elem ) = @_;
+ my @found;
+ foreach my $use_re ( @{ $self->find( 'PPI::Statement::Include' ) || [] } )
+ {
+ 're' eq $use_re->module()
+ or next;
+ $self->element_is_in_lexical_scope_after_statement_containing(
+ $elem, $use_re )
+ or next;
+ my $prefix = 'no' eq $use_re->type() ? q{-} : $EMPTY;
+ push @found,
+ map { "$prefix$_" }
+ grep { m{ \A / }smx }
+ map {
+ $_->isa( 'PPI::Token::Quote' ) ? $_->string() :
+ $_->isa( 'PPI::Token::QuoteLike::Words' ) ? $_->literal() :
+ $_->content() }
+ $use_re->schildren();
+ }
+ return \@found;
+}
+
+#-----------------------------------------------------------------------------
+
+# This got hung on the Perl::Critic::Document, rather than living in
+# Perl::Critic::Utils::PPI, because of the possibility that caching of scope
+# objects would turn out to be desirable.
+
+sub element_is_in_lexical_scope_after_statement_containing {
+ my ( $self, $inner_elem, $outer_elem ) = @_;
+
+ # If the outer element defines a scope, we're true if and only if
+ # the outer element contains the inner element.
+ $outer_elem->scope()
+ and return $inner_elem->descendant_of( $outer_elem );
+
+ # In the more general case:
+
+ # The last element of the statement containing the outer element
+ # must be before the inner element. If not, we know we're false,
+ # without walking the parse tree.
+
+ my $stmt = $outer_elem->statement()
+ or return;
+ my $last_elem = $stmt->last_element()
+ or return;
+
+ my $stmt_loc = $last_elem->location()
+ or return;
+
+ my $inner_loc = $inner_elem->location()
+ or return;
+
+ $stmt_loc->[0] > $inner_loc->[0]
+ and return;
+ $stmt_loc->[0] == $inner_loc->[0]
+ and $stmt_loc->[1] > $inner_loc->[1]
+ and return;
+
+ # Since we know the inner element is after the outer element, find
+ # the element that defines the scope of the statement that contains
+ # the outer element.
+
+ my $parent = $stmt;
+ while ( ! $parent->scope() ) {
+ $parent = $stmt->parent()
+ or return;
+ }
+
+ # We're true if and only if the scope of the outer element contains
+ # the inner element.
+
+ return $inner_elem->descendant_of( $parent );
+
+}
+
#-----------------------------------------------------------------------------
sub filename {
@@ -693,6 +774,19 @@ just returns the results of C<< PPIx::Regexp->new() >>. In either case,
it returns C<undef> unless the argument is something that
L<PPIx::Regexp|PPIx::Regexp> actually understands.
+=item C<< element_is_in_lexical_scope_after_statement_containing( $inner, $outer ) >>
+
+Is the C<$inner> element in lexical scope after the statement containing
+the C<$outer> element?
+
+In the case where C<$outer> is itself a scope-defining element, returns true
+if C<$outer> contains C<$inner>. In any other case, C<$inner> must be
+after the last element of the statement containing C<$outer>, and the
+innermost scope for C<$outer> also contains C<$inner>.
+
+This is not the same as asking whether C<$inner> is visible from
+C<$outer>.
+
=item C<< filename() >>
@@ -103,7 +103,7 @@ sub violates {
# Only the block form of list functions can be analyzed.
return if not my $first_arg = first_arg( $elem );
return if not $first_arg->isa('PPI::Structure::Block');
- return if not _has_topic_side_effect( $first_arg );
+ return if not $self->_has_topic_side_effect( $first_arg, $doc );
# Must be a violation
return $self->violation( $DESC, $EXPL, $elem );
@@ -112,15 +112,15 @@ sub violates {
#-----------------------------------------------------------------------------
sub _has_topic_side_effect {
- my $node = shift;
+ my ( $self, $node, $doc ) = @_;
# Search through all significant elements in the block,
# testing each element to see if it mutates the topic.
my $tokens = $node->find( 'PPI::Token' ) || [];
for my $elem ( @{ $tokens } ) {
next if not $elem->significant();
return 1 if _is_assignment_to_topic( $elem );
- return 1 if _is_topic_mutating_regex( $elem );
+ return 1 if $self->_is_topic_mutating_regex( $elem, $doc );
return 1 if _is_topic_mutating_func( $elem );
return 1 if _is_topic_mutating_substr( $elem );
}
@@ -149,14 +149,19 @@ sub _is_assignment_to_topic {
#-----------------------------------------------------------------------------
sub _is_topic_mutating_regex {
- my $elem = shift;
+ my ( $self, $elem, $doc ) = @_;
return if ! ( $elem->isa('PPI::Token::Regexp::Substitute')
|| $elem->isa('PPI::Token::Regexp::Transliterate') );
# Exempt PPI::Token::Regexp::Transliterate objects IF the replacement
# string is empty AND neither the /d or /s flags are specified, OR the
# replacement string equals the match string AND neither the /c or /s
# flags are specified. RT 44515.
+ #
+ # NOTE that, at least as of 5.14.2, tr/// does _not_ participate in the
+ # 'use re /modifiers' mechanism. And a good thing, too, since the
+ # modifiers that _are_ common (/s and /d) mean something completely
+ # different in tr///.
if ( $elem->isa( 'PPI::Token::Regexp::Transliterate') ) {
my $subs = $elem->get_substitute_string();
my %mods = $elem->get_modifiers();
@@ -174,8 +179,10 @@ sub _is_topic_mutating_regex {
# is no version check.
if ( $elem->isa( 'PPI::Token::Regexp::Substitute' ) ) {
- my %mods = $elem->get_modifiers();
- $mods{r} and return;
+ my $re = $doc->ppix_regexp_from_element( $elem )
+ or return;
+ $re->modifier_asserted( 'r' )
+ and return;
}
# If the previous sibling does not exist, then
@@ -40,7 +40,7 @@ sub applies_to { return qw(PPI::Token::Regexp::Match
#-----------------------------------------------------------------------------
sub violates {
- my ( $self, $elem, undef ) = @_;
+ my ( $self, $elem, $doc ) = @_;
my $re = $elem->get_match_string();
@@ -55,8 +55,9 @@ sub violates {
# If it's a multiline match, then end-of-line anchors don't represent the whole string
if ($front_anchor eq q{^} || $end_anchor eq q{$}) {
- my %mods = $elem->get_modifiers();
- return if $mods{m};
+ my $regexp = $doc->ppix_regexp_from_element( $elem )
+ or return;
+ return if $regexp->modifier_asserted( 'm' );
}
# check for grouping and optional alternation. Grouping may or may not capture
@@ -83,8 +83,7 @@ sub violates {
# Look for references to the capture in the regex itself
return if _enough_uses_in_regexp( $re, \@captures, \%named_captures, $doc );
- my $mod = $re->modifier();
- if ($mod and $mod->asserts( 'g' )
+ if ( $re->modifier_asserted( 'g' )
and not _check_if_in_while_condition_or_block( $elem ) ) {
$ncaptures = $NUM_CAPTURES_FOR_GLOBAL;
$#captures = $ncaptures - 1;
@@ -36,12 +36,13 @@ sub applies_to { return qw<PPI::Token::Regexp::Match
#-----------------------------------------------------------------------------
sub violates {
- my ( $self, $elem, undef ) = @_;
+ my ( $self, $elem, $doc ) = @_;
+
+ my $re = $doc->ppix_regexp_from_element( $elem )
+ or return;
+ $re->modifier_asserted( 's' )
+ or return $self->violation( $DESC, $EXPL, $elem );
- my %modifiers = $elem->get_modifiers();
- if ( not $modifiers{s} ) {
- return $self->violation( $DESC, $EXPL, $elem );
- }
return; #ok!;
}
@@ -58,16 +58,16 @@ sub applies_to {
#-----------------------------------------------------------------------------
sub violates {
- my ( $self, $elem, undef ) = @_;
+ my ( $self, $elem, $doc ) = @_;
my $match = $elem->get_match_string();
return if length $match <= $self->{_minimum_regex_length_to_complain_about};
return if not $self->{_strict} and $match =~ m< \A [\s\w]* \z >xms;
- my %mods = $elem->get_modifiers();
- if ( not $mods{x} ) {
- return $self->violation( $DESC, $EXPL, $elem );
- }
+ my $re = $doc->ppix_regexp_from_element( $elem )
+ or return;
+ $re->modifier_asserted( 'x' )
+ or return $self->violation( $DESC, $EXPL, $elem );
return; # ok!;
}
@@ -35,12 +35,13 @@ sub applies_to { return qw(PPI::Token::Regexp::Match
#-----------------------------------------------------------------------------
sub violates {
- my ( $self, $elem, undef ) = @_;
+ my ( $self, $elem, $doc ) = @_;
+
+ my $re = $doc->ppix_regexp_from_element( $elem )
+ or return;
+ $re->modifier_asserted( 'm' )
+ or return $self->violation( $DESC, $EXPL, $elem );
- my %mods = $elem->get_modifiers();
- if ( ! $mods{m} ) {
- return $self->violation( $DESC, $EXPL, $elem );
- }
return; #ok!;
}
@@ -69,6 +69,47 @@ my $string =~ y{[A-Z]}{[a-z]};
my $string =~ tr/[A-Z]/[a-z]/cd;
my $string =~ y/[A-Z]/[a-z]/cd;
+#-----------------------------------------------------------------------------
+
+## name use re '/s' - RT #72151
+## failures 0
+## cut
+
+use re '/s';
+my $string =~ m{pattern.};
+
+#-----------------------------------------------------------------------------
+
+## name use re qw{ /s } - RT #72151
+## failures 0
+## cut
+
+use re qw{ /s };
+my $string =~ m{pattern.};
+
+#-----------------------------------------------------------------------------
+
+## name use re qw{ /s } not in scope - RT #72151
+## failures 1
+## cut
+
+{
+ use re qw{ /s };
+}
+my $string =~ m{pattern.};
+
+#-----------------------------------------------------------------------------
+
+## name no re qw{ /s } - RT #72151
+## failures 1
+## cut
+
+use re qw{ /smx };
+{
+ no re qw{ /s };
+ my $string =~ m{pattern.};
+}
+
##############################################################################
# $URL$
# $Date$
@@ -126,6 +126,47 @@ my $string =~ m/foobar/;
my $string =~ s/foobar/foo bar/;
#-----------------------------------------------------------------------------
+
+## name use re '/x' - RT #72151
+## failures 0
+## cut
+
+use re '/x';
+my $string =~ m{pattern.};
+
+#-----------------------------------------------------------------------------
+
+## name use re qw{ /x } - RT #72151
+## failures 0
+## cut
+
+use re qw{ /x };
+my $string =~ m{pattern.};
+
+#-----------------------------------------------------------------------------
+
+## name use re qw{ /x } not in scope - RT #72151
+## failures 1
+## cut
+
+{
+ use re qw{ /x };
+}
+my $string =~ m{pattern.};
+
+#-----------------------------------------------------------------------------
+
+## name no re qw{ /x } - RT #72151
+## failures 1
+## cut
+
+use re qw{ /smx };
+{
+ no re qw{ /x };
+ my $string =~ m{pattern.};
+}
+
+#-----------------------------------------------------------------------------
# Local Variables:
# mode: cperl
# cperl-indent-level: 4
Oops, something went wrong.

0 comments on commit 5b04e2c

Please sign in to comment.