Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
Forbid Bare C<\b> in Regexes
Implementation to close RT#128986
  • Loading branch information
zoffixznet committed Aug 24, 2016
1 parent ce29255 commit b14828b
Showing 1 changed file with 38 additions and 0 deletions.
38 changes: 38 additions & 0 deletions v6d.pod
Expand Up @@ -127,4 +127,42 @@ a variable that defaults to C<42>.

Zoffix Znet


=head1 Forbid Bare C<\b> in Regexes

=head2 Current Behaviour

The backspace character is silently attempted to be matched:

$ perl6 -e 'say "foo" ~~ /"foo" \b/'
Nil
$ perl6 -e 'say "foo\b" ~~ /"foo" \b/'
「fo」

=head2 New Behaviour

Attempting to use bare C<\b> in regexes will throw:

$ perl6 -e 'say "foo" ~~ /"foo" \b/'
==SORRY==
Cannot use bare \b in regexes. For literal backspace use "\b",
for word boundaries use << or >> for left/right boundaries, respectively

$ perl6 -e 'say "foo\b" ~~ /"foo" "\b"/'
「fo」

=head2 Rationale

It's very easy to accidentally use C<\b> for word boundaries, since that's
the syntax in Perl/GNU/.NET regex. It's also uncommon to want to match
a literal C<\b>, so requiring explicit quotes (or use of \c8 or \x8),
is justifiable and won't be a burden.

Implementing this will resolve
L<RT#128986|https://rt.perl.org/Ticket/Display.html?id=128986>

=head2 Steakholder

Zoffix Znet

=cut

8 comments on commit b14828b

@pmichaud
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? S05 already says that \b is forbidden... the problem appears to be that Rakudo isn't implementing it properly.

https://github.com/perl6/specs/blame/master/S05-regex.pod#L2269

@coke
Copy link
Contributor

@coke coke commented on b14828b Aug 25, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pm, what's the objection to marking it as something for 6.d? That we could slip the change into a 6.c versioned compiler without waiting for a spec version bump?

@pmichaud
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. To my thinking... unless there's a test in roast (v 6.c) that expects \b to match literal 'b' (I didn't find one), there ought to be no problem with adding the exception to Rakudo now.

@pmichaud
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'm having difficulty understanding the meaning of "This can't be added in 6.c" in RT #128986. Perhaps I should reply there.

@zoffixznet
Copy link
Contributor Author

@zoffixznet zoffixznet commented on b14828b Aug 25, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'm having difficulty understanding the meaning of "This can't be added in 6.c" in RT #128986.

My reasoning was this is a breaking change to the set of values used by a feature. Thinking from a user's perspective who used a bare \b to match a backspace, after this change their code will explode and if they ask why, our response would be bare \b is not in the roast. Thereby, we say that users who want stability must check the roast not only for specific methods and features, but also for the values those features accept. <after "♥"> works right now, but the ♥ is not in the spec. Can I still use it safely?

@pmichaud
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replied to the RT ticket, but to summarize here: Yes, the answer is "bare \b is not in the roast". In this particular instance it's even a little stronger than that, because we can also point to "S05 says that \b is illegal, and has done so for some time."

Most importantly, I think we really really really want to avoid saying that an implementation's specific behavior is an implied part of the language specification.

@zoffixznet
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the section in 151d791

Commented on the ticket, saying it can be fixed right now.

@coke
Copy link
Contributor

@coke coke commented on b14828b Aug 25, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think pointing to S05 makes the argument stronger - if the behavior is tested in roast, we should not change it; if it's not, we can consider it. It's already hard enough for a user to know what the spec is (by searching the tests in roast) - implying that items in the Syn can also inform the spec is too much, IMO, and against what we've already decided.

I think that we DO want to have a discussion about behavior that may change how code works in the wild, even if, by the strict consideration of my previous sentence, we could do it. That said, I've no strong opinions about whether this change is too much, so I've no issues with it going into 6.c

... but on the gripping hand; we can put it into rakudo, but put the test for it on roast/master, and then even though it's in the implementation, it's still not officially 6.c. (confusing)

Please sign in to comment.