Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work String#starts_with? with regex #5485

Merged

Conversation

makenowjust
Copy link
Contributor

This feature is imported from Ruby 2.5.0.

See: https://bugs.ruby-lang.org/issues/13712

This PR does not implement String#ends_with? for Regex because Ruby does not and it is hard to implement. You should see above link and ruby's discussion.

@ysbaddaden
Copy link
Contributor

What's the benefit over specifying ^ or \A in the regular expression?

"foobar" =~ /^foo/

@makenowjust
Copy link
Contributor Author

Readability.

@@ -3969,6 +3969,10 @@ class String
false
end

def starts_with?(re : Regex)
!!($~ = re.match_at_byte_index(self, 0, Regex::Options::ANCHORED))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fairly confusing definition, how about:

def starts_with?(re : Regex)
  match_data = $~ = re.match_at_byte_index(self, 0, Regex::Options::ANCHORED)

  match_data != nil
end

Copy link
Contributor Author

@makenowjust makenowjust Dec 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your code has another confusability. It seems match_data is redundant and last statement can be replaced with $~ != nil, unfortunately it cannot. It is hard to explain this reason, so I cannot accept your suggestion. Please approve.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just think that this way is easier to understand. Perhaps:

match_data = re.match_at_byte_index(self, 0, Regex::Options::ANCHORED)

$~ = match_data
match_data != nil

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is such a small method that I think the original code is fine.

@ysbaddaden
Copy link
Contributor

IMHO this is redundant and we'd expect ends_with? to also accept a regular expression, but it won't :/

My vote is "not fond but do whatever".

@makenowjust
Copy link
Contributor Author

/^foo|bar/ =~ "foobar" is not same as /^(?:foo|bar)/ =~ "foobar". A regex beginner causes such a mistaken sometimes. It is prevented when String#starts_with? accept regex. I think it is valuable enough.

@asterite
Copy link
Member

asterite commented Jan 1, 2018

I think the asymmetry with ends_with is a good reason no to include this. It can be easily achieved with ^ in front of a regex.

@makenowjust
Copy link
Contributor Author

For example, there is a method requiring a prefix:

def on_prefix(prefix)
  if gets.starts_with? prefix
    # process on given prefix is here.
  end
end

To specify two or more prefixes to this, we have to fix this method. However we don't need to fix when starts_with? accepts regex.

And now, I fixed ends_with? accepts regex. Yeah, it's perfect!

@asterite
Copy link
Member

asterite commented Jan 2, 2018

I agree with the use case. But the implementation of ends_with?(Regex) will create a new regex each time it's called. Not very efficient.

I don't know, maybe it's not that bad just having starts_with?(Regex), specially if Ruby implemented this. Probably others should decide if this belongs in the std.

@makenowjust
Copy link
Contributor Author

IMO starts_with?(Regex) is useful and it prevents a bug from regex, so I don't let go due to asymmetry.

It is really hard to implement ends_with?(Regex) efficiently. I think current implementation is the only way.

@@ -3991,6 +3995,10 @@ class String
true
end

def ends_with?(re : Regex)
!!($~ = /#{re}\z/.match(self))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this require /(?:#{re})\z/?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needless, re.to_s yields enclosed string.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, cool.

@makenowjust
Copy link
Contributor Author

ping

Copy link
Member

@jhass jhass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do this, @asterite's own comment pointed out a good reason for this, the common confusion between ^ and \A.

@RX14 RX14 added this to the Next milestone Apr 9, 2018
@RX14 RX14 merged commit 4f4240b into crystal-lang:master Apr 9, 2018
@makenowjust makenowjust deleted the feature/string/starts_with_with_regex branch April 9, 2018 10:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants