Skip to content

Conversation

@makenowjust
Copy link
Contributor

For example, Ruby's String#scan is:

"hello world".scan(/\w+|(?= )/) # => ["hello", "", "world"]

But Crystal's is:

"hello world".scan(/\w+|(?= )/).map &.[0] # => ["hello", ""]

This pull request fixes it by continuing to scan when match is empty.

For example, Ruby's String#scan is:

    "hello world".scan(/\w+|(?= )/) # => ["hello", "", "world"]

But Crystal's is:

    "hello world".scan(/\w+|(?= )/).map &.[0] # => ["hello", ""]

This commit fixes it by continuing to scan when match is empty.
@bcardiff bcardiff added kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:stdlib labels Jan 12, 2017
@bcardiff
Copy link
Member

The change seems good.

I wanted to see if fixing this could cause some inconsistency.
This PR changes thing introduced in 20c8570 . From there I though that checking the behavior of String#split was a good idea. The change from that method has changed in master since 20c857 so the same patch does not longer work.

Currently

In Ruby

"hello world".split(/\w+|(?= )/) # => ["", " "]

In Crystal

"hello world".split(/\w+|(?= )/) # => ["", "", " ", ""]

So, split might be worth of changing as well. Since one is the dual of the other I would vote for changing them at the same time.

@makenowjust
Copy link
Contributor Author

makenowjust commented Jan 13, 2017

@bcardiff I fixed String#split behavior, now it works like Ruby and other languages (i.e. JavaScript). But I didn't implement positive and negative limit feature in Ruby because I am always confusing this behavior. (And I specify -1 to limit every time.) Reference

@bcardiff
Copy link
Member

The split sample:

assert { "hello world".split(/\w+|(?= )/).should eq(["", " ", ""]) }

differs from Ruby but matches Javascript. I like that result more than the one I extract from Ruby in #3877 (comment).

LGTM

@bcardiff bcardiff merged commit ce1b6d6 into crystal-lang:master Jan 16, 2017
@bcardiff bcardiff added this to the Next milestone Jan 16, 2017
@makenowjust makenowjust deleted the fix/string/scan-regex-behavior branch January 17, 2017 00:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:stdlib

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants