New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use memchr for performance #3067

Merged
merged 1 commit into from Dec 31, 2015

Conversation

Projects
None yet
3 participants
@ksss
Contributor

ksss commented Dec 31, 2015

Use memchr for performance.

This code was copy from https://github.com/ruby/ruby/blob/2f2a5c3ae9f9e7a4d09b1500de8d864f48d69cad/re.c#L264-L269

benchmark

$ uname -v
Darwin Kernel Version 15.2.0: Fri Nov 13 19:56:56 PST 2015; root:xnu-3248.20.55~2/RELEASE_X86_64
s = "b"
str = ("a" * 100 + s)

t = Time.now
str.index(s)
puts Time.now - t

before => 0.000788
after => 0.000508


s = "b"
str = ("a" * 100 * 1024 * 1024 + s)

t = Time.now
str.index(s)
puts Time.now - t

before => 0.225474
after => 0.008658

Use memchr for performance
```ruby
s = "b"
str = ("a" * 100 + s)

t = Time.now
str.index(s)
puts Time.now - t
```

before => 0.000788
after  => 0.000508

---

```ruby
s = "b"
str = ("a" * 100 * 1024 * 1024 + s)

t = Time.now
str.index(s)
puts Time.now - t
```

before => 0.225474
after  => 0.008658

matz added a commit that referenced this pull request Dec 31, 2015

@matz matz merged commit 31b8469 into mruby:master Dec 31, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@ksss ksss deleted the ksss:use-memchr branch Dec 31, 2015

@mattn

This comment has been minimized.

Show comment
Hide comment
@mattn

mattn Jan 5, 2016

Contributor

@matz this should work well because this is binary comparing. do you expect mruby-string-ext should work as multi-byte? for example, "あ".index("\x81") should be 0 if string is utf-8.

Contributor

mattn commented Jan 5, 2016

@matz this should work well because this is binary comparing. do you expect mruby-string-ext should work as multi-byte? for example, "あ".index("\x81") should be 0 if string is utf-8.

matz added a commit that referenced this pull request Jan 5, 2016

bytes2chars() conversion to fail if target byte offset is not on the …
…character boundary; ref #3067

that means String#index matches first byte of a multi-byte character. this behavior is different
from CRuby, but a compromise for mruby which does not have encoding stuffs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment