Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change single char str patterns to chars #52646

Merged
merged 1 commit into from
Jul 24, 2018

Conversation

ljedrz
Copy link
Contributor

@ljedrz ljedrz commented Jul 23, 2018

A char is faster.

@rust-highfive
Copy link
Collaborator

r? @michaelwoerister

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 23, 2018
@petrochenkov
Copy link
Contributor

A char is faster

I wouldn't be so sure, see #41993
str is "search bytes in bytes", while char is "convert char to bytes, then search bytes in bytes".

@ljedrz
Copy link
Contributor Author

ljedrz commented Jul 23, 2018

Oh, ok; in that case I guess this might call for an update in clippy, as its single_char_pattern lint suggests that it's the other way round.

cc @oli-obk

@ljedrz
Copy link
Contributor Author

ljedrz commented Jul 23, 2018

@petrochenkov I just did a comparison using godbolt and the char variant produces half the str assembly; perhaps that issue is limited to starts_with? I'm not an expert, but this is confusing.

@killercup
Copy link
Member

A char is faster.

I don't see any benchmarks? :)

@ljedrz
Copy link
Contributor Author

ljedrz commented Jul 23, 2018

@killercup see the related clippy lint.

@kennytm
Copy link
Member

kennytm commented Jul 23, 2018

@petrochenkov #41993 only concerns about .starts_with(). The split() algorithm for char uses memchr() for a char and is way more efficient than that for a &str (which degenerates to naive search).

@killercup
Copy link
Member

killercup commented Jul 23, 2018

@ljedrz a piece of documentation is still no benchmark :)

Seriously, if we don't have one, we should totally write one and add it. This is easily decidable once we have the facts.

Edit: Maybe we can use the ones @kennytm wrote last year:
#41993 (comment). The most significant change to land since then was probably #46735. And if nothing else, we can link the results in the clippy docs.

Edit 2: We are currently linking to libc::memchr on most platforms. Adding a SIMD-aware Rust version might allow for better inlining. (But again, without benchmarks this is just speculation.)

@kennytm
Copy link
Member

kennytm commented Jul 23, 2018

Since I just happened to have benchmarks from rust-lang/rfcs#2500, here's the relevant parts comparing .starts_with() and .split():

.starts_with()
test pat3_starts_with_ascii_char::long_lorem_ipsum      ... bench:         174 ns/iter (+/- 12)
test pat3_starts_with_ascii_char::short_ascii           ... bench:         175 ns/iter (+/- 21)
test pat3_starts_with_ascii_char::short_mixed           ... bench:         174 ns/iter (+/- 9)
test pat3_starts_with_ascii_char::short_pile_of_poo     ... bench:         175 ns/iter (+/- 26)
test pat3_starts_with_ascii_string::long_lorem_ipsum    ... bench:         136 ns/iter (+/- 20)
test pat3_starts_with_ascii_string::short_ascii         ... bench:         138 ns/iter (+/- 26)
test pat3_starts_with_ascii_string::short_mixed         ... bench:         137 ns/iter (+/- 13)
test pat3_starts_with_ascii_string::short_pile_of_poo   ... bench:         136 ns/iter (+/- 15)
test std_starts_with_ascii_char::long_lorem_ipsum       ... bench:         339 ns/iter (+/- 29)
test std_starts_with_ascii_char::short_ascii            ... bench:         338 ns/iter (+/- 21)
test std_starts_with_ascii_char::short_mixed            ... bench:         844 ns/iter (+/- 42)
test std_starts_with_ascii_char::short_pile_of_poo      ... bench:         930 ns/iter (+/- 91)
test std_starts_with_ascii_string::long_lorem_ipsum     ... bench:         306 ns/iter (+/- 41)
test std_starts_with_ascii_string::short_ascii          ... bench:         307 ns/iter (+/- 63)
test std_starts_with_ascii_string::short_mixed          ... bench:         252 ns/iter (+/- 122)
test std_starts_with_ascii_string::short_pile_of_poo    ... bench:         222 ns/iter (+/- 193)

Speed ups:

Method Long ASCII Short ASCII Short Mixed Short Unicode
.starts_with('/') from libstd (baseline) 0% 0% 0% 0%
.starts_with("/") from libstd -10% -9% -70% -76%
.starts_with('/') from RFC 2500 -49% -48% -79% -81%
.starts_with("/") from RFC 2500 -60% -59% -84% -85%
.split()
test pat3_split_a_char::long_lorem_ipsum                ... bench:       3,084 ns/iter (+/- 517)
test pat3_split_a_char::short_ascii                     ... bench:         158 ns/iter (+/- 15)
test pat3_split_a_char::short_mixed                     ... bench:         102 ns/iter (+/- 27)
test pat3_split_a_char::short_pile_of_poo               ... bench:          13 ns/iter (+/- 0)
test pat3_split_a_str::long_lorem_ipsum                 ... bench:       5,592 ns/iter (+/- 692)
test pat3_split_a_str::short_ascii                      ... bench:         180 ns/iter (+/- 23)
test pat3_split_a_str::short_mixed                      ... bench:         160 ns/iter (+/- 17)
test pat3_split_a_str::short_pile_of_poo                ... bench:         109 ns/iter (+/- 9)
test std_split_a_char::long_lorem_ipsum                 ... bench:       3,003 ns/iter (+/- 231)
test std_split_a_char::short_ascii                      ... bench:         159 ns/iter (+/- 11)
test std_split_a_char::short_mixed                      ... bench:         108 ns/iter (+/- 2)
test std_split_a_char::short_pile_of_poo                ... bench:          23 ns/iter (+/- 0)
test std_split_a_str::long_lorem_ipsum                  ... bench:       6,176 ns/iter (+/- 693)
test std_split_a_str::short_ascii                       ... bench:         194 ns/iter (+/- 7)
test std_split_a_str::short_mixed                       ... bench:         170 ns/iter (+/- 17)
test std_split_a_str::short_pile_of_poo                 ... bench:         109 ns/iter (+/- 16)

Speed ups:

Method Long ASCII Short ASCII Short Mixed Short Unicode
.split('a').count() from libstd (baseline) 0% 0% 0% 0%
.split("a").count() from libstd +106% +22% +57% +374%
.split('a').count() from RFC 2500 +3% -1% -6% -43%
.split("a").count() from RFC 2500 +86% +13% +48% +374%

While .starts_with('/') is much slower than .starts_with("/"), we see that the situation is reversed that .split('a').count() is much faster than .split("a").count(). This is because we did not optimize for single-byte string searching at all, but we do optimize for ASCII char searching.

@michaelwoerister
Copy link
Member

Thanks for the PR, @ljedrz!

@kennytm's numbers suggest that this should be a win (although none of the modified code is really on the critical path).

@bors r+

@bors
Copy link
Contributor

bors commented Jul 24, 2018

📌 Commit 49c8ba9 has been approved by michaelwoerister

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 24, 2018
@bors
Copy link
Contributor

bors commented Jul 24, 2018

⌛ Testing commit 49c8ba9 with merge a2af866...

bors added a commit that referenced this pull request Jul 24, 2018
Change single char str patterns to chars

A `char` is faster.
@bors
Copy link
Contributor

bors commented Jul 24, 2018

☀️ Test successful - status-appveyor, status-travis
Approved by: michaelwoerister
Pushing a2af866 to master...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants