Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use memchr to speed up [u8]::contains 3x #46713

Merged
merged 4 commits into from Dec 31, 2017
Merged

Conversation

Manishearth
Copy link
Member

No description provided.

@Manishearth
Copy link
Member Author

Manishearth commented Dec 13, 2017

related: #46704

#![feature(test)]

extern crate test;
use test::Bencher;

#[bench]
fn find_byte(b: &mut Bencher) {
    let x = test::black_box("Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.").as_bytes();
    b.iter(|| test::black_box(x.contains(&0xff)));
}

#[bench]
fn find_byte_old(b: &mut Bencher) {
    let x = test::black_box("Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.").as_bytes();
    b.iter(|| test::black_box(x.iter().any(|x| *x == 0xff)));
}
running 2 tests
test find_byte  ... bench:          58 ns/iter (+/- 26)
test find_byte_old ... bench:         159 ns/iter (+/- 29)

test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured; 0 filtered out

Apparently our memchr implementation is actually slower than glibc, but @mystor is looking into that.

I also get similar numbers if I implement find_byte_old as something using [U8]::contains where U8 is struct U8(u8); so that it forces the fallback contains()

@Manishearth
Copy link
Member Author

I'll next be working to speed up #46693

@Manishearth
Copy link
Member Author

r? @bluss
cc @alexcrichton

@Manishearth Manishearth changed the title Use memchr to speed up [u8]::contains 2x Use memchr to speed up [u8]::contains 3x Dec 13, 2017
x.wrapping_sub(LO_USIZE) & !x & HI_USIZE != 0
}

#[cfg(target_pointer_width = "32")]
Copy link

@ghost ghost Dec 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going to work on 16-bit platforms as well? I'm asking because I already broke rustc once by neglecting those. :) #40832

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. It looks like it should work. But idk.

@kennytm kennytm added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 13, 2017
@Manishearth
Copy link
Member Author

This PR is contained in #46735 (but feel free to land separately, this is a simpler change)

@nagisa
Copy link
Member

nagisa commented Dec 15, 2017

How implausible it would be to start depending on memchr crate rather than including it verbatim, now that we have ability to depend on crates?

@hanna-kruppe
Copy link
Contributor

If I'm up to date, there are no third party crates pulled in by libstd. All the crates.io dependencies are in compiler crates and tools. Not sure if there's any reason for what or if it just never came up before.

@Manishearth
Copy link
Member Author

Manishearth commented Dec 15, 2017 via email

@@ -12,4 +12,4 @@
// Copyright 2015 Andrew Gallant, bluss and Nicolas Koch

// Fallback memchr is fastest on windows
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note, this statement (Fallback memchr is fastest on windows) can be revisited by someone with the right development environment.

fn slice_contains(&self, x: &[Self]) -> bool {
memchr::memchr(*self, x).is_some()
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add for i8 here too. Reinterpreting &[i8] as &[u8] is problem free

@bluss
Copy link
Member

bluss commented Dec 18, 2017

I'm glad to hear that the loopiness of Rust's bootstrapping does have a limit :-)

This PR looks good to me. The drawback is to add an unstable API (memchr function) to core with no intention of stabilizing it, that's normally not so popular.

@Manishearth
Copy link
Member Author

Manishearth commented Dec 18, 2017

This PR looks good to me. The drawback is to add an unstable API (memchr function) to core with no intention of stabilizing it, that's normally not so popular.

We already have a bunch of these APIs, like char_internals, whose sole raison d'être is that core needs that code but std needs to use it too.

Can I land this, r=you? This PR was split out so that it's easy to land independently if we want to.

@Manishearth
Copy link
Member Author

Oh, hold on, adding an i8 impl

@Manishearth
Copy link
Member Author

Done. r?

bors added a commit that referenced this pull request Dec 21, 2017
Use memchr for str::find(char)

This is a 10x improvement for searching for characters.

This also contains the patches from #46713 . Feel free to land both separately or together.

cc @mystor @alexcrichton

r? @bluss

fixes #46693
Copy link
Member

@bluss bluss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being so slow :/ The review comments are about that transmute is a very convenient interface, but we prefer everything but transmute.


impl SliceContains for i8 {
fn slice_contains(&self, x: &[Self]) -> bool {
let byte: u8 = unsafe { mem::transmute::<i8, u8>(*self) };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be a cast with as.

impl SliceContains for i8 {
fn slice_contains(&self, x: &[Self]) -> bool {
let byte: u8 = unsafe { mem::transmute::<i8, u8>(*self) };
let bytes: &[u8] = unsafe { mem::transmute::<&[i8], &[u8]>(x) };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This with slice::from_raw_parts

@Manishearth
Copy link
Member Author

cast with as is wrong, that is a semantic cast not a bitwise cast, and we need a bitwise cast for the search to happen of the same bits.

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented Dec 31, 2017

as casts don't do overflow checks, they wrap, so they effectively reinterpret the bits when casting between uN and iN.

@Manishearth
Copy link
Member Author

Manishearth commented Dec 31, 2017 via email

@hanna-kruppe
Copy link
Contributor

UB?!? No.

@Manishearth
Copy link
Member Author

Manishearth commented Dec 31, 2017 via email

@bluss
Copy link
Member

bluss commented Dec 31, 2017

In Rust the u8 to i8 cast using as is equivalent to the transmute in the sense that the input and output are bitwise equivalent.

@Manishearth
Copy link
Member Author

Manishearth commented Dec 31, 2017 via email

@nagisa
Copy link
Member

nagisa commented Dec 31, 2017

as casts may only be UB for float-to-integer and integer-to-float involving numbers of certain magnitudes. Casting integers of the same bit-width is a bitwise-reinterpretation. Casting to a wider integer will do either sign or zero extension depending on signedness of the source integer. Casting to a less wide integer will truncate the MSB bits.

@Manishearth
Copy link
Member Author

Updated

@bluss
Copy link
Member

bluss commented Dec 31, 2017

@bors r+

@bors
Copy link
Contributor

bors commented Dec 31, 2017

📌 Commit 4ef6847 has been approved by bluss

@kennytm kennytm added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 31, 2017
@bors
Copy link
Contributor

bors commented Dec 31, 2017

⌛ Testing commit 4ef6847 with merge 8c59418...

bors added a commit that referenced this pull request Dec 31, 2017
Use memchr to speed up [u8]::contains 3x

None
@bors
Copy link
Contributor

bors commented Dec 31, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: bluss
Pushing 8c59418 to master...

@bors bors merged commit 4ef6847 into rust-lang:master Dec 31, 2017
@Manishearth Manishearth deleted the memchr branch January 1, 2018 04:34
bors added a commit that referenced this pull request Jan 1, 2018
Use memchr for str::find(char)

This is a 10x improvement for searching for characters.

This also contains the patches from #46713 . Feel free to land both separately or together.

cc @mystor @alexcrichton

r? @bluss

fixes #46693
@avl
Copy link

avl commented Jan 4, 2018

Is this, which was mentioned above: "Casting to a less wide integer will truncate the LSB bits." really true? Isn't it the MSB which are truncated? 257u32 as u8 is 1u8, right?

@nagisa
Copy link
Member

nagisa commented Jan 4, 2018

Ah, indeed. Overworked me switches the terms. Happens all the time.

Edited the comment to reflect the reality for future prosperity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants