Skip to content

Conversation

@Veetaha
Copy link
Contributor

@Veetaha Veetaha commented Jun 27, 2020

No description provided.


fn find_cr(src: &[u8]) -> Option<usize> {
src.iter().enumerate().find_map(|(idx, &b)| if b == b'\r' { Some(idx) } else { None })
src.iter().zip(src.iter().skip(1)).position(|it| it == (&b'\r', &b'\n'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the original idea here was that find_cr should auto-vectorise easily (call into memchr, really), but apparently that's not the case :-(

https://godbolt.org/z/JH8yaK

For max performance, we should pull memchr from crates.io here, but we don't need max performance now

bors r+

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://godbolt.org/z/2j3urJ -- using a &str gives us memchr, as that is hard-coded in the stdlib.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose slice iter should override the default implementations of find/position and friends, but for some reason I (didn't check it) it most probably uses the default impl with try_fold() trickery under the hood which rustc is not powerful enough to minimize...
Btw, this function is heuristic since I suppose there might be wide unicode characters where the code point of \r is one of their byte footprint

@bors
Copy link
Contributor

bors bot commented Jun 28, 2020

@bors bors bot merged commit 0e0fb81 into rust-lang:master Jun 28, 2020
@matklad
Copy link
Contributor

matklad commented Jun 28, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants