Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SIMD accelerated multiple pattern search. #231

Merged
merged 1 commit into from
May 18, 2016
Merged

Conversation

BurntSushi
Copy link
Member

@BurntSushi BurntSushi commented May 15, 2016

This uses the "Teddy" algorithm, as learned from the Hyperscan regular
expression library.

This support is optional, subject to the following:

  1. A nightly compiler.
  2. Enabling the simd-accel feature.
  3. Adding RUSTFLAGS="-C target-feature=+ssse3" when compiling.

@BurntSushi
Copy link
Member Author

Note that this PR is blocked on a new release of simd making its way to crates.io. :-)

cc @huonw @alexcrichton

@killercup I may be able to carve out some interesting projects worth mentoring from this. I left quite a number of TODOs. Feel like learning SIMD? :-)

@BurntSushi
Copy link
Member Author

Relevant benchmarks:

name                                   rust.master ns/iter   rust.simd ns/iter       diff ns/iter   diff %
sherlock::name_alt3                    1,153,246 (515 MB/s)  187,304 (3,176 MB/s)        -965,942  -83.76%
sherlock::name_alt4_nocase             1,223,618 (486 MB/s)  293,523 (2,026 MB/s)        -930,095  -76.01%
sherlock::name_alt5                    319,736 (1,860 MB/s)  182,599 (3,258 MB/s)        -137,137  -42.89%
sherlock::name_alt5_nocase             1,223,311 (486 MB/s)  726,282 (819 MB/s)          -497,029  -40.63%
sherlock::name_holmes_nocase           1,108,772 (536 MB/s)  258,606 (2,300 MB/s)        -850,166  -76.68%
sherlock::name_sherlock_holmes_nocase  1,159,518 (513 MB/s)  239,155 (2,487 MB/s)        -920,363  -79.37%
sherlock::name_sherlock_nocase         1,160,342 (512 MB/s)  235,768 (2,523 MB/s)        -924,574  -79.68%
sherlock::the_nocase                   1,643,616 (361 MB/s)  461,669 (1,288 MB/s)      -1,181,947  -71.91%

@killercup
Copy link
Member

Cool! Thank you for thinking of me! I'll have a look at this :)

@killercup
Copy link
Member

killercup commented May 16, 2016

Fiddling with this a bit I noticed (aside from the stuff I did in #232) that there seems to be a genuine underflow exposed by the fowler::match_basic_81 test case: In src/simd_accel/teddy128.rs:573, you call verify_128 with pos - 2. But pos is usize and doesn't appear to be strictly > 2 here.

Edit: I added a simple check to prevent the underflow in #232.

@BurntSushi
Copy link
Member Author

@killercup Thanks! I've fixed that in this PR. (I didn't see your edit.)

@BurntSushi BurntSushi force-pushed the simd-teddy branch 6 times, most recently from 36faa2c to 6f2bb0f Compare May 18, 2016 14:24
This uses the "Teddy" algorithm, as learned from the Hyperscan regular
expression library: https://01.org/hyperscan

This support optional, subject to the following:

1. A nightly compiler.
2. Enabling the `simd-accel` feature.
3. Adding `RUSTFLAGS="-C target-feature=+ssse3"` when compiling.
@BurntSushi
Copy link
Member Author

@llogiq FYI, this PR impacts the regex-dna benchmark. It makes the multithreaded version slightly faster (0.69s down to 0.63s on my system), but it makes the single threaded version twice as fast (2.55s down to 1.23s). We'll have to wait for simd on stable to get this into the benchmark game though!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants