Counting occurrences of a given byte or UTF-8 characters in a slice of memory – fast
Rust
Switch branches/tags
Nothing to show
Clone or download
llogiq [Experimental] Convert benchmarks to criterion (#38)
convert benchmarks to use criterion

This moves the benchmarks to criterion, which offers a much nicer API. The user can now use a `COUNTS` env var with comma-separated integer values to set the benchmark byte slice sizes.

We also reduce benchmark cases to avoid CI timeout (because criterion is quite a bit heavier than bencher by default), only build release and reduce iterations, warmup & measurement time if the `CI` env var is set.
Latest commit 731280f Apr 11, 2018

README.md

bytecount

Counting bytes really fast

Build Status Windows build status Current Version License: Apache 2.0/MIT

This uses the "hyperscreamingcount" algorithm by Joshua Landau to count bytes faster than anything else. The newlinebench repository has further benchmarks.

To use bytecount in your crate, if you have cargo-edit, just type cargo add bytecount in a terminal with the crate root as the current path. Otherwise you can manually edit your Cargo.toml to add bytecount = 0.1.4 to your [dependencies] section.

In your crate root (lib.rs or main.rs, depending on if you are writing a library or application), add extern crate bytecount;. Now you can simply use bytecount::count as follows:

extern crate bytecount;

fn main() {
    let mytext = "some potentially large text, perhaps read from disk?";
    let spaces = bytecount::count(mytext.as_bytes(), b' ');
    ..
}

bytecount supports two features to make use of modern CPU's features to speed up counting considerably. To allow your users to use them, add the following to your Cargo.toml:

[features]
avx-accel = ["bytecount/avx-accel"]
simd-accel = ["bytecount/simd-accel"]

Now your users can compile with SSE support (available on most modern x86_64 processors) using:

cargo build --release --features simd-accel

Or even with AVX support (which likely requires compiling for the native target CPU):

RUSTFLAGS="-C target-cpu=native" cargo build --release --features "simd-accel avx-accel"

The algorithm is explained in depth here.

Note that for very short slices, the data parallelism will likely not win much performance gains. In those cases, a naive count with a 32-bit counter may be a superior solution, unless counting really large byte slices.

License

Licensed under either of at your discretion: