No obvious warm-up effect across the reps, but:
- No pre-iteration: 1.3ms; 1.4ms
- Pre-iterate: 1.2ms; 1.3ms
Using
if items.iter().any(|i| i.id[0] >= u64::MAX) {
return ([0, 0, 0, 0], Duration::MAX);
}2sd mean; p99 (m3)
- debug: 4.3ms; 4.6ms
- release: 1.3ms; 1.4ms
- release + abort on panic: 1.3ms; 1.4ms
- release + optimised profile + RUSTFLAGS="-C target-cpu=native": 1.3ms; 1.4ms
2sd mean; p99 (m3)
- Including init data and destruction: 83ms; 93ms
- Destruction only: 1.4ms; 1.8ms
- Neither: 1.4ms; 1.6ms
- Single allocation: 1.3ms; 1.4ms
2sd mean; p99 (m3)
for_each: 1.3ms; 1.4msfor i in items: 1.3ms; 1.4msfor i in 0..items.len(),for i in 0..SIZE, etc: : 1.3ms; 1.4ms- and with
black_box: 2.7ms; 3.5ms
Hard to force the bounds checks!
Manual overflow checks didn't make much difference
Using atomic u64s for result shows the same times.
- Single thread: 1.3ms; 1.4ms
- 10 threads, bad atomics: 19.9ms; 26.3ms
- 10 threads, good atomics: 0.9ms; 0.9ms
- 2 threads, good atomics: 0.9ms; 1.0ms
- 10 threads, interleaved: 1.3ms; 1.7ms
- Rayon (
par_iter): 43.1ms; 53.6ms
- OO: 1.3ms; 1.4ms
- split ids from values: 1.1ms; 1.2ms
- split ids: 0.8ms; 0.9ms
Based on data-oriented, using indexing vs iterators is equivalent, seems to be some small but consistent benefit to indexing (0.81ms vs 0.83ms).
With black_box on indices, mean is 0.97ms.