Skip to content

nrc/simple-perf-egs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Very simple performance experiment

Results

Warm-up

No obvious warm-up effect across the reps, but:

  • No pre-iteration: 1.3ms; 1.4ms
  • Pre-iterate: 1.2ms; 1.3ms

Using

    if items.iter().any(|i| i.id[0] >= u64::MAX) {
        return ([0, 0, 0, 0], Duration::MAX);
    }

Build

2sd mean; p99 (m3)

  • debug: 4.3ms; 4.6ms
  • release: 1.3ms; 1.4ms
  • release + abort on panic: 1.3ms; 1.4ms
  • release + optimised profile + RUSTFLAGS="-C target-cpu=native": 1.3ms; 1.4ms

Timing the right thing

2sd mean; p99 (m3)

  • Including init data and destruction: 83ms; 93ms
  • Destruction only: 1.4ms; 1.8ms
  • Neither: 1.4ms; 1.6ms
  • Single allocation: 1.3ms; 1.4ms

Loop style

2sd mean; p99 (m3)

  • for_each: 1.3ms; 1.4ms
  • for i in items: 1.3ms; 1.4ms
  • for i in 0..items.len(), for i in 0..SIZE, etc: : 1.3ms; 1.4ms
  • and with black_box: 2.7ms; 3.5ms

Hard to force the bounds checks!

Overflow checks

Manual overflow checks didn't make much difference

Parallelism

Using atomic u64s for result shows the same times.

  • Single thread: 1.3ms; 1.4ms
  • 10 threads, bad atomics: 19.9ms; 26.3ms
  • 10 threads, good atomics: 0.9ms; 0.9ms
  • 2 threads, good atomics: 0.9ms; 1.0ms
  • 10 threads, interleaved: 1.3ms; 1.7ms
  • Rayon (par_iter): 43.1ms; 53.6ms

Data-oriented approach

  • OO: 1.3ms; 1.4ms
  • split ids from values: 1.1ms; 1.2ms
  • split ids: 0.8ms; 0.9ms

Iteration style

Based on data-oriented, using indexing vs iterators is equivalent, seems to be some small but consistent benefit to indexing (0.81ms vs 0.83ms). With black_box on indices, mean is 0.97ms.

About

Simple examples of performance measurement and optimisation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages