low-overhead timing for x86_64.
it uses serialized rdtsc reads for TimePoint, provides a lock-free histogram for latency distributions and includes simple timer helpers for manual and scoped measurement.
use latency::{Histogram, TimePoint, Timer};
fn main() {
latency::init();
let start = TimePoint::now();
do_work();
println!("elapsed: {}ns", start.elapsed_ns());
let timer = Timer::start();
do_work();
println!("elapsed: {}ns", timer.elapsed_ns());
let histogram = Histogram::new();
histogram.record(42);
println!("{}", histogram.stats().format());
}
fn do_work() {}TimePoint::now()andelapsed_ns()for direct timingTimerandTimerGuardfor manual or scoped timingScopedTimerfor closure-based timingHistogramformin,max,mean, and percentile summariestime_block!andtime_if_enabled!for lightweight instrumentation
ScopedTimer threshold warnings are debug-only, so release builds do not write to stderr from the measured path.
the example benchmark compares latency::TimePoint against solana_measure::measure::Measure across an empty measurement and a bounded work sweep.
on 7950x, repeated pinned-core runs showed:
- empty measurement:
latencyis clearly lower overhead - small work around
80nsto300ns:latencyis still measurably faster - medium to larger work around
0.58usto2.33us: both are roughly equal, withlatencya few nanoseconds lower
run it with:
cargo run --release --example compare_timingtiming is enabled by default. to compile the crate without active timing code:
[features]
default = []real tsc-based timing requires x86_64. on other targets, the raw tsc helpers return 0, so TimePoint-based measurement is effectively disabled.