Performance counters may be helpful for benchmarking: - it can provide more information like cache misses, branch misprediction, etc. - number of cycles may be more stable than wall clock time. It may be hard to do it in a portable way. But doing it only on platforms that are available would still be nice. @aclements