Skip to content

CI benchmarks: provide a way to generate wall-time measurements #1487

@aochagavia

Description

@aochagavia

Our current benchmarking setup measures executed CPU instructions, which according to this blog post can reliably identify the speed difference between two versions of rustls:

It correlates fairly well with wall-time. You’d be mad to use it to compare the speed of two different programs, but it is very useful for comparing two slightly different versions of the same program, which are likely to have a similar instruction mix.

IMO we can safely assume most PRs propose changes that are "slightly different" from the existing version of the code, meaning that the instruction count is a reliable metric to judge their performance impact.

What about bigger changes? Talking to @nnethercote (author of the quote above), he mentioned that the correlation between instruction counts and wall-time decreases when there are bigger changes, or when they significantly affect memory layout. It will probably take some time to develop an intuition of which changes fall under this category (maybe #1448 is an instance), but it is clear that we need a setup to obtain wall-time measurements in those cases.

I'm currently thinking of adding flags to the bench runner that allow:

  1. Measuring wall-time instead of icounts (running each benchmark multiple times);
  2. Comparing the resulting time distributions between two runs.

The idea would be to manually trigger a wall-time bench run when a reviewer considers it necessary.

By the way, this all assumes we can run them on dedicated, properly configured, hardware. I'm currently arranging an OVH bare-metal machine sponsored by ISRG (let me know if you'd prefer something else).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions