Below is an explanation of the current database schema. This schema is duplicated across the (currently) two database backends we support: sqlite and postgres.
In general, the database is used to track four groups of things:
- Performance run statistics (e.g., instruction count) for compile time benchmarks on a per benchmark, profile, and scenario basis.
- Performance run statistics (e.g., instruction count) for runtime benchmarks on a per benchmark basis.
- Self profile data gathered with
-Zself-profile. - State when running GitHub bots and the performance runs (e.g., how long it took for a performance suite to run, errors encountered along the way, etc.)
Below are some diagrams showing the basic layout of the database schema for these three uses:
Here is the diagram for compile-time benchmarks:
┌────────────┐ ┌───────────────┐ ┌────────────┐
│ benchmark │ │ collection │ │ artifact │
├────────────┤ ├───────────────┤ ├────────────┤
┌►│ name * │ │ id * │◄┐│ id * │◄┐
│ │ stabilized │ │ perf_commit │ ││ name │ │
│ │ │ │ │ ││ date │ │
│ │ │ │ │ ││ type │ │
│ └────────────┘ └───────────────┘ │└────────────┘ │
│ │ │
│ │ │
│ ┌───────────────┐ ┌──────────┐ │ │
│ │ pstat_series │ │ pstat │ │ │
│ ├───────────────┤ ├──────────┤ │ │
│ │ id * │◄┐│ id * │ │ │
└─┤ crate │ └┤ series │ │ │
│ profile │ │ aid ├───┼───────────────┘
│ scenario │ │ cid │ │
│ backend │ │ value ├───┘
│ metric │ └──────────┘
└───────────────┘
For runtime benchmarks the schema very similar, but there are different table names:
pstat=>runtime_pstatpstat_series=>runtime_pstat_series- There are different attributes here,
benchmarkandmetric.
- There are different attributes here,
A description of a rustc compiler artifact being benchmarked.
This description includes:
- name: usually a commit sha or a tag like "1.51.0" but is free-form text so can be anything.
- date: the date associated with this compiler artifact (usually only when the name is a commit)
- type: currently one of "master" (i.e., we're testing a merge commit), "try" (someone is testing a PR), and "release" (usually a release candidate - though local compilers also get labeled like this).
sqlite> select * from artifact limit 1;
id name date type
---------- ---------- ---------- -------
1 LOCAL_TEST release
Records the size of individual components (like librustc_driver.so or libLLVM.so) of a single
artifact.
This description includes:
- component: normalized name of the component (hashes are removed)
- size: size of the component in bytes
sqlite> select * from artifact_size limit 1;
aid component size
---------- ---------- ----------
1 libLLVM.so 177892352
A "collection" of benchmarks tied only differing by the statistic collected.
This corresponds to a test result.
This is a way to collect statistics together signifying that they belong to the same logical benchmark run.
Currently, the collection also marks the git sha of the currently running collector binary.
sqlite> select * from collection limit 1;
id perf_commit
---------- ----------------------------------------
1 d9fd96f409a15429757030f225b082744a72516c
Keeps track of the collector's start and finish time as well as which step it's currently on.
sqlite> select * from collector_progress limit 1;
aid step start end
---------- ---------- ---------- ----------
1 helloworld 1625829961 1625829965
Records how long benchmarking takes in seconds.
sqlite> select * from artifact_collection_duration limit 1;
aid date_recorded duration
---------- ------------- ----------
1 1625829965 4
The different types of compile-time benchmarks that are run.
The table stores the name of the benchmark, whether it is capable of being run using the stable compiler, and its category. The benchmark name is used as a foreign key in many of the other tables.
Category is either primary (real-world benchmark) or secondary (stress test).
Stable benchmarks have category set to primary and stabilized set to 1.
sqlite> select * from benchmark limit 1;
name stabilized category
---------- ---------- ----------
helloworld 0 primary
Describes the parametrization of a compile-time benchmark. Contains a unique combination of a crate, profile, scenario and the metric being collected.
- crate (aka
benchmark): the benchmarked crate which might be a crate from crates.io or a crate made specifically to stress some part of the compiler. - profile: what type of compilation is happening - check build, optimized build (a.k.a. release build), debug build, or doc build.
- scenario: describes how much of the incremental cache is full. An empty incremental cache means that the compiler must do a full build.
- backend: codegen backend used for compilation.
- metric: the type of metric being collected.
This corresponds to a statistic description.
There is a separate table for this collection to avoid duplicating crates, profiles, scenarios etc.
many times in the pstat table.
sqlite> select * from pstat_series limit 1;
id crate profile scenario backend metric
---------- ---------- ---------- ---------- ------- ------------
1 helloworld check full llvm task-clock:u
A measured value of a compile-time metric that is unique to a pstat_series, artifact and a collection.
Each measured combination of a collection, rustc artifact, benchmarked crate, profile, scenario and a metric has its own unique entry in this table.
sqlite> select * from pstat limit 1;
series aid cid value
---------- ---------- ---------- ----------
1 1 1 24.93
Describes the parametrization of a runtime benchmark. Contains a unique combination of a benchmark and the metric being collected.
This table exists to avoid duplicating crates, profiles, scenarios etc. many times in the runtime_pstat table.
sqlite> select * from runtime_pstat_series limit 1;
id benchmark metric
---------- --------- --------------
1 nbody-10k instructions:u
A measured value of a runtime metric that is unique to a runtime_pstat_series, artifact and a collection.
Each measured combination of a collection, rustc artifact, benchmark and a metric has its own unique entry in this table.
sqlite> select * from runtime_pstat limit 1;
series aid cid value
---------- ---------- ---------- ----------
1 1 1 24.93
Records the duration of compiling a rustc crate for a given artifact and collection.
sqlite> select * from rustc_compilation limit 1;
aid cid crate duration
--- --- ---------- --------
1 42 rustc_mir_transform 28.096
Records that a given combination of artifact, collection, benchmark, profile and scenario has a self profile archive available. This profile is then downloaded through an endpoint - it is not stored in the database directly.
sqlite> select * from raw_self_profile limit 1;
aid cid crate profile cache
--- --- ----- ------- -----
1 42 hello-world debug full
Records a pull request commit that is waiting in a queue to be benchmarked.
First a merge commit is queued, then its artifacts are built by bors, and once the commit is attached to the entry in this table, it can be benchmarked.
- bors_sha: SHA of the commit that should be benchmarked
- pr: number of the PR
- parent_sha: SHA of the parent commit, to which will the PR be compared
- complete: bool specifying whether this commit has been already benchmarked or not
- requested: when was the commit queued
- include: which benchmarks should be included (corresponds to the
--includebenchmark parameter) - exclude: which benchmarks should be excluded (corresponds to the
--excludebenchmark parameter) - runs: how many iterations should be used by default for the benchmark run
- commit_date: when was the commit created
- backends: the codegen backends to use for the benchmarks (corresponds to the
--backendsbenchmark parameter)
sqlite> select * from pull_request_build limit 1;
bors_sha pr parent_sha complete requested include exclude runs commit_date backends
---------- -- ---------- -------- --------- ------- ------- ---- ----------- --------
1w0p83... 42 fq24xq... true <timestamp> 3 <timestamp>
Records a compilation or runtime error for an artifact and a benchmark.
sqlite> select * from error limit 1;
aid benchmark error
---------- --- -----
1 syn-1.0.89 Failed to compile...