Skip to content

Commit

Permalink
[BOLT] Add parser for pre-aggregated perf data
Browse files Browse the repository at this point in the history
Summary:
The regular perf2bolt aggregation job is to read perf output directly.
However, if the data is coming from a database instead of perf, one
could write a query to produce a pre-aggregated file. This function
deals with this case.

The pre-aggregated file contains aggregated LBR data, but without binary
knowledge. BOLT will parse it and, using information from the
disassembled binary, augment it with fall-through edge frequency
information. After this step is finished, this data can be either
written to disk to be consumed by BOLT later, or can be used by BOLT
immediately if kept in memory.

File format syntax:
{B|F|f} [<start_id>:]<start_offset> [<end_id>:]<end_offset> <count>
[<mispred_count>]

B - indicates an aggregated branch
F - an aggregated fall-through (trace)
f - an aggregated fall-through with external origin - used to disambiguate
between a return hitting a basic block head and a regular internal
jump to the block

<start_id> - build id of the object containing the start address. We can
skip it for the main binary and use "X" for an unknown object. This will
save some space and facilitate human parsing.

<start_offset> - hex offset from the object base load address (0 for the
main executable unless it's PIE) to the start address.

<end_id>, <end_offset> - same for the end address.

<count> - total aggregated count of the branch or a fall-through.

<mispred_count> - the number of times the branch was mispredicted.
Omitted for fall-throughs.

Example
F 41be50 41be50 3
F 41be90 41be90 4
f 41be90 41be90 7
B 4b1942 39b57f0 3 0
B 4b196f 4b19e0 2 0

(cherry picked from FBD8887182)
  • Loading branch information
rafaelauler authored and maksfb committed Jul 18, 2018
1 parent 27f3032 commit ddfcf4f
Show file tree
Hide file tree
Showing 9 changed files with 519 additions and 59 deletions.
3 changes: 2 additions & 1 deletion bolt/src/BinaryFunction.h
Expand Up @@ -2138,7 +2138,8 @@ class BinaryFunction {
/// Return a vector of offsets corresponding to a trace in a function
/// (see recordTrace() above).
Optional<SmallVector<std::pair<uint64_t, uint64_t>, 16>>
getFallthroughsInTrace(const LBREntry &First, const LBREntry &Second);
getFallthroughsInTrace(const LBREntry &First, const LBREntry &Second,
uint64_t Count = 1);

/// Returns an estimate of the function's hot part after splitting.
/// This is a very rough estimate, as with C++ exceptions there are
Expand Down
5 changes: 3 additions & 2 deletions bolt/src/BinaryFunctionProfile.cpp
Expand Up @@ -414,10 +414,11 @@ void BinaryFunction::postProcessProfile() {

Optional<SmallVector<std::pair<uint64_t, uint64_t>, 16>>
BinaryFunction::getFallthroughsInTrace(const LBREntry &FirstLBR,
const LBREntry &SecondLBR) {
const LBREntry &SecondLBR,
uint64_t Count) {
SmallVector<std::pair<uint64_t, uint64_t>, 16> Res;

if (!recordTrace(FirstLBR, SecondLBR, 1, &Res))
if (!recordTrace(FirstLBR, SecondLBR, Count, &Res))
return NoneType();

return Res;
Expand Down

0 comments on commit ddfcf4f

Please sign in to comment.