[trace] Introduce Hierarchical Trace Representation (HTR) and add com…

…mand for Intel PT trace visualization This diff introduces Hierarchical Trace Representation (HTR) and creates the `thread trace export ctf -f <filename> -t <thread_id>` command to export an Intel PT trace's HTR to Chrome Trace Format (CTF) for visualization. See `lldb/docs/htr.rst` for context/documentation on HTR. **Overview of Changes** - Add HTR documentation (see `lldb/docs/htr.rst`) - Add HTR structures (layer, block, block metadata) - Implement "Basic Super Block" HTR pass - Add 'thread trace export ctf' command to export the HTR of an Intel PT trace to Chrome Trace Format (CTF) As this diff is the first iteration of HTR and trace visualization, future diffs will build on this work by generalizing the internal design of HTR and implementing new HTR passes that provide better trace summarization/visualization. See attached video for an example of Intel PT trace visualization: {F17851042} Original Author: jj10306 Submitted by: wallace Reviewed By: wallace, clayborg Differential Revision: https://reviews.llvm.org/D105741
llvm · Jul 28, 2021 · d52ba48 · d52ba48
1 parent e12e02d
commit d52ba48
Show file tree

Hide file tree

Showing 12 changed files with 1,119 additions and 8 deletions.
diff --git a/lldb/docs/htr.rst b/lldb/docs/htr.rst
@@ -0,0 +1,47 @@
+Hierarchical Trace Representation (HTR)
+======================================
+The humongous amount of data processor traces like the ones obtained with Intel PT contain is not digestible to humans in its raw form. Given this, it is useful to summarize these massive traces by extracting useful information. Hierarchical Trace Representation (HTR) is the way lldb represents a summarized trace internally. HTR efficiently stores trace data and allows the trace data to be transformed in a way akin to compiler passes.
+
+Concepts
+--------
+**Block:** One or more contiguous units of the trace. At minimum, the unit of a trace is the load address of an instruction.
+
+**Block Metadata:** Metadata associated with each *block*. For processor traces, some metadata examples are the number of instructions in the block or information on what functions are called in the block.
+
+**Layer:** The representation of trace data between passes. For Intel PT there are two types of layers:
+
+ **Instruction Layer:** Composed of the oad addresses of the instructions in the trace. In an effort to save space, 
+ metadata is only stored for instructions that are of interest, not every instruction in the trace. HTR contains a 
+ single instruction layer.
+
+ **Block Layer:** Composed of blocks - a block in *layer n* refers to a sequence of blocks in *layer n - 1*. A block in 
+ *layer 1* refers to a sequence of instructions in *layer 0* (the instruction layer). Metadata is stored for each block in 
+ a block layer. HTR contains one or more block layers.
+
+**Pass:** A transformation applied to a *layer* that generates a new *layer* that is a more summarized, consolidated representation of the trace data.
+A pass merges instructions/blocks based on its specific purpose - for example, a pass designed to summarize a processor trace by function calls would merge all the blocks of a function into a single block representing the entire function.l
+
+The image below illusrates the transformation of a trace's representation (HTR)
+
+.. image:: media/htr-example.png
+
+Passes
+------
+A *pass* is applied to a *layer* to extract useful information (summarization) and compress the trace representation into a new *layer*. The idea is to have a series of passes where each pass specializes in extracting certain information about the trace. Some examples of potential passes include: identifying functions, identifying loops, or a more general purpose such as identifying long sequences of instructions that are repeated (i.e. Basic Super Block). Below you will find a description of each pass currently implemented in lldb.
+
+**Basic Super Block Reduction**
+
+A “basic super block” is the longest sequence of blocks that always occur in the same order. (The concept is akin to “Basic Block'' in compiler theory, but refers to dynamic occurrences rather than CFG nodes).
+
+The image below shows the "basic super blocks" of the sequence. Each unique "basic super block" is marked with a different color
+
+.. image:: media/basic_super_block_pass.png
+
+*Procedure to find all super blocks:*
+
+- For each block, compute the number of distinct predecessor and successor blocks.
+
+ - **Predecessor** - the block that occurs directly before (to the left of) the current block
+ - **Successor** - the block that occurs directly after (to the right of) the current block
+
+- A block with more than one distinct successor is always the start of a super block, the super block will continue until the next block with more than one distinct predecessor or successor.
diff --git a/lldb/source/Plugins/TraceExporter/CMakeLists.txt b/lldb/source/Plugins/TraceExporter/CMakeLists.txt
@@ -1 +1,2 @@
+add_subdirectory(common)
 add_subdirectory(ctf)
diff --git a/lldb/source/Plugins/TraceExporter/common/CMakeLists.txt b/lldb/source/Plugins/TraceExporter/common/CMakeLists.txt
@@ -0,0 +1,7 @@
+add_lldb_library(lldbPluginTraceExporterCommon
+  TraceHTR.cpp
+
+  LINK_LIBS
+    lldbCore
+    lldbTarget
+  )