Skip to content

Commit

Permalink
[trace][intelpt] Add task timer classes
Browse files Browse the repository at this point in the history
I'm adding two new classes that can be used to measure the duration of long
tasks as process and thread level, e.g. decoding, fetching data from
lldb-server, etc. In this first patch, I'm using it to measure the time it takes
to decode each thread, which is printed out with the `dump info` command. In a
later patch I'll start adding process-level tasks and I might move these
classes to the upper Trace level, instead of having them in the intel-pt
plugin. I might need to do that anyway in the future when we have to
measure HTR. For now, I want to keep the impact of this change minimal.

With it, I was able to generate the following info of a very big trace:

```
(lldb) thread trace dump info                                                                                                            Trace technology: intel-pt

thread #1: tid = 616081
  Total number of instructions: 9729366

  Memory usage:
    Raw trace size: 1024 KiB
    Total approximate memory usage (excluding raw trace): 123517.34 KiB
    Average memory usage per instruction (excluding raw trace): 13.00 bytes

  Timing:
    Decoding instructions: 1.62s

  Errors:
    Number of TSC decoding errors: 0
```

As seen above, it took 1.62 seconds to decode 9.7M instructions. This is great
news, as we don't need to do any optimization work in this area.

Differential Revision: https://reviews.llvm.org/D123357
  • Loading branch information
walter-erquinigo committed Apr 12, 2022
1 parent 9f52605 commit bdf3e7e
Show file tree
Hide file tree
Showing 9 changed files with 163 additions and 24 deletions.
1 change: 1 addition & 0 deletions lldb/source/Plugins/Trace/intel-pt/CMakeLists.txt
Expand Up @@ -16,6 +16,7 @@ lldb_tablegen(TraceIntelPTCommandOptions.inc -gen-lldb-option-defs
add_lldb_library(lldbPluginTraceIntelPT PLUGIN
CommandObjectTraceStartIntelPT.cpp
DecodedThread.cpp
TaskTimer.cpp
LibiptDecoder.cpp
ThreadDecoder.cpp
TraceCursorIntelPT.cpp
Expand Down
2 changes: 2 additions & 0 deletions lldb/source/Plugins/Trace/intel-pt/DecodedThread.h
Expand Up @@ -239,6 +239,8 @@ class DecodedThread : public std::enable_shared_from_this<DecodedThread> {
llvm::Optional<size_t> m_raw_trace_size;
/// All occurrences of libipt errors when decoding TSCs.
LibiptErrors m_tsc_errors;
/// Total amount of time spent decoding.
std::chrono::milliseconds m_total_decoding_time{0};
};

using DecodedThreadSP = std::shared_ptr<DecodedThread>;
Expand Down
22 changes: 22 additions & 0 deletions lldb/source/Plugins/Trace/intel-pt/TaskTimer.cpp
@@ -0,0 +1,22 @@
#include "TaskTimer.h"

using namespace lldb;
using namespace lldb_private;
using namespace lldb_private::trace_intel_pt;
using namespace llvm;

void ThreadTaskTimer::ForEachTimedTask(
std::function<void(const std::string &event,
std::chrono::milliseconds duration)>
callback) {
for (const auto &kv : m_timed_tasks) {
callback(kv.first, kv.second);
}
}

ThreadTaskTimer &TaskTimer::ForThread(lldb::tid_t tid) {
auto it = m_thread_timers.find(tid);
if (it == m_thread_timers.end())
it = m_thread_timers.try_emplace(tid, ThreadTaskTimer{}).first;
return it->second;
}
75 changes: 75 additions & 0 deletions lldb/source/Plugins/Trace/intel-pt/TaskTimer.h
@@ -0,0 +1,75 @@
//===-- TaskTimer.h ---------------------------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef LLDB_SOURCE_PLUGINS_TRACE_INTEL_PT_TASKTIMER_H
#define LLDB_SOURCE_PLUGINS_TRACE_INTEL_PT_TASKTIMER_H

#include "lldb/lldb-types.h"

#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/StringRef.h"

#include <chrono>
#include <functional>
#include <unordered_map>

namespace lldb_private {
namespace trace_intel_pt {

/// Class used to track the duration of long running tasks related to a single
/// thread for reporting.
class ThreadTaskTimer {
public:
/// Execute the given \p task and record its duration.
///
/// \param[in] name
/// The name used to identify this task for reporting.
///
/// \param[in] task
/// The task function.
///
/// \return
/// The return value of the task.
template <class R> R TimeTask(llvm::StringRef name, std::function<R()> task) {
auto start = std::chrono::steady_clock::now();
R result = task();
auto end = std::chrono::steady_clock::now();
std::chrono::milliseconds duration =
std::chrono::duration_cast<std::chrono::milliseconds>(end - start);
m_timed_tasks.insert({name.str(), duration});
return result;
}

/// Executive the given \p callback on each recorded task.
///
/// \param[in] callback
/// The first parameter of the callback is the name of the recorded task,
/// and the second parameter is the duration of that task.
void ForEachTimedTask(std::function<void(const std::string &name,
std::chrono::milliseconds duration)>
callback);

private:
std::unordered_map<std::string, std::chrono::milliseconds> m_timed_tasks;
};

/// Class used to track the duration of long running tasks for reporting.
class TaskTimer {
public:
/// \return
/// The timer object for the given thread.
ThreadTaskTimer &ForThread(lldb::tid_t tid);

private:
llvm::DenseMap<lldb::tid_t, ThreadTaskTimer> m_thread_timers;
};

} // namespace trace_intel_pt
} // namespace lldb_private

#endif // LLDB_SOURCE_PLUGINS_TRACE_INTEL_PT_TASKTIMER_H
25 changes: 15 additions & 10 deletions lldb/source/Plugins/Trace/intel-pt/ThreadDecoder.cpp
Expand Up @@ -30,15 +30,20 @@ DecodedThreadSP ThreadDecoder::Decode() {
}

DecodedThreadSP ThreadDecoder::DoDecode() {
DecodedThreadSP decoded_thread_sp =
std::make_shared<DecodedThread>(m_thread_sp);

Error err = m_trace.OnThreadBufferRead(
m_thread_sp->GetID(), [&](llvm::ArrayRef<uint8_t> data) {
DecodeTrace(*decoded_thread_sp, m_trace, data);
return Error::success();
return m_trace.GetTimer()
.ForThread(m_thread_sp->GetID())
.TimeTask<DecodedThreadSP>("Decoding instructions", [&]() {
DecodedThreadSP decoded_thread_sp =
std::make_shared<DecodedThread>(m_thread_sp);

Error err = m_trace.OnThreadBufferRead(
m_thread_sp->GetID(), [&](llvm::ArrayRef<uint8_t> data) {
DecodeTrace(*decoded_thread_sp, m_trace, data);
return Error::success();
});

if (err)
decoded_thread_sp->AppendError(std::move(err));
return decoded_thread_sp;
});
if (err)
decoded_thread_sp->AppendError(std::move(err));
return decoded_thread_sp;
}
27 changes: 21 additions & 6 deletions lldb/source/Plugins/Trace/intel-pt/TraceIntelPT.cpp
Expand Up @@ -117,19 +117,32 @@ void TraceIntelPT::DumpTraceInfo(Thread &thread, Stream &s, bool verbose) {
size_t insn_len = decoded_trace_sp->GetInstructionsCount();
size_t mem_used = decoded_trace_sp->CalculateApproximateMemoryUsage();

s.Format(" Raw trace size: {0} KiB\n", *raw_size / 1024);
s.Format(" Total number of instructions: {0}\n", insn_len);
s.Format(" Total approximate memory usage: {0:2} KiB\n",
(double)mem_used / 1024);

s.PutCString("\n Memory usage:\n");
s.Format(" Raw trace size: {0} KiB\n", *raw_size / 1024);
s.Format(
" Total approximate memory usage (excluding raw trace): {0:2} KiB\n",
(double)mem_used / 1024);
if (insn_len != 0)
s.Format(" Average memory usage per instruction: {0:2} bytes\n",
s.Format(" Average memory usage per instruction (excluding raw trace): "
"{0:2} bytes\n",
(double)mem_used / insn_len);

s.PutCString("\n Timing:\n");
GetTimer()
.ForThread(thread.GetID())
.ForEachTimedTask(
[&](const std::string &name, std::chrono::milliseconds duration) {
s.Format(" {0}: {1:2}s\n", name, duration.count() / 1000.0);
});

s.PutCString("\n Errors:\n");
const DecodedThread::LibiptErrors &tsc_errors =
decoded_trace_sp->GetTscErrors();
s.Format("\n Number of TSC decoding errors: {0}\n", tsc_errors.total_count);
s.Format(" Number of TSC decoding errors: {0}\n", tsc_errors.total_count);
for (const auto &error_message_to_count : tsc_errors.libipt_errors) {
s.Format(" {0}: {1}\n", error_message_to_count.first,
s.Format(" {0}: {1}\n", error_message_to_count.first,
error_message_to_count.second);
}
}
Expand Down Expand Up @@ -358,3 +371,5 @@ Error TraceIntelPT::OnThreadBufferRead(lldb::tid_t tid,
OnBinaryDataReadCallback callback) {
return OnThreadBinaryDataRead(tid, "threadTraceBuffer", callback);
}

TaskTimer &TraceIntelPT::GetTimer() { return m_task_timer; }
6 changes: 6 additions & 0 deletions lldb/source/Plugins/Trace/intel-pt/TraceIntelPT.h
Expand Up @@ -9,6 +9,7 @@
#ifndef LLDB_SOURCE_PLUGINS_TRACE_INTEL_PT_TRACEINTELPT_H
#define LLDB_SOURCE_PLUGINS_TRACE_INTEL_PT_TRACEINTELPT_H

#include "TaskTimer.h"
#include "ThreadDecoder.h"
#include "TraceIntelPTSessionFileParser.h"
#include "lldb/Utility/FileSpec.h"
Expand Down Expand Up @@ -150,6 +151,10 @@ class TraceIntelPT : public Trace {
/// return \a nullptr.
Process *GetLiveProcess();

/// \return
/// The timer object for this trace.
TaskTimer &GetTimer();

private:
friend class TraceIntelPTSessionFileParser;

Expand Down Expand Up @@ -184,6 +189,7 @@ class TraceIntelPT : public Trace {
std::map<lldb::tid_t, std::unique_ptr<ThreadDecoder>> m_thread_decoders;
/// Error gotten after a failed live process update, if any.
llvm::Optional<std::string> m_live_refresh_error;
TaskTimer m_task_timer;
};

} // namespace trace_intel_pt
Expand Down
15 changes: 11 additions & 4 deletions lldb/test/API/commands/trace/TestTraceDumpInfo.py
Expand Up @@ -38,9 +38,16 @@ def testDumpRawTraceSize(self):
substrs=['''Trace technology: intel-pt
thread #1: tid = 3842849
Raw trace size: 4 KiB
Total number of instructions: 21
Total approximate memory usage: 0.27 KiB
Average memory usage per instruction: 13.00 bytes
Number of TSC decoding errors: 0'''])
Memory usage:
Raw trace size: 4 KiB
Total approximate memory usage (excluding raw trace): 0.27 KiB
Average memory usage per instruction (excluding raw trace): 13.00 bytes
Timing:
Decoding instructions: ''', '''s
Errors:
Number of TSC decoding errors: 0'''],
patterns=["Decoding instructions: \d.\d\ds"])
14 changes: 10 additions & 4 deletions lldb/test/API/commands/trace/TestTraceLoad.py
Expand Up @@ -36,12 +36,18 @@ def testLoadTrace(self):
self.expect("thread trace dump info", substrs=['''Trace technology: intel-pt
thread #1: tid = 3842849
Raw trace size: 4 KiB
Total number of instructions: 21
Total approximate memory usage: 0.27 KiB
Average memory usage per instruction: 13.00 bytes
Number of TSC decoding errors: 0'''])
Memory usage:
Raw trace size: 4 KiB
Total approximate memory usage (excluding raw trace): 0.27 KiB
Average memory usage per instruction (excluding raw trace): 13.00 bytes
Timing:
Decoding instructions: ''', '''s
Errors:
Number of TSC decoding errors: 0'''])

def testLoadInvalidTraces(self):
src_dir = self.getSourceDir()
Expand Down

0 comments on commit bdf3e7e

Please sign in to comment.