Skip to content

[LLDB][Telemetry]Define DebuggerTelemetryInfo and related methods #127696

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 29 commits into from
Mar 3, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
24e9f78
[LLDB][Telemetry]Define DebuggerTelemetryInfo and related methods
oontvoo Feb 18, 2025
496c370
Update lldb/include/lldb/Core/Telemetry.h
oontvoo Feb 19, 2025
11efc09
Update lldb/source/Core/Telemetry.cpp
oontvoo Feb 19, 2025
5aec284
adddressed review comments:
oontvoo Feb 20, 2025
4b6a4bd
remove unused fields
oontvoo Feb 24, 2025
83bb7c4
Merge branch 'main' into debugger_info
oontvoo Feb 26, 2025
4dd0782
addressed review comments
oontvoo Feb 26, 2025
7ad8360
format
oontvoo Feb 26, 2025
d7b1275
try reporting the crash from a crash handler
oontvoo Feb 26, 2025
302552a
revert crash handler changes
oontvoo Feb 27, 2025
c00cd54
Update lldb/source/Core/Telemetry.cpp
oontvoo Feb 27, 2025
597185f
remove unused headers
oontvoo Feb 27, 2025
41f6649
remove miscinfo kindtype
oontvoo Feb 27, 2025
169449b
Init debugger_id to LLDB_INVALID_UID
oontvoo Feb 27, 2025
7420811
addressed review comment
oontvoo Feb 27, 2025
0fb078d
Merge branch 'debugger_info' of github.com:oontvoo/llvm-project into …
oontvoo Feb 27, 2025
68f95ab
removed unused
oontvoo Feb 27, 2025
d586956
add smoke check
oontvoo Feb 27, 2025
8b7afd6
remove unused headers
oontvoo Feb 27, 2025
21b8bdf
added test
oontvoo Feb 27, 2025
7776ea1
formatting
oontvoo Feb 27, 2025
c70bd5d
Update lldb/include/lldb/Core/Telemetry.h
oontvoo Feb 27, 2025
bcd793e
change template to Info
oontvoo Feb 27, 2025
4ba0431
formatting
oontvoo Feb 28, 2025
6027a69
Update lldb/source/Core/Debugger.cpp
oontvoo Feb 28, 2025
c011531
address review comments
oontvoo Feb 28, 2025
a9fd8ad
Merge branch 'debugger_info' of github.com:oontvoo/llvm-project into …
oontvoo Feb 28, 2025
17ced2e
formatting
oontvoo Mar 1, 2025
d07dc38
Merge branch 'main' into debugger_info
oontvoo Mar 3, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 79 additions & 2 deletions lldb/include/lldb/Core/Telemetry.h
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,10 @@

#include "lldb/Core/StructuredDataImpl.h"
#include "lldb/Interpreter/CommandReturnObject.h"
#include "lldb/Utility/LLDBLog.h"
#include "lldb/Utility/StructuredData.h"
#include "lldb/lldb-forward.h"
#include "llvm/ADT/FunctionExtras.h"
#include "llvm/ADT/StringExtras.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/Support/JSON.h"
Expand All @@ -22,13 +24,20 @@
#include <memory>
#include <optional>
#include <string>
#include <unordered_map>

namespace lldb_private {
namespace telemetry {

// We expect each (direct) subclass of LLDBTelemetryInfo to
// have an LLDBEntryKind in the form 0b11xxxxxxxx
// Specifically:
// - Length: 8 bits
// - First two bits (MSB) must be 11 - the common prefix
// If any of the subclass has descendents, those descendents
// must have their LLDBEntryKind in the similar form (ie., share common prefix)
struct LLDBEntryKind : public ::llvm::telemetry::EntryKind {
static const llvm::telemetry::KindType BaseInfo = 0b11000;
static const llvm::telemetry::KindType BaseInfo = 0b11000000;
static const llvm::telemetry::KindType DebuggerInfo = 0b11000100;
};

/// Defines a convenient type for timestamp of various events.
Expand All @@ -41,6 +50,7 @@ struct LLDBBaseTelemetryInfo : public llvm::telemetry::TelemetryInfo {
std::optional<SteadyTimePoint> end_time;
// TBD: could add some memory stats here too?

lldb::user_id_t debugger_id = LLDB_INVALID_UID;
Debugger *debugger;

// For dyn_cast, isa, etc operations.
Expand All @@ -56,26 +66,93 @@ struct LLDBBaseTelemetryInfo : public llvm::telemetry::TelemetryInfo {
void serialize(llvm::telemetry::Serializer &serializer) const override;
};

struct DebuggerInfo : public LLDBBaseTelemetryInfo {
std::string lldb_version;

bool is_exit_entry = false;

DebuggerInfo() = default;

llvm::telemetry::KindType getKind() const override {
return LLDBEntryKind::DebuggerInfo;
}

static bool classof(const llvm::telemetry::TelemetryInfo *T) {
// Subclasses of this is also acceptable
return (T->getKind() & LLDBEntryKind::DebuggerInfo) ==
LLDBEntryKind::DebuggerInfo;
}

void serialize(llvm::telemetry::Serializer &serializer) const override;
};

/// The base Telemetry manager instance in LLDB.
/// This class declares additional instrumentation points
/// applicable to LLDB.
class TelemetryManager : public llvm::telemetry::Manager {
public:
llvm::Error preDispatch(llvm::telemetry::TelemetryInfo *entry) override;

const llvm::telemetry::Config *GetConfig();

virtual llvm::StringRef GetInstanceName() const = 0;

static TelemetryManager *GetInstance();

static TelemetryManager *GetInstanceIfEnabled();

protected:
TelemetryManager(std::unique_ptr<llvm::telemetry::Config> config);

static void SetInstance(std::unique_ptr<TelemetryManager> manger);

private:
std::unique_ptr<llvm::telemetry::Config> m_config;
// Each instance of a TelemetryManager is assigned a unique ID.
const std::string m_id;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we store this as a UUID (and convert it to a string in the serializer)? I think having a static method in the UUID class that generates a random UUID might be generally useful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What use case do you have in mind? Right now this field is read-only (copied onto multiple entry objects).

So it seems jsut storing a string directly seems more straightforward.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting the generation code into the UUID class seems reasonable. I'm fairly ambivalent about the storage format. One reason for storing it in the UUID form might be if we think that someone could be interested in storing/transmitting the UUID in a binary form (because it's more compact?).

(One thing complicating that is the fact that you're currently appending the timestamp to the string. I don't think that's necessary because the random data should be unique enough, and the (unlikely) case of a missing random source could be handled by initializing a pseudo-random generator with the timestamp and then using that to generate a UUID)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One reason for storing it in the UUID form might be if we think that someone could be interested in storing/transmitting the UUID in a binary form

This is the TelemetryManager - it is never transmitted or serialised anywhere. So why does compactness matter?

If you mean storing the id as UUID in the TelemetryInfo object, then we'd need to update the llvm base interface. However, the reason the id was a string in the base interface was because people wanted flexibility in what the id could be in each implementation.
Also the downside of storing it as UUID object is that you may have different telemetry::Serializer objects generating different values for the same UUID (ie.,different separators?). That's potentially problematic because ultimately, the point of this field is for grouping entries from the same session. We need the key to be identical across entries and not "accidentally" differ because of serialisation methods.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you mean storing the id as UUID in the TelemetryInfo object, then we'd need to update the llvm base interface. However, the reason the id was a string in the base interface was because people wanted flexibility in what the id could be in each implementation.

I mean that, but yes, the fact that this is in the llvm base class make this tricky. I think this is fine then.


static std::unique_ptr<TelemetryManager> g_instance;
};

/// Helper RAII class for collecting telemetry.
template <typename Info> struct ScopedDispatcher {
// The debugger pointer is optional because we may not have a debugger yet.
// In that case, caller must set the debugger later.
ScopedDispatcher(llvm::unique_function<void(Info *info)> callback,
Debugger *debugger = nullptr)
: m_callback(std::move(callback)) {
// Start the timer.
m_start_time = std::chrono::steady_clock::now();
m_info.debugger = debugger;
}

void SetDebugger(Debugger *debugger) { m_info.debugger = debugger; }

~ScopedDispatcher() {
// If Telemetry is disabled (either at buildtime or runtime),
// then don't do anything.
TelemetryManager *manager = TelemetryManager::GetInstanceIfEnabled();
if (!manager)
return;

m_info.start_time = m_start_time;
// Populate common fields that we can only set now.
m_info.end_time = std::chrono::steady_clock::now();
// The callback will set the remaining fields.
m_callback(&m_info);
// And then we dispatch.
if (llvm::Error er = manager->dispatch(&m_info)) {
LLDB_LOG_ERROR(GetLog(LLDBLog::Object), std::move(er),
"Failed to dispatch entry of type: {0}", m_info.getKind());
}
}

private:
SteadyTimePoint m_start_time;
llvm::unique_function<void(Info *info)> m_callback;
Info m_info;
};

} // namespace telemetry
} // namespace lldb_private
#endif // LLDB_CORE_TELEMETRY_H
17 changes: 17 additions & 0 deletions lldb/source/Core/Debugger.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
#include "lldb/Core/PluginManager.h"
#include "lldb/Core/Progress.h"
#include "lldb/Core/StreamAsynchronousIO.h"
#include "lldb/Core/Telemetry.h"
#include "lldb/DataFormatters/DataVisualization.h"
#include "lldb/Expression/REPL.h"
#include "lldb/Host/File.h"
Expand Down Expand Up @@ -52,6 +53,7 @@
#include "lldb/Utility/State.h"
#include "lldb/Utility/Stream.h"
#include "lldb/Utility/StreamString.h"
#include "lldb/Version/Version.h"
#include "lldb/lldb-enumerations.h"

#if defined(_WIN32)
Expand All @@ -62,13 +64,15 @@
#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/iterator.h"
#include "llvm/Config/llvm-config.h"
#include "llvm/Support/DynamicLibrary.h"
#include "llvm/Support/FileSystem.h"
#include "llvm/Support/Process.h"
#include "llvm/Support/ThreadPool.h"
#include "llvm/Support/Threading.h"
#include "llvm/Support/raw_ostream.h"

#include <chrono>
#include <cstdio>
#include <cstdlib>
#include <cstring>
Expand Down Expand Up @@ -761,7 +765,14 @@ void Debugger::InstanceInitialize() {

DebuggerSP Debugger::CreateInstance(lldb::LogOutputCallback log_callback,
void *baton) {
lldb_private::telemetry::ScopedDispatcher<
lldb_private::telemetry::DebuggerInfo>
helper([](lldb_private::telemetry::DebuggerInfo *entry) {
entry->lldb_version = lldb_private::GetVersion();
});
DebuggerSP debugger_sp(new Debugger(log_callback, baton));
helper.SetDebugger(debugger_sp.get());
Comment on lines +768 to +774
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely better, but lambdas tend to add some boilerplate of their own. They may be useful in some cases (if you need to set some field at the very end) but I think that in many cases (this one included) we could just set the field directly, so I want to float this idea: I think this would be shorter if we had the dispatcher object expose the telemetry entry it is managing directly:

Suggested change
lldb_private::telemetry::ScopedDispatcher<
lldb_private::telemetry::DebuggerInfo>
helper([](lldb_private::telemetry::DebuggerInfo *entry) {
entry->lldb_version = lldb_private::GetVersion();
});
DebuggerSP debugger_sp(new Debugger(log_callback, baton));
helper.SetDebugger(debugger_sp.get());
telemetry::ScopedDispatcher<telemetry::DebuggerInfo> helper;
DebuggerSP debugger_sp(new Debugger(log_callback, baton));
if (telemetry::DebuggerInfo *info = helper.GetEntry()) {// returns NULL if telemetry is disabled
info->debugger = debugger_sp.get();
info->lldb_version = lldb_private::GetVersion();
}

(and if we do need a callback, I don't see a reason why this couldn't coexist with that)

Any thoughts @JDevlieghere ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not exposing the managed TelemetryInfo because in the next PR, i'm going to move it to being a local var in the execute function. (Reasons being I need to dispatch/execute more than one callbacks - one at the beginning and one at the end of the function. )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a second thought, how about we make a "NoOp" TelemetryManager that does not do anything in case when Telemetry is disabled? That way, we wouldn't need to check for the flag/ptr.


if (g_debugger_list_ptr && g_debugger_list_mutex_ptr) {
std::lock_guard<std::recursive_mutex> guard(*g_debugger_list_mutex_ptr);
g_debugger_list_ptr->push_back(debugger_sp);
Expand Down Expand Up @@ -987,6 +998,12 @@ void Debugger::Clear() {
// static void Debugger::Destroy(lldb::DebuggerSP &debugger_sp);
// static void Debugger::Terminate();
llvm::call_once(m_clear_once, [this]() {
telemetry::ScopedDispatcher<telemetry::DebuggerInfo> helper(
[this](lldb_private::telemetry::DebuggerInfo *info) {
assert(this == info->debugger);
info->is_exit_entry = true;
},
this);
ClearIOHandlers();
StopIOHandlerThread();
StopEventHandlerThread();
Expand Down
63 changes: 47 additions & 16 deletions lldb/source/Core/Telemetry.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,10 @@
//===----------------------------------------------------------------------===//
#include "lldb/Core/Telemetry.h"
#include "lldb/Core/Debugger.h"
#include "lldb/Core/Telemetry.h"
#include "lldb/Utility/LLDBLog.h"
#include "lldb/Utility/UUID.h"
#include "lldb/Version/Version.h"
#include "lldb/lldb-enumerations.h"
#include "lldb/lldb-forward.h"
#include "llvm/ADT/StringRef.h"
Expand All @@ -30,6 +32,26 @@ static uint64_t ToNanosec(const SteadyTimePoint Point) {
return std::chrono::nanoseconds(Point.time_since_epoch()).count();
}

// Generate a unique string. This should be unique across different runs.
// We build such string by combining three parts:
// <16 random bytes>_<timestamp>
// This reduces the chances of getting the same UUID, even when the same
// user runs the two copies of binary at the same time.
static std::string MakeUUID() {
uint8_t random_bytes[16];
std::string randomString = "_";
if (auto ec = llvm::getRandomBytes(random_bytes, 16)) {
LLDB_LOG(GetLog(LLDBLog::Object),
"Failed to generate random bytes for UUID: {0}", ec.message());
} else {
randomString = UUID(random_bytes).GetAsString();
}

return llvm::formatv(
"{0}_{1}", randomString,
std::chrono::steady_clock::now().time_since_epoch().count());
}

void LLDBBaseTelemetryInfo::serialize(Serializer &serializer) const {
serializer.write("entry_kind", getKind());
serializer.write("session_id", SessionId);
Expand All @@ -38,38 +60,47 @@ void LLDBBaseTelemetryInfo::serialize(Serializer &serializer) const {
serializer.write("end_time", ToNanosec(end_time.value()));
}

[[maybe_unused]] static std::string MakeUUID(Debugger *debugger) {
uint8_t random_bytes[16];
if (auto ec = llvm::getRandomBytes(random_bytes, 16)) {
LLDB_LOG(GetLog(LLDBLog::Object),
"Failed to generate random bytes for UUID: {0}", ec.message());
// Fallback to using timestamp + debugger ID.
return llvm::formatv(
"{0}_{1}", std::chrono::steady_clock::now().time_since_epoch().count(),
debugger->GetID());
}
return UUID(random_bytes).GetAsString();
void DebuggerInfo::serialize(Serializer &serializer) const {
LLDBBaseTelemetryInfo::serialize(serializer);

serializer.write("lldb_version", lldb_version);
serializer.write("is_exit_entry", is_exit_entry);
}

TelemetryManager::TelemetryManager(std::unique_ptr<Config> config)
: m_config(std::move(config)) {}
: m_config(std::move(config)), m_id(MakeUUID()) {}

llvm::Error TelemetryManager::preDispatch(TelemetryInfo *entry) {
// Do nothing for now.
// In up-coming patch, this would be where the manager
// attach the session_uuid to the entry.
// Assign the manager_id, and debugger_id, if available, to this entry.
LLDBBaseTelemetryInfo *lldb_entry = llvm::cast<LLDBBaseTelemetryInfo>(entry);
lldb_entry->SessionId = m_id;
if (Debugger *debugger = lldb_entry->debugger)
lldb_entry->debugger_id = debugger->GetID();
return llvm::Error::success();
}

const Config *TelemetryManager::GetConfig() { return m_config.get(); }

std::unique_ptr<TelemetryManager> TelemetryManager::g_instance = nullptr;
TelemetryManager *TelemetryManager::GetInstance() {
if (!Config::BuildTimeEnableTelemetry)
return nullptr;
return g_instance.get();
}

TelemetryManager *TelemetryManager::GetInstanceIfEnabled() {
// Telemetry may be enabled at build time but disabled at runtime.
if (TelemetryManager *ins = TelemetryManager::GetInstance()) {
if (ins->GetConfig()->EnableTelemetry)
return ins;
}

return nullptr;
}

void TelemetryManager::SetInstance(std::unique_ptr<TelemetryManager> manager) {
g_instance = std::move(manager);
if (Config::BuildTimeEnableTelemetry)
g_instance = std::move(manager);
}

} // namespace telemetry
Expand Down
Loading
Loading