Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC][LLDB] Telemetry in LLDB #87815

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

oontvoo
Copy link
Member

@oontvoo oontvoo commented Apr 5, 2024

Implement telemetry in LLDB.

(See previous discussions at https://discourse.llvm.org/t/rfc-lldb-telemetry-metrics/64588)

@oontvoo oontvoo requested review from labath and cmtice April 5, 2024 18:31
@llvmbot llvmbot added the lldb label Apr 5, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Apr 5, 2024

@llvm/pr-subscribers-lldb

Author: Vy Nguyen (oontvoo)

Changes

Implement telemetry in LLDB.

(See previous discussions at https://discourse.llvm.org/t/rfc-lldb-telemetry-metrics/64588)


Patch is 48.73 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/87815.diff

12 Files Affected:

  • (modified) lldb/include/lldb/API/SBDebugger.h (+4)
  • (modified) lldb/include/lldb/Core/Debugger.h (+10)
  • (added) lldb/include/lldb/Core/Telemetry.h (+237)
  • (modified) lldb/include/lldb/Target/Process.h (+3)
  • (modified) lldb/source/API/SBDebugger.cpp (+9)
  • (modified) lldb/source/Core/CMakeLists.txt (+1)
  • (modified) lldb/source/Core/Debugger.cpp (+31-1)
  • (added) lldb/source/Core/Telemetry.cpp (+620)
  • (modified) lldb/source/Interpreter/CommandInterpreter.cpp (+34-10)
  • (modified) lldb/source/Target/Process.cpp (+6-1)
  • (modified) lldb/source/Target/Target.cpp (+16-1)
  • (modified) lldb/tools/lldb-dap/DAP.cpp (+13-5)
diff --git a/lldb/include/lldb/API/SBDebugger.h b/lldb/include/lldb/API/SBDebugger.h
index 62b2f91f5076d5..c4b51fc925f094 100644
--- a/lldb/include/lldb/API/SBDebugger.h
+++ b/lldb/include/lldb/API/SBDebugger.h
@@ -9,10 +9,12 @@
 #ifndef LLDB_API_SBDEBUGGER_H
 #define LLDB_API_SBDEBUGGER_H
 
+#include <chrono>
 #include <cstdio>
 
 #include "lldb/API/SBDefines.h"
 #include "lldb/API/SBPlatform.h"
+#include "third_party/llvm/llvm-project/lldb/include/lldb/API/SBStructuredData.h"
 
 namespace lldb_private {
 class CommandPluginInterfaceImplementation;
@@ -243,6 +245,8 @@ class LLDB_API SBDebugger {
 
   lldb::SBTarget GetDummyTarget();
 
+  void SendTelemetry(SBStructuredData *entry);
+
   // Return true if target is deleted from the target list of the debugger.
   bool DeleteTarget(lldb::SBTarget &target);
 
diff --git a/lldb/include/lldb/Core/Debugger.h b/lldb/include/lldb/Core/Debugger.h
index 418c2403d020f4..a3ecb7e4724374 100644
--- a/lldb/include/lldb/Core/Debugger.h
+++ b/lldb/include/lldb/Core/Debugger.h
@@ -19,6 +19,7 @@
 #include "lldb/Core/FormatEntity.h"
 #include "lldb/Core/IOHandler.h"
 #include "lldb/Core/SourceManager.h"
+#include "lldb/Core/Telemetry.h"
 #include "lldb/Core/UserSettingsController.h"
 #include "lldb/Host/HostThread.h"
 #include "lldb/Host/StreamFile.h"
@@ -38,6 +39,7 @@
 #include "lldb/lldb-private-enumerations.h"
 #include "lldb/lldb-private-types.h"
 #include "lldb/lldb-types.h"
+#include "third_party/llvm/llvm-project/lldb/include/lldb/Utility/StructuredData.h"
 
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/StringMap.h"
@@ -145,6 +147,12 @@ class Debugger : public std::enable_shared_from_this<Debugger>,
 
   lldb::StreamFileSP GetErrorStreamSP() { return m_error_stream_sp; }
 
+  std::shared_ptr<TelemetryLogger> GetTelemetryLogger() {
+    return m_telemetry_logger;
+  }
+
+  void SendClientTelemetry(lldb_private::StructuredData::Object *entry);
+
   File &GetInputFile() { return *m_input_file_sp; }
 
   File &GetOutputFile() { return m_output_stream_sp->GetFile(); }
@@ -737,6 +745,7 @@ class Debugger : public std::enable_shared_from_this<Debugger>,
   uint32_t m_interrupt_requested = 0; ///< Tracks interrupt requests
   std::mutex m_interrupt_mutex;
 
+  std::shared_ptr<TelemetryLogger> m_telemetry_logger;
   // Events for m_sync_broadcaster
   enum {
     eBroadcastBitEventThreadIsListening = (1 << 0),
@@ -749,6 +758,7 @@ class Debugger : public std::enable_shared_from_this<Debugger>,
 
   Debugger(const Debugger &) = delete;
   const Debugger &operator=(const Debugger &) = delete;
+  TelemetryEventStats stats;
 };
 
 } // namespace lldb_private
diff --git a/lldb/include/lldb/Core/Telemetry.h b/lldb/include/lldb/Core/Telemetry.h
new file mode 100644
index 00000000000000..f021437772aa09
--- /dev/null
+++ b/lldb/include/lldb/Core/Telemetry.h
@@ -0,0 +1,237 @@
+#ifndef LLDB_CORE_TELEMETRY_H
+#define LLDB_CORE_TELEMETRY_H
+
+#include <chrono>
+#include <ctime>
+#include <memory>
+#include <optional>
+#include <string>
+
+#include "lldb/Interpreter/CommandReturnObject.h"
+#include "lldb/Utility/StructuredData.h"
+#include "lldb/lldb-forward.h"
+#include "llvm/ADT/StringExtras.h"
+#include "llvm/ADT/StringRef.h"
+
+namespace lldb_private {
+
+using SteadyTimePoint = std::chrono::time_point<std::chrono::steady_clock>;
+
+struct TelemetryEventStats {
+  // REQUIRED: Start time of event
+  SteadyTimePoint m_start;
+  // OPTIONAL: End time of event - may be empty if not meaningful.
+  std::optional<SteadyTimePoint> m_end;
+
+  // TBD: could add some memory stats here too?
+
+  TelemetryEventStats() = default;
+  TelemetryEventStats(SteadyTimePoint start) : m_start(start) {}
+  TelemetryEventStats(SteadyTimePoint start, SteadyTimePoint end)
+      : m_start(start), m_end(end) {}
+
+  std::optional<std::chrono::nanoseconds> Duration() const {
+    if (m_end.has_value())
+      return *m_end - m_start;
+    else
+      return std::nullopt;
+  }
+};
+
+struct LoggerConfig {
+  // If true, loggings will be enabled.
+  bool enable_logging;
+
+  // Additional destinations to send the logged entries.
+  // Could be stdout, stderr, or some local paths.
+  // Note: these are destinations are __in addition to__ whatever the default
+  // destination(s) are, as implemented by vendors.
+  std::vector<std::string> additional_destinations;
+};
+
+// The base class contains the basic set of data.
+// Downstream implementations can add more fields as needed.
+struct BaseTelemetryEntry {
+  // A "session" corresponds to every time lldb starts.
+  // All entries emitted for the same session will have
+  // the same session_uuid
+  std::string session_uuid;
+
+  TelemetryEventStats stats;
+
+  // Counting number of entries.
+  // (For each set of entries with the same session_uuid, this value should
+  // be unique for each entry)
+  size_t counter;
+
+  virtual ~BaseTelemetryEntry() = default;
+  virtual std::string ToString() const;
+};
+
+struct ExitDescription {
+  int exit_code;
+  std::string description;
+};
+
+struct DebuggerInfoEntry : public BaseTelemetryEntry {
+  std::string username;
+  std::string lldb_git_sha;
+  std::string lldb_path;
+  std::string cwd;
+
+  // The debugger exit info. Not populated if this entry was emitted for startup
+  // event.
+  std::optional<ExitDescription> lldb_exit;
+
+  std::string ToString() const override;
+};
+
+struct TargetInfoEntry : public BaseTelemetryEntry {
+  // All entries emitted for the same SBTarget will have the same
+  // target_uuid.
+  std::string target_uuid;
+  std::string file_format;
+
+  std::string binary_path;
+  size_t binary_size;
+
+  // The process(target) exit info. Not populated of this entry was emitted for
+  // startup event.
+  std::optional<ExitDescription> process_exit;
+
+  std::string ToString() const override;
+};
+
+// Entry from client (eg., SB-API)
+struct ClientTelemetryEntry : public BaseTelemetryEntry {
+  std::string request_name;
+  std::string error_msg;
+  std::string ToString() const override;
+};
+
+struct CommandInfoEntry : public BaseTelemetryEntry {
+  // If the command is/can be associated with a target entry,
+  // this field contains that target's UUID.
+  // <EMPTY> otherwise.
+  std::string target_uuid;
+  std::string command_uuid;
+
+  // Eg., "breakpoint set"
+  std::string command_name;
+
+  // !!NOTE!!: The following fields may be omitted due to PII risk.
+  // (Configurable via the LoggerConfig struct)
+  std::string original_command;
+  std::string args;
+
+  ExitDescription exit_description;
+
+  std::string ToString() const override;
+};
+
+// The "catch-all" entry to store a set of custom/non-standard
+// data.
+struct MiscInfoEntry : public BaseTelemetryEntry {
+  // If the event is/can be associated with a target entry,
+  // this field contains that target's UUID.
+  // <EMPTY> otherwise.
+  std::string target_uuid;
+
+  // Set of key-value pairs for any optional (or impl-specific) data
+  std::unordered_map<std::string, std::string> meta_data;
+
+  std::string ToString() const override;
+};
+
+// Where/how to send the logged entries.
+class TelemetryDestination {
+public:
+  virtual ~TelemetryDestination() = default;
+  virtual Status EmitEntry(const BaseTelemetryEntry *entry) = 0;
+  virtual std::string name() const = 0;
+};
+
+// The logger itself!
+class TelemetryLogger {
+public:
+  static std::shared_ptr<TelemetryLogger> CreateInstance(Debugger *);
+
+  virtual ~TelemetryLogger() = default;
+
+  // Invoked upon lldb startup
+  virtual void LogStartup(llvm::StringRef lldb_path,
+                          TelemetryEventStats stats) = 0;
+
+  // Invoked upon lldb exit.
+  virtual void LogExit(llvm::StringRef lldb_path,
+                       TelemetryEventStats stats) = 0;
+
+  // Invoked upon process exit
+  virtual void LogProcessExit(int status, llvm::StringRef exit_string,
+                              TelemetryEventStats stats,
+                              Target *target_ptr) = 0;
+
+  // Invoked upon loading the main executable module
+  // We log in a fire-n-forget fashion so that if the load
+  // crashes, we don't lose the entry.
+  virtual void LogMainExecutableLoadStart(lldb::ModuleSP exec_mod,
+                                          TelemetryEventStats stats) = 0;
+  virtual void LogMainExecutableLoadEnd(lldb::ModuleSP exec_mod,
+                                        TelemetryEventStats stats) = 0;
+
+  // Invoked for each command
+  // We log in a fire-n-forget fashion so that if the command execution
+  // crashes, we don't lose the entry.
+  virtual void LogCommandStart(llvm::StringRef uuid,
+                               llvm::StringRef original_command,
+                               TelemetryEventStats stats,
+                               Target *target_ptr) = 0;
+  virtual void LogCommandEnd(llvm::StringRef uuid, llvm::StringRef command_name,
+                             llvm::StringRef command_args,
+                             TelemetryEventStats stats, Target *target_ptr,
+                             CommandReturnObject *result) = 0;
+
+  virtual std::string GetNextUUID() = 0;
+
+  // For client (eg., SB API) to send telemetry entries.
+  virtual void
+  LogClientTelemetry(lldb_private::StructuredData::Object *entry) = 0;
+
+  virtual void AddDestination(TelemetryDestination *destination) = 0;
+};
+
+/*
+
+Logger configs: LLDB users can also supply their own configs via:
+$XDG_CONFIG_HOME/.lldb_telemetry_config
+
+
+We can propose simple syntax: <field_name><colon><value>
+Eg.,
+enable_logging:true
+destination:stdout
+destination:stderr
+destination:/path/to/some/file
+
+The allowed field_name values are:
+ * enable_logging
+       If the fields are specified more than once, the last line will take precedence
+       If enable_logging is set to false, no logging will occur.
+ * destination.
+      This is allowed to be specified multiple times - it will add to the default
+      (ie, specified by vendor) list of destinations.
+      The value can be either of
+         + one of the two magic values "stdout" or "stderr".
+         + a path to a local file
+
+!!NOTE!!: We decided to use a separate file instead of the existing settings
+         file because that file is parsed too late in the process and by the
+         there might have been lots of telemetry-entries that need to be
+         sent already.
+         This approach avoid losing log entries if LLDB crashes during init.
+
+*/
+LoggerConfig *GetLoggerConfig();
+
+} // namespace lldb_private
+#endif // LLDB_CORE_TELEMETRY_H
diff --git a/lldb/include/lldb/Target/Process.h b/lldb/include/lldb/Target/Process.h
index 2f3a3c22422efe..c9a7f3db2eb0bb 100644
--- a/lldb/include/lldb/Target/Process.h
+++ b/lldb/include/lldb/Target/Process.h
@@ -28,6 +28,7 @@
 #include "lldb/Core/LoadedModuleInfoList.h"
 #include "lldb/Core/PluginInterface.h"
 #include "lldb/Core/SourceManager.h"
+#include "lldb/Core/Telemetry.h"
 #include "lldb/Core/ThreadSafeValue.h"
 #include "lldb/Core/ThreadedCommunication.h"
 #include "lldb/Core/UserSettingsController.h"
@@ -3249,6 +3250,8 @@ void PruneThreadPlans();
 
   Process(const Process &) = delete;
   const Process &operator=(const Process &) = delete;
+
+  TelemetryEventStats m_event_stats;
 };
 
 /// RAII guard that should be acquired when an utility function is called within
diff --git a/lldb/source/API/SBDebugger.cpp b/lldb/source/API/SBDebugger.cpp
index fbcf30e67fc1cd..6078ab59a20afc 100644
--- a/lldb/source/API/SBDebugger.cpp
+++ b/lldb/source/API/SBDebugger.cpp
@@ -34,6 +34,7 @@
 #include "lldb/API/SBTypeNameSpecifier.h"
 #include "lldb/API/SBTypeSummary.h"
 #include "lldb/API/SBTypeSynthetic.h"
+#include <iostream>
 
 #include "lldb/Core/Debugger.h"
 #include "lldb/Core/DebuggerEvents.h"
@@ -965,6 +966,14 @@ SBTarget SBDebugger::GetDummyTarget() {
   return sb_target;
 }
 
+void SBDebugger::SendTelemetry(SBStructuredData *entry) {
+  if (lldb_private::Debugger *debugger = this->get()) {
+    debugger->SendClientTelemetry(entry->m_impl_up->GetObjectSP().get());
+  } else {
+    std::cerr << " --- cannot log dap request - debugger is null\n";
+  }
+}
+
 bool SBDebugger::DeleteTarget(lldb::SBTarget &target) {
   LLDB_INSTRUMENT_VA(this, target);
 
diff --git a/lldb/source/Core/CMakeLists.txt b/lldb/source/Core/CMakeLists.txt
index 10525ac39e6ef5..4ce9e5d81679f5 100644
--- a/lldb/source/Core/CMakeLists.txt
+++ b/lldb/source/Core/CMakeLists.txt
@@ -54,6 +54,7 @@ add_lldb_library(lldbCore
   SourceLocationSpec.cpp
   SourceManager.cpp
   StreamAsynchronousIO.cpp
+  Telemetry.cpp
   ThreadedCommunication.cpp
   UserSettingsController.cpp
   Value.cpp
diff --git a/lldb/source/Core/Debugger.cpp b/lldb/source/Core/Debugger.cpp
index ebd112110e5f2d..929b22e1df311d 100644
--- a/lldb/source/Core/Debugger.cpp
+++ b/lldb/source/Core/Debugger.cpp
@@ -17,6 +17,7 @@
 #include "lldb/Core/PluginManager.h"
 #include "lldb/Core/Progress.h"
 #include "lldb/Core/StreamAsynchronousIO.h"
+#include "lldb/Core/Telemetry.h"
 #include "lldb/DataFormatters/DataVisualization.h"
 #include "lldb/Expression/REPL.h"
 #include "lldb/Host/File.h"
@@ -733,12 +734,23 @@ void Debugger::InstanceInitialize() {
 
 DebuggerSP Debugger::CreateInstance(lldb::LogOutputCallback log_callback,
                                     void *baton) {
+  SteadyTimePoint start_time = std::chrono::steady_clock::now();
   DebuggerSP debugger_sp(new Debugger(log_callback, baton));
+  debugger_sp->stats.m_start = start_time;
   if (g_debugger_list_ptr && g_debugger_list_mutex_ptr) {
     std::lock_guard<std::recursive_mutex> guard(*g_debugger_list_mutex_ptr);
     g_debugger_list_ptr->push_back(debugger_sp);
   }
   debugger_sp->InstanceInitialize();
+  TelemetryEventStats init_stats(start_time, std::chrono::steady_clock::now());
+
+  // TODO: we could split up the logging:
+  // - LogStartup_start: called at the beginning
+  // - LogStartup_end: called at the end
+  // This way if the init crashes, we still have some logging.
+  debugger_sp->m_telemetry_logger->LogStartup(
+      HostInfo::GetProgramFileSpec().GetPathAsConstString().GetCString(),
+      std::move(init_stats));
   return debugger_sp;
 }
 
@@ -847,7 +859,8 @@ Debugger::Debugger(lldb::LogOutputCallback log_callback, void *baton)
       m_sync_broadcaster(nullptr, "lldb.debugger.sync"),
       m_broadcaster(m_broadcaster_manager_sp,
                     GetStaticBroadcasterClass().AsCString()),
-      m_forward_listener_sp(), m_clear_once() {
+      m_forward_listener_sp(), m_clear_once(),
+      m_telemetry_logger(TelemetryLogger::CreateInstance(this)) {
   // Initialize the debugger properties as early as possible as other parts of
   // LLDB will start querying them during construction.
   m_collection_sp->Initialize(g_debugger_properties);
@@ -939,6 +952,18 @@ void Debugger::Clear() {
   //     static void Debugger::Destroy(lldb::DebuggerSP &debugger_sp);
   //     static void Debugger::Terminate();
   llvm::call_once(m_clear_once, [this]() {
+    // Log the "quit" event.
+    // Note: this session_stats include the time since LLDB starts till quit
+    // (now).
+    // TBD: we could also record stats for *just* the quit action, if needed?
+    //      (ie., how long it takes to run all these cleanup functions?)
+    TelemetryEventStats session_stats{
+        /*start_session*/ stats.m_start,
+        /*end_session*/ std::chrono::steady_clock::now()};
+    m_telemetry_logger->LogExit(
+        HostInfo::GetProgramFileSpec().GetPathAsConstString().GetCString(),
+        session_stats);
+
     ClearIOHandlers();
     StopIOHandlerThread();
     StopEventHandlerThread();
@@ -2198,6 +2223,11 @@ Status Debugger::RunREPL(LanguageType language, const char *repl_options) {
   return err;
 }
 
+void Debugger::SendClientTelemetry(
+    lldb_private::StructuredData::Object *entry) {
+  m_telemetry_logger->LogClientTelemetry(entry);
+}
+
 llvm::ThreadPoolInterface &Debugger::GetThreadPool() {
   assert(g_thread_pool &&
          "Debugger::GetThreadPool called before Debugger::Initialize");
diff --git a/lldb/source/Core/Telemetry.cpp b/lldb/source/Core/Telemetry.cpp
new file mode 100644
index 00000000000000..2f8bca1da582b8
--- /dev/null
+++ b/lldb/source/Core/Telemetry.cpp
@@ -0,0 +1,620 @@
+
+//===-- Telemetry.cpp -----------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+#include "lldb/Core/Telemetry.h"
+
+#include <memory>
+#include <stdbool.h>
+#include <sys/auxv.h>
+
+#include <chrono>
+#include <cstdlib>
+#include <ctime>
+#include <fstream>
+#include <iostream>
+#include <string>
+#include <typeinfo>
+#include <utility>
+#include <vector>
+
+#include "lldb/API/SBDebugger.h"
+#include "lldb/API/SBProcess.h"
+#include "lldb/Core/Debugger.h"
+#include "lldb/Core/Module.h"
+#include "lldb/Host/FileSystem.h"
+#include "lldb/Host/HostInfo.h"
+#include "lldb/Interpreter/CommandInterpreter.h"
+#include "lldb/Target/Process.h"
+#include "lldb/Target/Statistics.h"
+#include "lldb/Utility/ConstString.h"
+#include "lldb/Utility/FileSpec.h"
+#include "lldb/Utility/UUID.h"
+#include "lldb/Version/Version.h"
+#include "lldb/lldb-enumerations.h"
+#include "lldb/lldb-forward.h"
+#include "third_party/llvm/llvm-project/llvm/include/llvm/ADT/StringRef.h"
+#include "llvm/ADT/SmallString.h"
+#include "llvm/ADT/Twine.h"
+#include "llvm/Support/Chrono.h"
+#include "llvm/Support/FileSystem.h"
+#include "llvm/Support/LineIterator.h"
+#include "llvm/Support/MemoryBuffer.h"
+#include "llvm/Support/Path.h"
+#include "llvm/Support/RandomNumberGenerator.h"
+
+#include "clang/Basic/Version.h"
+
+namespace lldb_private {
+
+std::string BaseTelemetryEntry::ToString() const {
+  return "[BaseTelemetryEntry]\n" + ("  session_uuid:" + session_uuid + "\n") +
+         ("  start timestamp: " +
+          std::to_string(stats.m_start.time_since_epoch().count()) + "\n") +
+         (" counter: " + std::to_string(counter));
+}
+
+std::string DebuggerInfoEntry::ToString() const {
+  auto duration = stats.Duration();
+  return BaseTelemetryEntry::ToString() + "\n" + ("[DebuggerInfoEntry]\n") +
+         ("  username: " + username + "\n") +
+         ("  lldb_git_sha: " + lldb_git_sha + "\n") +
+         ("  lldb_path: " + lldb_path + "\n") + ("  cwd: " + cwd + "\n") +
+         ("  lldb startup/session duration: " +
+          (duration.has_value() ? std::to_string(duration->count())
+                                : "<empty>") +
+          "(nanosec)\n") +
+         (lldb_exit.has_value()
+              ? ("lldb_exit_code: " + std::to_string(lldb_exit->exit_code) +
+                 ", lldb_exit_description: " + lldb_exit->description + "\n  ")
+              : (""));
+}
+
+std::string ClientTelemetryEntry::ToString() const {
+  return BaseTelemetryEntry::ToString() + "\n" + ("[DapRequestInfoEntry]\n") +
+         ("  request_name: " + request_name + "\n") +
+         ("  request_duration: " + std::to_string(stats.Duration()->count()) +
+          "(nanosec)\n") +
+         ("  error_msg: " + error_msg + "\n");
+}
+
+std::string TargetInfoEntry::ToString() const {
+  std::string exit_or_load_desc;
+  if (process_exit.has_value()) {
+    // If this entry was emitted for an exit
+    exit_or_load_desc =
+        "  process_duration: " + std::to_string(stats.Duration()->count()) +
+        "(nanosec)" +
+        "\n"
+        "  exit_code: " +
+        std::to_string(process_exit->exit_code) +
+        ", exit description: " + process_exit->description + "\n";
+  } else {
+    // This was emitted for a load event.
+    // See if it was the start-load or end-load entry
+    if (stats.m_end.has_value()) {
+      exit_or_load_desc = "  startup_init_duration: " +
+                          std::to_string(stats.Duration()->count()) +
+                          "(nanosec)" + "\n";
+    } else {
+      exit_or_load_desc = " startup_init_start\n";
+    }
+  }
+  return BaseTelemetryEntry::ToString() + "\n" + ("[TargetInfoEntry]\n") +
+         ("  target_uuid: " + target_uuid + "\n") +
+         ("  file_format: " + file_format + "\n") +
+         ("  binary_path: " + binary_path + "\n") +
+         ("  binary_size: " + std::to_string(binary_size) + "\n") +
+         exit_or_load_desc;...
[truncated]

Copy link

github-actions bot commented Apr 5, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 2606c87788153bf33d854fa5c3a03e16d544c5d7 7321a9073f1ec326f53ada9f40e258c038037af2 -- lldb/include/lldb/Core/Telemetry.h lldb/source/Core/Telemetry.cpp lldb/include/lldb/API/SBDebugger.h lldb/include/lldb/Core/Debugger.h lldb/include/lldb/Target/Process.h lldb/source/API/SBDebugger.cpp lldb/source/Core/Debugger.cpp lldb/source/Interpreter/CommandInterpreter.cpp lldb/source/Target/Process.cpp lldb/source/Target/Target.cpp lldb/tools/lldb-dap/DAP.cpp
View the diff from clang-format here.
diff --git a/lldb/include/lldb/Core/Telemetry.h b/lldb/include/lldb/Core/Telemetry.h
index f021437772..4b9dc87d58 100644
--- a/lldb/include/lldb/Core/Telemetry.h
+++ b/lldb/include/lldb/Core/Telemetry.h
@@ -215,12 +215,12 @@ destination:/path/to/some/file
 
 The allowed field_name values are:
  * enable_logging
-       If the fields are specified more than once, the last line will take precedence
-       If enable_logging is set to false, no logging will occur.
+       If the fields are specified more than once, the last line will take
+precedence If enable_logging is set to false, no logging will occur.
  * destination.
-      This is allowed to be specified multiple times - it will add to the default
-      (ie, specified by vendor) list of destinations.
-      The value can be either of
+      This is allowed to be specified multiple times - it will add to the
+default (ie, specified by vendor) list of destinations. The value can be either
+of
          + one of the two magic values "stdout" or "stderr".
          + a path to a local file
 

Copy link
Member

@JDevlieghere JDevlieghere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bunch of time has passed since the original RFC. It would be great, and help with reviewing the PR, to have an overview of the currently proposed architecture, the different pieces and how it all fits together. I left some inline comments, but I'm not sure if it's worth fixing those before we're all on the same page about the higher level stuff.

#include <cstdio>

#include "lldb/API/SBDefines.h"
#include "lldb/API/SBPlatform.h"
#include "third_party/llvm/llvm-project/lldb/include/lldb/API/SBStructuredData.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this was added by your IDE. There's a few instances of these.

@@ -243,6 +245,8 @@ class LLDB_API SBDebugger {

lldb::SBTarget GetDummyTarget();

void SendTelemetry(SBStructuredData *entry);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems odd to allow clients to send their own telemetry. I'd expect all telemetry you'd want to collect to originate from within LLDB. This also seems like a pretty privacy concern because now you're opening the door to report arbitrary data on behalf of LLDB.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify your privacy model?

We need this API because we want to be able to collect performance stats for our IDE debugging layer (eg., how long each DAP packet takes, success vs failed vs timedout counts, etc). While it's possible that the clients could have their own telemetry outside of LLDB, it is helpful to be able to combine both client(s)' and server's telemetry data for each session

As for privacy concerns, at least, from our POV, we ensure that the collected data is stored with restricted and auditable access. The Destination class, which takes care of forwarding the collected telemetry entries to its final destination, is vendor-specific. So you could choose to ignore client telemetry if it does not fit your privacy model.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to combine telemetry from LLDB and clients together, why not invert the relationship? Instead of having LLDB ingest client telemetry data and report them together, we could instead have LLDB hand its telemetry over to clients and have the client combine them.


using SteadyTimePoint = std::chrono::time_point<std::chrono::steady_clock>;

struct TelemetryEventStats {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to have comments explaining the purpose of these classes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this class differ from the existing Timer class?

}
};

struct LoggerConfig {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging is a pretty overloaded term. Unless this is actually related to the existing Log infrastructure, this should have a different name, something like TelemetryConfig?

// the same session_uuid
std::string session_uuid;

TelemetryEventStats stats;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does every telemetry entry need a start and end time? I think we discussed this in the RFC, but I think there's a lot of telemetry that doesn't have to be time based. For example, TargetInfoEntry does that really need a start and stop time?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1: duration-style events are just one kind of event in our downstream telemetry system. It doesn't make sense for most events to have a duration though, so we have many other kinds of events (e.g. feature usage events).

virtual std::string name() const = 0;
};

// The logger itself!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't add much.

};

// The logger itself!
class TelemetryLogger {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TelemetryEmitter?


virtual ~TelemetryLogger() = default;

// Invoked upon lldb startup
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing period. A few more instances of that below.

TelemetryEventStats stats) = 0;

// Invoked upon lldb exit.
virtual void LogExit(llvm::StringRef lldb_path,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having separate methods for the different events, which all take a TelemetryEventStats, it seems like it would be simpler to have a generic Emit method and then the individual events know how to emit themselves?

@@ -1868,8 +1874,28 @@ bool CommandInterpreter::HandleCommand(const char *command_line,
LazyBool lazy_add_to_history,
CommandReturnObject &result,
bool force_repeat_command) {
std::cout << "HandleCommand: " << command_line << "\n";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stray debug line?

using SteadyTimePoint = std::chrono::time_point<std::chrono::steady_clock>;

struct TelemetryEventStats {
// REQUIRED: Start time of event
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you convert all the comments in header files to Doxygen comments?
///

virtual std::string ToString() const;
};

struct ExitDescription {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And add top-level doxygen comments to each new datatype?

virtual void AddDestination(TelemetryDestination *destination) = 0;
};

/*
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you follow the LLVM coding style in your comments?
I.e., this should be a doxygen comment /// instead of a C-style comment.

std::string exit_or_load_desc;
if (process_exit.has_value()) {
// If this entry was emitted for an exit
exit_or_load_desc =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be more efficient to use an llvm stream https://llvm.org/doxygen/classllvm_1_1raw__string__ostream.html for all these string concatenations, throughout the file

Copy link
Member

@bulbazord bulbazord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read through the RFC briefly and skimmed the PR. Left some comments, but I'm sure this PR will change over time.

+1 to all of Jonas's and Adrian's comments.

@@ -0,0 +1,237 @@
#ifndef LLDB_CORE_TELEMETRY_H
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file needs a copyright header.

// OPTIONAL: End time of event - may be empty if not meaningful.
std::optional<SteadyTimePoint> m_end;

// TBD: could add some memory stats here too?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add more detail to this comment or remove it. As-is, this comment isn't actionable. Future contributors (yourself included) may not know what "some memory stats" actually means.

if (m_end.has_value())
return *m_end - m_start;
else
return std::nullopt;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for else here.

if (lldb_private::Debugger *debugger = this->get()) {
debugger->SendClientTelemetry(entry->m_impl_up->GetObjectSP().get());
} else {
std::cerr << " --- cannot log dap request - debugger is null\n";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::cerr is not the right place to write this, please log it instead. LLDB is often used as a library and there would be no way for clients to opt-out of this behavior.


#include <memory>
#include <stdbool.h>
#include <sys/auxv.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auxv is not available on every platform, you'll have to go through the host layer for whatever you're trying to do with it.

}

llvm::StringRef parse_value(llvm::StringRef str, llvm::StringRef label) {
return str.substr(label.size()).trim();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is 'value'?

#include <limits>
#include <memory>
#include <optional>
#include <string>
#include <typeinfo>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM and LLDB disable RTTI, what is this for?

if (log)
*log << "error: unhandled command \"" << command.data() << "\""
<< std::endl;
// auto end_time = std::chrono::steady_clock::now();
// TODO: fill in the SBStructuredData and send it.
// debugger.SendTelemetry(...);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover debugging code?

#include "DAP.h"
#include "LLDBUtils.h"
#include "llvm/ADT/StringExtras.h"
#include "llvm/Support/FormatVariadic.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why were these removed?

@@ -243,6 +245,8 @@ class LLDB_API SBDebugger {

lldb::SBTarget GetDummyTarget();

void SendTelemetry(SBStructuredData *entry);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to combine telemetry from LLDB and clients together, why not invert the relationship? Instead of having LLDB ingest client telemetry data and report them together, we could instead have LLDB hand its telemetry over to clients and have the client combine them.

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I think I mentioned in the original RFC, I'm not convinced the core functionality for this belongs in LLDB. In my view, LLDB is a client that should make use of a common telemetry library that can be used by other parts of the wider LLVM toolchain, or indeed other non-LLVM tools. From a point of view of separation of concerns, it also doesn't make sense: LLDB is for debugging, whereas telemetry is a more generic thing that isn't specific to debugging. Certainly, there may be some specific aspects that are tied to LLDB in particular, but I'd expect those to be about the "what" is reported, not the "how" it is reported.

As others have mentioned, I think it would be really helpful to have an overview of the design with the different key parts described (by class name ideally), preferably with the aid of a diagram, since a picture is often a clearer way of describing structure than words are.


namespace lldb_private {

using SteadyTimePoint = std::chrono::time_point<std::chrono::steady_clock>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speaking from experience: steady_clock is only useful for measuring durations, since it isn't relative to a user's (or indeed any) fixed point in time - try converting it to a year/month/day etc format and you'll see what I mean. If you want an actual point in time that can be understood by the telemetry consumer, you'll need something to act as the base point of time that the steady times are relative to. In our internal telemetry implementation, our duration events therefore actually use both steady and system clock time points, with the reported duration being the difference between two steady time points, and the system clock time point being to indicate when the event was triggered.

// the same session_uuid
std::string session_uuid;

TelemetryEventStats stats;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1: duration-style events are just one kind of event in our downstream telemetry system. It doesn't make sense for most events to have a duration though, so we have many other kinds of events (e.g. feature usage events).

@oontvoo
Copy link
Member Author

oontvoo commented Apr 10, 2024

A bunch of time has passed since the original RFC. It would be great, and help with reviewing the PR, to have an overview of the currently proposed architecture, the different pieces and how it all fits together. I left some inline comments, but I'm not sure if it's worth fixing those before we're all on the same page about the higher level stuff.

I've replied to the Discourse thread just now with an updated doc

PTAL. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants