Move probe expansion into codegen

The previous probe expansion approach tried to minimize the amount of LLVM functions generated by emitting a single function for all probe matches in most cases. While this was efficient, it came with a couple of drawbacks: - It is necessary to generate a separate LLVM function for each match (e.g. when the 'probe' builtin is used). This leads to having two very similar loops for iterating matches in BPFtrace::add_probe and in CodegenLLVM::visit(Probe) which is quite confusing and hard to maintain. - libbpf needs one BPF program (i.e. one LLVM function) per probe so if we want to delegate program loading (and possibly attachment) to libbpf (which we do), we cannot use this approach. See [1] for more details. This refactors probe expansion by moving most of it into codegen. Overall, we now distinguish two types of probe expansion: Full expansion - A separate LLVM function is generated for each match. This is used for most expansions now. Multi expansion - Used for k(u)probes when k(u)probe_multi is available. Generates one LLVM function and one BPF program for all matches and attaches the expanded functions via bpf_link_create_opts. This allows to drop a lot of duplicated code. The expansion for "full" is done in CodegenLLVM::visit(Probe), the expansion for "multi" is done in BPFtrace::add_probe. A drawback of this approach is that we generate substantially larger ELF objects for expansions of probe types which do not support multi-probes (e.g. kfuncs and tracepoints) as we generate duplicate LLVM functions. This is something we can live with for now since multi-attachment is not the main use-case for these probe types (e.g. attaching to many kfuncs is very slow) and there's usually an alternative to use multi-kprobes. One particular area where this refactoring caused problems is unit tests in tests/bpftrace.cpp. Previously, it was sufficient to generate a simple ast::Probe and pass it to BPFtrace::add_probe since that was where most of the expansion was done. Now that the expansion was moved to codegen, we need to do full parser -> field analyser -> clang parser -> semantic analyser -> codegen sequence. With this change, some tests had to be dropped, especially the tests with a single wildcard for uprobe/USDT target. The reason is that semantic analyser expands these wildcards by searching all paths on the system which is something that cannot be mocked and therefore should not be run in unit tests (e.g. it prevents running the unit tests as non-root). Also, the problem comes with USDT probes as it is not possible to easily mock USDTHelper which is a fully static class. Since we need to override AttachPoint::usdt::num_locations from tests, we allow to do that via a new internal env variable BPFTRACE_TEST_USDT_NUM_LOCATIONS. [1] #3005
bpftrace · May 10, 2024 · 10f6f74 · 10f6f74
1 parent a2e86ff
commit 10f6f74
Show file tree

Hide file tree

Showing 14 changed files with 384 additions and 629 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -34,6 +34,8 @@ and this project adheres to
   - [#3060](https://github.com/bpftrace/bpftrace/pull/3060)
 - Disable func builtin for kretprobes and uretprobes when `get_func_ip` feature is not available
   - [#2645](https://github.com/bpftrace/bpftrace/pull/2645)
+- Move probe expansion into codegen
+  - [#3155](https://github.com/bpftrace/bpftrace/pull/3155)
 #### Deprecated
 #### Removed
 #### Fixed

diff --git a/src/ast/ast.h b/src/ast/ast.h
@@ -627,7 +627,15 @@ class AttachPoint : public Node {
   uint64_t len = 0;   // for watchpoint probes, the width of watched addr
   std::string mode;   // for watchpoint probes, the watch mode
   bool async = false; // for watchpoint probes, if it's an async watchpoint
-  bool need_expansion = false;
+
+  // There are 2 kinds of attach point expansion:
+  // - full expansion  - separate LLVM function is generated for each match
+  // - multi expansion - one LLVM function and BPF program is generated for all
+  //                     matches, the list of expanded functions is attached to
+  //                     the BPF program using the k(u)probe.multi mechanism
+  bool need_full_expansion = false;
+  bool need_multi_expansion = false;
+
   uint64_t address = 0;
   uint64_t func_offset = 0;
   bool ignore_invalid = false;
@@ -639,6 +647,11 @@ class AttachPoint : public Node {
   int index() const;
   void set_index(int index);
 
+  bool need_expansion()
+  {
+    return need_full_expansion || need_multi_expansion;
+  }
+
 private:
   AttachPoint(const AttachPoint &other) = default;
 

diff --git a/src/ast/attachpoint_parser.cpp b/src/ast/attachpoint_parser.cpp
@@ -340,9 +340,18 @@ AttachPointParser::State AttachPointParser::kprobe_parser(bool allow_offset)
     ap_->func = parts_[func_idx];
   }
 
-  if (ap_->func.find('*') != std::string::npos ||
-      ap_->target.find('*') != std::string::npos)
-    ap_->need_expansion = true;
+  // kprobe_multi does not support the "module:function" syntax so in case of
+  // a wildcarded module, always use full expansion
+  if (has_wildcard(ap_->target))
+    ap_->need_full_expansion = true;
+
+  if (has_wildcard(ap_->func)) {
+    if (bpftrace_.feature_->has_kprobe_multi()) {
+      ap_->need_multi_expansion = true;
+    } else {
+      ap_->need_full_expansion = true;
+    }
+  }
 
   return OK;
 }
@@ -440,9 +449,17 @@ AttachPointParser::State AttachPointParser::uprobe_parser(bool allow_offset,
       ap_->func = func;
   }
 
-  if (ap_->target.find('*') != std::string::npos ||
-      ap_->func.find('*') != std::string::npos)
-    ap_->need_expansion = true;
+  // As the C++ language supports function overload, a given function name
+  // (without parameters) could have multiple matches even when no
+  // wildcards are used.
+  if (has_wildcard(ap_->func) || has_wildcard(ap_->target) ||
+      ap_->lang == "cpp") {
+    if (bpftrace_.feature_->has_uprobe_multi()) {
+      ap_->need_multi_expansion = true;
+    } else {
+      ap_->need_full_expansion = true;
+    }
+  }
 
   return OK;
 }
@@ -477,10 +494,10 @@ AttachPointParser::State AttachPointParser::usdt_parser()
     ap_->func = parts_[3];
   }
 
-  if (ap_->target.find('*') != std::string::npos ||
-      ap_->ns.find('*') != std::string::npos || ap_->ns.empty() ||
-      ap_->func.find('*') != std::string::npos || bpftrace_.pid())
-    ap_->need_expansion = true;
+  // Always fully expand USDT probes as they may access args
+  if (has_wildcard(ap_->target) || has_wildcard(ap_->ns) || ap_->ns.empty() ||
+      has_wildcard(ap_->func) || bpftrace_.pid())
+    ap_->need_full_expansion = true;
 
   return OK;
 }
@@ -505,7 +522,7 @@ AttachPointParser::State AttachPointParser::tracepoint_parser()
 
   if (ap_->target.find('*') != std::string::npos ||
       ap_->func.find('*') != std::string::npos)
-    ap_->need_expansion = true;
+    ap_->need_full_expansion = true;
 
   return OK;
 }
@@ -627,7 +644,7 @@ AttachPointParser::State AttachPointParser::watchpoint_parser(bool async)
 
     ap_->func = func_arg_parts[0];
     if (ap_->func.find('*') != std::string::npos)
-      ap_->need_expansion = true;
+      ap_->need_full_expansion = true;
 
     if (func_arg_parts[1].size() <= 3 || func_arg_parts[1].find("arg") != 0) {
       errs_ << "Invalid function argument" << std::endl;
@@ -696,7 +713,7 @@ AttachPointParser::State AttachPointParser::kfunc_parser()
 
   if (ap_->func.find('*') != std::string::npos ||
       ap_->target.find('*') != std::string::npos)
-    ap_->need_expansion = true;
+    ap_->need_full_expansion = true;
 
   return OK;
 }
@@ -714,7 +731,7 @@ AttachPointParser::State AttachPointParser::iter_parser()
 
   if (parts_[1].find('*') != std::string::npos) {
     if (listing_) {
-      ap_->need_expansion = true;
+      ap_->need_full_expansion = true;
     } else {
       if (ap_->ignore_invalid)
         return SKIP;
@@ -744,7 +761,7 @@ AttachPointParser::State AttachPointParser::raw_tracepoint_parser()
   ap_->func = parts_[1];
 
   if (has_wildcard(ap_->func))
-    ap_->need_expansion = true;
+    ap_->need_full_expansion = true;
 
   return OK;
 }

diff --git a/src/ast/passes/codegen_llvm.cpp b/src/ast/passes/codegen_llvm.cpp
@@ -2550,6 +2550,51 @@ void CodegenLLVM::generateProbe(Probe &probe,
         func_type, name, current_attach_point_->address, index);
 }
 
+void CodegenLLVM::add_probe(AttachPoint &ap,
+                            Probe &probe,
+                            const std::string &name,
+                            FunctionType *func_type)
+{
+  current_attach_point_ = &ap;
+  probefull_ = ap.name();
+  if (probetype(ap.provider) == ProbeType::usdt) {
+    auto usdt = USDTHelper::find(bpftrace_.pid(), ap.target, ap.ns, ap.func);
+    if (!usdt.has_value()) {
+      // Unfortunately, it is not easy to mock USDTHelper in tests as it's fully
+      // static. Since we only need to override usdt.num_locations, pass the
+      // value via env variable for testing purposes.
+      auto test_usdt = std::getenv("BPFTRACE_TEST_USDT_NUM_LOCATIONS");
+      if (test_usdt)
+        ap.usdt.num_locations = std::stoi(test_usdt);
+      else
+        LOG(FATAL) << "Failed to find usdt probe: " << probefull_;
+    } else
+      ap.usdt = *usdt;
+
+    // A "unique" USDT probe can be present in a binary in multiple
+    // locations. One case where this happens is if a function
+    // containing a USDT probe is inlined into a caller. So we must
+    // generate a new program for each instance. We _must_ regenerate
+    // because argument locations may differ between instance locations
+    // (eg arg0. may not be found in the same offset from the same
+    // register in each location)
+    auto reset_ids = create_reset_ids();
+    current_usdt_location_index_ = 0;
+    for (int i = 0; i < ap.usdt.num_locations; ++i) {
+      reset_ids();
+
+      std::string full_func_id = name + "_loc" + std::to_string(i);
+      generateProbe(probe, full_func_id, probefull_, func_type, i);
+      bpftrace_.add_probe(ap, probe, i);
+      current_usdt_location_index_++;
+    }
+  } else {
+    generateProbe(probe, name, probefull_, func_type);
+    bpftrace_.add_probe(ap, probe);
+  }
+  current_attach_point_ = nullptr;
+}
+
 void CodegenLLVM::visit(Subprog &subprog)
 {
   std::vector<llvm::Type *> arg_types;
@@ -2642,54 +2687,19 @@ void CodegenLLVM::createRet(Value *value)
 void CodegenLLVM::visit(Probe &probe)
 {
   FunctionType *func_type = FunctionType::get(b_.getInt64Ty(),
-                                              { b_.GET_PTR_TY() }, // struct
-                                                                   // pt_regs
-                                                                   // *ctx
+                                              { b_.GET_PTR_TY() }, // ctx
                                               false);
 
-  // Probe has at least one attach point (required by the parser)
-  auto &attach_point = (*probe.attach_points)[0];
-
-  // All usdt probes need expansion to be able to read arguments
-  if (probetype(attach_point->provider) == ProbeType::usdt)
-    probe.need_expansion = true;
-
-  bool generated = false;
-  current_attach_point_ = attach_point;
-  inside_subprog_ = false;
-
-  /*
-   * Most of the time, we can take a probe like kprobe:do_f* and build a
-   * single BPF program for that, called "s_kprobe:do_f*", and attach it to
-   * each wildcard match. An exception is the "probe" builtin, where we need
-   * to build different BPF programs for each wildcard match that contains an
-   * ID for the match. Those programs will be called "s_kprobe:do_fcntl" etc.
-   */
-  if (probe.need_expansion == false) {
-    // build a single BPF program pre-wildcards
-    probefull_ = probe.name();
-    if (probe.index() == 0)
-      probe.set_index(getNextIndexForProbe());
-    generateProbe(probe, probefull_, probefull_, func_type);
-    generated = true;
-  } else {
-    /*
-     * Build a separate BPF program for each wildcard match.
-     * We begin by saving state that gets changed by the codegen pass, so we
-     * can restore it for the next pass (printf_id_, time_id_).
-     */
-    auto reset_ids = create_reset_ids();
-
-    for (auto attach_point : *probe.attach_points) {
-      current_attach_point_ = attach_point;
-
-      std::set<std::string> matches;
-      if (attach_point->provider == "BEGIN" ||
-          attach_point->provider == "END") {
-        matches.insert(attach_point->provider);
-      } else {
-        matches = bpftrace_.probe_matcher_->get_matches_for_ap(*attach_point);
-      }
+  // We begin by saving state that gets changed by the codegen pass, so we
+  // can restore it for the next pass (printf_id_, time_id_).
+  auto reset_ids = create_reset_ids();
+  for (auto *attach_point : *probe.attach_points) {
+    reset_ids();
+    current_attach_point_ = attach_point;
+    if (probe.need_expansion || attach_point->need_full_expansion) {
+      // Do expansion - generate a separate LLVM function for each match
+      auto matches = bpftrace_.probe_matcher_->get_matches_for_ap(
+          *attach_point);
 
       probe_count_ += matches.size();
       uint64_t max_bpf_progs = bpftrace_.config_.get(
@@ -2704,54 +2714,24 @@ void CodegenLLVM::visit(Probe &probe)
                       "environment variable.";
       }
 
-      tracepoint_struct_ = "";
-      for (const auto &m : matches) {
+      for (auto &match : matches) {
         reset_ids();
-        std::string match = m;
-        generated = true;
-
         if (attach_point->index() == 0)
           attach_point->set_index(getNextIndexForProbe());
 
-        AttachPoint match_ap = attach_point->create_expansion_copy(match);
-        probefull_ = match_ap.name();
-        current_attach_point_ = &match_ap;
-
-        if (probetype(attach_point->provider) == ProbeType::usdt) {
-          // Set the probe identifier so that we can read arguments later
-          auto usdt = USDTHelper::find(
-              bpftrace_.pid(), match_ap.target, match_ap.ns, match_ap.func);
-          if (!usdt.has_value())
-            LOG(BUG) << "Failed to find usdt probe: " << probefull_;
-          match_ap.usdt = *usdt;
-
-          // A "unique" USDT probe can be present in a binary in multiple
-          // locations. One case where this happens is if a function containing
-          // a USDT probe is inlined into a caller. So we must generate a new
-          // program for each instance. We _must_ regenerate because argument
-          // locations may differ between instance locations (eg arg0. may not
-          // be found in the same offset from the same register in each
-          // location)
-          current_usdt_location_index_ = 0;
-          for (int i = 0; i < match_ap.usdt.num_locations; ++i) {
-            reset_ids();
-
-            std::string full_func_id = match + "_loc" + std::to_string(i);
-            generateProbe(probe, full_func_id, probefull_, func_type, i);
-            current_usdt_location_index_++;
-          }
-        } else {
-          generateProbe(probe, match, probefull_, func_type);
-        }
+        auto match_ap = attach_point->create_expansion_copy(match);
+        add_probe(match_ap, probe, match, func_type);
       }
+      if (matches.empty()) {
+        generateProbe(probe, "dummy", "dummy", func_type, std::nullopt, true);
+      }
+    } else {
+      if (probe.index() == 0)
+        probe.set_index(getNextIndexForProbe());
+      add_probe(*attach_point, probe, attach_point->name(), func_type);
     }
-
-    if (!generated)
-      generateProbe(probe, "dummy", "dummy", func_type, std::nullopt, true);
   }
 
-  if (generated)
-    bpftrace_.add_probe(probe);
   current_attach_point_ = nullptr;
 }
 

diff --git a/src/ast/passes/codegen_llvm.h b/src/ast/passes/codegen_llvm.h
@@ -152,6 +152,12 @@ class CodegenLLVM : public Visitor {
                      std::optional<int> usdt_location_index = std::nullopt,
                      bool dummy = false);
 
+  // Generate a probe and register it to the BPFtrace class.
+  void add_probe(AttachPoint &ap,
+                 Probe &probe,
+                 const std::string &name,
+                 FunctionType *func_type);
+
   [[nodiscard]] ScopedExprDeleter accept(Node *node);
   [[nodiscard]] std::tuple<Value *, ScopedExprDeleter> getMapKey(Map &map);
   AllocaInst *getMultiMapKey(Map &map, const std::vector<Value *> &extra_keys);

diff --git a/src/ast/passes/field_analyser.cpp b/src/ast/passes/field_analyser.cpp
@@ -165,7 +165,7 @@ void FieldAnalyser::resolve_args(Probe &probe)
         probe_type != ProbeType::uprobe)
       continue;
 
-    if (ap->need_expansion) {
+    if (ap->need_expansion()) {
       std::set<std::string> matches;
 
       // Find all the matches for the wildcard..

diff --git a/src/ast/passes/semantic_analyser.cpp b/src/ast/passes/semantic_analyser.cpp
@@ -424,7 +424,7 @@ void SemanticAnalyser::visit(Builtin &builtin)
       ProbeType type = probetype(attach_point->provider);
 
       if (type == ProbeType::tracepoint) {
-        probe->need_expansion = true;
+        attach_point->need_full_expansion = true;
         builtin_args_tracepoint(attach_point, builtin);
       }
     }