Skip to content

Commit

Permalink
usdt: DTrace userspace side
Browse files Browse the repository at this point in the history
This is implemented almost entirely along with the pid provider, which is
reassuringly similar to how it was done in the in-kernel days. They're
really very closely-related beasts, and the same uprobe-based code can
handle both easily enough.  To reflect this, dt_prov_pid.c is renamed to
dt_prov_uprobe.c.

This does several things that are sufficiently intertwined that putting
them in one commit seems most readable:

 - implements USDT probe discovery, ripping out a lot of old ioctl stuff
   and obsolete code handling stuff like structure-copying thunks in the
   Solaris C library and a bunch of obsolete functions around DOF
   acquisition (keeping one which might well be revived in the next
   phase), and adding dt_pid_create_usdt_probes, which scans the
   systemwide uprobe list and creates DTrace-side USDT probes (and their
   associated underlying uprobe-based probes) for any that are relevant
   (see below), using an sscanf-based parser: the uprobe naming scheme
   was designed so that it works with the limitations of such parsers.
   Thanks to the %m conversion specifier there is no risk of buffer
   overrunning if the name components are unexpectedly long.

   Right now this can only create probes for specific processes (those
   named on the command line in probe names, as usual), but in future
   it'll grow the ability to make probes for everything dtprobed has
   spotted probes for.  Because it is driven by the systemwide uprobe
   list, it can create probes for processes that started before DTrace
   did, just like the old in-kernel model.

 - rejigs the pid provider support in dt_prov_uprobe.c (formerly
   dt_prov_pid.c) to use the new uprobe_create mechanism to make pid
   probes, with names consistent with uprobes created by dtprobed for
   USDT probes; the uprobes have names like dt_pid/[pr]_$dev_$ino_$addr.
   USDT probes pass their name components down as encoded uprobe
   arguments, viz:

   p:dt_pid/dt_2d_231d_7fbe933165fd /tmp/runtest.17112/usdt-dlclose1.123064.9010/livelib.so:0x15fd Ptest_prov=\1 Mlivelib__2eso=\2 Fgo=\3 Ngo=\4

   (The [PMFN] prefix is added and stripped off automatically by the
   name en/decoder, and makes sure that no two "args" have the same
   name, even if the probe component is the same, as above.)

 - provides provide_usdt_probe to be called at USDT probe discovery
   time to create USDT probes and their underlying uprobe-based probes.
   The creation of the underlying uprobe-based probes is split off into
   a new funtion create_underlying that is also used for pid probes.
   Probe naming is designed to ensure that USDT probes and pid probes
   that are created at the same offset are associated with the same
   underlying probe.

   The struct dt_uprobe attached to the underlying probe gains a device
   number (which it should always have had) and keeps track of the
   underlying uprobe name from create_uprobe or USDT probe discovery
   and remembers whether or not DTrace created it (if dtprobed created
   it, dtrace must not delete it).

   USDT probes can be associated with more than one underlying probe
   (if the probe appears repeatedly in a program).  Repeated calls to
   provide_usdt_probe for the same probe description but with different
   offsets will cause the USDT probe to be chained into the appropriate
   underlying probes (creating them as needed).

 - enabling gets a little more complex.  We intern both the overlying
   (pid and USDT) probe *and* the associated underlying probe in the
   enablings list (by getting ->enable for the pid/USDT probe to walk
   its list of underlying probes and enable all of them), and to also
   add itself to the enablings list.

 - Trampoline generation has to adapt to this, but also has to use a
   less kludgy way of figuring out the pids the trampoline applies to:
   rather than parsing the name apart on the spot, we ask dt_pid, which
   already has code to *properly* parse apart both pid and usdt names
   and extract the pid from them.

 - stack arg handling needs a bit of a tweak.  USDT probes can take a
   set of arguments that and are implemented as a fake function call.
   The underlying uprobe is placed right after the argument setup and
   therefore should be retrieved without applying PT_REGS_ARGSTKBASE
   for platforms on which PT_REGS_ARGSTKBASE > 0.  The underlying
   probe gets the PP_IS_FUNCALL flag set to indicate this.

Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
Reviewed-by: Nick Alcock <nick.alcock@oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
  • Loading branch information
nickalcock authored and kvanhees committed Oct 27, 2022
1 parent a28e369 commit 95ac5e2
Show file tree
Hide file tree
Showing 13 changed files with 849 additions and 491 deletions.
18 changes: 13 additions & 5 deletions include/dtrace/pid.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
* Licensed under the Universal Permissive License v 1.0 as shown at
* http://oss.oracle.com/licenses/upl.
*
* Copyright (c) 2009, 2021, Oracle and/or its affiliates. All rights reserved.
* Copyright (c) 2009, 2022, Oracle and/or its affiliates. All rights reserved.
*/

/*
Expand All @@ -13,6 +13,7 @@
#ifndef _DTRACE_PID_H
#define _DTRACE_PID_H

#include <sys/types.h>
#include <dirent.h>
#include <dtrace/universal.h>

Expand All @@ -26,16 +27,23 @@ typedef enum pid_probetype {
} pid_probetype_t;

typedef struct pid_probespec {
pid_t pps_pid; /* task PID */
pid_probetype_t pps_type; /* probe type */
const char *pps_prv; /* provider (without pid) */
char *pps_mod; /* probe module (object) */
char pps_fun[DTRACE_FUNCNAMELEN]; /* probe function */
ino_t pps_ino; /* object inode */
char *pps_prb; /* probe name (if provided) */
dev_t pps_dev; /* object device node */
ino_t pps_inum; /* object inode */
char *pps_fn; /* object full filename */
uint64_t pps_pc; /* probe address */
uint64_t pps_off; /* probe offset (in object) */

/*
* Fields below this point do not apply to probes of type
* DTPPT_UNDERLYING.
*/
pid_t pps_pid; /* task PID */
uint64_t pps_vaddr; /* object base address */
uint64_t pps_size; /* function size (in bytes) */
uint8_t pps_glen; /* glob pattern length */
char pps_gstr[1]; /* glob pattern string */
} pid_probespec_t;

Expand Down
5 changes: 3 additions & 2 deletions libdtrace/Build
Original file line number Diff line number Diff line change
Expand Up @@ -48,10 +48,10 @@ libdtrace-build_SOURCES = dt_aggregate.c \
dt_program.c \
dt_prov_dtrace.c \
dt_prov_fbt.c \
dt_prov_pid.c \
dt_prov_profile.c \
dt_prov_sdt.c \
dt_prov_syscall.c \
dt_prov_uprobe.c \
dt_provider.c \
dt_provider_tp.c \
dt_regset.c \
Expand Down Expand Up @@ -84,13 +84,14 @@ dt_consume.c_CFLAGS := -Wno-pedantic
dt_debug.c_CFLAGS := -Wno-prio-ctor-dtor
dt_cg.c_CFLAGS := -Wno-pedantic
dt_dis.c_CFLAGS := -Wno-pedantic
dt_pid.c_CFLAGS := -Wno-pedantic
dt_proc.c_CFLAGS := -Wno-pedantic
dt_prov_dtrace.c_CFLAGS := -Wno-pedantic
dt_prov_fbt.c_CFLAGS := -Wno-pedantic
dt_prov_pid.c_CFLAGS := -Wno-pedantic
dt_prov_profile.c_CFLAGS := -Wno-pedantic
dt_prov_sdt.c_CFLAGS := -Wno-pedantic
dt_prov_syscall.c_CFLAGS := -Wno-pedantic
dt_prov_uprobe.c_CFLAGS := -Wno-pedantic
dt_debug.c_CFLAGS := -Wno-prio-ctor-dtor
drti.c_CFLAGS := -Wno-prio-ctor-dtor

Expand Down
12 changes: 8 additions & 4 deletions libdtrace/dt_cg.c
Original file line number Diff line number Diff line change
Expand Up @@ -310,12 +310,15 @@ dt_cg_tramp_copy_regs(dt_pcb_t *pcb, int rp)

/*
* Copy arguments from a dt_pt_regs structure referenced by the 'rp' argument.
* If 'called' is nonzero, the registers are laid out as when inside the
* function: if zero, they are laid out as at the call instruction, before the
* function is called (as is done for e.g. usdt).
*
* The caller must ensure that %r7 contains the value set by the
* dt_cg_tramp_prologue*() functions.
*/
void
dt_cg_tramp_copy_args_from_regs(dt_pcb_t *pcb, int rp)
dt_cg_tramp_copy_args_from_regs(dt_pcb_t *pcb, int rp, int called)
{
dtrace_hdl_t *dtp = pcb->pcb_hdl;
dt_irlist_t *dlp = &pcb->pcb_ir;
Expand Down Expand Up @@ -361,13 +364,13 @@ dt_cg_tramp_copy_args_from_regs(dt_pcb_t *pcb, int rp)
* rc = bpf_probe_read[_user](dctx->mst->argv[i],
* sizeof(uint64_t),
* &sp[i - PT_REGS_ARGC +
* PT_REGS_ARGSTKBASE]);
* (called ? PT_REGS_ARGSTKBASE : 0)]);
* // mov %r1, %r7
* // add %r1, DMST_ARG(i)
* // mov %r2, sizeof(uint64_t)
* // lddw %r3, [%rp + PT_REGS_SP]
* // add %r3, (i - PT_REGS_ARGC +
* PT_REGS_ARGSTKBASE) *
* (called ? PT_REGS_ARGSTKBASE : 0)) *
* sizeof(uint64_t)
* // call bpf_probe_read[_user]
* if (rc != 0)
Expand All @@ -387,7 +390,8 @@ dt_cg_tramp_copy_args_from_regs(dt_pcb_t *pcb, int rp)
emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, DMST_ARG(i)));
emit(dlp, BPF_MOV_IMM(BPF_REG_2, sizeof(uint64_t)));
emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_3, rp, PT_REGS_SP));
emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, (i - PT_REGS_ARGC + PT_REGS_ARGSTKBASE) * sizeof(uint64_t)));
emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, (i - PT_REGS_ARGC +
(called ? PT_REGS_ARGSTKBASE : 0)) * sizeof(uint64_t)));
emit(dlp, BPF_CALL_HELPER(dtp->dt_bpfhelper[BPF_FUNC_probe_read_user]));
emit(dlp, BPF_BRANCH_IMM(BPF_JEQ, BPF_REG_0, 0, lbl_ok));
emit(dlp, BPF_STORE_IMM(BPF_DW, BPF_REG_7, DMST_ARG(i), 0));
Expand Down
2 changes: 1 addition & 1 deletion libdtrace/dt_cg.h
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ extern void dt_cg_tramp_prologue_act(dt_pcb_t *pcb, dt_activity_t act);
extern void dt_cg_tramp_prologue(dt_pcb_t *pcb);
extern void dt_cg_tramp_clear_regs(dt_pcb_t *pcb);
extern void dt_cg_tramp_copy_regs(dt_pcb_t *pcb, int rp);
extern void dt_cg_tramp_copy_args_from_regs(dt_pcb_t *pcb, int rp);
extern void dt_cg_tramp_copy_args_from_regs(dt_pcb_t *pcb, int rp, int called);
extern void dt_cg_tramp_copy_rval_from_regs(dt_pcb_t *pcb, int rp);
extern void dt_cg_tramp_call_clauses(dt_pcb_t *pcb, const dt_probe_t *prp,
dt_activity_t act);
Expand Down
1 change: 1 addition & 0 deletions libdtrace/dt_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -319,6 +319,7 @@ struct dtrace_hdl {

dt_htab_t *dt_provs; /* hash table of dt_provider_t's */
const struct dt_provider *dt_prov_pid; /* PID provider */
const struct dt_provider *dt_prov_usdt; /* USDT provider */
dt_proc_hash_t *dt_procs; /* hash table of grabbed process handles */
dt_intdesc_t dt_ints[6]; /* cached integer type descriptions */
ctf_id_t dt_type_func; /* cached CTF identifier for function type */
Expand Down
2 changes: 1 addition & 1 deletion libdtrace/dt_open.c
Original file line number Diff line number Diff line change
Expand Up @@ -66,10 +66,10 @@ const dt_version_t _dtrace_versions[] = {
static const dt_provimpl_t *dt_providers[] = {
&dt_dtrace,
&dt_fbt,
&dt_pid,
&dt_profile,
&dt_sdt,
&dt_syscall,
&dt_uprobe,
};

/*
Expand Down

0 comments on commit 95ac5e2

Please sign in to comment.