Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify and improve ignoring of missing probes #3246

Merged
merged 6 commits into from
Jun 20, 2024

Conversation

viktormalik
Copy link
Contributor

@viktormalik viktormalik commented Jun 17, 2024

For kprobes and uprobes, when a probe is attaching to multiple functions, we do not fail when some functions do not exist. The reason is that k(u)probes are dynamic events which regularly change between versions of kernel/binaries and we want to allow the tools to attach to all potential events while expecting some attachments to fail.

The problem is that we're not doing the ignoring consistently. For kprobes, we always show a warning if an attachment fails. For uprobes, bpftrace currently fails due to #3235 but even without the bug, we would just silently ignore the missing functions.

This PR unifies the approaches by always showing the warning. In addition, it adds a new config option warn_missing_probes, by default set to true, which allows to turn the warnings off. This is done for all the shipped tools.

Fixes #3235.

The first commit also fixes another attachment issue caused by a combination of kprobe_multi and progfd caching. See the commit message for details.

Note: one potential problem with ignoring missing probe warnings is that when all attach points of a probe are missing, the probe will not be attached at all without any warning. Normally, this is IMHO fine as the user must have requested to ignore warnings so he must be prepared that this may happen. The only problem is the shipped xfsdist.bt tool which behaves like this when the XFS module is not loaded but I wouldn't expect people to try to trace XFS calls without XFS being present on their system.

Checklist
  • Language changes are updated in man/adoc/bpftrace.adoc
  • User-visible and non-trivial changes updated in CHANGELOG.md
  • The new behaviour is covered by tests

@viktormalik
Copy link
Contributor Author

There are several user-facing changes here so I ended up adding changelog entries for almost each commit (except for the second one as that fixes an unreleased bug).

Copy link
Member

@danobi danobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm

if (error_on_missing_symbol) {
throw FatalUserException(msg);
} else {
LOG(WARNING) << msg << ", skipping.";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be more clear to say "skipping probe." rather than just "skipping."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I updated the error/warning messages.

src/attached_probe.cpp Show resolved Hide resolved
@@ -1155,7 +1163,7 @@ void AttachedProbe::attach_uprobe(int pid, bool safe_mode)
return;
}

if (!resolve_offset_uprobe(safe_mode))
if (!resolve_offset_uprobe(safe_mode, probe_.orig_name == probe_.name))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So basically if multiple attachpoints are on a probe, we allow missing attachpoint with a warning. But if only one attachpoint is specified, it's an error.

I think this makes sense

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly. That's also how it works for kprobes.

@viktormalik viktormalik force-pushed the probe-warnings branch 2 times, most recently from 423fb76 to b778fa2 Compare June 18, 2024 07:49
man/adoc/bpftrace.adoc Show resolved Hide resolved
src/attached_probe.cpp Show resolved Hide resolved
tools/gethostlatency.bt Outdated Show resolved Hide resolved
@ajor
Copy link
Member

ajor commented Jun 18, 2024

I'm thinking that it might be useful to add the option to have missing probes trigger an error instead of a warning too. e.g. in some kind of automated tracing system it's unlikely that people will see the warnings - an error would stop them going unnoticed.

We don't have to implement that now, but it'd be good to make the config option future-proof, e.g. maybe with options like:

missing_probes = "ignore"
missing_probes = "warn"
missing_probes = "error"

@jordalgo
Copy link
Contributor

+1 to @ajor 's suggestion for making the config an enum of sorts

@viktormalik
Copy link
Contributor Author

I'm thinking that it might be useful to add the option to have missing probes trigger an error instead of a warning too. e.g. in some kind of automated tracing system it's unlikely that people will see the warnings - an error would stop them going unnoticed.

We don't have to implement that now, but it'd be good to make the config option future-proof, e.g. maybe with options like:

missing_probes = "ignore"
missing_probes = "warn"
missing_probes = "error"

Great idea! I implemented it with the above proposed options.

@viktormalik
Copy link
Contributor Author

@jordalgo please check 8113d59, it's a bit of hacky but I didn't come with another solution which wouldn't make the code too complicated. I used the same approach for the new config option.

@ajor ajor self-assigned this Jun 19, 2024
Copy link
Member

@ajor ajor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the warnings / errors for kprobes are a different code path than for uprobes - would it make sense to add tests for the new config option with kprobes as well?

tools/biostacks.bt Outdated Show resolved Hide resolved
@@ -23,13 +24,11 @@ class AttachedProbe {
AttachedProbe(Probe &probe,
BpfProgram &&prog,
bool safe_mode,
BPFfeature &feature,
BTF &btf);
BPFtrace &bpftrace);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this was just to add access to Config, I'd prefer to pass bpftrace.config_ into AttachedProbe along with feature and btf to make it simpler to understand what AttachedProbe needs specifically - BPFtrace is a monstrosity that's too big for its own good!

is_traceable_func() complicates things though :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you just described my line of thought - I started with passing just Config and moving to BPFtrace once I realized that I need is_traceable_func, too.

Some editors keep removing the trailing whitespaces on saving so better
remove them in a separate commit rather than merging the change with
another manual update.
When an invalid value is passed to the "cache_user_symbols" config
option, we print an error but do not fail b/c the error is not printed
to ConfigAnalyser::err_. Since the method for setting the value is
called from other places than ConfigAnalyser (when setting the option
from env), let's keep the current behaviour but add an extra empty

    LOG(ERROR, assignment.expr->loc, err_);

to ConfigAnalyser. This will make ConfigAnalyser fail and also print the
error location:

    ERROR: Invalid value for cache_user_symbols: valid values are PER_PID, PER_PROGRAM, and NONE.
    stdin:1:33-41: ERROR:
    config = { cache_user_symbols = "PERPID" } u:/bin/bash:readline { exit(); }

This a bit of a hack and the output is not super-nice but it's the
cleanest thing code-wise.
We cannot use cached fds (pre-loaded BPF programs) for kprobes in
presence of kprobe_multi b/c the BPF_TRACE_KPROBE_MULTI attach type must
be passed upon program loading. If we use prog fd caching and we have a
probe with both wildcarded and non-wildcarded attach points, attachment
will fail based on the order of probes.

For example, for the following program:

    # bpftrace -e 'k:vfs_open,k:vfs_read* {}'

it depends on which probe is loaded first. If it's 'k:vfs_open', then
the BPF_TRACE_KPROBE_MULTI will be missing and when the other probe
tries to reuse the BPF program but attach it using kprobe_multi link,
the attachment will fail.

This fixes the issue by forbidding progfd reuse in presence of
kprobe_multi. It doesn't bring much advantage in such a case anyways.
When a probe is attached to multiple attach points, do not fail hard
when some of the attach points are missing as it may be intentional.
Just show a warning as we do for kprobes.

Previously, we would silently ignore such probes but commit 056005a
("Move probe expansion into codegen") changed that behaviour so take the
opportunity here and unify the behaviour for kprobes and uprobes.
@viktormalik
Copy link
Contributor Author

It looks like the warnings / errors for kprobes are a different code path than for uprobes - would it make sense to add tests for the new config option with kprobes as well?

Yes, good idea, added.

@viktormalik
Copy link
Contributor Author

@jordalgo @danobi want to have a look once more after the latest changes?

Otherwise I'll go ahead, merge this, and craft the 0.21 release.

@jordalgo
Copy link
Contributor

@jordalgo please check 8113d59, it's a bit of hacky but I didn't come with another solution which wouldn't make the code too complicated. I used the same approach for the new config option.

This is just to get the location in the code for the error output right? If so, that makes sense and doesn't seem too hacky.

man/adoc/bpftrace.adoc Outdated Show resolved Hide resolved
man/adoc/bpftrace.adoc Outdated Show resolved Hide resolved
Copy link
Contributor

@jordalgo jordalgo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few minor adoc nits. JC, do you want to make the config an env var as well like the others?

@ajor
Copy link
Member

ajor commented Jun 19, 2024

Few minor adoc nits. JC, do you want to make the config an env var as well like the others?

I don't think we need to do that for every new config option. I see env vars as mainly being for backwards compatibility

@jordalgo
Copy link
Contributor

I don't think we need to do that for every new config option.

I will say that the one benefit is that they go into the --help output, which is nice - but maybe we should just add the config options instead of env vars to the help output.

Add a new config option "missing_probes" which allows the user to choose
how to handle missing k(u)probes for probes which have multiple attach
points.

The possible options are:
- "error"  - print an error and terminate,
- "warn"   - (default) print a warning but continue execution,
- "ignore" - silently ignore missing probes.

Note that this only applies to probes with multiple attach points. For
probes with a single attachpoint, a missing probe always triggers an
error.
Many tools try to attach the same probe to multiple k(u)probes as
different attach points may be available on different systems. For such
tools, use the newly introduced config option to disable warnings about
the missing probes as they are often expected.
@viktormalik
Copy link
Contributor Author

I don't think we need to do that for every new config option.

Agreed with @ajor here, no need to add more env vars.

I will say that the one benefit is that they go into the --help output, which is nice - but maybe we should just add the config options instead of env vars to the help output.

Putting config options to help instead of env vars sounds good to me. Let's discuss in office hours.

@viktormalik viktormalik merged commit 5b6013d into bpftrace:master Jun 20, 2024
17 checks passed
@viktormalik viktormalik deleted the probe-warnings branch June 28, 2024 05:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Attaching to non-existing uprobes fails
4 participants