Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor wildcard matching #1549

Merged
merged 7 commits into from
Feb 11, 2021
Merged

Conversation

viktormalik
Copy link
Contributor

@viktormalik viktormalik commented Oct 8, 2020

The main goal of this work is to unify code used for wildcard matching into one place. Currently, probe listing uses different code to resolve wildcards than probe attachment does, which results in possible discrepancies (such as in #565). The idea is that if some probes are listed, they can be attached with the same syntax.

There are two primary changes done:

  • all code related to probe matching is moved into a new class ProbeMatcher,
  • when listing probes, we use the standard Bison parser and AttachPointParser to parse the listing query. This required a change in the parser grammar: different rule for attach points for listing (when attach point is a listing query) and for probe attachement.

Some details worth mentioning:

  • Listing allows to use wildcards in probe type, while attachment originally doesn't. To cope with this, AttachPointParser tries to expand the probe type and when multiple types are matched, it creates new AttachPoint objects that replace the original AttachPoint object created by the Bison parser.
  • On the other hand, this PR allows to attach a single program to multiple attach points from different providers using a wildcard (e.g. k*:vfs_open). This should not break anything, though, since it was possible before by just simply enumerating the attach points (kprobe:vfs_open,kfunc:vfs_open).
  • One round of semantic analyser is run when listing probes to make the errors consistent for listing and attachement.
  • All usages of regular expressions should be now removed.

Resolves #565, resolves #1489, fixes #1581.

Checklist
  • Language changes are updated in docs/reference_guide.md
  • User-visible and non-trivial changes updated in CHANGELOG.md
  • The new behaviour is covered by tests

@@ -1,5 +1,5 @@
NAME "usdt probes - list probes by file"
RUN bpftrace -l 'usdt:./testprogs/usdt_test'
RUN bpftrace -l 'usdt:./testprogs/usdt_test:*'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the normal syntax to "attach to all probes for that binary". Since I wanted to unify attachment and listing syntax, we either have to require the explicit wildcard for listing or make the wildcard implicit for attachment. I prefer the first option. Also, we would probably have to make the wildcard implicit for all probe types, which I don't really like.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We try to avoid breaking compatibility with existing documentation (books) when possible. But the current behaviour is inconsistent too:

$ sudo bpftrace_master  -l 'u:/bin/bash' | head -n2
uprobe:/bin/bash:__libc_csu_fini
uprobe:/bin/bash:__libc_csu_init
$ sudo bpftrace_master  -l 'k:' | head -n2
$

So I'm ok with breaking it as long as all probe types end up having consistent behaviour. @brendangregg how do you feel about it.

But I do not like the error this throws currently, its confusing:

$ sudo bpftrace -l 'u:/bin/bash'
stdin:1:13-14: ERROR: syntax error, unexpected {
u:/bin/bash {}
            ~

And can even segfault:

$ sudo bpftrace -l 'k:'
Segmentation fault

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We try to avoid breaking compatibility with existing documentation (books) when possible.
...
So I'm ok with breaking it as long as all probe types end up having consistent behaviour.

The current behaviour in this PR is that every probe part should be specified and non-empty (at least * should be given). The only part that can be empty/skipped is namespace for usdt (per documentation) and binary name for uprobe/usdt if PID is specified.

On a second thought, I would agree with empty part of a probe behaving as *. That would keep the backwards compatibility (at least for the above cases). Also, AFAIK, DTrace uses a similar syntax.

But I do not like the error this throws currently, its confusing:
...
And can even segfault:

I'm currently working on this, this is why the PR is in draft state. I'm thinking about running one iteration of semantic analyser, which already has good error messages (and they would be consistent, too).

@viktormalik viktormalik marked this pull request as draft October 9, 2020 04:42
@viktormalik viktormalik force-pushed the refactor-wildcards branch 3 times, most recently from 7e9b227 to e468af3 Compare October 20, 2020 09:55
@viktormalik
Copy link
Contributor Author

viktormalik commented Oct 20, 2020

The PR is ready for review.

The remaining thing to decide is: how do we treat empty probe parts (both for listing and attachment)? Currently, probe listing allows empty probe parts (by treating them as *) but probe attachment doesn't. Since we want to unify this, we have to choose one of the approaches. I'd prefer treating empty probe parts as *, which will keep the compatibility and make the attach point/listing specs shorter. @fbs, @brendangregg WDYT?

@viktormalik viktormalik marked this pull request as ready for review October 20, 2020 10:27
@mmisono
Copy link
Collaborator

mmisono commented Oct 20, 2020

I don't review the code yet but it seems something wrong with struct listing.

before:

% time sudo ./src/bpftrace -lv 'struct path'
BTF: using data from /sys/kernel/btf/vmlinux
struct path {
        struct vfsmount *mnt;
        struct dentry *dentry;
};
0.13s user 0.33s system 92% cpu 0.505 total

after:

% time sudo ./src/bpftrace -lv 'struct path'
BTF: using data from /sys/kernel/btf/vmlinux
struct path {
        struct vfsmount *mnt;
        struct dentry *dentry;
};
struct path_cond {
        kuid_t uid;
        umode_t mode;
};
11.08s user 0.21s system 85% cpu 13.137 total  // too slow

@mmisono
Copy link
Collaborator

mmisono commented Oct 21, 2020

Anther issuse:

% sudo ASAN_OPTIONS=detect_leaks=0 ./src/bpftrace -e 't*a:a{}'
FATAL: program type invalid\
zsh: abort      sudo ASAN_OPTIONS=detect_leaks=0 ./src/bpftrace -e 't*a:a{}'

The program shouldn't abort, should print an error.

@viktormalik
Copy link
Contributor Author

I don't review the code yet but it seems something wrong with struct listing.

before:

% time sudo ./src/bpftrace -lv 'struct path'
BTF: using data from /sys/kernel/btf/vmlinux
struct path {
        struct vfsmount *mnt;
        struct dentry *dentry;
};
0.13s user 0.33s system 92% cpu 0.505 total

after:

% time sudo ./src/bpftrace -lv 'struct path'
BTF: using data from /sys/kernel/btf/vmlinux
struct path {
        struct vfsmount *mnt;
        struct dentry *dentry;
};
struct path_cond {
        kuid_t uid;
        umode_t mode;
};
11.08s user 0.21s system 85% cpu 13.137 total  // too slow

Thanks for noticing, this should be fixed now.

@viktormalik
Copy link
Contributor Author

Anther issuse:

% sudo ASAN_OPTIONS=detect_leaks=0 ./src/bpftrace -e 't*a:a{}'
FATAL: program type invalid\
zsh: abort      sudo ASAN_OPTIONS=detect_leaks=0 ./src/bpftrace -e 't*a:a{}'

The program shouldn't abort, should print an error.

Yes, the problem here is that t*a doesn't match any probe type, which I don't handle properly (hence the abort). But this reveals another inconsistence in bpftrace - how should we handle wildcarded attach points that have no matches?

From the current master:

# bpftrace -e 't:*:nonsense {}'
stdin:1:1-13: ERROR: tracepoints not found: *:nonsense
t:*:nonsense {}
~~~~~~~~~~~~

but on the other hand:

bpftrace -e 'k:nonsense* {}'
No probes to attach

I'd say that this should be unified - either always print an error or always ignore that attach point.

@mmisono
Copy link
Collaborator

mmisono commented Oct 24, 2020

I agree with you, it's better if the behavior is unified. But I guess the reason currently trying to attach nonexistent kprobe is allowed is to make the script work as much as possible because kernel function names can change from version to version. For example, in my environment (kernel 5.9), one of the kprobe in biostacks.bt does not exist.

% sudo ./src/bpftrace ../tools/biostacks.bt
Attaching 5 probes...
cannot attach kprobe, probe entry may not exist
WARNING: could not attach probe kprobe:blk_start_request, skipping.

or more simple example is

% sudo ./src/bpftrace -e "k:a,k:f {}"
Attaching 2 probes...
cannot attach kprobe, probe entry may not exist
WARNING: could not attach probe kprobe:a, skipping.

We also want no to break existing scripts as much as possible. So I think we should keep the current behavior, that means error if wildcard matches any probe type, error if wildcard attach point exists except kprobe (I'm not sure about other than kprobe and tracepoint, so we need check other probes)

@viktormalik
Copy link
Contributor Author

We also want no to break existing scripts as much as possible. So I think we should keep the current behavior, that means error if wildcard matches any probe type, error if wildcard attach point exists except kprobe (I'm not sure about other than kprobe and tracepoint, so we need check other probes)

This sounds like a good way to go. I'd say that we shoudn't fail on non-existing probes for probe types that are not stable: kprobe, kfunc, uprobe, usdt. To sum it up:

  • error if probe type doesn't exist/match,
  • ignore non-existing attach points for non-stable probe types (kprobe, kfunc, uprobe, usdt),
  • error if attach point doesn't exist/match for all other probe types.

@@ -96,6 +96,12 @@ EXPECT kretfunc
REQUIRES_FEATURE btf
TIMEOUT 1

NAME listing with wildcarded probe type
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure, these newly added tests are worked as the same as before? or some behavior are changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one works the same as before, but e.g. listing userspace probes by wildcarded probe type didn't work.

@mmisono
Copy link
Collaborator

mmisono commented Nov 9, 2020

Looks like this works before: sudo ./src/bpftrace -lv "*:*:*"

$$ = new ast::Probe($1, nullptr, nullptr);
else
{
error(@$, "unexpected end of file, expected {");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

% sudo ./src/bpftrace -l 'k:vfs_open // {}'
stdin:1:12-13: ERROR: syntax error, unexpected end predicate, expecting {
k:vfs_open // {}
           ~

In this case, "unexpected listing query format" is more appropriate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but since this is an automatic error message by bison, I haven't yet found a way to implement it in a short way. And I don't like the idea of making large changes to the parser (potentially introducing errors) just for the sake of fixing one error message.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I also don't familiar with bison, and the current behavior is OK to me.

src/btf.cpp Outdated Show resolved Hide resolved
@mmisono
Copy link
Collaborator

mmisono commented Nov 9, 2020

I realized that now

% sudo ./src/bpftrace -l 'k:vfs_write,k:vfs_open'
kprobe:vfs_write
kprobe:vfs_open

works. It is good thing, but it seems -v option does not work. ./src/bpftrace -lv 'k:vfs_write,k:vfs_open' gives the same output as the above.

@mmisono
Copy link
Collaborator

mmisono commented Nov 9, 2020

I'll check the remaining code when I have time.

@viktormalik
Copy link
Contributor Author

I realized that now

% sudo ./src/bpftrace -l 'k:vfs_write,k:vfs_open'
kprobe:vfs_write
kprobe:vfs_open

works. It is good thing, but it seems -v option does not work. ./src/bpftrace -lv 'k:vfs_write,k:vfs_open' gives the same output as the above.

That's expected, kprobes do not list arguments with -v (neither they did before). It's kfuncs doing that:

# src/bpftrace -lv 'f:vfs_write,f:vfs_open'
kfunc:vfs_write
    struct file * file
    [...]
kfunc:vfs_open
    const struct path * path
    [...]

@mmisono
Copy link
Collaborator

mmisono commented Nov 9, 2020

I realized that now

% sudo ./src/bpftrace -l 'k:vfs_write,k:vfs_open'
kprobe:vfs_write
kprobe:vfs_open

works. It is good thing, but it seems -v option does not work. ./src/bpftrace -lv 'k:vfs_write,k:vfs_open' gives the same output as the above.

That's expected, kprobes do not list arguments with -v (neither they did before). It's kfuncs doing that:

# src/bpftrace -lv 'f:vfs_write,f:vfs_open'
kfunc:vfs_write
    struct file * file
    [...]
kfunc:vfs_open
    const struct path * path
    [...]

my bad, I forgot about it.

@viktormalik
Copy link
Contributor Author

Looks like this works before: sudo ./src/bpftrace -lv "*:*:*"

Right, thanks. This should now work as before.

@mmisono
Copy link
Collaborator

mmisono commented Nov 9, 2020

% sudo ./src/bpftrace -l "k*:vfs_open"
kfunc:vfs_open
kprobe:vfs_open
kretfunc:vfs_open

this is ok but

% sudo  ./src/bpftrace -l "f*:vfs_open"
stdin:1:1-12: ERROR: No probe type matched
f*:vfs_open
~~~~~~~~~~~

% sudo  ./src/bpftrace -lv "*:vfs_open"
stdin:1:1-1: ERROR: vfs_open is not a hardware probe
*:vfs_open

stdin:1:1-1: ERROR: vfs_open is not a software probe
*:vfs_open

@viktormalik
Copy link
Contributor Author

% sudo  ./src/bpftrace -l "f*:vfs_open"
stdin:1:1-12: ERROR: No probe type matched
f*:vfs_open
~~~~~~~~~~~

This is intentional since f* does not match any probe type. I improved the error message, though:

stdin:1:1-12: ERROR: No probe type matched for f*

% sudo ./src/bpftrace -lv "*:vfs_open"
stdin:1:1-1: ERROR: vfs_open is not a hardware probe
*:vfs_open

stdin:1:1-1: ERROR: vfs_open is not a software probe
*:vfs_open

This is now fixed:

# src/bpftrace -lv "*:vfs_open"
kfunc:vfs_open
    const struct path * path
    [...]
kprobe:vfs_open

I also removed kretfunc as it is just a duplicate of kfunc (similarly, kretprobe are not listed).

@mmisono
Copy link
Collaborator

mmisono commented Nov 10, 2020

OK^

It seems listing can only work without other queries. This is not supported previously, but I think it would be useful. What do you think?

sudo  ./src/bpftrace -lv "f:vfs_open, struct path"
stdin:1:12-12: ERROR: Unrecognized probe type:
f:vfs_open, struct path

or

sudo ./src/bpftrace -lv "struct path,struct file"

@viktormalik
Copy link
Contributor Author

It seems listing can only work without other queries. This is not supported previously, but I think it would be useful. What do you think?

Might be useful, but I'd rather not extend this PR with additional features, it's quite large already. We can add this later in a separate PR.

Copy link
Collaborator

@mmisono mmisono left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good and good improvements to me, but definitely more reviewers are needed because of its size. Also, I wonder if we can have tests for ProbeMatcher.

};

// clang-format off
const std::vector<ProbeListItem> SW_PROBE_LIST = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we need to use a singleton to thread-safe initialization. something like: https://github.com/iovisor/bpftrace/blob/77a25cf9eb76f1107240ee26cca8b7223bb5184f/src/ast/semantic_analyser.cpp#L22-L34
@danobi probably know this.

@viktormalik
Copy link
Contributor Author

Overall looks good and good improvements to me, but definitely more reviewers are needed because of its size. Also, I wonder if we can have tests for ProbeMatcher.

I already thought about this, but then I realized that matching is already tested in tests/bpftrace.cpp, tests/parser.cpp, and in runtime tests. It might make sense to group these into a separate test class for ProbeMatcher (although I would, again, postpone it to another PR).

@fbs
Copy link
Contributor

fbs commented Nov 20, 2020

Overal it looks good to me, nice cleanup.

I'm not that happy with the way this hooks into semantic analysis. It's another thing bolted into the semantic analyser that makes it a bit more complex. A separate simpeler pass seems more appropriate to me. But as its a nice cleanup we shouldn't block on it imo

@fbs
Copy link
Contributor

fbs commented Dec 4, 2020

@danobi any feedback?

I want to merge this soon but it needs a rebase @viktormalik .

@viktormalik
Copy link
Contributor Author

Rebased, ready to merge.

There's still an unresolved issue - how to treat empty probe parts? See this review and this comment for details. Currently, this PR requires that each probe part is specified, which is not backwards compatible with listing (previously, listing allowed to skip probe parts). I think that this is not a big issue (since it's only listing), I'd suggest merging this PR and then maybe adding a possibility to omit empty probe parts as another PR.

@viktormalik
Copy link
Contributor Author

One embedded build test time-outed, but it passes on my fork branch (check here). I guess a restart could solve this.

@viktormalik
Copy link
Contributor Author

Is this waiting for me? I think that the CI fail should be resolved by a restart and otherwise the PR should be ready for merge.

@fbs
Copy link
Contributor

fbs commented Feb 4, 2021

Looks ok to me :)

@danobi
Copy link
Member

danobi commented Feb 9, 2021

Needs a rebase

Refactor probe attachent and probe listing so that the same code is used
for matching wildcards. This consists of two primary changes:
- all code related to probe matching is moved into a new class
  ProbeMatcher
- when listing probes, we use the standard Bison parser and
  AttachPointParser to parse the listing query

The problem with using AttachPointParser for parsing list queries is
that listing (unlike attachement) allows to specify multiple probe types
at once (using wildcard). To cope with this, AttachPointParser tries to
expand the probe type and when multiple types are matched, it creates
new AttachPoint objects that replace the original AttachPoint object
created by the Bison parser.

This also makes attachement of a single program to different probe types
using a wildcard possible. Since attachement to multiple probe types was
possible before (by enumerating them), this should not break anything.

Functions in BTF are changed so that they do not print
functions/params/structs directly, but instead return them in sets that
are processed in ProbeMatcher.

Tests mock ProbeMatcher (in the class MockProbeMatcher) since it
contains methods from BPFtrace that were originally mocked.

This change should ensure that if some probes are listed, they can be
attached with the same syntax.

All usages of regular expressions should be now removed.
This makes sure that the list query (attach point specification) is
checked for semantic errors (such as missing part for the given probe
type) and that the reported errors are meaningful and consistent with
errors reported for probe attachement.
If wildcarded probe type matches multiple actual probe types, ignore
those probe types for which the rest of the probe specification has
incorrect number of parts (instead of throwing an error). Also ignore
non-existing hardware and software probes in such case.

E.g.:
  bpftrace -l '*:*:*'
will list all probes that have 3 parts.
If a probe type doesn't exist or no probe type is matched when a
wildcard is used, terminate with error.
Do not require the trailing colon for path in an attach point, since it
may not be present and then the parser fails with strange error.

Leading colon must always be present as path must follow the probe type
specifier (e.g. "uprobe:/bin/bash").
Add new tests for various usages of wildcards, especially for wildcards
occurring in probe type.
@viktormalik
Copy link
Contributor Author

Rebased

@fbs fbs merged commit af06198 into bpftrace:master Feb 11, 2021
@viktormalik viktormalik deleted the refactor-wildcards branch March 10, 2021 10:27
@ajor ajor mentioned this pull request Apr 12, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants