Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework the matchBinaries selector implementation #1731

Merged
merged 11 commits into from
Nov 15, 2023

Conversation

mtardy
Copy link
Member

@mtardy mtardy commented Nov 8, 2023

This PR reimplements the matchBinaries selector. Instead of using a unique map and do the binary filtering early at the execve level (see more details here), these changes make Tetragon stores the execve binary path in the execve_map which stores the state of all processes running on the machine. Thus we can retrieve this binary path when doing the matchBinaries later and perform a hash lookup at a later stage.

It has two main benefits:

  • Events triggered by process started before Tetragon started are now detected by matchBinaries correctly, I added a test for this that is using the perfring test machinery that can check if something is missing in a timely manner.
  • Prepare the way to add the Prefix and NotPrefix operators. And after Postfix and NotPostfix.

It also has some limitations:

  • It might be slower than before (but this change is needed to get prefix/postfix later on)
  • The first implementation was reusing string matchArgs machinery that is more CPU efficient but was too complex for 4.19 limited BPF program size (it increased the BPF progs size by ~1500 instructions). This might be improved in the future: using the string matchArgs variable size maps for >4.19. Note that we can virtually decrease the max size of the binary path to 128 instead of 256 to reduce the overhead.
  • This is not tied to this PR, but the binary path is just the arg passed to execve and thus can be a relative path (this explains many of the users issues). A future patch is needed to read the absolute path of the task_struct (as we do on userspace side with /proc to fill the initial state of the execve_map) to make this feature complete.

Note that these changes are generally reducing the number of instructions in BPF programs, see the veristat report.

@mtardy mtardy added the release-note/bug This PR fixes an issue in a previous release of Tetragon. label Nov 8, 2023
Copy link

netlify bot commented Nov 8, 2023

Deploy Preview for tetragon ready!

Name Link
🔨 Latest commit fc29fa0
🔍 Latest deploy log https://app.netlify.com/sites/tetragon/deploys/654e560952f1aa000881135e
😎 Deploy Preview https://deploy-preview-1731--tetragon.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@mtardy mtardy force-pushed the pr/mtardy/match-binaries-rework branch from 0421686 to 59673ba Compare November 9, 2023 10:10
bpf/process/bpf_fork.c Outdated Show resolved Hide resolved
@mtardy mtardy force-pushed the pr/mtardy/match-binaries-rework branch from 59673ba to 0483ad4 Compare November 9, 2023 10:14
@mtardy mtardy marked this pull request as ready for review November 9, 2023 10:18
@mtardy mtardy requested a review from a team as a code owner November 9, 2023 10:18
@mtardy mtardy requested a review from jrfastab November 9, 2023 10:18
@mtardy mtardy added kind/enhancement This improves or streamlines existing functionality area/bpf This is related to BPF code labels Nov 9, 2023
This new struct will store information about the binary path from the
moment we persist the execve information in the execve_map. We tried to
reduce at maximum the size of what we store, ending up with 256 bytes
instead of the theoretical maximum MAX_PATH 4096 bytes (+ metadata).

This will be useful when doing the matchBinary at a later stage and
retrieving the information about the process from the execve_map.

Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
This commits introduces changes on the proc reader part, which scans
/proc at startup and initialize/fill execve_map with information of
processes that were started before tetragon.

Also it moves the part that is trimming the p.args if the size of the
process information would not fit in the allocated buffer. We were
previously doing it in the part that parses /proc, which was too early
because in the case of execve_map initialization, it's not needed (and
we now need at least 255 bytes of the binary path guaranted), while it's
needed for pushing the execve event, where it was moved.

We also needed the 'exe' value at execve_map initialization, which was
already merged with 'cmdline' early at /proc parsing since it was not
necessary previously. Now we merge 'exe' and 'cmdline' on demand at a
later stage (again when pushing the execve event).

Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
@mtardy mtardy force-pushed the pr/mtardy/match-binaries-rework branch from 0483ad4 to fc29fa0 Compare November 10, 2023 16:10
Copy link
Member

@tpapagian tpapagian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This is a good improvement! I have done a first review and added some comments.

This is not tied to this PR, but the binary path is just the arg passed to execve and thus can be a relative path (this explains many of the users issues). A future patch is needed to read the absolute path of the task_struct (as we do on userspace side with /proc to fill the initial state of the execve_map) to make this feature complete.

I agree that this is not tied to this PR but this would be also a great improvement in matchBinaries for a follow-up PR. I believe what you do here will help to have the full binary path ready from the kernel. Maybe we need to create an issue to keep track of this (if we do not already have one).

pkg/sensors/exec/procevents/proc_reader.go Outdated Show resolved Hide resolved
pkg/sensors/tracing/selectors.go Outdated Show resolved Hide resolved
pkg/selectors/selectors.go Outdated Show resolved Hide resolved
bpf/lib/process.h Show resolved Hide resolved
pkg/selectors/kernel.go Outdated Show resolved Hide resolved
pkg/selectors/selectors.go Outdated Show resolved Hide resolved
pkg/sensors/exec/execvemap/execve.go Outdated Show resolved Hide resolved
@mtardy mtardy force-pushed the pr/mtardy/match-binaries-rework branch from fc29fa0 to a80de98 Compare November 13, 2023 14:26
@mtardy
Copy link
Member Author

mtardy commented Nov 13, 2023

Thanks! This is a good improvement! I have done a first review and added some comments.

This is not tied to this PR, but the binary path is just the arg passed to execve and thus can be a relative path (this explains many of the users issues). A future patch is needed to read the absolute path of the task_struct (as we do on userspace side with /proc to fill the initial state of the execve_map) to make this feature complete.

I agree that this is not tied to this PR but this would be also a great improvement in matchBinaries for a follow-up PR. I believe what you do here will help to have the full binary path ready from the kernel. Maybe we need to create an issue to keep track of this (if we do not already have one).

Cool thanks for the careful review, I think everything was whether already fixed or I fixed it in last push. Thanks for the insight on the clone thing 🙏 !

Copy link
Member

@tpapagian tpapagian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time to apply my proposed fixes!

Copy link
Contributor

@kkourt kkourt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM!

Let's replace the Fatal with a warning, unless there is a reason not to do so.
Everything else is a nit so feel free to opt out.

pkg/sensors/exec/procevents/proc_reader.go Outdated Show resolved Hide resolved
pkg/sensors/exec/procevents/proc_reader.go Outdated Show resolved Hide resolved
bpf/process/types/basic.h Show resolved Hide resolved
A new binary struct was added BPF side to store a part of the binary
path inside the execve_map values (to do comparison at a later stage).

Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
This copies the information to persist in the execve_map.

Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
This commits introduces the new "tg_mb_sel_opts", that stores the
matchBinaries selector options on userspace, to use on BPF side. It
adapts the code to parse the selector and to populate the map with the
options at progam loading time.

Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
This commit introduces a new implementation for matchBinaries using the
stored binary path from the execve information to match against a hash
maps of the matchBinaries paths for In and NotIn operators. This hash
map is in a map of maps containing potentially a hash map per
matchBinaries selector.

Note that a first iteration was made using the strings machinery, using
multiple map strings and thus reducing CPU cycles, but it proved to be
too complex for 4.19.

This also remove old unnecessary fields and code for the old
matchBinaries implementation.

Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
This cleans up the old implementation and add the userspace side for the
new implementation: parsing the the matchBinaries selectors and
populating the map at program loading time with the paths.

Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
This test is a bit more advanced than the previous usual ones
(TestKprobeMatchBinaries) since it can check for the absence of the
filtered event from the output of the perfring.

Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
Previous matchBinaries selector implementation would skip events
triggered by process started before Tetragon.

Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
@mtardy mtardy force-pushed the pr/mtardy/match-binaries-rework branch from a80de98 to 753507f Compare November 15, 2023 11:06
@mtardy
Copy link
Member Author

mtardy commented Nov 15, 2023

Thank you both, Anastasios I tried to fix all issues related to that #1731 (comment), and Kornilios I changed the switch and the Fatal log.

@mtardy mtardy merged commit d3a0a07 into main Nov 15, 2023
31 checks passed
@mtardy mtardy deleted the pr/mtardy/match-binaries-rework branch November 15, 2023 12:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/bpf This is related to BPF code kind/enhancement This improves or streamlines existing functionality release-note/bug This PR fixes an issue in a previous release of Tetragon.
Projects
None yet
3 participants