-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework the matchBinaries selector implementation #1731
Conversation
✅ Deploy Preview for tetragon ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
0421686
to
59673ba
Compare
59673ba
to
0483ad4
Compare
This new struct will store information about the binary path from the moment we persist the execve information in the execve_map. We tried to reduce at maximum the size of what we store, ending up with 256 bytes instead of the theoretical maximum MAX_PATH 4096 bytes (+ metadata). This will be useful when doing the matchBinary at a later stage and retrieving the information about the process from the execve_map. Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
This commits introduces changes on the proc reader part, which scans /proc at startup and initialize/fill execve_map with information of processes that were started before tetragon. Also it moves the part that is trimming the p.args if the size of the process information would not fit in the allocated buffer. We were previously doing it in the part that parses /proc, which was too early because in the case of execve_map initialization, it's not needed (and we now need at least 255 bytes of the binary path guaranted), while it's needed for pushing the execve event, where it was moved. We also needed the 'exe' value at execve_map initialization, which was already merged with 'cmdline' early at /proc parsing since it was not necessary previously. Now we merge 'exe' and 'cmdline' on demand at a later stage (again when pushing the execve event). Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
0483ad4
to
fc29fa0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This is a good improvement! I have done a first review and added some comments.
This is not tied to this PR, but the binary path is just the arg passed to execve and thus can be a relative path (this explains many of the users issues). A future patch is needed to read the absolute path of the task_struct (as we do on userspace side with /proc to fill the initial state of the execve_map) to make this feature complete.
I agree that this is not tied to this PR but this would be also a great improvement in matchBinaries
for a follow-up PR. I believe what you do here will help to have the full binary path ready from the kernel. Maybe we need to create an issue to keep track of this (if we do not already have one).
fc29fa0
to
a80de98
Compare
Cool thanks for the careful review, I think everything was whether already fixed or I fixed it in last push. Thanks for the insight on the clone thing 🙏 ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the time to apply my proposed fixes!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! LGTM!
Let's replace the Fatal
with a warning, unless there is a reason not to do so.
Everything else is a nit so feel free to opt out.
A new binary struct was added BPF side to store a part of the binary path inside the execve_map values (to do comparison at a later stage). Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
This copies the information to persist in the execve_map. Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
This commits introduces the new "tg_mb_sel_opts", that stores the matchBinaries selector options on userspace, to use on BPF side. It adapts the code to parse the selector and to populate the map with the options at progam loading time. Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
This commit introduces a new implementation for matchBinaries using the stored binary path from the execve information to match against a hash maps of the matchBinaries paths for In and NotIn operators. This hash map is in a map of maps containing potentially a hash map per matchBinaries selector. Note that a first iteration was made using the strings machinery, using multiple map strings and thus reducing CPU cycles, but it proved to be too complex for 4.19. This also remove old unnecessary fields and code for the old matchBinaries implementation. Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
This cleans up the old implementation and add the userspace side for the new implementation: parsing the the matchBinaries selectors and populating the map at program loading time with the paths. Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
This test is a bit more advanced than the previous usual ones (TestKprobeMatchBinaries) since it can check for the absence of the filtered event from the output of the perfring. Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
Previous matchBinaries selector implementation would skip events triggered by process started before Tetragon. Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
a80de98
to
753507f
Compare
Thank you both, Anastasios I tried to fix all issues related to that #1731 (comment), and Kornilios I changed the switch and the Fatal log. |
This PR reimplements the matchBinaries selector. Instead of using a unique map and do the binary filtering early at the execve level (see more details here), these changes make Tetragon stores the execve binary path in the execve_map which stores the state of all processes running on the machine. Thus we can retrieve this binary path when doing the matchBinaries later and perform a hash lookup at a later stage.
It has two main benefits:
Prefix
andNotPrefix
operators. And afterPostfix
andNotPostfix
.It also has some limitations:
task_struct
(as we do on userspace side with/proc
to fill the initial state of theexecve_map
) to make this feature complete.Note that these changes are generally reducing the number of instructions in BPF programs, see the veristat report.