Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eBPF control plane signals #3237

Merged
merged 5 commits into from
Jun 26, 2023

Conversation

NDStrahilevitz
Copy link
Collaborator

@NDStrahilevitz NDStrahilevitz commented Jun 14, 2023

Add a control plane package which attaches specialized shorter eBPF probes to guarantee lifecycle events reach tracee.
Currently, only container lifecycle events are handled.

Author: Nadav Strahilevitz <nadav.strahilevitz@aquasec.com>
Date:   Sun Jun 4 09:04:16 2023 +0000

    ebpf: refactor arg buffer to type
Author: Nadav Strahilevitz <nadav.strahilevitz@aquasec.com>
Date:   Wed Jun 7 12:14:55 2023 +0000

    bufferdecoder: add DecodeArguments method
Author: Nadav Strahilevitz <nadav.strahilevitz@aquasec.com>
Date:   Wed Jun 7 12:29:37 2023 +0000

    events/parse: refactor parse.ArgVal
    
    Take arguments array directly instead of the entire event.
Author: Nadav Strahilevitz <nadav.strahilevitz@aquasec.com>
Date:   Wed Jun 7 12:31:05 2023 +0000

    ebpf: add control plane
    
    The Control Plane's intention is to process "signal programs" which
    would previously depend on the event submission success.
    
    Instead, a separate buffer and slimmer type is used for these critical
    event processes.

Resolve #3086

@NDStrahilevitz
Copy link
Collaborator Author

@AlonZivony FYI, if this is merged before the process tree, it might be a good idea to connect it to this instead of the regular event buffer. If you see any limitations for that LMK here so I can address them.

@yanivagman @rafaeldtinoco Currently we automatically pass cgroup_mkdir/rmdir in all scopes with no filter check, WDYT about removing the bypass in this PR?

@NDStrahilevitz NDStrahilevitz force-pushed the control_plane_events branch 4 times, most recently from 7d1ddd4 to 29a8fed Compare June 15, 2023 12:28
@yanivagman
Copy link
Collaborator

@yanivagman @rafaeldtinoco Currently we automatically pass cgroup_mkdir/rmdir in all scopes with no filter check, WDYT about removing the bypass in this PR?

Yes please.
Going to review it next.

pathDecoder := New(trimmedPath)
sunPath, err = readStringVarFromBuff(pathDecoder, 108)
}
sunPath, err := readStringVarFromBuff(ebpfMsgDecoder, 108)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were reasons for the above additions like having the extra NULL bytes, why do you think it is ok to remove those?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I saw it doing it in another buffer was unnecessary overhead (in terms of readability mostly). By moving the cursor it has the same effect in the buffer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yanivagman About this and the below comment, this is a one time used function for this specific flow, I don't think it's necessary to formalize it for this PR. Maybe in another one, but it's a very niche function not used for anything else.

res = append(res, byte(char))
err = decoder.DecodeInt8(&char)
if err != nil {
return "", errfmt.Errorf("error reading null terminated string: %v", err)
}
}
res = bytes.TrimLeft(res[:], "\000")
decoder.cursor += max - count // move cursor to end of buffer
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We never access the cursor directly in this file but always using the DecodeXXX decoder methods.
Maybe we should consider converting this function to be a decoder method?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well then the package boundary may be wrong. I saw these function as utils which shouldn't be in the decoder itself but do have access to it. But maybe it should be added to the decoder as you suggest.

pkg/bufferdecoder/protocol_test.go Outdated Show resolved Hide resolved
pkg/bufferdecoder/protocol.go Outdated Show resolved Hide resolved
pkg/ebpf/c/common/buffer.h Outdated Show resolved Hide resolved
pkg/ebpf/c/tracee.bpf.c Show resolved Hide resolved
pkg/ebpf/c/types.h Show resolved Hide resolved
@@ -89,7 +89,7 @@ func (t *Tracee) enrichContainerEvents(ctx gocontext.Context, in <-chan *trace.E
cgroupId := uint64(event.CgroupID)
// CgroupMkdir: pick EventID from the event itself
if eventID == events.CgroupMkdir {
cgroupId, _ = parse.ArgVal[uint64](event, "cgroup_id")
cgroupId, _ = parse.ArgVal[uint64](event.Args, "cgroup_id")
}
// CgroupRmdir: clean up remaining events and maps
if eventID == events.CgroupRmdir {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the below code related to event enrichment of cgroup mkdir/rmdir or should we move it to be handled in the control plane?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add an enrich call at the end of the control plane logic so the code here could take less time (since enrichment would already be done), that's a good idea.
But the change done here is just to align with the parse.ArgVal function signature change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By starting the EnrichCgroupInfo sooner (in the control plane), we will help the pipeline (that might receive same related cgroupid events a bit later because of its channel buffer). I think its a good idea.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added now, will test to see if it works right, please review it as well @rafaeldtinoco.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if we now enrich using the control plane messages - do we still need to perform the code below (currently the lines below are not changed in this PR)

pkg/ebpf/events_processor.go Show resolved Hide resolved
pkg/events/parse/params_test.go Show resolved Hide resolved
@rafaeldtinoco
Copy link
Contributor

I want to review this one pls.

pkg/ebpf/c/maps.h Outdated Show resolved Hide resolved
Copy link
Contributor

@rafaeldtinoco rafaeldtinoco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some minor comments but LGTM, its a nice change, thank you!

@NDStrahilevitz NDStrahilevitz force-pushed the control_plane_events branch 4 times, most recently from ab31b88 to 748e303 Compare June 25, 2023 10:39
@NDStrahilevitz
Copy link
Collaborator Author

@yanivagman @rafaeldtinoco I believe i've addressed the problems raised in the PR relevant to its scope, please re-review.

Copy link
Collaborator

@yanivagman yanivagman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM,
we can potentially remove more logic from the events pipeline

@@ -89,7 +89,7 @@ func (t *Tracee) enrichContainerEvents(ctx gocontext.Context, in <-chan *trace.E
cgroupId := uint64(event.CgroupID)
// CgroupMkdir: pick EventID from the event itself
if eventID == events.CgroupMkdir {
cgroupId, _ = parse.ArgVal[uint64](event, "cgroup_id")
cgroupId, _ = parse.ArgVal[uint64](event.Args, "cgroup_id")
}
// CgroupRmdir: clean up remaining events and maps
if eventID == events.CgroupRmdir {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if we now enrich using the control plane messages - do we still need to perform the code below (currently the lines below are not changed in this PR)

@yanivagman
Copy link
Collaborator

Thanks @NDStrahilevitz, well done!
I think the control plane will be a very good addition to Tracee

@rafaeldtinoco
Copy link
Contributor

@NDStrahilevitz Im re-checking with your changes, and will +1 soon.

@rafaeldtinoco
Copy link
Contributor

@NDStrahilevitz needs rebasing.

Take arguments array directly instead of the entire event.
The Control Plane's intention is to process "signal programs" which
would previously depend on the event submission success.

Instead, a separate buffer and slimmer type is used for these critical
event processes.
@NDStrahilevitz NDStrahilevitz merged commit 676893a into aquasecurity:main Jun 26, 2023
25 checks passed
@NDStrahilevitz NDStrahilevitz deleted the control_plane_events branch June 26, 2023 17:58
@itaysk
Copy link
Collaborator

itaysk commented Jun 27, 2023

Is there a reason the last commit is separate (fix id override)?

@NDStrahilevitz
Copy link
Collaborator Author

Is there a reason the last commit is separate (fix id override)?

Because it was a bug that I think existed before the features introduced here, and I just found it while testing it with container enrichment. So I decided to include it in a separate commit and fix it opportunistically as part of the PR.

@itaysk
Copy link
Collaborator

itaysk commented Jun 27, 2023

makes sense, thanks

@itaysk
Copy link
Collaborator

itaysk commented Jun 27, 2023

In this case the PR should have closed the bug report issue in addition to the feature issue (ok to close tow issues with one PR) and that bug issue should have been included in the milestone. Not a huge deal just taking the learning opportunity
cc @yanivagman

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Separate control plane buffer
4 participants