New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ProcessTree: add core process tree logic (1/4) #1236
Conversation
I have not fully digested the event timestamp rolling buffer, but I just wanted to point out there would still be consistency issues even if we were using a single EndpointSecurity client. Events come from ES serially, however Santa quickly dispatches most events to an async queue. While you could keep the process tree up to date easier, making decisions based on that tree in those async events wouldn't be consistent. Enforcing Santa to make decisions using the process tree before it dispatches events to an async queue isn't practical either. I assume the event timestamp rolling buffer helps with this type of consistency? |
Correct, but the tree mutations will happen while still in the serial callback from ES, so events affecting the tree are processed in order. Each process once added to the tree is ~immutable, so as long as the tree is updated in a logically consistent order, a downstream user of the tree can see the results of the mutations (e.g. annotations propagated to a newly forked pid) as long as that pid is kept alive in the tree which is where the ProcessToken comes in use. The place where this could get messy is if client A manually sets an annotation and then expects client B to be able to see that. Since there's no serialization between clients, whether B sees the annotation is nondeterministic. This is an intended trade-off though, as annotation logic is meant to be written into annotator classes which are all run once when the event is first processed by the tree. This means that a client could possibly see an annotation set "in the future" (if a sequence of events causes an annotation to be set, and this client is processing those events after another client already has) but I've yet to think of any cases where this would actually be an issue. |
Cool, that addresses my consistency concern. Last night I too came to the conclusion about "in the future" annotations being okay. Changing topics a bit. What events types will be used to fill the process tree and make annotations? I assume only notify events will be used. Is that correct? Auth events can be denied by other ES clients even if Santa allows them. Is it accurate to say that the notify events will populate the tree and annotations, while auth events will be a reader of the tree and annotations? |
Yep, that's one of the two main features this enables. And in the case of auth events that clients are already subscribed to process lifecycle (e.g. the main binary authorizer), the tree can be informed off of those auth events directly to avoid the need for redundant notify subscriptions. The other main feature comes from the Recorder also being a client of the tree to allow for annotation export in produced (protobuf) events - this is the primary reason for the |
Some high level convention comments: File names: https://google.github.io/styleguide/objcguide.html#file-names
E.g.: Also, namespaces: https://google.github.io/styleguide/cppguide.html#Namespace_Names
E.g. Also, struct names should be capitalized: https://google.github.io/styleguide/cppguide.html#Type_Names |
Noted, will change.
This I kept at a top level as this is going to be used outside of just Santa, and is meant to be thought of more of a standalone library that Santa happens to use, similar to
Huh, I thought I remembered the opposite where structs were lower/snakecase and classes were upper. Will change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a good start.
If this is a library Santa makes use of it should be a separate repo that we include as a dependency. Otherwise Matt's suggestion holds. |
Updated. May revisit/rework as this gets ported to other internal agents, but don't want this blocking for now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good to go with this first PR. Will need to see how certain things turn out as they're adopted, but good to press on from here!
This is the first PR in a chain adding process trees to Santa, allowing for client-side annotating of processes, "context-aware" rules (e.g. #293), and more.
Most of the tree is relatively straight forward, with two notable exceptions:
"Deduplication" of incoming events
Santa uses multiple, independent EndpointSecurity clients to receive different events for different "subsystems". While we could maintain a ProcessTree per ES client, this is not efficient. For a single tree to work however, it must only be updated for a given event on the system once, otherwise previously processed events could be "undone" by events reprocessed later (e.g. a fork could be processed by client A then by client B which would causing the forked process created by A to be overridden in the tree). To get around this, a rolling buffer of event timestamps is kept, and incoming event timestamps are compared against this set (see
ProcessTree::Step
). Events coming from ES occasionally have timestamps which go backwards, so we have to keep a list and not just the most recent timestamp we've seen.Process cleanup
Removal from the tree's "active set" (map of pid to Process object) is also complicated by Santa's asynchronous processing of events from multiple, independent EndpointSecurity clients. When an event is processed which may remove a Process from the map (e.g. an exit), the removal is actually delayed until the event's timestamp falls off of the list described above, which means every client of the tree is likely to have processed the event by then. Since accesses to the tree about processes referenced in the event may happen asynchronously, even after processing of the next event has started, the tree also implements a basic refcount system managed by a ProcessToken object. This object can be embedded in the Message as it's processed by Santa, and as long as the ProcessToken is alive, the processes referenced by the token will remain in the tree so that, at any time, the client can call
tree->Get(pid)
and retrieve the process.