ProcessTree: add core process tree logic (1/4) #1236

kallsyms · 2023-11-16T19:05:34Z

This is the first PR in a chain adding process trees to Santa, allowing for client-side annotating of processes, "context-aware" rules (e.g. #293), and more.

Most of the tree is relatively straight forward, with two notable exceptions:

"Deduplication" of incoming events

Santa uses multiple, independent EndpointSecurity clients to receive different events for different "subsystems". While we could maintain a ProcessTree per ES client, this is not efficient. For a single tree to work however, it must only be updated for a given event on the system once, otherwise previously processed events could be "undone" by events reprocessed later (e.g. a fork could be processed by client A then by client B which would causing the forked process created by A to be overridden in the tree). To get around this, a rolling buffer of event timestamps is kept, and incoming event timestamps are compared against this set (see ProcessTree::Step). Events coming from ES occasionally have timestamps which go backwards, so we have to keep a list and not just the most recent timestamp we've seen.

Process cleanup

Removal from the tree's "active set" (map of pid to Process object) is also complicated by Santa's asynchronous processing of events from multiple, independent EndpointSecurity clients. When an event is processed which may remove a Process from the map (e.g. an exit), the removal is actually delayed until the event's timestamp falls off of the list described above, which means every client of the tree is likely to have processed the event by then. Since accesses to the tree about processes referenced in the event may happen asynchronously, even after processing of the next event has started, the tree also implements a basic refcount system managed by a ProcessToken object. This object can be embedded in the Message as it's processed by Santa, and as long as the ProcessToken is alive, the processes referenced by the token will remain in the tree so that, at any time, the client can call tree->Get(pid) and retrieve the process.

tburgin · 2023-11-16T21:06:52Z

I have not fully digested the event timestamp rolling buffer, but I just wanted to point out there would still be consistency issues even if we were using a single EndpointSecurity client. Events come from ES serially, however Santa quickly dispatches most events to an async queue. While you could keep the process tree up to date easier, making decisions based on that tree in those async events wouldn't be consistent. Enforcing Santa to make decisions using the process tree before it dispatches events to an async queue isn't practical either.

I assume the event timestamp rolling buffer helps with this type of consistency?

kallsyms · 2023-11-17T17:58:32Z

Events come from ES serially, however Santa quickly dispatches most events to an async queue.

Correct, but the tree mutations will happen while still in the serial callback from ES, so events affecting the tree are processed in order. Each process once added to the tree is ~immutable, so as long as the tree is updated in a logically consistent order, a downstream user of the tree can see the results of the mutations (e.g. annotations propagated to a newly forked pid) as long as that pid is kept alive in the tree which is where the ProcessToken comes in use.

The place where this could get messy is if client A manually sets an annotation and then expects client B to be able to see that. Since there's no serialization between clients, whether B sees the annotation is nondeterministic. This is an intended trade-off though, as annotation logic is meant to be written into annotator classes which are all run once when the event is first processed by the tree. This means that a client could possibly see an annotation set "in the future" (if a sequence of events causes an annotation to be set, and this client is processing those events after another client already has) but I've yet to think of any cases where this would actually be an issue.

tburgin · 2023-11-17T18:38:01Z

Cool, that addresses my consistency concern. Last night I too came to the conclusion about "in the future" annotations being okay.

Changing topics a bit. What events types will be used to fill the process tree and make annotations? I assume only notify events will be used. Is that correct? Auth events can be denied by other ES clients even if Santa allows them.

Is it accurate to say that the notify events will populate the tree and annotations, while auth events will be a reader of the tree and annotations?

kallsyms · 2023-11-17T18:54:39Z

Is it accurate to say that the notify events will populate the tree and annotations, while auth events will be a reader of the tree and annotations?

Yep, that's one of the two main features this enables. And in the case of auth events that clients are already subscribed to process lifecycle (e.g. the main binary authorizer), the tree can be informed off of those auth events directly to avoid the need for redundant notify subscriptions.

The other main feature comes from the Recorder also being a client of the tree to allow for annotation export in produced (protobuf) events - this is the primary reason for the Proto() method on the Annotation abstract class. One of the initial use cases is an annotation like "descendant of a shell" which can be highly informative for anyone analyzing the logs. Of course this is possible to do today by reconstructing process ancestry, but embedding simple tags like this into the output event stream make it significantly easier/cheaper to use.

mlw · 2023-11-17T21:04:15Z

Some high level convention comments:

File names: https://google.github.io/styleguide/objcguide.html#file-names

File names should reflect the name of the class implementation that they contain—including case.

E.g.: base.h --> Annotator.h, but these all should be fixed up to match the project.

Also, namespaces: https://google.github.io/styleguide/cppguide.html#Namespace_Names

Namespace names are all lower-case, with words separated by underscores. Top-level namespace names are based on the project name.

E.g. namespace process_tree --> namespace santa::santad::process_tree::<class>

Also, struct names should be capitalized: https://google.github.io/styleguide/cppguide.html#Type_Names

kallsyms · 2023-11-17T22:05:11Z

File names: https://google.github.io/styleguide/objcguide.html#file-names

E.g.: base.h --> Annotator.h, but these all should be fixed up to match the project.

Noted, will change.

Also, namespaces: https://google.github.io/styleguide/cppguide.html#Namespace_Names

E.g. namespace process_tree --> namespace santa::santad::process_tree::<class>

This I kept at a top level as this is going to be used outside of just Santa, and is meant to be thought of more of a standalone library that Santa happens to use, similar to fsspool.

Also, struct names should be capitalized: https://google.github.io/styleguide/cppguide.html#Type_Names

Huh, I thought I remembered the opposite where structs were lower/snakecase and classes were upper. Will change.

pmarkowsky

This looks like a good start.

Source/santad/ProcessTree/process.h

Source/santad/ProcessTree/tree.cc

Source/santad/ProcessTree/tree.h

russellhancox · 2023-11-17T23:35:56Z

Also, namespaces: https://google.github.io/styleguide/cppguide.html#Namespace_Names
E.g. namespace process_tree --> namespace santa::santad::process_tree::<class>

This I kept at a top level as this is going to be used outside of just Santa, and is meant to be thought of more of a standalone library that Santa happens to use, similar to fsspool.

If this is a library Santa makes use of it should be a separate repo that we include as a dependency. Otherwise Matt's suggestion holds.

kallsyms · 2023-11-20T16:14:37Z

Otherwise Matt's suggestion holds.

Updated. May revisit/rework as this gets ported to other internal agents, but don't want this blocking for now.

Source/santad/ProcessTree/process_tree.cc

Source/santad/ProcessTree/process_tree.h

Source/santad/ProcessTree/process_tree.cc

Source/santad/ProcessTree/process_tree.h

mlw

I'm good to go with this first PR. Will need to see how certain things turn out as they're adopted, but good to press on from here!

kallsyms added 3 commits November 16, 2023 13:33

ProcessTree: add core process tree logic

183dd16

make Step implicitly called by Handle* methods

deb3431

lint

ecdb358

kallsyms requested a review from a team as a code owner November 16, 2023 19:05

pmarkowsky added the process annotations label Nov 16, 2023

pmarkowsky changed the title ~~ProcessTree: add core process tree logic~~ ProcessTree: add core process tree logic (1/4) Nov 16, 2023

kallsyms mentioned this pull request Nov 16, 2023

ProcessTree: add macOS specific loader and ES adapter (2/4) #1237

Merged

pmarkowsky reviewed Nov 17, 2023

View reviewed changes

kallsyms added 7 commits November 17, 2023 17:19

naming convention

8144ee3

widen pidversion to be generic

47e8e3a

move os specific backfill to os specific impl

0ec4143

simplify ts checking

9d9ba5c

retain/release a whole vec of pids

bd8acb8

document processtoken

a795fb0

lint

bc9568f

namespace

934ee11

kallsyms added 6 commits November 20, 2023 12:03

add process tree to project-wide unit test target

4bb08fa

case change annotations

505bd5b

case change annotations

62bd517

remove stray comment

b078b63

default initialize seen_timestamps

8b00187

fix missing initialization of refcnt and tombstoned

7cb9b0b

mlw requested changes Dec 8, 2023

View reviewed changes

kallsyms added 4 commits December 19, 2023 11:53

reshuffle pb namespace

0855dbe

pr review

019cf51

move annotation registration to tree construction

36ab25c

use factory function for tree construction

9dce653

mlw approved these changes Feb 2, 2024

View reviewed changes

Merge branch 'main' into pt-1

a063281

kallsyms merged commit e8db89c into google:main Feb 5, 2024
9 checks passed

kallsyms deleted the pt-1 branch February 5, 2024 19:30

mlw added this to the 2024.2 milestone Feb 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ProcessTree: add core process tree logic (1/4) #1236

ProcessTree: add core process tree logic (1/4) #1236

kallsyms commented Nov 16, 2023

tburgin commented Nov 16, 2023

kallsyms commented Nov 17, 2023

tburgin commented Nov 17, 2023

kallsyms commented Nov 17, 2023

mlw commented Nov 17, 2023 •

edited

kallsyms commented Nov 17, 2023

pmarkowsky left a comment

russellhancox commented Nov 17, 2023

kallsyms commented Nov 20, 2023

mlw left a comment

ProcessTree: add core process tree logic (1/4) #1236

ProcessTree: add core process tree logic (1/4) #1236

Conversation

kallsyms commented Nov 16, 2023

"Deduplication" of incoming events

Process cleanup

tburgin commented Nov 16, 2023

kallsyms commented Nov 17, 2023

tburgin commented Nov 17, 2023

kallsyms commented Nov 17, 2023

mlw commented Nov 17, 2023 • edited

kallsyms commented Nov 17, 2023

pmarkowsky left a comment

Choose a reason for hiding this comment

russellhancox commented Nov 17, 2023

kallsyms commented Nov 20, 2023

mlw left a comment

Choose a reason for hiding this comment

mlw commented Nov 17, 2023 •

edited