New Event Structure #2870

itaysk · 2023-03-14T19:48:11Z

#2355 changed the primary user experience of Tracee to be event oriented (previously events were considered internal and hidden from the user). Therefore:

The event schema needs to be formalized and stabilized. Since it's no longer internal.
The event structure needs to be generalized. Since events are will now be used for detections, captures and more.

Following is the updated event schema based on the comments below:

timestamp
name
id - machine readable id (integer). Note: current event id isn't good since it is architecture specific
// version - use semver where major is a breaking change in the event (e.g. one of the event's fields under data has been changed or removed), minor is a non breaking change (e.g. a new field was added to the event under data) and patch (e.g. a bug fix). Since this data is static, we may remove this or make optional
// tags - since this data is static, we may remove this or make optional
labels - doesn't exist. For future use.
policies
1. matched
2. actions - doesn't exist, for future use - list of actions taken (currently the only action we have is print).
context
1. process
  1. executable
    1. path
    2. name - the binary name (basename of the path) - doesn't exist, consider adding (in another issue)
  2. uniqueId - unique id of the process
  3. pid
  4. hostPid
  5. executionTime - time of last exec. Doesn't exist, consider adding (in another issue)
  6. realUser
    1. id
    2. name - doesn't exist, consider adding (in another issue)
  7. user - effective user. Doesn't exist, consider adding (in another issue)
    1. id
    2. name
  8. ancestors - process ancestors array. Only direct parent will be populated by default with the following fields:
    1. uniqueId
    2. pid
    3. hostPid
    4. Other ancestor fields may be populated by threat detection events
  9. thread
    1. startTime
    2. name (aka "comm")
    3. tid
    4. hostTid
    5. capabilities - doesn't exist, consider adding (in another issue)
    6. syscall - the syscall that triggered this event
    7. compat - boolean. moved from flags.compat
    8. userStackTrace - if enabled, will be here
2. container
  1. id
  2. name
  3. image
    1. id
    2. repoDigest
    3. name
  4. isRunning - boolean. moved from flags
  5. startTime - Timestamp of container start time. Doesn’t exist. Will replace started
  6. pid - entrypoint's pid. Doesn’t exists, consider adding
3. k8s
  1. pod
    1. name
    2. uid
    3. labels
  2. namespace
    1. name
data
1. Any relevant field (per-event schema)
2. returnValue (if relevant will appear here)
3. triggeredBy (will appear on threat detection events)
  1. name
  2. id
  3. data
threat (if relevant will appear here) - static data about threats (can be omitted)
1. description
2. mitre
  1. tactic
    1. name
  2. technique
    1. name
    2. id
3. severity

We also discussed versioning the event schema, but not including the version with each event, for efficiency.

The text was updated successfully, but these errors were encountered:

itaysk · 2023-03-14T20:00:17Z

while this represent the complete event schema, I also think we should try to emit minimal events. For example, I suppose most people won't need the full description with every event, or the full stack trace. We already allow conditionally including some elements (stack trace, syscall, exec-env), and I think we need to expand this control to all event fields. for discussing this I'll create another issue.

NDStrahilevitz · 2023-03-16T13:57:23Z

@itaysk Do you think we should maybe use the protocol.Event feature for this (see types/protocol)? We could add a ContentType header to differentiate events. Or do we want to stick with trace.Event with split up context fields?

I've had this thought as well in relation to the recent Metadata field we've added for the finding events, it could've fit into either the protocol.EventHeader or that Findings can have a different payload body, but be considered an event through the protocol.

itaysk · 2023-03-20T12:25:27Z

Actually I'm not very familiar with how protocol is used now. I thought it's supposed to become obsolete in the unified binary approach. Can you elaborate on your proposal or perhaps add an example?

NDStrahilevitz · 2023-03-21T12:07:03Z

The protocol is more a way to differentiate different payloads going into the rules engine.
For example in CNDR it is used to define the ability to receive events sent directly from CNDR into the signature (for example to set data storage in the signature).
The same way the protocol can define a different payload for signature inputs, it can be used to define different event payloads with different shapes (for example, before we changed detections to be trace.Event we could have defined it it to be a different payload in the protocol.Event instead).
I am not sure if we want to currently support different payloads for events going into the engine (since the engine is now supposed to be internal).

NDStrahilevitz · 2023-03-21T12:31:35Z

May I suggest we move the event definition to a proto file while we're at it?
Should make it easier to integrate tracee into gRPC systems later.

itaysk · 2023-03-21T15:20:04Z

The protocol is more a way to differentiate different payloads going into the rules engine.
For example in CNDR it is used to define the ability to receive events sent directly from CNDR into the signature (for example to set data storage in the signature).
The same way the protocol can define a different payload for signature inputs, it can be used to define different event payloads with different shapes (for example, before we changed detections to be trace.Event we could have defined it it to be a different payload in the protocol.Event instead).
I am not sure if we want to currently support different payloads for events going into the engine (since the engine is now supposed to be internal).

I'm still not sure there's a use case for this, or I didn't fully understand it, but regardless seems like we reached the same conclusion that the "rule engine" is gone now.

+1 for proto

rafaeldtinoco · 2023-03-22T03:05:54Z

Has the event type serialization, using protobufs, ever been tested? I'm particularly concerned about:

Protocol buffers tend to assume that entire messages can be loaded into memory at once and are not larger than an object graph. For data that exceeds a few megabytes, consider a different solution; when working with larger data, you may effectively end up with several copies of the data due to serialized copies, which can cause surprising spikes in memory usage.

I mean, we will not serialize eBPF events, for sure, but if we're considering converting the Tracee type to protobuf, in a high rate, coming from an external source, then we should do some measurement before taking the decision, IMO.

NDStrahilevitz · 2023-03-22T15:16:42Z

Has the event type serialization, using protobufs, ever been tested? I'm particularly concerned about:

Protocol buffers tend to assume that entire messages can be loaded into memory at once and are not larger than an object graph. For data that exceeds a few megabytes, consider a different solution; when working with larger data, you may effectively end up with several copies of the data due to serialized copies, which can cause surprising spikes in memory usage.

I've actually written a PR (#2070) once where I added a ebpf -> rules serialization with protobuf. From my measurement in that PR, protobuf serialization was quicker than both json and gob ONLY if the struct was initially in the protobuf form already - conversion from trace.Event to proto.Event was the overhead in that printer format.
When taking conversion time into consideration it was quicker than both.

rafaeldtinoco · 2023-03-22T19:48:02Z

Good to know that.

yanivagman · 2023-04-09T12:09:00Z

I updated the struct to have process related context grouped together (pids, tids, comm, star time, namespaces, cgroup, uid).
Also moved processorId to be part of the context

itaysk · 2023-04-27T12:07:59Z

Adding this here for future reference and consideration: https://www.elastic.co/blog/ecs-elastic-common-schema-otel-opentelemetry-faq

The field changes were causing compatibility issues for integrated products. We will re-introduce these fields as part of the new Context field in the new event structure (see aquasecurity#2870)

itaysk · 2023-05-29T11:48:58Z

After talking with @yanivagman we thought about some slight changes:

rename args to fields
rename kubernetes to pod (and remove pod prefix from internal fields)
remove parent information from process intro:
new ancestors field which is an array of process (not all fields will be implemented so need to omitEmpty
move flags.containerStarted to container.started
move flags.isCompat to process.compat

TBD:

severity - location and name
version - maybe move to root

I've updated the description to reflect this

yanivagman · 2023-05-31T20:40:28Z

Let's also take the opportunity to only expose the fields mentioned here to the user. Any other (internal) fields should not be part of the trace.Event.
We can do this by embedding the trace.Event struct into an internal struct used by the pipeline only where we can also add some extra fields (e.g. matchedActions, matchedPolicies bitmap, cgroup id, etc.)

yanivagman · 2023-06-08T21:53:29Z

More detailed struct with some comments added:

timestamp
id - better if this was an integer, but we also need to keep backwards compatibility with e.g. TRC-XXX ids. Maybe call this differently so we can add integer ID in the future?
name
metadata
1. description
2. severity (or priority for non signature events?)
3. tags
4. version (this version describes the event fields version, and not the schema of the event structure)
5. misc - map[string]interface{}
context
1. process - all process related context
  1. executionTime - doesn't exist, consider adding
  2. name
  3. id
  4. namespaceId
  5. userId
  6. thread
    1. startTime
    2. id
    3. namespaceId
    4. mountNamespaceId - consider removing
    5. pidNamespaceId - consider removing
    6. utsName - consider removing
    7. syscall - moved here
    8. stackTrace - if enabled, will be here
    9. compat - moved from flags.compat
2. ancestors - array of all process ancestors, in a structure similar to process. Element 0 is parent.
3. container - all container related context
  1. id
  2. name
  3. image
  4. imageDigest
  5. started - moved from flags - consider setting this as a timestamp of container start time and not boolean
4. pod - all pod related context (from kubernetes)
  1. name
  2. namespace
  3. uid
  4. sandbox
5. processorId - moved here
fields - renamed from args (or should we call it data)?
1. every event attribute
2. returnValue (if relevant will appear here)
matchedPolicies

itaysk · 2023-06-10T06:05:58Z

id - better if this was an integer

why?

executionTime - doesn't exist, consider adding

agree, but let's discuss in a separate issue as it's about adding an entirely new feature? (and won't break the event structure)

thread

what's the motivation for adding this layer?

mountNamespaceId - consider removing
pidNamespaceId - consider removing
utsName - consider removing

+1

ancestors

since we're discussing the "event structure", having a field in the root implies it's an "event" field. In this case, I'm internally reading this as "event ancestors" but the meaning is "process ancestors". should we relocate it under process or prefix it with process to clarify?

started - moved from flags - consider setting this as a timestamp of container start time and not boolean

agree, but let's discuss in a separate issue as it's about adding an entirely new feature?

pod - all pod related context (from kubernetes)

if this is a kubernetes thing, should we add kubernetes to the name? if this is not, are we sure the structure will work for other pod implementation?

sandbox

What does this mean? from reading the code I think I can gather that it is looking at container lable io.kubernetes.docker.type==sandbox but I don't understand what/why/how it's related to kubernetes pod. @NDStrahilevitz can you please explain?

matchedPolicies

This is the only field in the event that is Tracee-specific, and not meaningful by itself. For example, if I'm streaming all events to splunk and then someone else at another time sees this event there, all other field would make sense since they describe what happened, but to understand this field I need to understand trace and how it was configured started.
I won't turn this into a debate about the usefulness of this event, but at least I'd suggest to prefix with "tracee". or even better, if we have intentions to add more tracee-specific info int the future (#3153) then better to put it under a "tracee" level.

josedonizetti · 2023-09-26T12:28:43Z

Moved it to next release, the event structure is finished and merged, now it is missing integrating it on tracee internals.

mcherny · 2023-11-22T11:02:37Z

Few comments:

Threat is static information, and hence should be available outside of event (including grpc interface)
The event header should also include the "text ID" of event (ART-X, TRC-X etc.)
To be able to use numeric ID, consumer need to be able to use the definitions of ids without dependency on tracee ecosystem
3.1 The id definition must be managed in backward compatible manner ( i.e. only add, no edits, no deletes)

mcherny · 2023-11-22T12:51:33Z

The data we need and is not in context currently:

under process section missing full process commandline
under k8s pod section missing deployment and type

NDStrahilevitz · 2023-11-22T17:29:08Z

under process section missing full process commandline

This is available in the sched_process_exec event and (I think) in the process tree and its data source.
This isn't to claim we shouldn't put it in the event context, rather that it is already available through alternative means.

under k8s pod section missing deployment and type

@itaysk we should be able to add these through container labels like other kubernetes data we already get in container enrichment. I can open an issue for this if agreed.

rafaeldtinoco · 2023-11-22T17:59:34Z

This is available in the sched_process_exec event and (I think) in the process tree and its data source.

When process tree was being created there was a specific discussion about COMM versus BINARY PATH and where the info should go (https://github.com/aquasecurity/tracee/pull/3364/files#diff-773e2917cb050cc42ce31d36b08db6c9e3da89ab6dff75f8a9b3eba5171316d3R12).

Alon and I agreed that COMM would be part of the TaskInfo structure (for the Process or Thread) and the Binary Path would be part of the File Info structure (for the Binary and the Interpreter of the BInary).

The COMM (Process Name for the process tree) does not pick the args (its basically procfs comm field), that would come together with argv array (which is an argument for the exec event).

We can introduce COMM + ARGS in the process tree if needed (or something else), no problem.

josedonizetti · 2023-11-22T20:00:22Z

Hey @mcherny, thank you for the questions:

Few comments:

Threat is static information, and hence should be available outside of event (including grpc interface)

Do you have an example of how this will be used? Because the Threat information is part of specific even (behaviour events), I'm considering this should be a part of the event definition as optional, and not have its own API. Thoughts?

The event header should also include the "text ID" of event (ART-X, TRC-X etc.)

We didn't add those because they are internal to Aqua, our plan was to actually remove all together for opensource signatures, and let the internal project handle the translation between those ids, and tracee ids.

To be able to use numeric ID, consumer need to be able to use the definitions of ids without dependency on tracee ecosystem
3.1 The id definition must be managed in backward compatible manner ( i.e. only add, no edits, no deletes)

Agree! Right now the ids are stable for base events but dynamic for signatures, I need to look into how we can make it always stable.

mcherny · 2023-11-23T14:50:39Z

under process section missing full process commandline

This is available in the sched_process_exec event and (I think) in the process tree and its data source. This isn't to claim we shouldn't put it in the event context, rather that it is already available through alternative means.

Assuming we need this on each event that this may be relevant (e.x. file open), how would I consume by alternative means if I need it outside of tracee process, that is at gRPC client?

NDStrahilevitz · 2023-11-23T15:59:04Z

Assuming we need this on each event that this may be relevant (e.x. file open), how would I consume by alternative means if I need it outside of tracee process, that is at gRPC client?

In general, if someone needs info on something on each event of its kind, for a particular usecase, a signature/derived event is probably called for.

yanivagman · 2023-11-24T20:11:34Z

Few comments:

Threat is static information, and hence should be available outside of event (including grpc interface)

That's why we wrote:

threat (if relevant will appear here) - static data about threats (can be omitted)

There is an advantage to include this information in the OSS project when a threat is detected (not so frequent) so user can get information about the threat without consulting the documentation

The event header should also include the "text ID" of event (ART-X, TRC-X etc.)

In continuation to the first point, this is also static data and can be added to the threat section by an internal mapping if required by some project

To be able to use numeric ID, consumer need to be able to use the definitions of ids without dependency on tracee ecosystem
3.1 The id definition must be managed in backward compatible manner ( i.e. only add, no edits, no deletes)

Agree. We have an old issue opened for that #1098

yanivagman · 2023-11-24T20:35:21Z

under process section missing full process commandline

This is available in the sched_process_exec event and (I think) in the process tree and its data source. This isn't to claim we shouldn't put it in the event context, rather that it is already available through alternative means.

Assuming we need this on each event that this may be relevant (e.x. file open), how would I consume by alternative means if I need it outside of tracee process, that is at gRPC client?

Sounds like a reasonable addition to tracee that should be optional. We should discuss such additions in a separate issue and keep this issue for event structure to support the already existing features of tracee

NDStrahilevitz · 2024-05-02T14:36:32Z

I've been doing some work in the capture area. Considering we want to merge it into the "everything is an event" scheme at some point, shouldn't we also include an artifact field in the structure (I recall @yanivagman making this sort of suggestion some time ago)? I assume that otherwise, each capture will be its own event.

yanivagman · 2024-05-03T12:37:59Z

I've been doing some work in the capture area. Considering we want to merge it into the "everything is an event" scheme at some point, shouldn't we also include an artifact field in the structure (I recall @yanivagman making this sort of suggestion some time ago)? I assume that otherwise, each capture will be its own event.

Eventually capture will be an action taken by some event. In that case, the data about it will be part of policies->actions

itaysk added kind/feature area/events labels Mar 14, 2023

This was referenced Mar 14, 2023

Improve events documentation #2809

Open

Selecting event output fields #2871

Open

yanivagman mentioned this issue Mar 16, 2023

Enrich image digest #2760

Merged

yanivagman added this to the v0.14.0 milestone Mar 30, 2023

yanivagman mentioned this issue Mar 30, 2023

Severity should be a field of SignatureMetadata, not an optional key into the properties map #2747

Closed

yanivagman assigned josedonizetti Apr 4, 2023

rafaeldtinoco changed the title ~~New Event Strucutre~~ New Event Structure Apr 10, 2023

josedonizetti modified the milestones: v0.14.0, v0.15.0 Apr 24, 2023

yanivagman modified the milestones: v0.15.0, v0.16.0 May 18, 2023

josedonizetti modified the milestones: v0.18.0, v0.19.0 Sep 26, 2023

yanivagman mentioned this issue Sep 26, 2023

fix(proctree): allow regular events to create proctree nodes #3498

Merged

yanivagman modified the milestones: v0.19.0, v0.20.0 Sep 28, 2023

This was referenced Nov 13, 2023

Add Parquet output #3682

Open

tracee event context has type overflows #3690

Open

josedonizetti modified the milestones: v0.20.0, v0.21.0 Feb 3, 2024

yanivagman mentioned this issue Feb 21, 2024

hostName field reported by Tracee should be 64 characters long #3840

Open

yanivagman modified the milestones: v0.21.0, v0.22.0 Apr 16, 2024

This was referenced Apr 16, 2024

Add support to OpenTelemetry semantic conventions #3235

Open

Remove return value from ebpf event message #3998

Open

yanivagman mentioned this issue May 7, 2024

Streamline events pipeline #1684

Closed

3 tasks

josedonizetti mentioned this issue May 21, 2024

moving printers to the new event structure #4069

Draft

yanivagman modified the milestones: v0.22.0, v0.23.0 Jun 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Event Structure #2870

New Event Structure #2870

itaysk commented Mar 14, 2023 •

edited by yanivagman

Loading

itaysk commented Mar 14, 2023

NDStrahilevitz commented Mar 16, 2023 •

edited

Loading

itaysk commented Mar 20, 2023

NDStrahilevitz commented Mar 21, 2023

NDStrahilevitz commented Mar 21, 2023

itaysk commented Mar 21, 2023

rafaeldtinoco commented Mar 22, 2023 •

edited

Loading

NDStrahilevitz commented Mar 22, 2023 •

edited

Loading

rafaeldtinoco commented Mar 22, 2023

yanivagman commented Apr 9, 2023

itaysk commented Apr 27, 2023

itaysk commented May 29, 2023

yanivagman commented May 31, 2023

yanivagman commented Jun 8, 2023 •

edited

Loading

itaysk commented Jun 10, 2023 •

edited

Loading

josedonizetti commented Sep 26, 2023

mcherny commented Nov 22, 2023

mcherny commented Nov 22, 2023

NDStrahilevitz commented Nov 22, 2023 •

edited

Loading

rafaeldtinoco commented Nov 22, 2023

josedonizetti commented Nov 22, 2023

mcherny commented Nov 23, 2023

NDStrahilevitz commented Nov 23, 2023 •

edited

Loading

yanivagman commented Nov 24, 2023 •

edited

Loading

yanivagman commented Nov 24, 2023

NDStrahilevitz commented May 2, 2024 •

edited

Loading

yanivagman commented May 3, 2024

New Event Structure #2870

New Event Structure #2870

Comments

itaysk commented Mar 14, 2023 • edited by yanivagman Loading

itaysk commented Mar 14, 2023

NDStrahilevitz commented Mar 16, 2023 • edited Loading

itaysk commented Mar 20, 2023

NDStrahilevitz commented Mar 21, 2023

NDStrahilevitz commented Mar 21, 2023

itaysk commented Mar 21, 2023

rafaeldtinoco commented Mar 22, 2023 • edited Loading

NDStrahilevitz commented Mar 22, 2023 • edited Loading

rafaeldtinoco commented Mar 22, 2023

yanivagman commented Apr 9, 2023

itaysk commented Apr 27, 2023

itaysk commented May 29, 2023

yanivagman commented May 31, 2023

yanivagman commented Jun 8, 2023 • edited Loading

itaysk commented Jun 10, 2023 • edited Loading

josedonizetti commented Sep 26, 2023

mcherny commented Nov 22, 2023

mcherny commented Nov 22, 2023

NDStrahilevitz commented Nov 22, 2023 • edited Loading

rafaeldtinoco commented Nov 22, 2023

josedonizetti commented Nov 22, 2023

mcherny commented Nov 23, 2023

NDStrahilevitz commented Nov 23, 2023 • edited Loading

yanivagman commented Nov 24, 2023 • edited Loading

yanivagman commented Nov 24, 2023

NDStrahilevitz commented May 2, 2024 • edited Loading

yanivagman commented May 3, 2024

itaysk commented Mar 14, 2023 •

edited by yanivagman

Loading

NDStrahilevitz commented Mar 16, 2023 •

edited

Loading

rafaeldtinoco commented Mar 22, 2023 •

edited

Loading

NDStrahilevitz commented Mar 22, 2023 •

edited

Loading

yanivagman commented Jun 8, 2023 •

edited

Loading

itaysk commented Jun 10, 2023 •

edited

Loading

NDStrahilevitz commented Nov 22, 2023 •

edited

Loading

NDStrahilevitz commented Nov 23, 2023 •

edited

Loading

yanivagman commented Nov 24, 2023 •

edited

Loading

NDStrahilevitz commented May 2, 2024 •

edited

Loading