Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Policies pointer #3938

Closed
wants to merge 5 commits into from
Closed

Policies pointer #3938

wants to merge 5 commits into from

Conversation

geyslan
Copy link
Member

@geyslan geyslan commented Mar 27, 2024

Depends on #3939 and #3940

1. Explain what the PR does

6465221 chore: go.mod (types) bump
609df3b perf: pass policies pointer in the event

609df3b perf: pass policies pointer in the event

When policies version was implemented, the initial idea was to pass
the policies pointer in the event, so that the correct policies version
could be reused at each stage of the pipeline without having to
retrieve it from the snapshots every time. However, this was not
possible at the time since the gob library does not support
unsafe.Pointer types.

After #3939, Tracee no longer uses gob, so the policies pointer can be
passed in the event as originally intended.

2. Explain how to test it

3. Other comments

Kubernetes Kubernetes `json:"kubernetes,omitempty"`
EventID int `json:"eventId,string"`
EventName string `json:"eventName"`
Policies unsafe.Pointer `json:"-"`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit is gonna be in other PR.

@geyslan geyslan marked this pull request as draft March 27, 2024 16:49
@geyslan
Copy link
Member Author

geyslan commented Mar 27, 2024

This optimization is still blocked since #3841 only removed gob from tracee. tracee-rules still uses it.


EDIT: this #3939 unblocks it.

geyslan added 4 commits March 27, 2024 20:54
This finishes the removal of gob support after aquasecurity#3841.
Instead of holding the version of the policies, hold a pointer to the
policies object.
When policies version was implemented, the initial idea was to pass
the policies pointer in the event, so that the correct policies version
could be reused at each stage of the pipeline without having to
retrieve it from the snapshots every time. However, this was not
possible at the time since the gob library does not support
unsafe.Pointer types.

After aquasecurity#3939, Tracee no longer uses gob, so the policies pointer can be
passed in the event as originally intended.
@geyslan geyslan marked this pull request as ready for review March 28, 2024 00:59
Copy link
Contributor

@AlonZivony AlonZivony left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully understand the pros of this change.
We are removing the information about the policies version from the event, making it hard for the user to understand it.
From what I do understand, this removes the need to search in the snapshots for the version of the policies each time queried, which is nice in idea because you no longer need to have an access to the policies snapshots to know about the policies relevant for the event. But won't any code that will need to go over the policies have a pointer to the policies to begin with? Can you give an example for where you need this direct access?
I am just a bit afraid of coupling the events with the policies.

t.eventsPool.Put(evt)
continue
}
evt.Policies = unsafe.Pointer(policies)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you using the unsafe.Pointer instead of just a pointer to the policies?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

types would require importing the policy package to know about *Policies, which isn't the intention.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it is what you sort of do, no? Making the event to depend on the policies implementation.

@geyslan
Copy link
Member Author

geyslan commented Mar 28, 2024

But won't any code that will need to go over the policies have a pointer to the policies to begin with?

Only through the Snapshots Get*() since we'll be working based on a specific version.

Can you give an example for where you need this direct access?

The idea is that if you get a event from the pipeline, anywhere in the codebase, you can get the version information directly from the Policies pointer. I think that keeping the version field available can induce the user to look for the Policies pointer in the Snapshots again.

I am just a bit afraid of coupling the events with the policies.

I see you. Let's consider that we have had already coupled it having the Policies Version before. Now we just swapping it with a proper reference for the sake of performance.

@geyslan
Copy link
Member Author

geyslan commented Mar 28, 2024

From what I do understand, this removes the need to search in the snapshots for the version of the policies each time queried,

Yes. In the future, we will have pruning working in Snapshots. It may happen that in a sequential pipeline stage this version of the policy is already pruned (timed out) and inaccessible via Snapshots, therefore keeping the pointer from the beginning will give us a greater guarantee of output from that event.

@AlonZivony
Copy link
Contributor

The idea is that if you get a event from the pipeline, anywhere in the codebase, you can get the version information directly from the Policies pointer. I think that keeping the version field available can induce the user to look for the Policies pointer in the Snapshots again.

But why does using the snapshots is so bad?
The motivation here is not clear to me.
It feels to me like keeping a pointer to the process tree object of the process instead of the entity id.
If you can clarify it here on in an issue then it would be great.

Let's consider that we have had already coupled it having the Policies Version before. Now we just swapping it with a proper reference for the sake of performance.

Its not making it proper, its making it a direct dependency instead of an indirect one.
I would say that it makes sense to do it if the policy is a read-only object that only contain information and the performance is super crucial, but otherwise I think we should consider to keep an identifier of the policy instead (currently in the form of version).

@geyslan
Copy link
Member Author

geyslan commented Mar 28, 2024

@AlonZivony #3940 (comment)

I'll be changing this, changing types won't be necessary anymore.

@geyslan geyslan marked this pull request as draft March 28, 2024 15:17
@geyslan
Copy link
Member Author

geyslan commented Apr 1, 2024

I was enveloping trace.Event type through the pipeline...

image

but reached a point (sig engine) that would require to change types again... as it's not the ideal currently, I'm postponing it.

@geyslan geyslan closed this Apr 1, 2024
@geyslan geyslan deleted the policies-pointer branch June 28, 2024 18:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants