Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ebpf: pipeline: reduce iteration over policies #3209

Merged
merged 3 commits into from
Jun 14, 2023

Conversation

geyslan
Copy link
Member

@geyslan geyslan commented Jun 7, 2023

1. Explain what the PR does

5bce483 policies: replace "userspace" with "userland" (2023/jun/08) <Geyslan Gregório>

0a01581 docs: change "bitmask" mentions to "bitmap" (2023/jun/07) <Geyslan Gregório>

229ce1c ebpf: pipeline: reduce iteration over policies (2023/jun/07) <Geyslan Gregório>

This creates a policy bitmap with filters that need to be checked in the
userland. This bitmap is used to avoid iterating the entire map when
there is no policy match in the kernel that requires userland filtering.

When there is a policy that requires userland filtering, the iteration
must be done using FilterableInUserSpaceMap(), which returns a possible
reduced map of policies.

2. Explain how to test it

3. Other comments

This creates a policy bitmap with filters that need to be checked in the
userland. This bitmap is used to avoid iterating the entire map when
there is no policy match in the kernel that requires userland filtering.

When there is a policy that requires userland filtering, the iteration
must be done using FilterableInUserSpaceMap(), which returns a possible
reduced map of policies.
@rafaeldtinoco
Copy link
Contributor

rafaeldtinoco commented Jun 9, 2023

These:

This commit updates the codebase in events_pipeline.go and policies.go
to replace occurrences of "userspace" with "userland" in the context of
policies filtering.

By using the term "userland", the code becomes more accessible to
developers familiar with standard operating system concepts and promotes
better comprehension of the code's functionality related to policies
filtering in the user-level environment.

and

Bitmap is the most correct term for the data structure used to represent
a set of bits for Policies, while bitmask is a term used for bitwise
operations on bitmaps.

This is a first step towards making the codebase more consistent with
the terminology used, although it only changes the comments.

A future commit will possibly change the names of types and variables
to use the term bitmap instead of bitmask, or just add the bitmap term
to policy related code.

P.S.: it's also does a bit of comment cosmetics.

is WAYYY to descriptive for such a simple changes.

Git logs should be extensive if mandatory for the understanding of what patch is doing. Otherwise they should be one-liners or similar:

policies: reword userpace to userland

no need for more reasoning (the issue can give more in-depth if ever needed) IMO.

@rafaeldtinoco
Copy link
Contributor

Third PR git log seems right for me. I'll check it out locally once Im done with some of my coding, and give a +1 soon. Cheers!

@geyslan
Copy link
Member Author

geyslan commented Jun 12, 2023

no need for more reasoning (the issue can give more in-depth if ever needed) IMO.

We don't have an issue for this, as it's a chore and it has been mentioned in the past that chores shouldn't have open issues.

So, without more information to reason the changes, I believe that well-written comments when changing the naming in code are valuable for a few reasons. First, they act as documentation and help future contributors understand the purpose and reasoning behind the name change. They also provide context, reducing confusion and making the changes easier to understand. Well-written comments reduce cognitive load by providing additional context, making it easier for developers to work with the code in the future. Lastly, reasoned comments prevent further discussions such as "why name it this way" or "I think it's better to name it that way." Do you agree?

@rafaeldtinoco
Copy link
Contributor

Lastly, reasoned comments prevent further discussions such as "why name it this way" or "I think it's better to name it that way." Do you agree?

Nope, I think this is bike shedding (both, the git log and your answer). Its a simple change, it should be stated in a simple oneliner git log.

rafaeldtinoco

This comment was marked as off-topic.

@rafaeldtinoco rafaeldtinoco dismissed their stale review June 14, 2023 12:20

nope, wrong comment

Comment on lines +83 to +84
// TODO: make sure the stores are also atomic (an atomic load is only protecting
// the read from context switches, not from CPU cache coherency issues).
Copy link
Contributor

@rafaeldtinoco rafaeldtinoco Jun 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that simple (about atomics). If you did something like this:

diff --git a/pkg/policy/policies.go b/pkg/policy/policies.go
index 3b363a7d..0f72a4d3 100644
--- a/pkg/policy/policies.go
+++ b/pkg/policy/policies.go
@@ -13,30 +13,49 @@ type Policies struct {
        filterEnabledPoliciesMap  map[*Policy]int      // stores only enabled policies
        filterUserlandPoliciesMap map[*Policy]int      // stores a reduced map with only userland filterable policies
 
-       uidFilterMin            uint64
-       uidFilterMax            uint64
-       pidFilterMin            uint64
-       pidFilterMax            uint64
-       uidFilterableInUserland bool
-       pidFilterableInUserland bool
-
-       filterableInUserland    uint64 // bitmap of policies that must be filtered in userland
-       containerFiltersEnabled uint64 // bitmap of policies that have at least one container filter type enabled
+       uidFilterMin *atomic.Uint64
+       uidFilterMax *atomic.Uint64
+       pidFilterMin *atomic.Uint64
+       pidFilterMax *atomic.Uint64
+
+       uidFilterableInUserland *atomic.Bool
+       pidFilterableInUserland *atomic.Bool
+
+       filterableInUserland    *atomic.Uint64 // bitmap of policies that must be filtered in userland
+       containerFiltersEnabled *atomic.Uint64 // bitmap of policies that have at least one container filter type enabled
 }
 
 func NewPolicies() *Policies {
+       uidFilterMin := &atomic.Uint64{}
+       uidFilterMax := &atomic.Uint64{}
+       pidFilterMin := &atomic.Uint64{}
+       pidFilterMax := &atomic.Uint64{}
+       uidFilterableInUserland := &atomic.Bool{}
+       pidFilterableInUserland := &atomic.Bool{}
+       FilterableInUserland := &atomic.Uint64{}
+       containerFiltersEnabled := &atomic.Uint64{}
+
+       uidFilterMin.Store(filters.MinNotSetUInt)
+       uidFilterMax.Store(filters.MaxNotSetUInt)
+       pidFilterMin.Store(filters.MinNotSetUInt)
+       pidFilterMax.Store(filters.MaxNotSetUInt)
+       uidFilterableInUserland.Store(false)
+       pidFilterableInUserland.Store(false)
+       FilterableInUserland.Store(0)
+       containerFiltersEnabled.Store()
+
        return &Policies{
                policiesArray:             [MaxPolicies]*Policy{},
                filterEnabledPoliciesMap:  map[*Policy]int{},
                filterUserlandPoliciesMap: map[*Policy]int{},
-               uidFilterMin:              filters.MinNotSetUInt,
-               uidFilterMax:              filters.MaxNotSetUInt,
-               pidFilterMin:              filters.MinNotSetUInt,
-               pidFilterMax:              filters.MaxNotSetUInt,
-               uidFilterableInUserland:   false,
-               pidFilterableInUserland:   false,
-               filterableInUserland:      0,
-               containerFiltersEnabled:   0,
+               uidFilterMin:              uidFilterMin,
+               uidFilterMax:              uidFilterMax,
+               pidFilterMin:              pidFilterMin,
+               pidFilterMax:              pidFilterMax,
+               uidFilterableInUserland:   uidFilterableInUserland,
+               pidFilterableInUserland:   pidFilterableInUserland,
+               filterableInUserland:      FilterableInUserland,
+               containerFiltersEnabled:   containerFiltersEnabled,
        }
 }
 
@@ -45,45 +64,39 @@ func (ps *Policies) Count() int {
 }
 
 func (ps *Policies) UIDFilterMin() uint64 {
-       return ps.uidFilterMin
+       return ps.uidFilterMin.Load()
 }
 
 func (ps *Policies) UIDFilterMax() uint64 {
-       return ps.uidFilterMax
+       return ps.uidFilterMax.Load()
 }
 
 func (ps *Policies) PIDFilterMin() uint64 {
-       return ps.pidFilterMin
+       return ps.pidFilterMin.Load()
 }
 
 func (ps *Policies) PIDFilterMax() uint64 {
-       return ps.pidFilterMax
+       return ps.pidFilterMax.Load()
 }
 
 func (ps *Policies) UIDFilterableInUserland() bool {
-       return ps.uidFilterableInUserland
+       return ps.uidFilterableInUserland.Load()
 }
 
 func (ps *Policies) PIDFilterableInUserland() bool {
-       return ps.pidFilterableInUserland
+       return ps.pidFilterableInUserland.Load()
 }
 
-// ContainerFilterEnabled returns a bitmap of policies that have at least
-// one container filter type enabled.
-//
-// TODO: make sure the stores are also atomic (an atomic load is only protecting
-// the read from context switches, not from CPU cache coherency issues).
+// ContainerFilterEnabled returns a bitmap of policies that have at least one container filter type
+// enabled.
 func (ps *Policies) ContainerFilterEnabled() uint64 {
-       return atomic.LoadUint64(&ps.containerFiltersEnabled)
+       return ps.containerFiltersEnabled.Load()
 }
 
-// FilterableInUserland returns a bitmap of policies that must be filtered in userland
-// (ArgFilter, RetFilter, ContextFilter, UIDFilter and PIDFilter).
-//
-// TODO: make sure the stores are also atomic (an atomic load is only protecting
-// the read from context switches, not from CPU cache coherency issues).
+// FilterableInUserland returns a bitmap of policies that must be filtered in userland (ArgFilter,
+// RetFilter, ContextFilter, UIDFilter and PIDFilter).
 func (ps *Policies) FilterableInUserland() uint64 {
-       return atomic.LoadUint64(&ps.filterableInUserland)
+       return ps.filterableInUserland.Load()
 }
 
 // Compute recalculates values, updates flags, fills the reduced userland map,
@@ -98,7 +111,9 @@ func (ps *Policies) Compute() {
        ps.updateContainerFilterEnabled()
 
        userlandMap := make(map[*Policy]int)
-       ps.filterableInUserland = 0
+
+       ps.filterableInUserland.Store(0)
+
        for p := range ps.filterEnabledPoliciesMap {
                if p.ArgFilter.Enabled() ||
                        p.RetFilter.Enabled() ||

And encapsulated the Load/Stores for the uint64s, you would still have a problem requiring a mutex protecting those uint64s (and not only atomics). So atomics might not even be needed for them (if they're protected by a mutex).

That happens because you use "utils.SetBit()" in that variable, and the SetBit does:

  • load
  • shift
  • store

Which are 3 operations that can't be protected by a single underlaying HW atomic operation. So you need a mutex (with an atomic primitive under the hood) to protect the load/change/store (as you can't do a "compare and exchange" atomic HW operation for bitwise).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its better if you add a "big lock" (single mutex) and hold that mutex for all operations on your uint64s for now (it will be safer and less error prone). For bools you can use atomics with SetTrue() SetFalse() methods, for uint64s that are simply Load/Store you can also have Set/Get methods, but for uint64s with bits being shifted, you will need a lock if no underlaying atomic instruction/method/operation exists (which I believe is the case here).

@rafaeldtinoco
Copy link
Contributor

@geyslan IMO we should "protect" the Policies class now with atomics and mutexes (for non-atomic changes). If changes are interdependent (like changing 2 variables simultaneously such as uidFilterableInUserland and filterableInUserland we should have a mutex, not to incur in errors such as changing one and having the other changed back by concurrency of multiple accesses).

@geyslan
Copy link
Member Author

geyslan commented Jun 14, 2023

Yeah, we have discussed that on https://github.com/aquasecurity/tracee/pull/3183/files#diff-84bf110f6f537355b16c6e40683f306bb5bdac12e5b254b4aaba47fbdbcb5396R69-R71 effort. Let's make that concurrency control in a different issue. I'll raise it. Thanks.


#3239

@rafaeldtinoco rafaeldtinoco self-requested a review June 14, 2023 14:45
Copy link
Contributor

@rafaeldtinoco rafaeldtinoco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not from this PR, the entire interface will need sync work at #3239. Thanks for opening that @geyslan! LGTM this one

@geyslan geyslan merged commit d79107c into aquasecurity:main Jun 14, 2023
25 checks passed
@geyslan geyslan deleted the user-bitmap branch July 31, 2023 22:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants