Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature(proctree): Introduce a process tree #3364

Merged
merged 31 commits into from Sep 18, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
e834db7
feature(events): task unique identifier murmur hashing
rafaeldtinoco Aug 21, 2023
317bc6c
feature(controlplane): create signal events to be used by controlplane
rafaeldtinoco Aug 22, 2023
4d36a79
chore(ebpf): move functions with inline asm
rafaeldtinoco Aug 22, 2023
a8470d3
chore(parse): tell argument type on argument parsing errors
rafaeldtinoco Aug 22, 2023
c18492b
feature(controller): create processes lifecycle functions
rafaeldtinoco Aug 22, 2023
6a6696d
chore(controlplane): rename local variable to ctrl
rafaeldtinoco Aug 25, 2023
c28f848
feature(utils/proc): add stat and status file parsing
rafaeldtinoco Aug 30, 2023
66819e2
feature(changelog): introduce the changelog package
rafaeldtinoco Sep 8, 2023
cc361ba
feature(proctree): start the process tree data structure
rafaeldtinoco Aug 28, 2023
f07277e
chore(controlplane): rename containers file and eventID var
rafaeldtinoco Sep 11, 2023
9526cd0
chore(controlplane): comment how to debug proctree
rafaeldtinoco Sep 11, 2023
acf2961
feature(proctree): read single PID from procfs
rafaeldtinoco Sep 11, 2023
82a9087
chore(proctree) adjust the place for process tree initialization
rafaeldtinoco Sep 12, 2023
048e0cd
chore(proctree): do not update parent and leader when they exist
rafaeldtinoco Sep 12, 2023
a034d98
debug(proctree): display proctree cache status in debug mode
rafaeldtinoco Sep 12, 2023
4cc22b6
feature(proctree): add name to processes and threads
rafaeldtinoco Sep 12, 2023
e7a3894
feature(proctree): add changelog support to fileinfo
rafaeldtinoco Sep 12, 2023
03971cd
feature(proctree): enable async procfs updates to proctree
rafaeldtinoco Sep 12, 2023
407d580
feature(proctree): create config setting for proctree cache size
rafaeldtinoco Sep 13, 2023
4e428de
chore(proctree): turn async procfs channel writes non blocking
rafaeldtinoco Sep 13, 2023
f5eae6d
chore(proctree): move anon procfs functions to real ones
rafaeldtinoco Sep 13, 2023
ba76a0a
debug(proctree): split evictions from procs and threads
rafaeldtinoco Sep 13, 2023
2f190af
chore(changelog): change values if given timestamp exists
rafaeldtinoco Sep 13, 2023
265ffc3
chore(time): centralize timing functions
rafaeldtinoco Sep 11, 2023
c100dc4
debug(proctree): remove debug information
rafaeldtinoco Sep 13, 2023
1524014
chore(proctree): make proctree disabled by default ...
rafaeldtinoco Sep 14, 2023
562513d
chore(changelog): comparable types only so debug can work
rafaeldtinoco Sep 14, 2023
c8a63f2
fix(events): re-enable sched events to fix tests
rafaeldtinoco Sep 14, 2023
03093f4
chore(proctree): adjust cache cmdline for other features...
rafaeldtinoco Sep 18, 2023
07ca4d0
fix(proctree): proctree events only when proctree enabled
rafaeldtinoco Sep 18, 2023
5321b26
fix(proctree): interp and interpreter as diff things
rafaeldtinoco Sep 18, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
6 changes: 6 additions & 0 deletions cmd/tracee-ebpf/main.go
Expand Up @@ -91,6 +91,12 @@ func main() {
Value: cli.NewStringSlice("none"),
Usage: "control event caching queues. run '--cache help' for more info.",
},
&cli.StringSliceFlag{
Name: "proctree",
Aliases: []string{"t"},
Value: cli.NewStringSlice("none"),
Usage: "process tree options. run '--proctree help' for more info.",
},
&cli.StringSliceFlag{
Name: "crs",
Usage: "define connected container runtimes. run '--crs help' for more info.",
Expand Down
11 changes: 11 additions & 0 deletions cmd/tracee/cmd/root.go
Expand Up @@ -223,6 +223,17 @@ func initCmd() error {
return errfmt.WrapError(err)
}

rootCmd.Flags().StringArrayP(
"proctree",
rafaeldtinoco marked this conversation as resolved.
Show resolved Hide resolved
"t",
[]string{"none"},
"[process|thread]\t\t\tControl process tree options",
)
err = viper.BindPFlag("proctree", rootCmd.Flags().Lookup("proctree"))
if err != nil {
return errfmt.WrapError(err)
}

// Server flags

rootCmd.Flags().Bool(
Expand Down
3 changes: 3 additions & 0 deletions examples/config/global_config.json
Expand Up @@ -3,6 +3,9 @@
"cache": [
"none"
],
"proctree": [
"none"
],
"capabilities": [],
"containers": false,
"crs": [],
Expand Down
2 changes: 2 additions & 0 deletions examples/config/global_config.yaml
@@ -1,6 +1,8 @@
blob-perf-buffer-size: 1024
cache:
- none
proctree:
- none
capabilities: []
containers: false
crs: []
Expand Down
159 changes: 159 additions & 0 deletions pkg/changelog/changelog.go
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the implementation, looks good and intuitive.
One question - in your implementation you can't set a default value, you get the zero value of your type if you ask about a time before the first item.
There are times that you know that something has a default value.
It is possible to envelope this struct to return the default value in that case, but it will be hard to know if the given value is a default one or a value in the log.
Maybe expose a way to save a default value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I'll add credits to you in both, the changelog and the datasource (I'm clearly using things you've done in your PR). FYIO, will add a signed-off by in both commits (you and me).

@@ -0,0 +1,159 @@
package changelog

import (
"time"

"github.com/aquasecurity/tracee/pkg/logger"
)

type comparable interface {
~int | ~float64 | ~string
}

type item[T comparable] struct {
timestamp time.Time // timestamp of the change
value T // value of the change
}

// The changelog package provides a changelog data structure. It is a list of changes, each with a
// timestamp. The changelog can be queried for the value at a given time.
Comment on lines +18 to +19
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@geyslan this abstratcion of a changelog and package can be used for policy updates as well, wdyt?
A possible addition to this changelog is the ability to prune old changes made. This can be done, for example, by pruning all changes before some given time (can be given as a parameter to this package)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. The initial policies version implementation used a rough circular buffer keyed by timestamps. At that time we decided to key it with a version number. In fact, removing the policies version object should be done for a timeout or even when a buffer round is complete, before replacing the stale policies version.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@geyslan not sure I understand - do you think it can be used for the policy updates or not?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yanivagman No, as it stands now it cannot be used for versioned policies (u16). The changelog uses logic around timestamps, while policy snapshots are closely tied to the version number.


// ATTENTION: You should use Changelog within a struct and provide methods to access it,
// coordinating access through your struct mutexes. DO NOT EXPOSE the changelog object directly to
// the outside world as it is not thread-safe.

type Changelog[T comparable] struct {
changes []item[T] // list of changes
timestamps map[time.Time]struct{} // set of timestamps (used to avoid duplicates)
}

// NewChangelog creates a new changelog.
func NewChangelog[T comparable]() *Changelog[T] {
return &Changelog[T]{
changes: []item[T]{},
timestamps: map[time.Time]struct{}{},
}
}

// Getters

// GetCurrent: Observation on single element changelog.
//
// If there's one element in the changelog, after the loop, left would be set to 1 if the single
// timestamp is before the targetTime, and 0 if it's equal or after.
//
// BEFORE: If the single timestamp is before the targetTime, when we return
// clv.changes[left-1].value, returns clv.changes[0].value, which is the expected behavior.
//
// AFTER: If the single timestamp is equal to, or after the targetTime, the current logic would
// return a "zero" value because of the condition if left == 0.
//
// We need to find the last change that occurred before or exactly at the targetTime. The binary
// search loop finds the position where a new entry with the targetTime timestamp would be inserted
// to maintain chronological order:
//
// This position is stored in "left".
//
// So, to get the last entry that occurred before the targetTime, we need to access the previous
// position, which is left-1.
rafaeldtinoco marked this conversation as resolved.
Show resolved Hide resolved
//
// GetCurrent returns the latest value of the changelog.
func (clv *Changelog[T]) GetCurrent() T {
rafaeldtinoco marked this conversation as resolved.
Show resolved Hide resolved
if len(clv.changes) == 0 {
return returnZero[T]()
}

return clv.changes[len(clv.changes)-1].value
}

// Get returns the value of the changelog at the given time.
func (clv *Changelog[T]) Get(targetTime time.Time) T {
if len(clv.changes) == 0 {
return returnZero[T]()
}

idx := clv.findIndex(targetTime)
if idx == 0 {
var zero T
return zero
}

return clv.changes[idx-1].value
}

// GetAll returns all the values of the changelog.
func (clv *Changelog[T]) GetAll() []T {
values := make([]T, len(clv.changes))
for i, entry := range clv.changes {
values[i] = entry.value
}
return values
}

// Setters

// SetCurrent sets the latest value of the changelog.
func (clv *Changelog[T]) SetCurrent(value T) {
clv.setAt(value, time.Now())
}

// Set sets the value of the changelog at the given time.
func (clv *Changelog[T]) Set(value T, targetTime time.Time) {
clv.setAt(value, targetTime)
}

// private

// setAt sets the value of the changelog at the given time.
func (clv *Changelog[T]) setAt(value T, targetTime time.Time) {
// If the timestamp is already set, update that value only.
_, ok := clv.timestamps[targetTime]
if ok {
index := clv.findIndex(targetTime)
if !clv.changes[index].timestamp.Equal(targetTime) { // sanity check only (time exists already)
logger.Debugw("changelog internal error: timestamp mismatch")
return
}
if clv.changes[index].value != value {
logger.Debugw("changelog error: value mismatch for same timestamp")
}
clv.changes[index].value = value
return
}

entry := item[T]{
timestamp: targetTime,
value: value,
}

// Insert the new entry in the changelog, keeping the list sorted by timestamp.
idx := clv.findIndex(entry.timestamp)
clv.changes = append(clv.changes, item[T]{})
copy(clv.changes[idx+1:], clv.changes[idx:])
rafaeldtinoco marked this conversation as resolved.
Show resolved Hide resolved
clv.changes[idx] = entry

// Mark the timestamp as set.
clv.timestamps[targetTime] = struct{}{}
}

// findIndex returns the index of the first item in the changelog that is after the given time.
func (clv *Changelog[T]) findIndex(target time.Time) int {
left, right := 0, len(clv.changes)

for left < right {
middle := (left + right) / 2
if clv.changes[middle].timestamp.Before(target) {
left = middle + 1
} else {
right = middle
}
}

return left
}

// returnZero returns the zero value of the type T.
func returnZero[T any]() T {
var zero T
return zero
}
82 changes: 82 additions & 0 deletions pkg/changelog/changelog_test.go
@@ -0,0 +1,82 @@
package changelog

import (
"reflect"
"testing"
"time"
)

func TestChangelog(t *testing.T) {
cl := NewChangelog[int]()

// Test GetCurrent on an empty changelog
if cl.GetCurrent() != 0 {
t.Errorf("GetCurrent on empty changelog should return 0")
}

// Test SetCurrent and GetCurrent
cl.SetCurrent(42)
if cl.GetCurrent() != 42 {
t.Errorf("GetCurrent after SetCurrent should return 42")
}

// Test Get on an empty changelog

cl = NewChangelog[int]()

if cl.Get(time.Now()) != 0 {
t.Errorf("Get on empty changelog should return 0")
}

// Test 1 second interval among changes

cl = NewChangelog[int]()

cl.SetCurrent(1)
time.Sleep(2 * time.Second)
cl.SetCurrent(2)
time.Sleep(2 * time.Second)
cl.SetCurrent(3)

now := time.Now()

if cl.Get(now.Add(-4*time.Second)) != 1 {
t.Errorf("Get on changelog should return 1")
}
if cl.Get(now.Add(-2*time.Second)) != 2 {
t.Errorf("Get on changelog should return 2")
}
if cl.Get(now) != 3 {
t.Errorf("Get on changelog should return 3")
}

// Test 100 milliseconds interval among changes
// NOTE: If this test becomes flaky we can change/remove it.

cl = NewChangelog[int]()

cl.SetCurrent(1)
time.Sleep(100 * time.Millisecond)
cl.SetCurrent(2)
time.Sleep(100 * time.Millisecond)
cl.SetCurrent(3)

now = time.Now()

if cl.Get(now.Add(-200*time.Millisecond)) != 1 {
t.Errorf("Get on changelog should return 1")
rafaeldtinoco marked this conversation as resolved.
Show resolved Hide resolved
}
if cl.Get(now.Add(-100*time.Millisecond)) != 2 {
t.Errorf("Get on changelog should return 2")
}
if cl.Get(now) != 3 {
t.Errorf("Get on changelog should return 3")
}

// Test getting all values at once

expected := []int{1, 2, 3}
if !reflect.DeepEqual(cl.GetAll(), expected) {
t.Errorf("GetAll should return %v, but got %v", expected, cl.GetAll())
}
rafaeldtinoco marked this conversation as resolved.
Show resolved Hide resolved
}
8 changes: 8 additions & 0 deletions pkg/cmd/cobra/cobra.go
Expand Up @@ -101,6 +101,14 @@ func GetTraceeRunner(c *cobra.Command, version string) (cmd.Runner, error) {
logger.Debugw("Cache", "type", cfg.Cache.String())
}

// Process Tree command line flags

procTree, err := flags.PrepareProcTree(viper.GetStringSlice("proctree"))
if err != nil {
return runner, err
}
cfg.ProcTree = procTree
rafaeldtinoco marked this conversation as resolved.
Show resolved Hide resolved

// Capture command line flags - via cobra flag

captureFlags, err := c.Flags().GetStringArray("capture")
Expand Down
3 changes: 3 additions & 0 deletions pkg/cmd/flags/help.go
Expand Up @@ -13,6 +13,7 @@ func PrintAndExitIfHelp(ctx *cli.Context) {
keys := []string{
"crs",
"cache",
"proctree",
"capture",
"scope",
"events",
Expand Down Expand Up @@ -51,6 +52,8 @@ func GetHelpString(key string) string {
return containersHelp()
case "cache":
return cacheHelp()
case "proctree":
return procTreeHelp()
case "capture":
return captureHelp()
case "scope", "events":
Expand Down