-
Notifications
You must be signed in to change notification settings - Fork 408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proctree improvements (RSS/Performance) #4261
Conversation
Public proctree methods already make use of the mutex of the inner lru cache. This commit removes the outer mutex and adds concurrency tests to ensure that the proctree is safe to use concurrently.
hashicorp LRU increases RSS over time. This change uses the proctree lru cache to an expirable one, which will remove entries after a certain time. Two flags/options were added to the proctree command to control the cache TTLs. The default is 120 seconds. flags: --proctree process-cache-ttl=60 --proctree thread-cache-ttl=60 yaml config: cache-ttl: process: 60 thread: 60
comm had the same inner address as the original string which it was created from (basename). This caused the original underlying bytes to be preserved and not garbage collected. Converting it to a byte slice and then back to a string guarantees that a new string is created releasing the original reference.
double checked the benchmark in my env, good improvement!
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, congrats for the good job :)
1. Explain what the PR does
4b89620 fix: create a new fresh string for comm
cb89e2f perf: use proctree expirable lru cache
976704f perf: remove mutex, add concurrency tests
36ca022 chore: benchmark proctree
4b89620 fix: create a new fresh string for comm
cb89e2f perf: use proctree expirable lru cache
976704f perf: remove mutex, add concurrency tests
2. Explain how to test it
Run tracee
sudo ./dist/tracee -e execve --proctree source=both --proctree process-cache-ttl=60 --proctree thread-cache-ttl=60 -o none
Stress it
i=0; while ((i < 300000)); do ls >/dev/null; ((i++)) ; done
Measurements
We consider looking at only the
000000c0000000000
segment, narrowing down to where the proctree allocates memory. Not doing this would give us the full RSS which contains the perf buffer segments that are allocated per cpu.*A reduction of 78008 KB (after 6min) but without further decrease.
*A reduction of 345624 KB (after stress and cache expiring).
976704f perf: remove mutex, add concurrency tests
That reduced lock contention.
/home/gg/.goenv/versions/1.22.4/bin/go test -benchmem -run=^$ -tags ebpf -bench ^BenchmarkProcessTree$ github.com/aquasecurity/tracee/pkg/proctree -benchtime=1000000x -race
Measurements Conclusion
With this PR, after stress and cache expiring, we have a decrease around 267616KB (310056KB [old final cache] - 48580KB [new final cache]) in the size of proctree and significant improvement in CPU time of the public methods of proctree.
3. Other comments