New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new(libsinsp): sinsp_repair_state_sc_set
ppm sc API
#826
new(libsinsp): sinsp_repair_state_sc_set
ppm sc API
#826
Conversation
c4b9f5b
to
3cf9d19
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you Melissa! This looks really promising, i think this is what we all had in mind actually :)
There are some minor things to think about:
- We need an event/syscall flag to mark a syscall as
heavy
- then, Falco/consumers will need to check if the adaptive set contains any of the heavy syscalls (perhaps a new API for this that return the subset of the ppm_sc_set cotnaining all the heavy ones?)
- in case Falco is not run with
-A
, we will need to print a warning and remove the heavy syscalls from the set - if Falco is run with -A, we run adaptive regardless of heavy syscalls
But that can be done in subsequent PRs; let's keep it simple :D
userspace/libsinsp/sinsp_ppm_sc.cpp
Outdated
|
||
} | ||
|
||
void sinsp::set_filter_evttypes_ppm_sc_set(std::unordered_set<std::string> m_filter_evttypes_names, std::unordered_set<uint32_t>& ppm_sc_set) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's see, thought this will likely be refactored anyways after first getting the Falco approach in place. Consider the current approach to extract evttypes from the filter a hack
(I don't yet have a full understanding of those parts of the code base) -> for now idea was to have a complete enough POC for testing.
userspace/libsinsp/sinsp_ppm_sc.cpp
Outdated
std::unordered_set<uint32_t> ppm_sc_set_filter_evttypes = ppm_sc_set; | ||
|
||
/* Enforce syscalls solely needed for minimal sinsp state built-up and life-cycle management (Falco use case). */ | ||
std::unordered_set<uint32_t> base_thread_state_set = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we start from enforce_sinsp_state_ppm_sc
?
In that case, we can be sure that any syscall needed by sinsp enrichment is actually captured.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Primary goal of this PR is to provide an alternative to enforce_sinsp_state_ppm_sc
with the --optimized-syscalls
option (opt-in, default to false) 🙃 .
Thinking behind this is that current approach seems to add too many unnecessary syscalls (because of too generic UF_NEVER_DROP
or EF_MODIFIES_STATE
flags).
Consider the following (attempting to brain dump my current understanding):
- For example, consider a scenario where Falco rules are only around
spawned_process
andopen*
syscalls -> hence no need to tap into any network related syscalls at all or other fds. - Many
EF_MODIFIES_STATE
doEF_CREATES_FD
, but consider a scenario where this syscall isn't part of a Falco rule, hence don't need to add those fds -> for the most part most fd related syscalls contain the needed fd info in the actual live event. Only for example forconnect
,accept*
andlisten
syscalls absolutely need to also tracesocket
syscall etc to get the ip tuples. However, interested in learning about concrete examples where this assessment is false -> then extend new alternativeenforce_optimized_ppm_sc_set
option. - Very carefully reviewed the parser and the current sets is what came out, but I planned more detailed analyses. The intended new minimum set should truly only address adding or updating fields that are used to create logs when another syscall triggers (e.g. opened a file, but also need current uid of the process).
[The removing and destroying parts should be sufficiently addressed (it's only the proc exit sched events and close syscall)]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, i'd say:
- either we enforce the sinsp state ppm_sc
- or we DO NOT enforce them, and expect the consumer to specify additional syscalls to be tracked, aside from the ruleset ones
If you see, all APIs in that source file but enforce_simple_ppm_sc
do not enforce the sinsp state ppm sc!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, Falco will eventually enforce the sinsp_state_ppm_sc
though :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that sooner or later we will need to expose a config option to let users choose the syscalls to be additionally tracked, like
additional_syscalls: [ "open", "close", "execve" ]
and when the option is set, we should avoid enforcing the sinsp state ppm sc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@FedeDP summarized all new suggestions in Falco PR falcosecurity/falco#2361.
- Re
additional_syscalls
option, I like it, gives more control and even more options to end users - Re sinsp state enforcement, obviously since I opened this would say we really need this new approach (once finalized and proven to be 100% technically sound), but we can support both: (1) a more efficient sinsp state enforcement option (this one) and (2) the more conservative option (the existing API). Conservative one can be default and the new one opt-in via a flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note however that UF_NEVER_DROP
are likely to be dropped from the enforced set (i think we will eventually drop both the UF_NEVER_DROP
and UF_ALWAYS_DROP
flags sooner or later).
In the end, our enforced state ppm sc will just be the list of EF_MODIFIES_STATE
events, not that big!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@FedeDP I like the heavy
syscall addition as safety guardrails (💯 ) and agreed, could rebase this PR after a new API has been added in a separate PR.
Additional suggestions for consideration:
- Let's simulate everything in
sinsp-example
, hence also add-A
flag or any other new flag tosinsp-example
to be able to implement moree2e
tests? - Would it also be acceptable to add a flag
--allow-heavy-syscalls
instead? That way-A
has one clear purpose (no ambiguity), else if you define heavy syscalls in Falco rules, but don't set--allow-heavy-syscalls
they would be rejected similar to how you described it. Suggesting this to achieve aZERO complaints, all configurations supported approach
, because end users may complain again and say, hey I only want to add this one heavy syscall, let's saymmap
, why can't I just do that and why do I have to pay again the cost of tracing ALL syscalls (and keep in mind this cost increases as we add more syscalls)
userspace/libsinsp/sinsp_ppm_sc.cpp
Outdated
|
||
} | ||
|
||
void sinsp::set_filter_evttypes_ppm_sc_set(std::unordered_set<std::string> m_filter_evttypes_names, std::unordered_set<uint32_t>& ppm_sc_set) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's see, thought this will likely be refactored anyways after first getting the Falco approach in place. Consider the current approach to extract evttypes from the filter a hack
(I don't yet have a full understanding of those parts of the code base) -> for now idea was to have a complete enough POC for testing.
userspace/libsinsp/sinsp_ppm_sc.cpp
Outdated
std::unordered_set<uint32_t> ppm_sc_set_filter_evttypes = ppm_sc_set; | ||
|
||
/* Enforce syscalls solely needed for minimal sinsp state built-up and life-cycle management (Falco use case). */ | ||
std::unordered_set<uint32_t> base_thread_state_set = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Primary goal of this PR is to provide an alternative to enforce_sinsp_state_ppm_sc
with the --optimized-syscalls
option (opt-in, default to false) 🙃 .
Thinking behind this is that current approach seems to add too many unnecessary syscalls (because of too generic UF_NEVER_DROP
or EF_MODIFIES_STATE
flags).
Consider the following (attempting to brain dump my current understanding):
- For example, consider a scenario where Falco rules are only around
spawned_process
andopen*
syscalls -> hence no need to tap into any network related syscalls at all or other fds. - Many
EF_MODIFIES_STATE
doEF_CREATES_FD
, but consider a scenario where this syscall isn't part of a Falco rule, hence don't need to add those fds -> for the most part most fd related syscalls contain the needed fd info in the actual live event. Only for example forconnect
,accept*
andlisten
syscalls absolutely need to also tracesocket
syscall etc to get the ip tuples. However, interested in learning about concrete examples where this assessment is false -> then extend new alternativeenforce_optimized_ppm_sc_set
option. - Very carefully reviewed the parser and the current sets is what came out, but I planned more detailed analyses. The intended new minimum set should truly only address adding or updating fields that are used to create logs when another syscall triggers (e.g. opened a file, but also need current uid of the process).
[The removing and destroying parts should be sufficiently addressed (it's only the proc exit sched events and close syscall)]
@incertum note that the Basically, we will have:
|
I agree with the solution proposed by @FedeDP, since it basically preserves the previous behaviors from a functional point of view, but at the same time, allows optimization. 👍 |
Minor note: instead of a new
Most of the "heavy" syscalls are IO related. And this would save us yet another flag. Basically we could use the set provided by https://github.com/falcosecurity/libs/blob/master/userspace/libsinsp/sinsp_ppm_sc.cpp#L167 |
ACK, thanks @FedeDP and @leogr for clarifying @FedeDP smart re using In summary:
|
3cf9d19
to
e782db3
Compare
@FedeDP @leogr @Andreagit97 @jasondellaluce this has been re-designed from the ground up thanks to recent refactors. |
e782db3
to
264345d
Compare
90fd58c
to
34e8732
Compare
Created a new Falco ticket falcosecurity/falco#2433 to track this effort and also posted on slack asking for community help to further build this feature out. Thanks. |
34e8732
to
bc26347
Compare
sinsp_repair_state_sc_set
ppm sc API
See falcosecurity/falco#2433 (comment) Co-authored-by: Jason Dellaluce <jasondellaluce@gmail.com> Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
bc26347
to
98cf6c0
Compare
Rebased and removed changes from This would also clean up the current potentially verbose event names print out ...
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR LGTM left some minor comments :)
A more general question, if I understood correctly now to libs consumer we are offering 3 possible ways:
simple_sc_set
repair_sc_set
- YOLO mode (crazy scenarios with just one syscall enabled)
is it true?
When opening the engine with Falco we will always call this repair
method using syscalls provided by rulesets, is it true?
/* These syscalls are used to build up or modify info of the basic process (tinfo) struct. | ||
* Consistent enforcement regardless of the input ppm_sc_set. | ||
*/ | ||
libsinsp::events::set<ppm_sc_code> repaired_sinsp_state_sc_set = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this could be the right place to enforce PPM_SC_SCHED_PROCESS_EXIT
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! Nice, since now we can do it, left a comment stating that it is ok if this enforcement is redundant in case we decide to also enforce PPM_SC_SCHED_PROCESS_EXIT
elsewhere.
@jasondellaluce when we refactor sc_names_to_sc_set
and/or create event_names_to_sc_set
need to also include the tp now ...
…sp_repair_state_sc_set see also falcosecurity#963 (comment) Co-authored-by: Andrea Terzolo <andrea.terzolo@polito.it> Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
* merge filter ppm_sc set when returning sinsp_repair_state_sc_set * optimize bind condition Co-authored-by: Andrea Terzolo <andrea.terzolo@polito.it> Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
Tried to wordsmith the description. We likely need more edits to come up with a clear final description.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
LGTM label has been added. Git tree hash: aba2e88251a1e3e658253a631c899a3a97d18c46
|
Makes sense! Thank you for the summary :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Andreagit97, FedeDP, incertum The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/milestone 0.11.0 |
Signed-off-by: Melissa Kilby melissa.kilby.oss@gmail.com
What type of PR is this?
/kind cleanup
/kind design
/kind feature
Any specific area of the project related to this PR?
/area libsinsp
Does this PR require a change in the driver versions?
What this PR does / why we need it:
Edited 2023-02-16:
Thanks to refactors in #877 and #854 stars have aligned to build out the adaptive base syscalls enforcement option based on the event types defined in the filter.
Why is this option needed?
Clients can now enable more resourceful kernel tracing in that only syscalls that are needed for sinsp state build-up and life-cycle management are subscribed to in addition to the syscalls defined in the filter string. In short, this new option is sinsp state compliant - conditioned by filter events / current configuration. This option can accomplish large cost savings, especially when the client needs to feature different modes of monitoring according to a cost budget. All configurations are supported, e.g. the most minimal configuration possible is solely monitoring spawned processes nothing else.
Integration proposal:
Example:
Which issue(s) this PR fixes:
Inability to customize libs clients and run agents in a more cost saving mode.
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?: