Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new(app_actions)!: adjust base_syscalls option, add base_syscalls.repair #2457

Merged
merged 10 commits into from Mar 30, 2023

Conversation

incertum
Copy link
Contributor

What type of PR is this?

Uncomment one (or more) /kind <> lines:

/kind bug

/kind cleanup

/kind design

/kind documentation

/kind failing-test

/kind feature

/kind release

Any specific area of the project related to this PR?

Uncomment one (or more) /area <> lines:

/area build

/area engine

/area tests

/area proposals

/area CI

What this PR does / why we need it:

Finalize #2433.
Fixes missing PPM_SC_SCHED_PROCESS_EXIT safety enforcement when user_positive_sc_set non empty.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

new(app_actions)!: adjust base_syscalls option, add base_syscalls.repair

@@ -444,66 +444,86 @@ metadata_download:
watch_freq_sec: 1


# base_syscalls ! Use with caution !
# base_syscalls ! Use with caution, read carefully !
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to make the description easier to follow, we can plan for some final touch up closer to Falco 0.35 release.


/* Hidden safety enforcement for `base_syscalls.custom_set` user input override option -> sched_process_exit trace point activation
* (procexit event) is necessary for continuous state engine cleanup, else memory would grow rapidly and linearly over time. */
base_sc_set.insert(PPM_SC_SCHED_PROCESS_EXIT);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was missing for the user override scenario ...

@@ -47,7 +47,7 @@ static std::string s_sample_ruleset = "sample-ruleset";
static std::string s_sample_source = falco_common::syscall_source;

static strset_t s_sample_filters = {
"evt.type=connect or evt.type=accept",
"evt.type=connect or evt.type=accept or evt.type=accept4 or evt.type=umount2",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small touch up on tests to include some of the overlapping syscalls already. Noticed during testing that init_module was troublesome ... but it may be a test setup issue, as when I ran the Falco binary with debug log level init_module was correctly included everywhere.

As we refactor the ppm sc API one more time with a subsequent libs bump we should include a generic syscalls in the test mock rules as well as this critical test coverage is currently missing. CC @jasondellaluce

@incertum
Copy link
Contributor Author

Seeing that there are issues with the test cases, because of extending them a bit. Let's investigate, locally they passed, but I posted in another comment that when I tried to add init_module into the mock up rules it was troublesome. Happy to revert test extensions a bit and we can address them in a separate PR after @jasondellaluce refactors final aspects of ppm sc API.

@Andreagit97 Andreagit97 added this to the 0.35.0 milestone Mar 27, 2023
@jasondellaluce
Copy link
Contributor

@incertum I had a list of suggestions to propose with where impossible-ish to report through the GH UI, so I proceeded by pushing a commit on your branch with my review suggestion. Please drop it if you feel like it's wrong.

The topic is to fix a couple of logic/preference flaws, and strengthen the test cases (also re-including your interim test enhancements). Hopefully this will fix the CI on GH as well, since it works for me locally too.

@@ -89,11 +89,9 @@ static void select_event_set(falco::app::state& s, const libsinsp::events::set<p
auto user_positive_sc_set = libsinsp::events::names_to_sc_set(user_positive_names);
auto user_negative_sc_set = libsinsp::events::names_to_sc_set(user_negative_names);

if (!user_positive_sc_set.empty() || s.config->m_base_syscalls_repair)
if (!user_positive_sc_set.empty())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: we were wrongly accepting totally-empty positive set, which we didn't allow before and I'm a bit contrary on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be an alternative to allow bypassing default sinsp state enforcement, but use .repair option, but maybe still want one negative syscall?

Either way you can always just add execve in there and that way bypass, so I would always find a way to get the behavior I want :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My points are:

  • We need a way for users to satisfy the case "all default sinsp set syscalls, except the negated ones", which as of today we interpret the empty positive set for. If you want to really have an empty custom positive set, what is your proposal to support the above? We do the same kind of reasoning for enabled event sources, for example
  • I'm not sure that explicitly allowing user to pass-in totally-empty base sets is the best UX, and as you said you still have means to circumvent it by simply defining a syscall as positive and negative at the same time. I'm not strong on this, but what is the concrete use case that you foresee on having a totally-empty positive base set?

Copy link
Contributor Author

@incertum incertum Mar 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see suggestion in one of the latest commits. Perhaps having the condition (repair is specified, but custom_set empty) as separate logic block clears things up? Intent of this condition is to be able to use automated state enforcement, but not the default one and instead the repair one as the repair one is conditioned by rules and the more correct solution. But since it is still experimental we cannot default to it.

Use cases: Busy servers may not be able to tolerate too verbose / unneeded state enforcement as currently done with the default option. But those end users may also appreciate more automation.

Worst case as you also confirmed I can bypass it for our use cases, but I think supporting this configuration makes sense for the community as a whole. WDYT?

userspace/falco/app/actions/configure_interesting_sets.cpp Outdated Show resolved Hide resolved
userspace/falco/app/actions/configure_interesting_sets.cpp Outdated Show resolved Hide resolved
@@ -301,9 +301,10 @@ TEST(ConfigureInterestingSets, selection_custom_base_set)
// note: `accept` is not included even though it is matched by the rules,
// which means that the custom negation base set has precedence over the
// final selection set as a whole
"connect", "open", "ptrace", "mmap", "execve", "read", "syncfs", "sched_process_exit"
// todo(jasondellaluce): add "accept4" once names_to_sc_set is polished on the libs side
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When disabling accept, Falco also disables accept4. That's still expected because I still have to clean up names_to_sc_set on libsinsp.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, yes!

@incertum
Copy link
Contributor Author

Thanks a lot for fixing up the tests 🙏 will address the other changes shortly when I have some time.

incertum and others added 9 commits March 29, 2023 05:12
…syscalls set

Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
See falcosecurity#2433

Co-authored-by: Jason Dellaluce <jasondellaluce@gmail.com>
Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
….custom_set

Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
Co-authored-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
Signed-off-by: Jason Dellaluce <jasondellaluce@gmail.com>
Co-authored-by: Jason Dellaluce <jasondellaluce@gmail.com>
Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
@incertum
Copy link
Contributor Author

@jasondellaluce I think we still have a some problems with the tests. They can interfere with each other and then some of them can fail. It appears to be a known issue, was reading up on https://chromium.googlesource.com/external/github.com/google/googletest/+/refs/tags/release-1.8.0/googletest/docs/FAQ.md -> we may need better SetUp and TearDown mechanisms for these ConfigureInterestingSets tests?

How to replicate issues: This is the interesting part, first all worked again locally, changing order of tests made it then fail for 2 of them. And then without changing anything after rebasing some tests failed, but for example if you comment all other tests out the one that failed will surely pass. Hope one of these conditions helps you to simulate these type of apparently a bit random interferences.

Comment on lines 134 to 144
/* REPLACE DEFAULT STATE, nothing else. */
if (s.config->m_base_syscalls_repair && s.config->m_base_syscalls_custom_set.empty())
{
/* If `base_syscalls.repair` is specified, but `base_syscalls.custom_set` is empty we are replacing
* the default `sinsp_state_sc_set()` enforcement with the alternative `sinsp_repair_state_sc_set`.
* This approach only activates additional syscalls Falco needs beyond the
* syscalls defined in each Falco rule that are absolutely necessary based
* on the current rules configuration. */

// returned set already has rules_sc_set merged
s.selected_sc_set = libsinsp::events::sinsp_repair_state_sc_set(rules_sc_set);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/* REPLACE DEFAULT STATE, nothing else. */
if (s.config->m_base_syscalls_repair && s.config->m_base_syscalls_custom_set.empty())
{
/* If `base_syscalls.repair` is specified, but `base_syscalls.custom_set` is empty we are replacing
* the default `sinsp_state_sc_set()` enforcement with the alternative `sinsp_repair_state_sc_set`.
* This approach only activates additional syscalls Falco needs beyond the
* syscalls defined in each Falco rule that are absolutely necessary based
* on the current rules configuration. */
// returned set already has rules_sc_set merged
s.selected_sc_set = libsinsp::events::sinsp_repair_state_sc_set(rules_sc_set);
/* REPLACE DEFAULT STATE, nothing else. */
if (s.config->m_base_syscalls_repair && user_positive_sc_set.empty())
{
/* If `base_syscalls.repair` is specified, but `base_syscalls.custom_set` defines no positive syscall we are replacing
* the default `sinsp_state_sc_set()` enforcement with the alternative `sinsp_repair_state_sc_set`.
* This approach only activates additional syscalls Falco needs beyond the
* syscalls defined in each Falco rule that are absolutely necessary based
* on the current rules configuration. */
base_sc_set.clear()

Agreed on your proposal. I think this should happen only when user_positive_sc_set is empty, so that users can still negate some syscall if they need to (potentially used by the rules for maximum control). Also, this needs to be moved at line 110 as an else branch of the current if (!user_positive_sc_set.empty()), so that negated syscall checks can be applied below.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK, since sinsp_repair_state_sc_set already merges in the rules_sc_set moving it to line 115 may be better. I think it's ok to just re-assign / override s.selected_sc_set again, but we could add more if else, elseif etc behavior as well.

@jasondellaluce
Copy link
Contributor

we may need better SetUp and TearDown mechanisms for these ConfigureInterestingSets tests?

I think all the tests should be "stateless", and they just access the static globals for reading. It can be that the static global variables are not initialized in the proper execution order maybe, in which case SetUp and TearDown would definitely help. Since they are passing so far, I would leve this test flakyness improvements for another PR, WDYT?

Co-authored-by: Jason Dellaluce <jasondellaluce@gmail.com>
Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
Copy link
Contributor

@jasondellaluce jasondellaluce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@poiana
Copy link

poiana commented Mar 29, 2023

LGTM label has been added.

Git tree hash: 70810f8df1c10e271f13dd53c90294a2aef24110

Copy link
Member

@leogr leogr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Thank you!

@poiana
Copy link

poiana commented Mar 30, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: incertum, jasondellaluce, leogr

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [jasondellaluce,leogr]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@poiana poiana merged commit 0b6e243 into falcosecurity:master Mar 30, 2023
16 checks passed
@incertum incertum deleted the repair-base-syscalls branch April 25, 2023 09:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

5 participants