Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cleanup(libsinsp): Extension of ppm sc API for corner cases (e.g. event_set_to_names and names_to_sc_set) #915

Merged
merged 14 commits into from
Mar 6, 2023

Conversation

incertum
Copy link
Contributor

@incertum incertum commented Feb 21, 2023

What type of PR is this?

Uncomment one (or more) /kind <> lines:

/kind bug

kind cleanup

/kind design

/kind documentation

/kind failing-test

/kind feature

Any specific area of the project related to this PR?

Uncomment one (or more) /area <> lines:

/area API-version

/area build

/area CI

/area driver-kmod

/area driver-bpf

/area driver-modern-bpf

/area libscap-engine-bpf

/area libscap-engine-gvisor

/area libscap-engine-kmod

/area libscap-engine-modern-bpf

/area libscap-engine-nodriver

/area libscap-engine-noop

/area libscap-engine-source-plugin

/area libscap-engine-savefile

/area libscap-engine-udig

/area libscap

/area libpman

/area libsinsp

/area tests

/area proposals

Does this PR require a change in the driver versions?

/version driver-API-version-major

/version driver-API-version-minor

/version driver-API-version-patch

/version driver-SCHEMA-version-major

/version driver-SCHEMA-version-minor

/version driver-SCHEMA-version-patch

What this PR does / why we need it:

Updated:

Performed a fresh inspection for corner cases throughout:

  • event_set_to_names initially thought to be "wonky" was a false assessment from my side. Nonetheless, proposing to adopt a better style here as the inner loop was very hard to follow. Results are equivalent, meaning we do not have a regression. Added new unit tests plus as per @jasondellaluce suggestion added a bool so that we can map from event codes to either the event table names or resolve them to the true syscall names, which however because of the information loss will result in over subscribing especially for the generic events case (-> will map to about 234 generic syscalls). That way all future use cases should be covered.
  • names_to_sc_set -> Totally overlooked all the corner cases and special snowflakes here. Given we now scan Falco rules for evt.type ppm_sc codes and this method is used for translation, it is a highly relevant refactor for ensuring Falco correctness (e.g. accept in rules will now be guaranteed to activate both accept and accept4 syscalls just like before)
  • Some more cleanup and noticed few more other minor things

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

cleanup(libsinsp): Extension of ppm sc API for corner cases (e.g. event_set_to_names and names_to_sc_set) 

incertum and others added 5 commits February 26, 2023 19:24
…vent_set, all_non_sc_event_set to ppm sc API

Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
…_sc_set in ppm sc API

Extra back and forth mapping to resolve overloaded event <-> sc names, e.g. accept -> accept, accept4
Plus account for variants that share event codes, e.g. eventfd, eventfd2 share PPME_SYSCALL_EVENTFD_E, PPME_SYSCALL_EVENTFD_X
Plus handle special snowflakes, e.g. "umount" event string maps to PPME_SYSCALL_UMOUNT_E, PPME_SYSCALL_UMOUNT_X, but
in actuality applies for "umount2" syscall as "umount" syscall is a generic event -> end result is activating both umount, umount2

Since names_to_event_set would resolve generic sc events, we only apply these extra lookups for non generic sc event codes

New tests added as well.

Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
Refactor so that event_set_to_names is more ppm sc API native and easier to audit.
New method achieves equivalent results, no regression.

Extend unit tests.

Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
…c API

Have the option to convert event_set to names as defined in the event_table
without proper sc resolution.

Co-authored-by: Jason Dellaluce <jasondellaluce@gmail.com>
Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
@poiana poiana added size/XL and removed size/L labels Feb 26, 2023
@incertum incertum changed the title fix(libsinsp): fix event_set_to_names in ppm sc API cleanup(libsinsp): Extension of ppm sc API for corner cases (e.g. event_set_to_names and names_to_sc_set) Feb 26, 2023
Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
*/
std::unordered_set<std::string> event_set_to_names(const set<ppm_event_code>& events_set);
std::unordered_set<std::string> event_set_to_names(const set<ppm_event_code>& events_set, bool resolve_sc = true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::unordered_set<std::string> event_set_to_names(const set<ppm_event_code>& events_set, bool resolve_sc = true);
std::unordered_set<std::string> event_set_to_names(const set<ppm_event_code>& events_set, bool resolve_generic = true);

Why resolving the converted SC set, instead of just resolving generic events? I feel like we're creating an equivalento of sc_set_to_names. What I would expect from this function is to treat generics as syscall or as the set of actual generic event names, depending on the boolean. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorporated :)

/*!
\brief Get the ppm_event for each non sc events.
*/
set<ppm_event_code> all_non_sc_event_set();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all_events().filter([&](ppm_sc_code e) { return !libsinsp::events::is_syscall_event(e); });

Isn't this equivalent? Is it worth to add and maintain API methods if we already have a filtering functionality? This applies to the ones above too. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorporated :)

incertum and others added 3 commits February 27, 2023 15:34
Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
Do not add additional sinsp APIs and instead use one liner filters

Co-authored-by: Jason Dellaluce <jasondellaluce@gmail.com>
Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
event_set_to_names -> adjust option to resolve to sc names,
but only for generic events to not duplicate sc_set_to_names

Co-authored-by: Jason Dellaluce <jasondellaluce@gmail.com>
Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
jasondellaluce
jasondellaluce previously approved these changes Feb 28, 2023
Copy link
Contributor

@jasondellaluce jasondellaluce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

Comment on lines +254 to +258
auto all_non_generic_sc_event_set = libsinsp::events::all_event_set().filter([&](ppm_event_code e) { return libsinsp::events::is_syscall_event(e); })\
.diff(libsinsp::events::set<ppm_event_code>{PPME_GENERIC_E, PPME_GENERIC_X});
auto tmp_event_set = all_non_generic_sc_event_set.intersect(libsinsp::events::names_to_event_set(syscalls));
auto tmp_sc_set = libsinsp::events::event_set_to_sc_set(tmp_event_set);
return ppm_sc_set.merge(tmp_sc_set);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with this because I see the reason behind it, but I would prefer fixing the tables (e.g. the umount corner case) and iterate back here in the future to make this function simpler and less defensive. Just leaving a TODO for us.

Copy link
Contributor Author

@incertum incertum Feb 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jasondellaluce thank you for your additional input.

The primary reasoning behind these changes that follow the principle of right-sized incremental progress is as follows:

  • We benefit from a complete and technically correct solution without any additional major refactor. A major refactor tends to always carry unknowns.
  • We now have time to reflect on what will be best for the long-term project roadmap as per our project's governance without blocking crucial testing of the new Falco feature update: improve support and tests for live-capture event selection falco#2432. There is always a possibility that while a new table appears better at the moment to cover these corner cases, we however could then discover the reasoning why it wasn't done that way in the first place. Objectively these two tables have been around almost from the beginning of the project, so would vote for first gathering more context. In summary, in order to attempt making the right decisions allowing for more time to think seems a fair choice.

See also #889 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure this is what we want, I need to double-check, my brain with all this PPM stuff is like -> 🤯

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Falco rule calls "accept" -> should apply filter to both accept, accept4
Falco rule calls "eventfd" -> should apply filter to both eventfd, eventfd2

This is the behavior this change ensures without requiring a table refactor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I got the point thank you for that!

@poiana
Copy link
Contributor

poiana commented Feb 28, 2023

LGTM label has been added.

Git tree hash: b0e01948c793d748111f9e92291397bb015aafbd

Copy link
Member

@Andreagit97 Andreagit97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to double-check the names to ppm_sc conversion

userspace/libsinsp/events/sinsp_events.cpp Show resolved Hide resolved
userspace/libsinsp/events/sinsp_events.h Outdated Show resolved Hide resolved
* @param ppm_sc_set set of `ppm_sc` from which you want to obtain information
* @return set of events associated with the provided `ppm_sc` set.
/*!
\brief When you want to retrieve the events associated with a particular `ppm_event` you have to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
\brief When you want to retrieve the events associated with a particular `ppm_event` you have to
\brief When you want to retrieve all `sc` associated with a particular `ppm_event` you have to

userspace/libsinsp/events/sinsp_events.h Outdated Show resolved Hide resolved
userspace/libsinsp/events/sinsp_events.h Show resolved Hide resolved
Comment on lines +254 to +258
auto all_non_generic_sc_event_set = libsinsp::events::all_event_set().filter([&](ppm_event_code e) { return libsinsp::events::is_syscall_event(e); })\
.diff(libsinsp::events::set<ppm_event_code>{PPME_GENERIC_E, PPME_GENERIC_X});
auto tmp_event_set = all_non_generic_sc_event_set.intersect(libsinsp::events::names_to_event_set(syscalls));
auto tmp_sc_set = libsinsp::events::event_set_to_sc_set(tmp_event_set);
return ppm_sc_set.merge(tmp_sc_set);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure this is what we want, I need to double-check, my brain with all this PPM stuff is like -> 🤯

incertum and others added 2 commits March 2, 2023 17:41
Co-authored-by: Andrea Terzolo <andrea.terzolo@polito.it>
Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
Co-authored-by: Jason Dellaluce <jasondellaluce@gmail.com>
Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
After a fresh look optimize event_set_to_names given we
sequentially adjusted approach in one PR.

Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
Copy link
Member

@Andreagit97 Andreagit97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@poiana
Copy link
Contributor

poiana commented Mar 6, 2023

LGTM label has been added.

Git tree hash: 98ecc5da7b140455f78e3818ec854e09dff0e9a6

Copy link
Contributor

@jasondellaluce jasondellaluce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@poiana poiana merged commit c2e2276 into falcosecurity:master Mar 6, 2023
@poiana
Copy link
Contributor

poiana commented Mar 6, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Andreagit97, incertum, jasondellaluce

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [Andreagit97,incertum,jasondellaluce]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@LucaGuerra
Copy link
Contributor

/milestone 0.11.0

@poiana poiana added this to the 0.11.0 milestone May 16, 2023
@incertum incertum deleted the fixes-ppm-sc-api3 branch December 8, 2023 20:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants