[PROPOSAL] Formalize container engine testing framework similar to the new kernel version testing framework #1298

incertum · 2023-08-15T21:27:01Z

Motivation

The implementation of the formal kernel version testing framework (https://github.com/falcosecurity/libs/blob/master/proposals/20230530-driver-kernel-testing-framework.md) has had a highly positive impact on the overall progress and stability of The Falco Project.

I am proposing a similar testing framework for container engines, with a specific focus on maintaining the expected functionality of each engine for a particular container runtime.

This testing would be crucial not only to identify regressions but also to demonstrate the reliability of the container engine. This is because there is no expectation for it to be flawless. Simultaneously, we must comprehend the scenarios and conditions in which we might fail to retrieve container information. This understanding will help establish a form of Service Level Objective (SLO) for adopters. For instance, in edge case race conditions, we might provide less stringent guarantees compared to a situation where a container runs for 30 days without ever having its information available. The latter case serves as an example of an opportunity to enhance the engine's robustness. Returning to the notion that perfection is unattainable, embracing a data-driven approach will assist in setting escalation thresholds for reported container engine issues.

Feature

Set up a testbed to evaluate the following:

Test accurate and reliable container information enrichment for two scenarios: (1) container was active before agent launch, and (2) container launches after agent start.
Above shall include verifying each supported field's accuracy, similar to existing test/drivers unit tests.
Assess each officially supported container engine, prioritizing certain container runtimes as P1 (e.g., containerd, cri-o, docker), while others are labeled "best effort".
Perform semi-realistic tests on a Kubernetes server featuring multiple pods. These tests aim to observe continuous enrichment of container information over an extended period (e.g., several hours), encompassing stable pods as well as pods coming up and down. Apply upper limits as per https://kubernetes.io/docs/setup/best-practices/cluster-large/. However, reaching 110 pods per node with multiple containers within a pod is unlikely. A more realistic expectation would be a maximum of around 100-150 containers per node.

Note: Parallel testing may be applicable to certain runtimes, while for others, individual assessments are required.

CC @falcosecurity/core-maintainers

The text was updated successfully, but these errors were encountered:

incertum · 2023-12-01T22:44:32Z

@jasondellaluce and @Andreagit97 and others it may be time for better container engine testing, we keep breaking it see latest oversight (that was on me) #1535

leogr · 2023-12-05T17:03:26Z

Hey @incertum

This is really interesting. Have we already collected a list of regressions we have encountered? 🤔 It would be useful to understand which aspects to focus on more.

poiana · 2024-03-04T21:49:31Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

incertum · 2024-03-05T00:09:47Z

/remove-lifecycle stale

Some new e2e test efforts are a WIP @therealbobo

poiana · 2024-06-03T03:53:24Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

poiana · 2024-07-03T03:54:19Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh with /remove-lifecycle rotten.

Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle rotten

incertum added the kind/feature New feature or request label Aug 15, 2023

incertum mentioned this issue Aug 15, 2023

[TRACKING] Re-audit container engines for empty container info values (Initial focus on CRI for Kubernetes) falcosecurity/falco#2708

Open

Andreagit97 added this to the TBD milestone Sep 4, 2023

incertum mentioned this issue Dec 4, 2023

new(libsinsp/test): Start dedicated container engine unit testsuite w/ mock CRI API response #1544

Merged

incertum mentioned this issue Dec 16, 2023

Container Engine Refactor (CRI) #1589

Open

incertum mentioned this issue Jan 26, 2024

[TRACKING] End-to-End Test Contribution Proposal #1650

Open

5 tasks

poiana added the lifecycle/stale label Mar 4, 2024

poiana removed the lifecycle/stale label Mar 5, 2024

poiana added the lifecycle/stale label Jun 3, 2024

poiana added lifecycle/rotten and removed lifecycle/stale labels Jul 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PROPOSAL] Formalize container engine testing framework similar to the new kernel version testing framework #1298

[PROPOSAL] Formalize container engine testing framework similar to the new kernel version testing framework #1298

incertum commented Aug 15, 2023

incertum commented Dec 1, 2023

leogr commented Dec 5, 2023

poiana commented Mar 4, 2024

incertum commented Mar 5, 2024

poiana commented Jun 3, 2024

poiana commented Jul 3, 2024

[PROPOSAL] Formalize container engine testing framework similar to the new kernel version testing framework #1298

[PROPOSAL] Formalize container engine testing framework similar to the new kernel version testing framework #1298

Comments

incertum commented Aug 15, 2023

incertum commented Dec 1, 2023

leogr commented Dec 5, 2023

poiana commented Mar 4, 2024

incertum commented Mar 5, 2024

poiana commented Jun 3, 2024

poiana commented Jul 3, 2024