Open
Description
What is your proposal:
Koordinator supports the Linux core scheduling for containers, where containers belonging to different users can be isolated at the SMT level.
Why is this needed:
For the scenarios aimed at security, the Core Scheduling feature is helpful to avoid SCAs (side-channel attacks) on the L1-level or L2-level while the apps of the same user still have chance to leverage SMTs.
Is there a suggested solution, if so, please add it:
- Phase 1: Provide the fundamental Linux Core Scheduling for the containers.
- 1.1 API: Define the pod-level API for the core scheduling. apis: add core sched apis #1720
- 1.2 Koordlet: supports the core scheduling cookie management for containers. koordlet: support core sched cookie management #1722
- 1.2.1 Since the current Anolis kernel version (5.10.136-16.1) has a compatibility problem between the Group Identity and Core Scheduling which can cause kernel hard-lockup when both features are enabled, the koordlet should mitigate the kernel bug by implementing a userspace conflict detection where the Group Identity is verified to disable before creating a new Core Sched cookie. koordlet: fix core sched conflicts with GI and revise API #1829
- 1.2.2 Koordlet split the core sched prom metrics into an new metrics registry. Done by metrics: seperater metrics as internal and external for slo-controller and koordlet #1807
- 1.3. Koord-Manager: supports the core scheduling group ID mutating. colocationprofile: support mutating pod labels and annotations with mapping #1781
- Phase 2: Improve the performance of the Core Scheduling feature.
- 2.1 Koordlet: Exports more metrics to trace the core sched operations working well.
- 2.2 Koordlet: After the Anolis kernel fixes the compatibility problem and provides a more stable interface, the koordlet removes the workaround about when the node's CPU QoS policy is migrating from Group Identity to Core Scheduling. (no longer traverse the cgroup trees, add container-level reconcile for GI)
- 2.3 Koord-scheduler: improves the scheduling scoring according to the density of the cookies.
The work of phase 1 is planned to be released in v1.5.0.