Skip to content

Commit

Permalink
UPSTREAM: <carry>: disable load balancing on created cgroups when man…
Browse files Browse the repository at this point in the history
…aged is enabled

Previously, cpu load balancing was enabled in cri-o by manually changing the sched_domain of cpus in sysfs.
However, RHEL 9 dropped support for this knob, instead requiring it be changed in cgroups directly.

To enable cpu load balancing on cgroupv1, the specified cgroup must have cpuset.sched_load_balance set to 0, as well as
all of that cgroup's parents, plus all of the cgroups that contain a subset of the cpus that load balancing is disabled for.

By default, all cpusets inherit the set from their parent and sched_load_balance as 1. Since we need to keep the cpus that need
load balancing disabled in the root cgroup, all slices will inherit the full cpuset.

Rather than rebalancing every cgroup whenever a new guaranteed cpuset cgroup is created, the approach this PR takes is to
set load balancing to disabled for all slices. Since slices definitionally don't have any processes in them, setting load balancing won't
affect the actual scheduling decisions of the kernel. All it will do is open the opportunity for CRI-O to set the actually set load balancing to
disabled for containers that request it.

Signed-off-by: Peter Hunt <pehunt@redhat.com>
  • Loading branch information
haircommander committed Apr 4, 2023
1 parent ff8ecbc commit fa81b46
Showing 1 changed file with 15 additions and 0 deletions.
15 changes: 15 additions & 0 deletions pkg/kubelet/cm/cgroup_manager_linux.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ import (
"sync"
"time"

"github.com/opencontainers/runc/libcontainer/cgroups"
libcontainercgroups "github.com/opencontainers/runc/libcontainer/cgroups"
"github.com/opencontainers/runc/libcontainer/cgroups/fscommon"
"github.com/opencontainers/runc/libcontainer/cgroups/manager"
Expand All @@ -37,6 +38,7 @@ import (
utilruntime "k8s.io/apimachinery/pkg/util/runtime"
"k8s.io/apimachinery/pkg/util/sets"
cmutil "k8s.io/kubernetes/pkg/kubelet/cm/util"
"k8s.io/kubernetes/pkg/kubelet/managed"
"k8s.io/kubernetes/pkg/kubelet/metrics"
)

Expand Down Expand Up @@ -472,6 +474,19 @@ func (m *cgroupManagerImpl) Create(cgroupConfig *CgroupConfig) error {
utilruntime.HandleError(fmt.Errorf("cgroup manager.Set failed: %w", err))
}

// Disable cpuset.sched_load_balance for all cgroups Kubelet creates when kubelet has workloads to manage.
// This way, CRI can disable sched_load_balance for pods that must have load balance
// disabled, but the slices can contain all cpus (as the guaranteed cpus are known dynamically).
if managed.IsEnabled() && !libcontainercgroups.IsCgroup2UnifiedMode() {
path := manager.Path("cpuset")
if path == "" {
return fmt.Errorf("Failed to find cpuset for newly created cgroup")
}
if err := cgroups.WriteFile(path, "cpuset.sched_load_balance", "0"); err != nil {
return err
}
}

return nil
}

Expand Down

0 comments on commit fa81b46

Please sign in to comment.