Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

don't set default cgroup option if reserved-cpus provided via kubelet… #1405

Merged
merged 10 commits into from
Aug 30, 2023
15 changes: 13 additions & 2 deletions files/bootstrap.sh
Original file line number Diff line number Diff line change
Expand Up @@ -222,6 +222,13 @@ ENABLE_LOCAL_OUTPOST="${ENABLE_LOCAL_OUTPOST:-}"
CLUSTER_ID="${CLUSTER_ID:-}"
LOCAL_DISKS="${LOCAL_DISKS:-}"

##allow --reserved-cpus options via kubelet arg directly and not set default cgroup option
USE_RESERVED_CPUS=false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should invert the semantics of this variable, to disable the cgroup settings (e.g. USE_RESERVED_CGROUPS) instead of enabling reserved CPU's. Other than that, this PR LGTM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, this way if we have some other usecases which needs disabling of cgroup, then we can reuse the same variable. let me push the changes,

if [[ ${KUBELET_EXTRA_ARGS} == *'--reserved-cpus'* ]]; then
USE_RESERVED_CPUS=true
log "INFO: received reserved-cpus options via kubelet arg: ${KUBELET_EXTRA_ARGS} .cgroup settings will not be set"
raghs-aws marked this conversation as resolved.
Show resolved Hide resolved
fi

if [[ ! -z ${LOCAL_DISKS} ]]; then
setup-local-disks "${LOCAL_DISKS}"
fi
Expand Down Expand Up @@ -565,8 +572,12 @@ if [[ "$CONTAINER_RUNTIME" = "containerd" ]]; then
sudo sed -i s,SANDBOX_IMAGE,$PAUSE_CONTAINER,g /etc/eks/containerd/containerd-config.toml

echo "$(jq '.cgroupDriver="systemd"' "${KUBELET_CONFIG}")" > "${KUBELET_CONFIG}"
echo "$(jq '.systemReservedCgroup="/system"' "${KUBELET_CONFIG}")" > "${KUBELET_CONFIG}"
echo "$(jq '.kubeReservedCgroup="/runtime"' "${KUBELET_CONFIG}")" > "${KUBELET_CONFIG}"
##allow --reserved-cpus options via kubelet arg directly and not set default cgroup option
if [[ "${USE_RESERVED_CPUS}" = false ]]; then
log "INFO: skipping setting of .systemReservedCgroup and .kubeReservedCgroup!"
raghs-aws marked this conversation as resolved.
Show resolved Hide resolved
echo "$(jq '.systemReservedCgroup="/system"' "${KUBELET_CONFIG}")" > "${KUBELET_CONFIG}"
echo "$(jq '.kubeReservedCgroup="/runtime"' "${KUBELET_CONFIG}")" > "${KUBELET_CONFIG}"
fi

# Check if the containerd config file is the same as the one used in the image build.
# If different, then restart containerd w/ proper config
Expand Down
98 changes: 98 additions & 0 deletions test/cases/reserved-cpus-kubelet-arg.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
#!/usr/bin/env bash
set -euo pipefail

echo "-> Should not set systemReservedCgroup and kubeReservedCgroup when --reserved-cpus is set with containerd"
exit_code=0
export KUBELET_VERSION=v1.24.15-eks-ba74326
runtime="containerd"
/etc/eks/bootstrap.sh \
--b64-cluster-ca dGVzdA== \
--apiserver-endpoint http://my-api-endpoint \
--kubelet-extra-args '--node-labels=cnf=cnf1 --reserved-cpus=0-3 --cpu-manager-policy=static' \
test || exit_code=$?

if [[ ${exit_code} -ne 0 ]]; then
echo "❌ Test Failed: expected a non-zero exit code but got '${exit_code}'"
exit 1
fi

retCode=$(
grep -q systemReservedCgroup /etc/kubernetes/kubelet/kubelet-config.json
echo $?
)
if [[ ${retCode} -eq 0 ]]; then
raghs-aws marked this conversation as resolved.
Show resolved Hide resolved
echo "❌ Test Failed: expected systemReservedCgroup to be absent from /etc/kubernetes/kubelet/kubelet-config.json. expected: 1 Received: ${retCode}"
exit 1
fi

retCode=$(
grep -q kubeReservedCgroup /etc/kubernetes/kubelet/kubelet-config.json
echo $?
)
if [[ ${retCode} -eq 0 ]]; then
echo "❌ Test Failed: expected kubeReservedCgroup to be absent from /etc/kubernetes/kubelet/kubelet-config.json. expected: 1 Received: ${retCode}"
exit 1
fi

echo "-> Should set systemReservedCgroup and kubeReservedCgroup when --reserved-cpus is not set with containerd"
exit_code=0
export KUBELET_VERSION=v1.24.15-eks-ba74326
runtime="containerd"
/etc/eks/bootstrap.sh \
--b64-cluster-ca dGVzdA== \
--apiserver-endpoint http://my-api-endpoint \
test || exit_code=$?

if [[ ${exit_code} -ne 0 ]]; then
echo "❌ Test Failed: expected a non-zero exit code but got '${exit_code}'"
exit 1
fi

retCode=$(
grep -q systemReservedCgroup /etc/kubernetes/kubelet/kubelet-config.json
echo $?
)
if [[ ${retCode} -ne 0 ]]; then
echo "❌ Test Failed: expected systemReservedCgroup to be retCode in /etc/kubernetes/kubelet/kubelet-config.json. expected: /system Received: $(grep -q systemReservedCgroup /etc/kubernetes/kubelet/kubelet-config.json) "
exit 1
fi

retCode=$(
grep -q kubeReservedCgroup /etc/kubernetes/kubelet/kubelet-config.json
echo $?
)
if [[ ${retCode} -ne 0 ]]; then
echo "❌ Test Failed: expected kubeReservedCgroup to be absent from /etc/kubernetes/kubelet/kubelet-config.json. expected: /runtime Received: $(grep -q kubeReservedCgroup /etc/kubernetes/kubelet/kubelet-config.json) "
exit 1
fi

echo "-> Should set systemReservedCgroup and kubeReservedCgroup when --reserved-cpus is set with dockerd"
exit_code=0
export KUBELET_VERSION=v1.23.15-eks-ba74326
/etc/eks/bootstrap.sh \
--b64-cluster-ca dGVzdA== \
--apiserver-endpoint http://my-api-endpoint \
test || exit_code=$?

if [[ ${exit_code} -ne 0 ]]; then
echo "❌ Test Failed: expected a non-zero exit code but got '${exit_code}'"
exit 1
fi

retCode=$(
grep -q systemReservedCgroup /etc/kubernetes/kubelet/kubelet-config.json
echo $?
)
if [[ ${retCode} -ne 0 ]]; then
echo "❌ Test Failed: expected systemReservedCgroup to be retCode in /etc/kubernetes/kubelet/kubelet-config.json. expected: /system Received: $(grep -q systemReservedCgroup /etc/kubernetes/kubelet/kubelet-config.json) "
exit 1
fi

retCode=$(
grep -q kubeReservedCgroup /etc/kubernetes/kubelet/kubelet-config.json
echo $?
)
if [[ ${retCode} -ne 0 ]]; then
echo "❌ Test Failed: expected kubeReservedCgroup to be absent from /etc/kubernetes/kubelet/kubelet-config.json. expected: /runtime Received: $(grep -q kubeReservedCgroup /etc/kubernetes/kubelet/kubelet-config.json) "
exit 1
fi