Skip to content

Commit

Permalink
reenable stalld now that it is in coreos
Browse files Browse the repository at this point in the history
opening up realtime cpu usage

allow optionally disabling stalld via env

Updated the bash and configured 4.10 upgrade to 4.11 to disable stalld

fixed a bash syntax error and added some output to the if statement

swapping to 4.9 to 4.10 from 4.10 to 4.11
  • Loading branch information
jeff-roche committed Nov 29, 2022
1 parent 9448a98 commit 93929f4
Show file tree
Hide file tree
Showing 3 changed files with 18 additions and 1 deletion.
Expand Up @@ -74,6 +74,7 @@ tests:
steps:
cluster_profile: gcp-openshift-gce-devel-ci-2
env:
STALLD_ENABLED: "false"
TEST_TYPE: upgrade-conformance
workflow: openshift-upgrade-gcp-ovn-rt
- as: e2e-azure-ovn-upgrade
Expand Down
Expand Up @@ -7,6 +7,8 @@ set -o pipefail
node_role=${APPLY_NODE_ROLE:=worker}
max_cpu=8
isolated_cpu=${COMPUTE_NODE_ISOLATED_CPU:-4}
sched_rt_runtime_us=-1
stalld_service="service.stalld=start,enable"
gcp_pattern="[n|c|m|a]{1}[1-9]{1}d?-(standard|highcpu|highmem|highgpu){1}-([0-9]+)"

# Currently RT is only supported on GCP
Expand All @@ -24,6 +26,13 @@ if [[ "$isolated_cpu" == "$max_cpu" ]]; then
echo "max and isolated cpu are equal, setting isolated CPU to $isolated_cpu"
fi

if [ ${STALLD_ENABLED:="true"} != "true" ]
then
echo "disabling stalld and setting default realtime timeout"
sched_rt_runtime_us=950000
stalld_service=""
fi

echo "Creating new realtime tuned profile on cluster"
oc create -f - <<EOF
apiVersion: tuned.openshift.io/v1
Expand All @@ -48,6 +57,9 @@ spec:
energy_perf_bias=performance
min_perf_pct=100
[service]
$stalld_service
[vm]
transparent_hugepages=never
Expand All @@ -63,7 +75,7 @@ spec:
[sysctl]
kernel.hung_task_timeout_secs = 600
kernel.nmi_watchdog = 0
kernel.sched_rt_runtime_us = 950000
kernel.sched_rt_runtime_us = $sched_rt_runtime_us
kernel.timer_migration = 0
kernel.numa_balancing=0
net.core.busy_read=50
Expand Down
Expand Up @@ -21,5 +21,9 @@ ref:
Isolated cores to use for RT configuration
4 vCPU is the current default for workload isolation, this is mirrored in our configs for RT
https://github.com/openshift/enhancements/blob/master/enhancements/workload-partitioning/management-workload-partitioning.md#goals
- name: STALLD_ENABLED
default: "true"
documentation: |-
To use stalld to prevent thread starvation
documentation: |-
The configure-realtime-tuned-profile step applies realtime tuned profile to cluster workers.

0 comments on commit 93929f4

Please sign in to comment.