Skip to content

Commit

Permalink
chart: Update settings based on the instance managers consolidation
Browse files Browse the repository at this point in the history
- Add the setting added in longhorn/longhorn-manager#1731 in the helm chart
- Related to longhorn#5208

Signed-off-by: Yarden Shoham <git@yardenshoham.com>
  • Loading branch information
yardenshoham committed Aug 6, 2023
1 parent 1e8bd45 commit 81e509b
Show file tree
Hide file tree
Showing 3 changed files with 11 additions and 31 deletions.
36 changes: 9 additions & 27 deletions chart/questions.yaml
Expand Up @@ -541,37 +541,19 @@ Set the value to **0** to disable backup restore."
type: int
min: 0
default: 300
- variable: defaultSettings.guaranteedEngineManagerCPU
label: Guaranteed Engine Manager CPU
description: "This integer value indicates how many percentage of the total allocatable CPU on each node will be reserved for each engine manager Pod. For example, 10 means 10% of the total CPU on a node will be allocated to each engine manager pod on this node. This will help maintain engine stability during high node workload.
In order to prevent unexpected volume engine crash as well as guarantee a relative acceptable IO performance, you can use the following formula to calculate a value for this setting:
Guaranteed Engine Manager CPU = The estimated max Longhorn volume engine count on a node * 0.1 / The total allocatable CPUs on the node * 100.
- variable: defaultSettings.guaranteedInstanceManagerCPU
label: Guaranteed Instance Manager CPU
description: "This integer value indicates how many percentage of the total allocatable CPU on each node will be reserved for each instance manager Pod. For example, 10 means 10% of the total CPU on a node will be allocated to each instance manager pod on this node. This will help maintain engine and replica stability during high node workload.
In order to prevent unexpected volume instance (engine/replica) crash as well as guarantee a relative acceptable IO performance, you can use the following formula to calculate a value for this setting:
`Guaranteed Instance Manager CPU = The estimated max Longhorn volume engine and replica count on a node * 0.1 / The total allocatable CPUs on the node * 100`
The result of above calculation doesn't mean that's the maximum CPU resources the Longhorn workloads require. To fully exploit the Longhorn volume I/O performance, you can allocate/guarantee more CPU resources via this setting.
If it's hard to estimate the usage now, you can leave it with the default value, which is 12%. Then you can tune it when there is no running workload using Longhorn volumes.
WARNING:
- Value 0 means unsetting CPU requests for engine manager pods.
- Considering the possible new instance manager pods in the further system upgrade, this integer value is range from 0 to 40. And the sum with setting 'Guaranteed Engine Manager CPU' should not be greater than 40.
- Value 0 means unsetting CPU requests for instance manager pods.
- Considering the possible new instance manager pods in the further system upgrade, this integer value is range from 0 to 40.
- One more set of instance manager pods may need to be deployed when the Longhorn system is upgraded. If current available CPUs of the nodes are not enough for the new instance manager pods, you need to detach the volumes using the oldest instance manager pods so that Longhorn can clean up the old pods automatically and release the CPU resources. And the new pods with the latest instance manager image will be launched then.
- This global setting will be ignored for a node if the field \"EngineManagerCPURequest\" on the node is set.
- After this setting is changed, all engine manager pods using this global setting on all the nodes will be automatically restarted. In other words, DO NOT CHANGE THIS SETTING WITH ATTACHED VOLUMES."
group: "Longhorn Default Settings"
type: int
min: 0
max: 40
default: 12
- variable: defaultSettings.guaranteedReplicaManagerCPU
label: Guaranteed Replica Manager CPU
description: "This integer value indicates how many percentage of the total allocatable CPU on each node will be reserved for each replica manager Pod. 10 means 10% of the total CPU on a node will be allocated to each replica manager pod on this node. This will help maintain replica stability during high node workload.
In order to prevent unexpected volume replica crash as well as guarantee a relative acceptable IO performance, you can use the following formula to calculate a value for this setting:
Guaranteed Replica Manager CPU = The estimated max Longhorn volume replica count on a node * 0.1 / The total allocatable CPUs on the node * 100.
The result of above calculation doesn't mean that's the maximum CPU resources the Longhorn workloads require. To fully exploit the Longhorn volume I/O performance, you can allocate/guarantee more CPU resources via this setting.
If it's hard to estimate the usage now, you can leave it with the default value, which is 12%. Then you can tune it when there is no running workload using Longhorn volumes.
WARNING:
- Value 0 means unsetting CPU requests for replica manager pods.
- Considering the possible new instance manager pods in the further system upgrade, this integer value is range from 0 to 40. And the sum with setting 'Guaranteed Replica Manager CPU' should not be greater than 40.
- One more set of instance manager pods may need to be deployed when the Longhorn system is upgraded. If current available CPUs of the nodes are not enough for the new instance manager pods, you need to detach the volumes using the oldest instance manager pods so that Longhorn can clean up the old pods automatically and release the CPU resources. And the new pods with the latest instance manager image will be launched then.
- This global setting will be ignored for a node if the field \"ReplicaManagerCPURequest\" on the node is set.
- After this setting is changed, all replica manager pods using this global setting on all the nodes will be automatically restarted. In other words, DO NOT CHANGE THIS SETTING WITH ATTACHED VOLUMES."
- This global setting will be ignored for a node if the field \"InstanceManagerCPURequest\" on the node is set.
- After this setting is changed, all instance manager pods using this global setting on all the nodes will be automatically restarted. In other words, DO NOT CHANGE THIS SETTING WITH ATTACHED VOLUMES."
group: "Longhorn Default Settings"
type: int
min: 0
Expand Down
3 changes: 1 addition & 2 deletions chart/templates/default-setting.yaml
Expand Up @@ -63,8 +63,7 @@ data:
{{ if not (kindIs "invalid" .Values.defaultSettings.concurrentAutomaticEngineUpgradePerNodeLimit) }}concurrent-automatic-engine-upgrade-per-node-limit: {{ .Values.defaultSettings.concurrentAutomaticEngineUpgradePerNodeLimit }}{{ end }}
{{ if not (kindIs "invalid" .Values.defaultSettings.backingImageCleanupWaitInterval) }}backing-image-cleanup-wait-interval: {{ .Values.defaultSettings.backingImageCleanupWaitInterval }}{{ end }}
{{ if not (kindIs "invalid" .Values.defaultSettings.backingImageRecoveryWaitInterval) }}backing-image-recovery-wait-interval: {{ .Values.defaultSettings.backingImageRecoveryWaitInterval }}{{ end }}
{{ if not (kindIs "invalid" .Values.defaultSettings.guaranteedEngineManagerCPU) }}guaranteed-engine-manager-cpu: {{ .Values.defaultSettings.guaranteedEngineManagerCPU }}{{ end }}
{{ if not (kindIs "invalid" .Values.defaultSettings.guaranteedReplicaManagerCPU) }}guaranteed-replica-manager-cpu: {{ .Values.defaultSettings.guaranteedReplicaManagerCPU }}{{ end }}
{{ if not (kindIs "invalid" .Values.defaultSettings.guaranteedInstanceManagerCPU) }}guaranteed-instance-manager-cpu: {{ .Values.defaultSettings.guaranteedInstanceManagerCPU }}{{ end }}
{{ if not (kindIs "invalid" .Values.defaultSettings.kubernetesClusterAutoscalerEnabled) }}kubernetes-cluster-autoscaler-enabled: {{ .Values.defaultSettings.kubernetesClusterAutoscalerEnabled }}{{ end }}
{{ if not (kindIs "invalid" .Values.defaultSettings.orphanAutoDeletion) }}orphan-auto-deletion: {{ .Values.defaultSettings.orphanAutoDeletion }}{{ end }}
{{ if not (kindIs "invalid" .Values.defaultSettings.storageNetwork) }}storage-network: {{ .Values.defaultSettings.storageNetwork }}{{ end }}
Expand Down
3 changes: 1 addition & 2 deletions chart/values.yaml
Expand Up @@ -149,8 +149,7 @@ defaultSettings:
concurrentAutomaticEngineUpgradePerNodeLimit: ~
backingImageCleanupWaitInterval: ~
backingImageRecoveryWaitInterval: ~
guaranteedEngineManagerCPU: ~
guaranteedReplicaManagerCPU: ~
guaranteedInstanceManagerCPU: ~
kubernetesClusterAutoscalerEnabled: ~
orphanAutoDeletion: ~
storageNetwork: ~
Expand Down

0 comments on commit 81e509b

Please sign in to comment.