-
Notifications
You must be signed in to change notification settings - Fork 109
Checklist for a new module
Bundle is the Deckhouse delivery edition. Possible values:
-
Default
— includes the recommended set of modules required for proper cluster operation: monitoring, authorization control, networking, and other needs. The current list can be found here. -
Minimal
— the minimum viable set of modules (only the20-deckhouse
module is included). -
Managed
— a set of modules adapted for managed solutions of cloud providers. A list of supported providers:- Google Kubernetes Engine (GKE)
To include your module in the specific bundle by default, add the following line to the appropriate modules/values-${bundle}.yaml
file: ${mobdule_name}Enabled: true
.
Read more about the algorithm for determining if the module should be enabled.
-
helm upgrade --install
is invoked if the/modules/<module-name>/Chart.yaml
file is present. -
A separate helm release is created for each module. Tiller is responsible for creating resources in the cluster. It is running in the Deckhouse Pod as a separate process. This command outputs the list of helm release:
kubectl -n d8-system exec deploy/deckhouse -- helm list
-
When rolled out for the first time, the helm release deployment will fail if the resources described in the release already exist in the cluster. Thus, the release will have have the FAILED state. This error will persist until duplicate resources are deleted from the cluster.
The release checksum is the checksum of all the helm chart files and values that Deckhouse generates for the release.
Releases in helm do not get updated when the module is restarted if the following conditions are met:
- The status of the previous release is not FAILED (you can check it in the helm list);
- The release checksum is the same;
- The checksum of all manifests in the release after the rendering stays the same.
Thus, restarting modules does not result in the accumulation of unneeded copies of the current helm release.
Values for a specific module are declared in the global key with the module name. Click here to read more about values for modules.
A special helper is implemented in helm_lib
to facilitate setting the priorityClassName
parameter.
Note that you MUST use it in all controllers without exception.
An example:
spec:
{{- include "helm_lib_priority_class" (tuple . "cluster-critical") | nindent 2 }}
The helper gets the global context and the desired priorityClassName value as an input. If the 001-priority-class
module is enabled in Deckhouse, the template will look like this:
spec:
priorityClassName: cluster-critical
Otherwise:
spec:
For more information about what classes Deckhouse uses, see the description of the priority-class module.
A special helper is also implemented in helm_lib
to facilitate setting the nodeSelector
option.
An example:
{{- include "helm_lib_node_selector" (tuple . "monitoring") | nindent 6 }}
The helper gets the global context and the desired strategy as the input to set the nodeSelector parameter.
There are four strategies in total:
-
frontend
,system
- these two use the following logic:- Use the value of the
nodeSelector
variable if it is present in module values. Otherwise: - If nodes with the
node-role.deckhouse.io/{{ .Chart.Name }}=""
label are found in the cluster, then this value is used as the nodeSelector. These nodes are considered dedicated for this chart's components. Otherwise: - If nodes with the
node-role.deckhouse.io/{{ strategy_name }}=""
label are found in the cluster, then this value is used as the nodeSelector. These nodes are considered dedicated for all components that use this deployment strategy.
- Use the value of the
-
monitoring
- uses the same logic as thesystem
andfrontend
strategies but includes an extra step after all of the above:- If nodes with the
node-role.deckhouse.io/system=""
label are found in the cluster, then this value is used as the nodeSelector. It is assumed that if there are no dedicated monitoring nodes in the cluster, then the components of monitoring-related modules run on the system nodes.
- If nodes with the
-
master
- this strategy uses the following logic:- If nodes with the
node-role.kubernetes.io/control-plane="""
label are found in the cluster, then this value is used as the nodeSelector. These nodes are considered dedicated for all components that use this deployment strategy. - If nodes with the
node-role.deckhouse.io/system=""
label are found in the cluster, then this value is used as the nodeSelector. It is assumed that if there are no master nodes and nodes with labels designating these nodes as masters in the cluster, then the components of such modules run on system nodes.
- If nodes with the
If none of the above conditions for the strategy is met, the nodeSelector will not be set.
The helper MUST be used for all Deckhouse components (wherever possible) except for DaemonSets that are deployed to all cluster nodes (node-exporter, csi-node, flannel, etc.).
A special helper is also implemented in helm_lib
to facilitate setting the tolerations
.
An example:
{{- include "helm_lib_tolerations" (tuple . "monitoring") | nindent 2 }}
The helper gets the global context and the desired strategy as the input to set the tolerations
parameter.
- If the module has the
tolerations
variable in values, it will be used to set this parameter. - The following rules will be added to the manifest:
tolerations:
- key: dedicated.deckhouse.io
operator: Equal
value: {{ .Chart.Name }}
- key: dedicated.deckhouse.io
operator: Equal
value: {{ strategy_name }}
-
For the
monitoring
strategy, the rules will look as follows:tolerations: - key: dedicated.deckhouse.io operator: Equal value: {{ .Chart.Name }} - key: dedicated.deckhouse.io operator: Equal value: {{ strategy_name }} - key: dedicated.deckhouse.io operator: Equal value: "system"
-
any-node
strategy is used to tolerate any working node. -
For the
any-node
strategy, the rules will look as follows:tolerations: - key: node-role.kubernetes.io/master - key: node-role.kubernetes.io/control-plane - key: dedicated.deckhouse.io operator: "Exists" - key: dedicated operator: "Exists" - key: DeletionCandidateOfClusterAutoscaler - key: ToBeDeletedByClusterAutoscaler - key: drbd.linbit.com/lost-quorum - key: drbd.linbit.com/force-io-error - key: drbd.linbit.com/ignore-fail-over
-
wildcard strategy
is used to tolerate everything. -
For the
wildcard
strategy, the rules will look as follows:tolerations: - operator: Exists
Also, it is some additional strategies for use in combination with main strategies:
-
cloud-provider-uninitialized
- adds toleration for cluster bootstrap time:- key: node.cloudprovider.kubernetes.io/uninitialized operator: Exists
-
no-csi
- tolerate nodes without csi running:- key: node.deckhouse.io/csi-not-bootstrapped operator: "Exists" effect: "NoSchedule"
-
storage-problems
- tolerate nodes with drbd problems:- key: drbd.linbit.com/lost-quorum - key: drbd.linbit.com/force-io-error - key: drbd.linbit.com/ignore-fail-over
This additional strategy applied by default with any base strategy except wildcard
.
-
node-problems
- tolerate nodes with various problems(pressure):- key: node.kubernetes.io/not-ready - key: node.kubernetes.io/out-of-disk - key: node.kubernetes.io/memory-pressure - key: node.kubernetes.io/disk-pressure - key: node.kubernetes.io/pid-pressure - key: node.kubernetes.io/unreachable - key: node.kubernetes.io/network-unavailable
-
uninitialized
- tolerate nodes with various problems -uninitialized
,no-csi
,node-problems
:- key: node.deckhouse.io/uninitialized operator: "Exists" effect: "NoSchedule" - key: node.deckhouse.io/csi-not-bootstrapped operator: "Exists" effect: "NoSchedule" - key: node.kubernetes.io/not-ready - key: node.kubernetes.io/out-of-disk - key: node.kubernetes.io/memory-pressure - key: node.kubernetes.io/disk-pressure - key: node.kubernetes.io/pid-pressure - key: node.kubernetes.io/unreachable - key: node.kubernetes.io/network-unavailable
To use additional strategies simply add strategy name with prefix with-
to helm helper. Example:
{{- include "helm_lib_tolerations" (tuple . "any-node" "with-uninitialized") | nindent 2 }}
To prevent use of additional strategy simply add strategy name with prefix without-
. Currently this make sense
only for strategy storage-problems
because this additional strategy adds by default to base strategies except wildcard
.
This is usable for statefulsets, which won't sheduled on nodes with drbd problems.
Example:
{{- include "helm_lib_tolerations" (tuple . "any-node" "without-node-problems") | nindent 2 }}
The helper MUST be used for all Deckhouse components (wherever possible).
The high availability (HA) mode protects crucial modules against possible downtime or failure.
helm_lib
provides auxiliary templates to facilitate using the HA mode.
-
helm_lib_ha_enabled
- returns a non-empty string of the HA mode is enabled for the cluster.{{- if (include "helm_lib_ha_enabled" .) }} HA enabled in Kubernetes cluster! {{-end }}
-
helm_lib_is_ha_to_value
- is used as anif else
expression. If the HA mode is enabled in the cluster, this template returns the first argument passed to it, and if not, it returns the second one.# There will be two replicas if the HA mode is enabled for the cluster and one if disabled. replicas: {{ include "helm_lib_is_ha_to_value" (list . 2 1) }}
The rules below ensure the correct operation and update of module components (Deployment or StatefulSet):
-
Always set podAntiAffinity for a Deployment and StatefulSet to ensure that the Pods are not run on the same node. Below is an example for prometheus:
{{- include "helm_lib_pod_anti_affinity_for_ha" (list . (dict "app" "deployment-label")) | nindent 6 }}
-
Set the correct
replicas
andstrategy
values for a Deployment:-
If the Deployment is NOT running on master nodes:
{{- include "helm_lib_deployment_strategy_and_replicas_for_ha" . | nindent 2 }}
It prevents blocking updates when the number of Deployment Pods is equal to the number of nodes, and nodeSelector and podAntiAffinity parameters are set.
-
If the Deployment is running on master nodes (on each master node!):
{{- include "helm_lib_deployment_on_master_strategy_and_replicas_for_ha" . | nindent 2 }}
It prevents blocking the Deployment update even if one of the master nodes is unavailable (if there are three or more master nodes!).
-
We recommend creating your own helper (if a similar helper has not been implemented already) if there is a need to evaluate some complex condition repeatedly.
- If the result of evaluating the helper is
true
, it must return somenon-empty string
. - If the result of evaluating the helper is
false
, it must return anempty string
.
Here is an example of helper implementation:
{{- define "helm_lib_module_https_ingress_tls_enabled" -}}
{{- $context := . -}}
{{- $mode := include "helm_lib_module_https_mode" $context -}}
{{- if or (eq "CertManager" $mode) (eq "CustomCertificate" $mode) -}}
not empty string
{{- end -}}
{{- end -}}
Usage:
{{- if (include "helm_lib_module_https_ingress_tls_enabled" .) }}
- name: ca-certificates
mountPath: "/usr/local/share/ca-certificates/"
readOnly: true
{{- end }}
In controller templates (daemonsets, statefulsets, deployments) for container images please use helm helper helm_lib_module_image
.
For common images like kube-rbac-proxy
use helm_lib_module_common_image
.
Usage:
apiVersion: apps/v1
kind: Deployment
metadata:
name: snapshot-controller
namespace: d8-{{ .Chart.Name }}
spec:
selector:
matchLabels:
app: snapshot-controller
template:
metadata:
labels:
app: snapshot-controller
spec:
containers:
- name: snapshot-controller
image: {{ include "helm_lib_module_image" (list . "snapshotController") }}
- name: kube-rbac-proxy
image: {{ include "helm_lib_module_common_image" (list . "kubeRbacProxy") }}
For more information about hooks, their structure, and binding to events, see the addon-operator documentation.
In Deckhouse, global hooks are stored in the /global-hooks
directory, module hooks are placed in the module's /modules/MODULE/hooks
directory.
You can pass information to the hook using environment variables with paths to files in /tmp. The hook's results are also returned via files. Click here to read more about using parameters in hooks.
Validation hooks are similar to regular webhooks in their interfaces and running. They use the same shell framework. For more information about conversion webhooks, see the shell-operator documentation.
In Deckhouse, validating hooks are located in the module's /modules/MODULE/webhooks/validation/
directory.
Conversion webhooks are similar to regular hooks in their interfaces and running mechanism. They use the same shell framework. For more information about conversion webhooks, see the shell-operator documentation.
In Deckhouse, conversion webhooks are located in the module's /modules/MODULE/webhooks/conversion/
directory.
We do not recommend using kubectl in hooks. It leads to a loss of idempotency since the hook depends on the cluster state in addition to the input parameters (that creates some difficulties during debugging/testing).
- Use the built-in shell-operator functionality (it is fully integrated into Deckhouse) to track objects;
- Use the shell_lib functionality (the
kubernetes::
-prefixed functions in particular: kubernetes::create_yaml, kubernetes::patch_jq, kubernetes::delete_if_exists, etc.) to create, edit, and delete objects.
The "enabled" webhooks are located in the root directory of the module. You can use them to describe the conditions under which the module must be enabled/disabled.
An example:
#!/bin/bash
source /deckhouse/shell_lib.sh
function __main__() {
if values::has global.modules.publicDomainTemplate ; then
echo "true" > $MODULE_ENABLED_RESULT
else
echo "false" > $MODULE_ENABLED_RESULT
fi
}
enabled::run $@
This webhook disables the module in all clusters in which the global.modules.publicDomainTemplate
option is not set.
Regular checks are implemented in shell_lib
functions with the enabled::
prefix. For example, the hook below disables the module in all clusters with the Kubernetes version < 1.21.0:
function __main__() {
enabled::disable_module_in_kubernetes_versions_less_than 1.21.0
echo "true" > $MODULE_ENABLED_RESULT
}
See the documentation for more info.
Deckhouse support validation for values passed using the Deckhouse ConfigMap and for Deckhouse-generated values.
The OpenAPI value validation scheme is needed:
- To make sure that the user has entered valid values into the Deckhouse ConfigMap, and to let the user know if the values entered are invalid.
- To ensure that all the necessary parameters (in the correct format) are passed for rendering the module's helm templates. It ensures the expected behavior within the cluster and that only the planned objects will end up in the cluster.
- To generate the documentation for the module parameters on the site.
The OpenAPI validating schemes are stored in the $GLOBAL_HOOKS_DIR/openapi
directory for global values, and in the $MODULES_DIR/<module-name>/openapi
for modules.
Refer to the addon-operator documentation for more information about schema validation.
The validation schemas have the OpenAPI Schema Object format. The detailed description of the format is available in the documentation.
Note that addon-operator
extends the schema format with additional properties. The additional information is available in the documentation.
Caution! If the
additionalProperties
property is not defined, it will be set tofalse
at all schema levels.
- The
openapi/config-values.yaml
scheme validates values passed by the user via a ConfigMap.
An example:
type: object
properties:
podNetworkMode:
type: string
enum: ["HostGW", "VXLAN"]
default: "HostGW"
description: |
Work mode.
-
The
openapi/values.yaml
scheme validates combined values consisting of values from ConfigMap and values generated by hooks (learn more here).Caution! The
openapi/values.yaml
scheme validates values generated by webhooks. Thus, the scheme will fire up an error when validating combined values since it does not have the description of the ConfigMap-derived values. Thex-extend
parameter extends theopenapi/values.yaml
schema with parameters of theopenapi/config-values.yaml
schema (as in the example below), thus avoiding duplicating them. Thex-extend
parameter must be used in all cases. Learn more here.
An example:
x-extend:
schema: config-values.yaml
type: object
properties:
internal:
type: object
default: {}
x-required-for-helm:
- podNetworkMode
properties:
podNetworkMode:
type: string
enum: ["HostGW", "VXLAN"]
How to create a validation scheme for a module:
-
openapi/config-values.yaml
:- The scheme is based on the module documentation.
- Set the default values for the fields. The default values can be specified in the:
- documentation;
-
$MODULES_DIR/<module-name>/values.yaml
file; - hardcoded in the webhook's code;
- The
required
property must be set for all required fields.
-
openapi/values.yaml
:- The schema is created for the values set by webhooks (usually, the
internal
group's variables). - Set the
x-extend
property to load theconfig-values.yaml
scheme. - Set the default values for the fields. The default values can be specified in the:
-
$MODULES_DIR/<module-name>/values.yaml
file; - hardcoded in the webhook's code;
-
- The
x-required-for-helm
property must be set for all mandatory fields.
- The schema is created for the values set by webhooks (usually, the
After creating the schemas for the module, delete the $MODULES_DIR/<module-name>/values.yaml
file.
If the module uses Persistent Storage, the effective storage class (EFC) should be determined as follows:
- If the EFC is defined in the module's config (values) – use the one explicitly specified in the module.
- If the PV exists – use the storage class of the existing PV.
- Otherwise, use either globally defined or default (determined automatically) EFC.
- If none of the above are suitable, use an emptyDir volume.
Such an approach allows you to avoid the re-provisioning of PVs (and data loss) that the global or default storage class change can trigger. To re-provision a PV, you must explicitly specify a different storage class right in the module configuration.
Note! You cannot mutate the
volumeClaimTemplate
. Thus, you must delete a statefulset (e.g., using a webhook) when changing the storageClass.
You can find a relevant example in the prometheus and openvpn modules' hooks.
CRDs must be stored in the crds
directory in the module's root.
The module must contain a dedicated hook called ensure_crds.go
with the following content:
/*
Copyright 2023 Flant JSC
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package hooks
import (
"github.com/deckhouse/deckhouse/go_lib/hooks/ensure_crds"
)
var _ = ensure_crds.RegisterEnsureCRDsHook("/deckhouse/modules/MODULE_NAME/crds/*.yaml")
If resources described via CRDs are used in other modules, you need to make a separate module for those CRDs.
An example: 010-vertical-pod-autoscaled-crd
. Most Deckhouse modules use these.
- The description of the
openAPIV3Schema
resource validation schema should be as detailed as possible; thedescription
of objects should be in English. - Use the
spec.additionalPrinterColumns
property to add a description for additional columns. It will be displayed in thekubectl get
command's output, thus improving the user experience.
Each module must be properly covered with tests. There are three types of tests:
- Webhook tests. These are stored in the
hooks
directory and use the following naming convention:${hook_name}_test.go
. Webhook tests check the result of running hooks. - Helm tests. These are stored in a separate
template_tests
directory in the module's root. Helm tests check the logic in helm templates. - Matrix tests. These are described in the
values_matrix_test.yaml
file on the module's root. Matrix tests check the rendering of helm templates and if these templates match our standards for a large number of values.yaml the matrix describes.