Skip to content

Checklist for a new module

Denis Romanenko edited this page Mar 1, 2023 · 2 revisions

Bundle

Bundle is the Deckhouse delivery edition. Possible values:

  • Default — includes the recommended set of modules required for proper cluster operation: monitoring, authorization control, networking, and other needs. The current list can be found here.
  • Minimal — the minimum viable set of modules (only the 20-deckhouse module is included).
  • Managed — a set of modules adapted for managed solutions of cloud providers. A list of supported providers:
    • Google Kubernetes Engine (GKE)

To include your module in the specific bundle by default, add the following line to the appropriate modules/values-${bundle}.yaml file: ${mobdule_name}Enabled: true.

Read more about the algorithm for determining if the module should be enabled.

Helm

  • helm upgrade --install is invoked if the /modules/<module-name>/Chart.yaml file is present.

  • A separate helm release is created for each module. Tiller is responsible for creating resources in the cluster. It is running in the Deckhouse Pod as a separate process. This command outputs the list of helm release:

    kubectl -n d8-system exec deploy/deckhouse -- helm list
  • When rolled out for the first time, the helm release deployment will fail if the resources described in the release already exist in the cluster. Thus, the release will have have the FAILED state. This error will persist until duplicate resources are deleted from the cluster.

The release checksum is the checksum of all the helm chart files and values that Deckhouse generates for the release.

Releases in helm do not get updated when the module is restarted if the following conditions are met:

  • The status of the previous release is not FAILED (you can check it in the helm list);
  • The release checksum is the same;
  • The checksum of all manifests in the release after the rendering stays the same.

Thus, restarting modules does not result in the accumulation of unneeded copies of the current helm release.

Module values

Values for a specific module are declared in the global key with the module name. Click here to read more about values for modules.

Priority Class

A special helper is implemented in helm_lib to facilitate setting the priorityClassName parameter. Note that you MUST use it in all controllers without exception.

An example:

spec:
  {{- include "helm_lib_priority_class" (tuple . "cluster-critical") | nindent 2 }}

The helper gets the global context and the desired priorityClassName value as an input. If the 001-priority-class module is enabled in Deckhouse, the template will look like this:

spec:
  priorityClassName: cluster-critical

Otherwise:

spec:

For more information about what classes Deckhouse uses, see the description of the priority-class module.

Node Selector

A special helper is also implemented in helm_lib to facilitate setting the nodeSelector option.

An example:

      {{- include "helm_lib_node_selector" (tuple . "monitoring") | nindent 6 }}

The helper gets the global context and the desired strategy as the input to set the nodeSelector parameter.

There are four strategies in total:

  1. frontend, system - these two use the following logic:

    • Use the value of the nodeSelector variable if it is present in module values. Otherwise:
    • If nodes with the node-role.deckhouse.io/{{ .Chart.Name }}="" label are found in the cluster, then this value is used as the nodeSelector. These nodes are considered dedicated for this chart's components. Otherwise:
    • If nodes with the node-role.deckhouse.io/{{ strategy_name }}="" label are found in the cluster, then this value is used as the nodeSelector. These nodes are considered dedicated for all components that use this deployment strategy.
  2. monitoring - uses the same logic as the system and frontend strategies but includes an extra step after all of the above:

    • If nodes with the node-role.deckhouse.io/system="" label are found in the cluster, then this value is used as the nodeSelector. It is assumed that if there are no dedicated monitoring nodes in the cluster, then the components of monitoring-related modules run on the system nodes.
  3. master - this strategy uses the following logic:

    • If nodes with the node-role.kubernetes.io/control-plane=""" label are found in the cluster, then this value is used as the nodeSelector. These nodes are considered dedicated for all components that use this deployment strategy.
    • If nodes with the node-role.deckhouse.io/system="" label are found in the cluster, then this value is used as the nodeSelector. It is assumed that if there are no master nodes and nodes with labels designating these nodes as masters in the cluster, then the components of such modules run on system nodes.

If none of the above conditions for the strategy is met, the nodeSelector will not be set.

The helper MUST be used for all Deckhouse components (wherever possible) except for DaemonSets that are deployed to all cluster nodes (node-exporter, csi-node, flannel, etc.).

Tolerations

A special helper is also implemented in helm_lib to facilitate setting the tolerations.

An example:

  {{- include "helm_lib_tolerations" (tuple . "monitoring") | nindent 2 }}

The helper gets the global context and the desired strategy as the input to set the tolerations parameter.

  • If the module has the tolerations variable in values, it will be used to set this parameter.
  • The following rules will be added to the manifest:
tolerations:
- key: dedicated.deckhouse.io
  operator: Equal
  value: {{ .Chart.Name }}
- key: dedicated.deckhouse.io
  operator: Equal
  value: {{ strategy_name }}
  • For the monitoring strategy, the rules will look as follows:

    tolerations:
    - key: dedicated.deckhouse.io
      operator: Equal
      value: {{ .Chart.Name }}
    - key: dedicated.deckhouse.io
      operator: Equal
      value: {{ strategy_name }}
    - key: dedicated.deckhouse.io
      operator: Equal
      value: "system"
  • any-node strategy is used to tolerate any working node.

  • For the any-node strategy, the rules will look as follows:

    tolerations:
    - key: node-role.kubernetes.io/master
    - key: node-role.kubernetes.io/control-plane
    - key: dedicated.deckhouse.io
      operator: "Exists"
    - key: dedicated
      operator: "Exists"
    - key: DeletionCandidateOfClusterAutoscaler
    - key: ToBeDeletedByClusterAutoscaler
    - key: drbd.linbit.com/lost-quorum
    - key: drbd.linbit.com/force-io-error
    - key: drbd.linbit.com/ignore-fail-over
  • wildcard strategy is used to tolerate everything.

  • For the wildcard strategy, the rules will look as follows:

    tolerations:
    - operator: Exists

Also, it is some additional strategies for use in combination with main strategies:

  • cloud-provider-uninitialized - adds toleration for cluster bootstrap time:

    - key: node.cloudprovider.kubernetes.io/uninitialized
      operator: Exists
  • no-csi - tolerate nodes without csi running:

    - key: node.deckhouse.io/csi-not-bootstrapped
      operator: "Exists"
      effect: "NoSchedule"  
  • storage-problems - tolerate nodes with drbd problems:

    - key: drbd.linbit.com/lost-quorum
    - key: drbd.linbit.com/force-io-error
    - key: drbd.linbit.com/ignore-fail-over

This additional strategy applied by default with any base strategy except wildcard.

  • node-problems - tolerate nodes with various problems(pressure):

    - key: node.kubernetes.io/not-ready
    - key: node.kubernetes.io/out-of-disk
    - key: node.kubernetes.io/memory-pressure
    - key: node.kubernetes.io/disk-pressure
    - key: node.kubernetes.io/pid-pressure
    - key: node.kubernetes.io/unreachable
    - key: node.kubernetes.io/network-unavailable
  • uninitialized - tolerate nodes with various problems - uninitialized, no-csi, node-problems:

    - key: node.deckhouse.io/uninitialized
      operator: "Exists"
      effect: "NoSchedule"
    - key: node.deckhouse.io/csi-not-bootstrapped
      operator: "Exists"
      effect: "NoSchedule"
    - key: node.kubernetes.io/not-ready
    - key: node.kubernetes.io/out-of-disk
    - key: node.kubernetes.io/memory-pressure
    - key: node.kubernetes.io/disk-pressure
    - key: node.kubernetes.io/pid-pressure
    - key: node.kubernetes.io/unreachable
    - key: node.kubernetes.io/network-unavailable

To use additional strategies simply add strategy name with prefix with- to helm helper. Example:

  {{- include "helm_lib_tolerations" (tuple . "any-node" "with-uninitialized") | nindent 2 }}

To prevent use of additional strategy simply add strategy name with prefix without-. Currently this make sense only for strategy storage-problems because this additional strategy adds by default to base strategies except wildcard. This is usable for statefulsets, which won't sheduled on nodes with drbd problems. Example:

  {{- include "helm_lib_tolerations" (tuple . "any-node" "without-node-problems") | nindent 2 }}

The helper MUST be used for all Deckhouse components (wherever possible).

The HA mode for the module

The high availability (HA) mode protects crucial modules against possible downtime or failure.

helm_lib provides auxiliary templates to facilitate using the HA mode.

  • helm_lib_ha_enabled - returns a non-empty string of the HA mode is enabled for the cluster.

    {{- if (include "helm_lib_ha_enabled" .) }}
    HA enabled in Kubernetes cluster!
    {{-end }}
  • helm_lib_is_ha_to_value - is used as an if else expression. If the HA mode is enabled in the cluster, this template returns the first argument passed to it, and if not, it returns the second one.

    # There will be two replicas if the HA mode is enabled for the cluster and one if disabled.
    replicas: {{ include "helm_lib_is_ha_to_value" (list . 2 1) }}

The rules below ensure the correct operation and update of module components (Deployment or StatefulSet):

  • Always set podAntiAffinity for a Deployment and StatefulSet to ensure that the Pods are not run on the same node. Below is an example for prometheus:

          {{- include "helm_lib_pod_anti_affinity_for_ha" (list . (dict "app" "deployment-label")) | nindent 6 }}
  • Set the correct replicas and strategy values for a Deployment:

    • If the Deployment is NOT running on master nodes:

        {{- include "helm_lib_deployment_strategy_and_replicas_for_ha" . | nindent 2 }}

      It prevents blocking updates when the number of Deployment Pods is equal to the number of nodes, and nodeSelector and podAntiAffinity parameters are set.

    • If the Deployment is running on master nodes (on each master node!):

        {{- include "helm_lib_deployment_on_master_strategy_and_replicas_for_ha" . | nindent 2 }}

      It prevents blocking the Deployment update even if one of the master nodes is unavailable (if there are three or more master nodes!).

Evaluating complex conditions

We recommend creating your own helper (if a similar helper has not been implemented already) if there is a need to evaluate some complex condition repeatedly.

  • If the result of evaluating the helper is true, it must return some non-empty string.
  • If the result of evaluating the helper is false, it must return an empty string.

Here is an example of helper implementation:

{{- define "helm_lib_module_https_ingress_tls_enabled" -}}
  {{- $context := . -}}

  {{- $mode := include "helm_lib_module_https_mode" $context -}}

  {{- if or (eq "CertManager" $mode) (eq "CustomCertificate" $mode) -}}
    not empty string
  {{- end -}}
{{- end -}}

Usage:

{{- if (include "helm_lib_module_https_ingress_tls_enabled" .) }}
- name: ca-certificates
  mountPath: "/usr/local/share/ca-certificates/"
  readOnly: true
{{- end }}

Container images

In controller templates (daemonsets, statefulsets, deployments) for container images please use helm helper helm_lib_module_image. For common images like kube-rbac-proxy use helm_lib_module_common_image.

Usage:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: snapshot-controller
  namespace: d8-{{ .Chart.Name }}
spec:
  selector:
    matchLabels:
      app: snapshot-controller
  template:
    metadata:
      labels:
        app: snapshot-controller
    spec:
      containers:
      - name: snapshot-controller
        image: {{ include "helm_lib_module_image" (list . "snapshotController") }}
      - name: kube-rbac-proxy
        image: {{ include "helm_lib_module_common_image" (list . "kubeRbacProxy") }}

Hooks

For more information about hooks, their structure, and binding to events, see the addon-operator documentation.

In Deckhouse, global hooks are stored in the /global-hooks directory, module hooks are placed in the module's /modules/MODULE/hooks directory.

You can pass information to the hook using environment variables with paths to files in /tmp. The hook's results are also returned via files. Click here to read more about using parameters in hooks.

Validating admission webhooks

Validation hooks are similar to regular webhooks in their interfaces and running. They use the same shell framework. For more information about conversion webhooks, see the shell-operator documentation.

In Deckhouse, validating hooks are located in the module's /modules/MODULE/webhooks/validation/ directory.

Conversion webhooks

Conversion webhooks are similar to regular hooks in their interfaces and running mechanism. They use the same shell framework. For more information about conversion webhooks, see the shell-operator documentation.

In Deckhouse, conversion webhooks are located in the module's /modules/MODULE/webhooks/conversion/ directory.

kubectl

We do not recommend using kubectl in hooks. It leads to a loss of idempotency since the hook depends on the cluster state in addition to the input parameters (that creates some difficulties during debugging/testing).

  • Use the built-in shell-operator functionality (it is fully integrated into Deckhouse) to track objects;
  • Use the shell_lib functionality (the kubernetes::-prefixed functions in particular: kubernetes::create_yaml, kubernetes::patch_jq, kubernetes::delete_if_exists, etc.) to create, edit, and delete objects.

The "enabled" webhooks

The "enabled" webhooks are located in the root directory of the module. You can use them to describe the conditions under which the module must be enabled/disabled.

An example:

#!/bin/bash

source /deckhouse/shell_lib.sh

function __main__() {
  if values::has global.modules.publicDomainTemplate ; then
    echo "true" > $MODULE_ENABLED_RESULT
  else
    echo "false" > $MODULE_ENABLED_RESULT
  fi
}

enabled::run $@

This webhook disables the module in all clusters in which the global.modules.publicDomainTemplate option is not set.

Regular checks are implemented in shell_lib functions with the enabled:: prefix. For example, the hook below disables the module in all clusters with the Kubernetes version < 1.21.0:

function __main__() {
  enabled::disable_module_in_kubernetes_versions_less_than 1.21.0
  echo "true" > $MODULE_ENABLED_RESULT
}

See the documentation for more info.

OpenAPI schemas for validating values

Deckhouse support validation for values passed using the Deckhouse ConfigMap and for Deckhouse-generated values.

The OpenAPI value validation scheme is needed:

  • To make sure that the user has entered valid values into the Deckhouse ConfigMap, and to let the user know if the values entered are invalid.
  • To ensure that all the necessary parameters (in the correct format) are passed for rendering the module's helm templates. It ensures the expected behavior within the cluster and that only the planned objects will end up in the cluster.
  • To generate the documentation for the module parameters on the site.

The OpenAPI validating schemes are stored in the $GLOBAL_HOOKS_DIR/openapi directory for global values, and in the $MODULES_DIR/<module-name>/openapi for modules.

Refer to the addon-operator documentation for more information about schema validation.

The validation schemas have the OpenAPI Schema Object format. The detailed description of the format is available in the documentation.

Note that addon-operator extends the schema format with additional properties. The additional information is available in the documentation.

Caution! If the additionalProperties property is not defined, it will be set to false at all schema levels.

  • The openapi/config-values.yaml scheme validates values passed by the user via a ConfigMap.

An example:

type: object
properties:
  podNetworkMode:
    type: string
    enum: ["HostGW", "VXLAN"]
    default: "HostGW"
    description: |
      Work mode.
  • The openapi/values.yaml scheme validates combined values consisting of values from ConfigMap and values generated by hooks (learn more here).

    Caution! The openapi/values.yaml scheme validates values generated by webhooks. Thus, the scheme will fire up an error when validating combined values since it does not have the description of the ConfigMap-derived values. The x-extend parameter extends the openapi/values.yaml schema with parameters of the openapi/config-values.yaml schema (as in the example below), thus avoiding duplicating them. The x-extend parameter must be used in all cases. Learn more here.

An example:

x-extend:
  schema: config-values.yaml
type: object
properties:
  internal:
    type: object
    default: {}
    x-required-for-helm:
    - podNetworkMode
    properties:
      podNetworkMode:
        type: string
        enum: ["HostGW", "VXLAN"]

How to create a validation scheme for a module:

  • openapi/config-values.yaml:
    • The scheme is based on the module documentation.
    • Set the default values for the fields. The default values can be specified in the:
      • documentation;
      • $MODULES_DIR/<module-name>/values.yaml file;
      • hardcoded in the webhook's code;
    • The required property must be set for all required fields.
  • openapi/values.yaml:
    • The schema is created for the values set by webhooks (usually, the internal group's variables).
    • Set the x-extend property to load the config-values.yaml scheme.
    • Set the default values for the fields. The default values can be specified in the:
      • $MODULES_DIR/<module-name>/values.yaml file;
      • hardcoded in the webhook's code;
    • The x-required-for-helm property must be set for all mandatory fields.

After creating the schemas for the module, delete the $MODULES_DIR/<module-name>/values.yaml file.

Using the storage class

If the module uses Persistent Storage, the effective storage class (EFC) should be determined as follows:

  1. If the EFC is defined in the module's config (values) – use the one explicitly specified in the module.
  2. If the PV exists – use the storage class of the existing PV.
  3. Otherwise, use either globally defined or default (determined automatically) EFC.
  4. If none of the above are suitable, use an emptyDir volume.

Such an approach allows you to avoid the re-provisioning of PVs (and data loss) that the global or default storage class change can trigger. To re-provision a PV, you must explicitly specify a different storage class right in the module configuration.

Note! You cannot mutate the volumeClaimTemplate. Thus, you must delete a statefulset (e.g., using a webhook) when changing the storageClass.

You can find a relevant example in the prometheus and openvpn modules' hooks.

CRDs

CRDs must be stored in the crds directory in the module's root.

The module must contain a dedicated hook called ensure_crds.go with the following content:

/*
Copyright 2023 Flant JSC

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package hooks

import (
 "github.com/deckhouse/deckhouse/go_lib/hooks/ensure_crds"
)

var _ = ensure_crds.RegisterEnsureCRDsHook("/deckhouse/modules/MODULE_NAME/crds/*.yaml")

If resources described via CRDs are used in other modules, you need to make a separate module for those CRDs.

An example: 010-vertical-pod-autoscaled-crd. Most Deckhouse modules use these.

Creating your own CRDs

  1. The description of the openAPIV3Schema resource validation schema should be as detailed as possible; the description of objects should be in English.
  2. Use the spec.additionalPrinterColumns property to add a description for additional columns. It will be displayed in the kubectl get command's output, thus improving the user experience.

Testing

Each module must be properly covered with tests. There are three types of tests:

  • Webhook tests. These are stored in the hooks directory and use the following naming convention: ${hook_name}_test.go. Webhook tests check the result of running hooks.
  • Helm tests. These are stored in a separate template_tests directory in the module's root. Helm tests check the logic in helm templates.
  • Matrix tests. These are described in the values_matrix_test.yaml file on the module's root. Matrix tests check the rendering of helm templates and if these templates match our standards for a large number of values.yaml the matrix describes.