Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config merge via imports overwrites the wrong part of the config #5837

Open
boredabdel opened this issue Aug 6, 2021 · 14 comments
Open

config merge via imports overwrites the wrong part of the config #5837

boredabdel opened this issue Aug 6, 2021 · 14 comments
Milestone

Comments

@boredabdel
Copy link

Hi,

I'm trying to use imports to append a custom config to the containerd configuration as i want to avoid editing the config.toml files

config.toml

# Kubernetes doesn't use containerd restart manager.
disabled_plugins = ["restart"]
oom_score = -999
imports = ["/etc/containerd/config.d/*.toml"]

[debug]
  level = "debug"

[grpc]
  gid = 412

[plugins.cri]
  stream_server_address = "127.0.0.1"
  max_container_log_line_size = 262144
[plugins.cri.cni]
  bin_dir = "/home/kubernetes/bin"
  conf_dir = "/etc/cni/net.d"
  conf_template = ""
[plugins.cri.registry.mirrors."docker.io"]
  endpoint = ["https://mirror.gcr.io","https://registry-1.docker.io"]

I have added a custom_config.toml file under /etc/containerd/config.d. It's content is


[plugins.cri.registry.mirrors."myregistrydomain.com"]
  endpoint = ["https://myregistrydomain.com"]
[plugins.cri.registry.configs."myregistrydomain.com".tls]
  ca_file = "/etc/custom-certs/cert.pem"

Before restarting the containerd daemon, a crictl info returns

{
  "status": {
    "conditions": [
      {
        "type": "RuntimeReady",
        "status": true,
        "reason": "",
        "message": ""
      },
      {
        "type": "NetworkReady",
        "status": true,
        "reason": "",
        "message": ""
      }
    ]
  },
  "cniconfig": {
    "PluginDirs": [
      "/home/kubernetes/bin"
    ],
    "PluginConfDir": "/etc/cni/net.d",
    "PluginMaxConfNum": 1,
    "Prefix": "eth",
    "Networks": [
      {
        "Config": {
          "Name": "cni-loopback",
          "CNIVersion": "0.3.1",
          "Plugins": [
            {
              "Network": {
                "type": "loopback",
                "ipam": {},
                "dns": {}
              },
              "Source": "{\"type\":\"loopback\"}"
            }
          ],
          "Source": "{\n\"cniVersion\": \"0.3.1\",\n\"name\": \"cni-loopback\",\n\"plugins\": [{\n  \"type\": \"loopback\"\n}]\n}"
        },
        "IFName": "lo"
      },
      {
        "Config": {
          "Name": "gke-pod-network",
          "CNIVersion": "0.3.1",
          "Plugins": [
            {
              "Network": {
                "type": "gke",
                "ipam": {
                  "type": "host-local"
                },
                "dns": {}
              },
              "Source": "{\"ipam\":{\"ranges\":[[{\"subnet\":\"10.100.5.0/24\"}]],\"routes\":[{\"dst\":\"0.0.0.0/0\"}],\"type\":\"host-local\"},\"mtu\":1460,\"type\":\"gke\"}"
            },
            {
              "Network": {
                "type": "portmap",
                "capabilities": {
                  "portMappings": true
                },
                "ipam": {},
                "dns": {}
              },
              "Source": "{\"capabilities\":{\"portMappings\":true},\"type\":\"portmap\"}"
            },
            {
              "Network": {
                "type": "bandwidth",
                "capabilities": {
                  "bandwidth": true
                },
                "ipam": {},
                "dns": {}
              },
              "Source": "{\"capabilities\":{\"bandwidth\":true},\"type\":\"bandwidth\"}"
            },
            {
              "Network": {
                "type": "cilium-cni",
                "ipam": {},
                "dns": {}
              },
              "Source": "{\"type\":\"cilium-cni\"}"
            }
          ],
          "Source": "{ \"name\": \"gke-pod-network\", \"cniVersion\": \"0.3.1\", \"plugins\": [ { \"type\": \"gke\", \"mtu\": 1460, \"ipam\": { \"type\": \"host-local\", \"ranges\": [ [{\"subnet\": \"10.100.5.0/24\"}] ], \"routes\": [ {\"dst\": \"0.0.0.0/0\"} ] } }, { \"type\": \"portmap\", \"capabilities\": { \"portMappings\": true } },{\"type\": \"bandwidth\",\"capabilities\": {\"bandwidth\": true}},{\"type\": \"cilium-cni\"} ] }\n"
        },
        "IFName": "eth0"
      }
    ]
  },
  "config": {
    "containerd": {
      "snapshotter": "overlayfs",
      "defaultRuntimeName": "runc",
      "defaultRuntime": {
        "runtimeType": "",
        "runtimeEngine": "",
        "PodAnnotations": null,
        "ContainerAnnotations": null,
        "runtimeRoot": "",
        "options": null,
        "privileged_without_host_devices": false,
        "baseRuntimeSpec": ""
      },
      "untrustedWorkloadRuntime": {
        "runtimeType": "",
        "runtimeEngine": "",
        "PodAnnotations": null,
        "ContainerAnnotations": null,
        "runtimeRoot": "",
        "options": null,
        "privileged_without_host_devices": false,
        "baseRuntimeSpec": ""
      },
      "runtimes": {
        "runc": {
          "runtimeType": "io.containerd.runc.v2",
          "runtimeEngine": "",
          "PodAnnotations": null,
          "ContainerAnnotations": null,
          "runtimeRoot": "",
          "options": {},
          "privileged_without_host_devices": false,
          "baseRuntimeSpec": ""
        }
      },
      "noPivot": false,
      "disableSnapshotAnnotations": true,
      "discardUnpackedLayers": false
    },
    "cni": {
      "binDir": "/home/kubernetes/bin",
      "confDir": "/etc/cni/net.d",
      "maxConfNum": 1,
      "confTemplate": ""
    },
    "registry": {
      "mirrors": {
        "docker.io": {
          "endpoint": [
            "https://mirror.gcr.io",
            "https://registry-1.docker.io"
          ]
        }
      },
      "configs": null,
      "auths": null,
      "headers": null
    },
    "imageDecryption": {
      "keyModel": ""
    },
    "disableTCPService": true,
    "streamServerAddress": "127.0.0.1",
    "streamServerPort": "0",
    "streamIdleTimeout": "4h0m0s",
    "enableSelinux": false,
    "selinuxCategoryRange": 1024,
    "sandboxImage": "k8s.gcr.io/pause:3.2",
    "statsCollectPeriod": 10,
    "systemdCgroup": false,
    "enableTLSStreaming": false,
    "x509KeyPairStreaming": {
      "tlsCertFile": "",
      "tlsKeyFile": ""
    },
    "maxContainerLogSize": 262144,
    "disableCgroup": false,
    "disableApparmor": false,
    "restrictOOMScoreAdj": false,
    "maxConcurrentDownloads": 3,
    "disableProcMount": false,
    "unsetSeccompProfile": "",
    "tolerateMissingHugetlbController": true,
    "disableHugetlbController": true,
    "ignoreImageDefinedVolumes": false,
    "containerdRootDir": "/var/lib/containerd",
    "containerdEndpoint": "/run/containerd/containerd.sock",
    "rootDir": "/var/lib/containerd/io.containerd.grpc.v1.cri",
    "stateDir": "/run/containerd/io.containerd.grpc.v1.cri"
  },
  "golang": "go1.13.5",
  "lastCNILoadStatus": "OK"
}

After restarting the containerd daemon, somehow the config merge changes the binDir. below is a truncated output of crictl info to only the part i don't understand why it's changing

    "cni": {
      "binDir": "/opt/cni/bin",
      "confDir": "/etc/cni/net.d",
      "maxConfNum": 1,
      "confTemplate": ""
    },

What i don't understand is why my binDir is changing and where this value(/opt/cni/bin) coming from

Any help is appreciated it, happy to provide dumps


crictl version
Version:  0.1.0
RuntimeName:  containerd
RuntimeVersion:  1.4.4
RuntimeApiVersion:  v1alpha2
@digglife
Copy link

digglife commented Aug 8, 2021

This is an expected behavior. When containerd server merges configuration from files defined in imports, it actually merges the plugin's configuration by section, not by key-value.

// Replace entire sections instead of merging map's values.
for k, v := range from.Plugins {
to.Plugins[k] = v
}

In your case, the configuration for plugins.cri is completely replaced by custom_config.toml, in which only registry mirrors are configured, everything else becomes un-configured, so the default values are used for them eventually.

The solution is moving the settings for cri plugin to one single toml file.

@t-indumathy
Copy link

Is there any way to pass these custom values as KV inputs, while restarting containerd ?

@digglife
Copy link

digglife commented Aug 9, 2021

@t-indumathy I don't think so, unless we change above-mentioned merge strategy for plugins' configuration, which is doable but I guess the current strategy is intentional. We need maintainers' opinions here.

@dkoshkin
Copy link

dkoshkin commented Nov 23, 2021

I stumbled on this after running into a similar situation.
I've modified the config file for nvidia runtime and then added a runtime important for Containerd mirrors.
It would be great to have a different merge strategy for the CRI plugin keys.

@joejulian
Copy link

I don't understand why there would be an import function if you can't merge the imported structures into the configuration. This seems like a logical expectation when importing directories, especially. I would argue that it's not actually working as intended, just that it's working as implemented making this a valid bug that could be fixed.

@dlipovetsky
Copy link

dlipovetsky commented Nov 23, 2021

The merge implementation is from #3574. There is a conversation in that PR about getting the merge right, and it looks like this approach won out:

What about just doing a replace based on the sections? I makes it easier to reason about and the user can just copy paste the section they want to change
-- #3574 (comment)

This PR is original implementation of config imports. Because there was no support for config imports, there was no user feedback to inform how to implement the merge. After 2+ years, we have user feedback, and we can reconsider how to implement the merge.

@joejulian
Copy link

Looks like someone thought that humans would be doing this rather than machines. Thanks for looking at the history, @dlipovetsky . Seems like there's room for improvement then. Whew. 😁 .

@gjkloosterman
Copy link

gjkloosterman commented Dec 17, 2021

I'm running into this issue as well.

I wanted to use a separate file to configure the registry config path (touching plugins."io.containerd.grpc.v1.cri".registry) and another one to configure the NVIDIA container runtime (touching plugins."io.containerd.grpc.v1.cri".containerd).

The documentation at https://github.com/containerd/containerd/blob/main/docs/man/containerd-config.toml.5.md suggests this will work:

Imported files will overwrite simple fields like int or string (if not empty) and will append array and map fields.

However, as explained in the earlier comments, this is not true for plugins. Is this something that can be addressed in the near future?

@mel1nn
Copy link

mel1nn commented Jan 11, 2022

Hi, I have the same problem, the configuration from the imports is not merged with the config.toml file.

smira added a commit to smira/talos that referenced this issue Jan 19, 2022
Containerd doesn't support merging plugin configuration from multiple
sources, and Talos has several pieces which configure CRI plugin:
(see containerd/containerd#5837)

* base config
* registry mirror config
* system extensions
* ...

So we implement our own simple way of merging config parts (by simply
concatenating text files) to build a final `cri.toml`.

At the same time containerd migrated to a new format to specify registry
mirror configuration, while old way (via CRI config) is going to be
removed in 1.7.0. New way also allows to apply most of registry
configuration (except for auth) on the fly.

Also, containerd was updated to 1.6.0-rc.0 and runc to 1.1.0.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
smira added a commit to smira/talos that referenced this issue Jan 19, 2022
Containerd doesn't support merging plugin configuration from multiple
sources, and Talos has several pieces which configure CRI plugin:
(see containerd/containerd#5837)

* base config
* registry mirror config
* system extensions
* ...

So we implement our own simple way of merging config parts (by simply
concatenating text files) to build a final `cri.toml`.

At the same time containerd migrated to a new format to specify registry
mirror configuration, while old way (via CRI config) is going to be
removed in 1.7.0. New way also allows to apply most of registry
configuration (except for auth) on the fly.

Also, containerd was updated to 1.6.0-rc.0 and runc to 1.1.0.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
smira added a commit to smira/talos that referenced this issue Jan 20, 2022
Containerd doesn't support merging plugin configuration from multiple
sources, and Talos has several pieces which configure CRI plugin:
(see containerd/containerd#5837)

* base config
* registry mirror config
* system extensions
* ...

So we implement our own simple way of merging config parts (by simply
concatenating text files) to build a final `cri.toml`.

At the same time containerd migrated to a new format to specify registry
mirror configuration, while old way (via CRI config) is going to be
removed in 1.7.0. New way also allows to apply most of registry
configuration (except for auth) on the fly.

Also, containerd was updated to 1.6.0-rc.0 and runc to 1.1.0.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
smira added a commit to smira/talos that referenced this issue Jan 20, 2022
Containerd doesn't support merging plugin configuration from multiple
sources, and Talos has several pieces which configure CRI plugin:
(see containerd/containerd#5837)

* base config
* registry mirror config
* system extensions
* ...

So we implement our own simple way of merging config parts (by simply
concatenating text files) to build a final `cri.toml`.

At the same time containerd migrated to a new format to specify registry
mirror configuration, while old way (via CRI config) is going to be
removed in 1.7.0. New way also allows to apply most of registry
configuration (except for auth) on the fly.

Also, containerd was updated to 1.6.0-rc.0 and runc to 1.1.0.

Signed-off-by: Andrey Smirnov <andrey.smirnov@talos-systems.com>
DavidFair added a commit to stfc/k8s-image-builder that referenced this issue Aug 16, 2022
Because of containerd/containerd#5837 we're
not able to seperate out the registries directive whilst using the
Nvidia runtime.

To workaround this we will edit the original config.toml so all
definitions are in-line. Even though this will likely result in
conflicts with upstream in the future.
@samuelkarp samuelkarp added this to the 2.0 milestone Feb 8, 2023
@samuelkarp
Copy link
Member

With 2.0 on the way (after we wrap up 1.7), I think this would be something worth looking at. I do think that changing the merge strategy is a breaking change, but I see a couple options for 2.0:

  1. Make this change and document it. Possibly write out a warning the first time containerd 2.0 starts with a config that is merged differently, or vend a tool that can be used to validate upgrade conditions ahead of time.
  2. Move to config version 3 that changes the merge strategy so that this is opt-in.

@cwrau
Copy link

cwrau commented Aug 21, 2023

With 2.0 on the way (after we wrap up 1.7), I think this would be something worth looking at. I do think that changing the merge strategy is a breaking change, but I see a couple options for 2.0:

  1. Make this change and document it. Possibly write out a warning the first time containerd 2.0 starts with a config that is merged differently, or vend a tool that can be used to validate upgrade conditions ahead of time.
  2. Move to config version 3 that changes the merge strategy so that this is opt-in.

Is there a timeline when 2.0 will be released? If it's a long way away option 2 could be already done right now as it wouldn't be breaking

@macwinnie
Copy link

Hi everyone,

I don't know if my observation totally fits in here – but I discovered some weird behaviour with the config import as well =D

Having containerd with version v1.7.4 installed and the actual default config in /etc/containerd/config.toml only adjusted by adding imports, importing two files will override themselves by having top level keys in common.

What I have:

  • my /etc/containerd/config.tomlcontainerd config default > /etc/containerd/config.toml and then edited a single line:
...
imports = ["/etc/containerd/runtime/*.toml"]
...
  • having two files in the /etc/containerd/runtime directory, the latter filename in alphabetical order is applied:

first.toml:

[plugins."io.containerd.grpc.v1.cri"]
disable_apparmor = true

second.toml:

[plugins."io.containerd.grpc.v1.cri".registry]
config_path = ""

[plugins."io.containerd.grpc.v1.cri".registry.configs."docker.io".auth]
username = "***"
password = "$$$"

In that order, only the registry credentials are imported correctly, the disable_apparmor setting is abandoned ... renaming the files the other way round results in the same behaviour but this time the registry credentials are missing and disable_apparmor is respected.

If that's something to tinker around with v2.x.x, I'm glad to await the next version – but if I should open a new bug report for that, please just tell me =)

Thanks and best!
macwinnie

@dkoshkin
Copy link

dkoshkin commented Sep 27, 2023

To work around this we developed a utility to do toml key merging in https://github.com/mesosphere/toml-merge

Example script of merging all files in /etc/containerd/custom-conf.d/*.toml and write it back out to /etc/containerd/config.toml:

        #!/bin/bash
        set -euo pipefail
        IFS=$'\n\t'

        declare -r TOML_MERGE_IMAGE="ghcr.io/mesosphere/toml-merge:v0.2.0"

        if ! ctr --namespace k8s.io images check "name==${TOML_MERGE_IMAGE}" | grep "${TOML_MERGE_IMAGE}" >/dev/null; then
          ctr --namespace k8s.io images pull "${TOML_MERGE_IMAGE}"
        fi

        cleanup() {
          ctr images unmount "${tmp_ctr_mount_dir}" || true
        }

        trap 'cleanup' EXIT

        readonly tmp_ctr_mount_dir="$(mktemp -d)"

        ctr --namespace k8s.io images mount "${TOML_MERGE_IMAGE}" "${tmp_ctr_mount_dir}"
        "${tmp_ctr_mount_dir}/usr/local/bin/toml-merge" -i --patch-file "/etc/containerd/custom-conf.d/*.toml" /etc/containerd/config.toml

@cwrau
Copy link

cwrau commented Nov 16, 2023

Is there a timeline when this will be fixed / v2 / config v3 will be released? Because we keep running into this problem

This time it's because we switched to cgroupv2 and have to set SystemdCgroup = true, but that line of course gets overridden...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests