Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make alternative container runtimes easier to use #8700

Closed
brandond opened this issue Oct 19, 2023 · 2 comments · Fixed by #8751
Closed

Make alternative container runtimes easier to use #8700

brandond opened this issue Oct 19, 2023 · 2 comments · Fixed by #8751
Assignees
Milestone

Comments

@brandond
Copy link
Contributor

brandond commented Oct 19, 2023

We should:

  • Pre-install RuntimeClass resources for any runtimes that we support auto-detecting. These are cluster-level resources, and it is harmless for these to be present even if some/all nodes do not have the matching runtime in their containerd config.
  • Add an agent-level option to set the default runtime for the node. Right now, users who want to use the nvidia container runtime instead of runc, without modifying pod specs to set the runtimeClassName, have to provide their own containerd.config.toml. Users frequently make mistakes and open issues when this breaks, as it not trivial to do correctly.
  • Add support for additional runtimes. Right now we only support nvidia and nvidia-experimental. The checks are trivial (searching for binaries in a couple paths) and should be easy to extend.
@brandond brandond added this to the v1.28.4+k3s1 milestone Oct 19, 2023
@vitorsavian vitorsavian self-assigned this Oct 26, 2023
This was referenced Oct 31, 2023
@brandond brandond reopened this Nov 13, 2023
@brandond
Copy link
Contributor Author

Reopening as incomplete. Unfortunately the linked PR used the "fixes" keyword which causes auto-closing of issues.

@fmoral2
Copy link
Contributor

fmoral2 commented Dec 12, 2023

Validated on Version:

-$ k3s version v1.28.4+k3s-71a3c35f (71a3c35f)

Environment Details

Infrastructure
Cloud EC2 instance

Node(s) CPU architecture, OS, and Version:
NAME="Oracle Linux Server"
VERSION="8.9"
ID="ol"
ID_LIKE="fedora"

Cluster Configuration:
1 node server

Steps to validate the fix

  1. Create files for represent adding pre config for another containerd runtimes
  2. Install k3s
  3. Check config.toml file
  4. Validate that each file is added to config.toml file with his content section

Reproduction Issue:

 
 sudo touch /usr/bin/containerd-shim-lunatic-v1
 sudo touch /usr/sbin/containerd-shim-spin-v1
 sudo touch /usr/bin/nvidia-container-runtime

 # k3s -v
k3s version v1.28.2+k3s-aaf84090 (aaf84090)
go version go1.20.8


# ls -a /var/lib/rancher/k3s/agent/containerd/
.  ..  io.containerd.snapshotter.v1.overlayfs


Validation Results:


 
 sudo touch /usr/bin/containerd-shim-lunatic-v1
 sudo touch /usr/sbin/containerd-shim-spin-v1
 sudo touch /usr/bin/nvidia-container-runtime

 
 ]# k3s -v
k3s version v1.28.4+k3s-71a3c35f (71a3c35f)
go version go1.20.11


  ]# cat /var/lib/rancher/k3s/agent/etc/containerd/config.toml | grep -E 'lunatic|spin|nvidia'
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes."lunatic"]
  runtime_type = "io.containerd.lunatic.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes."lunatic".options]
  BinaryName = "/usr/bin/containerd-shim-lunatic-v1"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes."nvidia"]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes."nvidia".options]
  BinaryName = "/usr/bin/nvidia-container-runtime"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes."nvidia-experimental"]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes."nvidia-experimental".options]
  BinaryName = "/usr/bin/nvidia-container-runtime-experimental"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes."spin"]
  runtime_type = "io.containerd.spin.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes."spin".options]
  BinaryName = "/usr/sbin/containerd-shim-spin-v1"

get node,pods -A
NAME                                               STATUS   ROLES                       AGE     VERSION
node/ .us-east-2.compute.internal   Ready    control-plane,etcd,master   6m22s   v1.28.4+k3s-71a3c35f

NAMESPACE     NAME                                          READY   STATUS      RESTARTS   AGE
kube-system   pod/coredns-6799fbcd5-v76dc                   1/1     Running     0          6m6s
kube-system   pod/helm-install-traefik-2vp2h                0/1     Completed   1          6m7s
kube-system   pod/helm-install-traefik-crd-q7cjs            0/1     Completed   0          6m7s
kube-system   pod/local-path-provisioner-84db5d44d9-b76pk   1/1     Running     0          6m6s
kube-system   pod/metrics-server-67c658944b-fg68p           1/1     Running     0          6m6s
kube-system   pod/svclb-traefik-b511a7b4-vw8np              2/2     Running     0          5m50s


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done Issue
Development

Successfully merging a pull request may close this issue.

5 participants