Container Stuck in ContainerCreating | no runtime for "spin" is configured #209

megamxl · 2024-04-21T21:08:01Z

This issue is already resolved, but I don't know where to share this information to fix it. I want to share this here that other stuck at this can get help when facing this issue.

kubectl describe pods

The failure was: Unknown desc = failed to get sandbox runtime: no runtime for "spin" is configured

Events:
  Type     Reason                  Age                  From               Message
  ----     ------                  ----                 ----               -------
  Normal   Scheduled               4m38s                default-scheduler  Successfully assigned default/simple-spinapp-56687588d9-w9h9k to node
  Warning  FailedCreatePodSandBox  3s (x23 over 4m38s)  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to get sandbox runtime: no runtime for "spin" is configured

I fixed it with a comment from the Ferymon Discord server, which isn't' that easy to get crawled by a serch engine.
The fix is from @stevesloka on Discord.

edit this file: /var/lib/rancher/k3s/agent/etc/containerd/config.toml.tmpl on each host

add this

{{ template "base" . }}

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.spin]
runtime_type = "io.containerd.spin.v1"

https://docs.k3s.io/advanced?_highlight=config.toml.tmp#configuring-containerd

I rebooted, than all nodes and then i was finally able to deploy apps.

The text was updated successfully, but these errors were encountered:

megamxl · 2024-04-21T21:28:04Z

Update It works fine for applications without autoscaling. When I apply the hpa example, I still have the same problem. Does anyone know why this is happening.

endocrimes · 2024-04-22T09:08:55Z

@kate-goldenring had issues scaling on k3s that looked like they were tied to concurrent pull limits in the kubelet when scaling up by more than a couple of replicas at a time.

I haven't ever dug in further to see if they were easily fixable (or specifically related to SpinKube) though.

If you could attach kubelet logs they would be super helpful for seeing if it's the same issue and further triage. Thanks 🎉

megamxl · 2024-04-22T14:27:00Z

I have done some further test on a new cluster and documented exactly the way i got normal spin apps without scling working

Setting up the k3s cluster
Installing all required dependencies from the installation with helm section [https://www.spinkube.dev/docs/spin-operator/installation/installing-with-helm/]
Then I checked my /var/lib/rancher/k3s/agent/etc/containerd/config.toml.tmpl and deleted all spin settings. Afterwards, I added only this. to the end of the file.

.... standard k3s container.d configuration...

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.spin]
    runtime_type = "/opt/kwasm/bin/containerd-shim-spin-v2"

then i restarted the k3s service
$ sudo systemctl restart k3s [https://docs.k3s.io/upgrades/killall]

Afterward, I was able to run the example.

The next comment will describe the scaling part

megamxl · 2024-04-22T14:59:56Z

when i manually check the pods with Kubectl, i get a values

$kubectl top pod

NAME                              CPU(cores)   MEMORY(bytes)
hpa-spinapp-d75d89476-w2d9m       1m           19Mi
simple-spinapp-56687588d9-r46kd   1m           13Mi

When I do kubectl get hpa i also get values.

NAME                 REFERENCE                TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
spinapp-autoscaler   Deployment/hpa-spinapp   1%/50%    1         10        1          101s

After tying the ingress i noticed that it was unable to sclae and when I removed and re added the hpa i got

kubectl get  hpa
NAME                 REFERENCE                TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
spinapp-autoscaler   Deployment/hpa-spinapp   <unknown>/50%   1         10        1          28s

kubectl describe hpa

Conditions:
  Type           Status  Reason                   Message
  ----           ------  ------                   -------
  AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
  ScalingActive  False   FailedGetResourceMetric  the HPA was unable to compute the replica count: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
Events:
  Type     Reason                        Age   From                       Message
  ----     ------                        ----  ----                       -------
  Warning  FailedGetResourceMetric       2s    horizontal-pod-autoscaler  failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
  Warning  FailedComputeMetricsReplicas  2s    horizontal-pod-autos

I hope you can work with this info

When I tried 5 concurrent invocations, the result was CPU%1 all the time. I also tried to change the interval of the metric server but this also didn't fix anything

I can confirm that the communication between the metric server and spin apps does not work. because when i set the pods under loads nothing happens. and it stays at 1%

kate-goldenring · 2024-04-22T17:39:30Z

@megamxl as @endocrimes mentioned I had some issues with k3s. One was the one you described. The shim wasn't being used until i restarted the k3s service (which contains containerd); however, a change was added to the node installer to do systemctl k3s restart: https://github.com/spinkube/containerd-shim-spin/pull/43/files#diff-716e6e24b7a494f27721cbbd94d70fba41081dcfefbd8a5ca81eea88a7b3de17R49. These changes should be in the latest node installer version (v0.13.1).

The other issue i was experiencing was Pods restarting due to ErrImagePull and ImagePullBackoff due to exceeding max pulls on K3s when using a latest tagged image. Switching to a versioned tag resolved that issue as latest changes the pull policy to Always and I was scaling to 50 replicas.

For HPA, I believe i also had delays on calculating CPU usage with it showing unknown for a minute and rarely could get it to scale above 2%. I assumed this was because i was and not incurring enough load, but it could be an issue with the metrics server that comes with K3s

endocrimes · 2024-05-22T14:17:34Z

closing this as it's a cluster configuration issue, more than an operator issue that we can act on - but definitely something to keep in mind when using k3s

endocrimes closed this as completed May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Container Stuck in ContainerCreating | no runtime for "spin" is configured #209

Container Stuck in ContainerCreating | no runtime for "spin" is configured #209

megamxl commented Apr 21, 2024 •

edited

megamxl commented Apr 21, 2024

endocrimes commented Apr 22, 2024

megamxl commented Apr 22, 2024

megamxl commented Apr 22, 2024 •

edited

kate-goldenring commented Apr 22, 2024 •

edited

endocrimes commented May 22, 2024

Container Stuck in ContainerCreating | no runtime for "spin" is configured #209

Container Stuck in ContainerCreating | no runtime for "spin" is configured #209

Comments

megamxl commented Apr 21, 2024 • edited

megamxl commented Apr 21, 2024

endocrimes commented Apr 22, 2024

megamxl commented Apr 22, 2024

megamxl commented Apr 22, 2024 • edited

kate-goldenring commented Apr 22, 2024 • edited

endocrimes commented May 22, 2024

megamxl commented Apr 21, 2024 •

edited

megamxl commented Apr 22, 2024 •

edited

kate-goldenring commented Apr 22, 2024 •

edited