Skip to content

Krustlet-tutorial pod get stuck in init:regitered status #624

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
JesseStutler opened this issue Jun 7, 2021 · 19 comments
Open

Krustlet-tutorial pod get stuck in init:regitered status #624

JesseStutler opened this issue Jun 7, 2021 · 19 comments
Labels
question Further information is requested

Comments

@JesseStutler
Copy link

Hi, I followed https://github.com/deislabs/krustlet/blob/main/docs/intro/tutorial03.md to create a pod on krustlet node, the pod successfully scheduled to the krustlet node, but the pod get stuck in init:registered and I don't know why. I think it's because I build istio in cluster.The init containers istio-init firstly got stuck in waiting status. Istio's version is v1.10.0. Is that krustlet's problem or istio's problem? How to solve this?
Another question by the way: Does krustlet use wasm-to-oci pull to pull a wasm module from registry or still use docker pull to pull from a registry?

@kflansburg
Copy link
Collaborator

We do not currently support Pods with both traditional containers and WASM workloads at the same time. I don't think it will be possible to run Krustlet Pods within an Istio mesh for now.

@JesseStutler
Copy link
Author

JesseStutler commented Jun 7, 2021

@kflansburg Thanks. I put the pod that only has the wasm workload into a new namespace that doesn't have the istio-injection=enabled label. But the pod's status still get stuck in registered, I never met this status before. I used kubectl describe to check the pod's status, I found that the pod got stuck in imagepullbackoff. Then I ran the command docker pull manually on the krustlet node to pull the wasm module from the registry, but it reported an error:invalid rootfs in image configuration(Does krustlet still use docker pull to pull a wasm module,right?Or use wasm-to-oci pull?Please tell me). Why is this? Is wasm-to-oci's problem?

@kflansburg
Copy link
Collaborator

I believe that Krustlet uses the client in the OCI Registry crate (https://github.com/deislabs/krustlet/tree/main/crates/oci-distribution) to pull wasm images.

What is the Pod manifest that you are trying to run? Does Krustlet emit any logs about why the image pull failed?

@JesseStutler
Copy link
Author

There is the krustlet's log:

[2021-06-07T05:54:27Z ERROR kubelet::state::common::image_pull] error sending request for url (https://192.168.0.211:8088/v2/): error trying to connect: error:1408F10B:SSL routines:ssl3_get_record:wrong version number:../ssl/record/ssl3_record.c:332:

    Caused by:
        0: error trying to connect: error:1408F10B:SSL routines:ssl3_get_record:wrong version number:../ssl/record/ssl3_record.c:332:
        1: error:1408F10B:SSL routines:ssl3_get_record:wrong version number:../ssl/record/ssl3_record.c:332:
        2: error:1408F10B:SSL routines:ssl3_get_record:wrong version number:../ssl/record/ssl3_record.c:332:

192.168.0.211:8088 is my harbor's location. I think it's because I used HTTP connection to harbor and didn't open HTTPS.Does it have to use HTTPS to connection to the registry?Or maybe add some parameters to use HTTP rather than HTTPS?

@bacongobbler
Copy link
Collaborator

bacongobbler commented Jun 7, 2021

A couple of things to unpack here.

Then I ran the command docker pull manually on the krustlet node to pull the wasm module from the registry, but it reported an error:invalid rootfs in image configuration

There is a common misconception that because Krustlet stores WebAssembly modules in OCI means that you can use docker pull to fetch the module. There is also the other misconception that because modules are stored in OCI, they must be a Docker container. Both are untrue. WebAssembly modules are stored as WebAssembly modules. And while the push/pull mechanism is the same between Docker and Krustlet, docker only understands how to fetch Docker containers, not WebAssembly modules. In order to fetch modules from an OCI server, use wasm-to-oci.

Why is this?

docker can only build, push, pull, unpack, and run Docker containers. It does not understand how to build, push, pull, unpack, or run a WebAssembly module.

Does it have to use HTTPS to connection to the registry?

Yes. This is enforced by the OCI distribution specification. All connections must be through HTTPS unless that registry is marked as "insecure", or is listening on the local loopback address (127.0.0.1).

@JesseStutler
Copy link
Author

Thanks for your prompt reply! Too fast to get the answer. I understand a lot now😁, I will close the issue after solving it.

@JesseStutler
Copy link
Author

JesseStutler commented Jun 7, 2021

All connections must be through HTTPS unless that registry is marked as "insecure"

How to mark the registry as “insecure” if I have to use HTTP? It's a little bit troublesome for my condition to build a HTTPS registry like Harbor. So if I can't use HTTP I have to use registry services like Azure or Google container registry, but I can't use Azure or Google lack of credit card😅. Github package registry will be my last choice.

@bacongobbler
Copy link
Collaborator

An image reference is broken down to its URL counterparts here. scheme_for determine the URL schema (HTTP or HTTPS). The definition for that function is here. It will use HTTP if that registry is part of the exception list.

krustlet-wasi provides a feature flag in its configuration to pass insecure registries to the oci-distribution client.

@JesseStutler
Copy link
Author

Then I added the flag "--insecure-registries" to krustlet-wasi, but the log still reported some errors:

[2021-06-08T02:45:37Z ERROR kubelet::state::common::image_pull] Failed to decode registry token from auth request

    Caused by:
        duplicate field `token` at line 1 column 1158

Why does it cause duplicate field token?

@JesseStutler JesseStutler reopened this Jun 8, 2021
@JesseStutler
Copy link
Author

If I used HTTPS harbor, I got unable to get local issuer certificate error:

[2021-06-08T05:34:56Z ERROR kubelet::state::common::image_pull] error sending request for url (https://harbor.hwcloud.nuaa/v2/): error trying to connect: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:../ssl/statem/statem_clnt.c:1924: (unable to get local issuer certificate)

    Caused by:
        0: error trying to connect: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:../ssl/statem/statem_clnt.c:1924: (unable to get local issuer certificate)
        1: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:../ssl/statem/statem_clnt.c:1924: (unable to get local issuer certificate)
        2: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:../ssl/statem/statem_clnt.c:1924

Where should I put harbor's crt,key and ca.crt on the krustlet node then krustlet can prove that the registry is save, just like docker's path /etc/docker/certs.d/yourdomain.com/?

@bacongobbler
Copy link
Collaborator

bacongobbler commented Jun 8, 2021

https://github.com/deislabs/krustlet/blob/8f7d24a19c9443a25f84b95653f0eec7f7c7f24e/crates/oci-distribution/src/client.rs#L136

doesn't look like it is hooked up anywhere. Feel like working on a contribution? :)

@bacongobbler bacongobbler added the question Further information is requested label Jun 8, 2021
@bacongobbler
Copy link
Collaborator

Alternatively, you can have a look through https://docs.rs/reqwest/0.11.3/reqwest/#tls. It looks like the system default chain certificate could be used as a workaround.

@JesseStutler
Copy link
Author

I'm a little confused..Only self-built container-registries like Harbor will cause this problem? Did someone meet the same problem before:unable to get local issuer certificate?

@JesseStutler
Copy link
Author

Then I added the flag "--insecure-registries" to krustlet-wasi, but the log still reported some errors:

[2021-06-08T02:45:37Z ERROR kubelet::state::common::image_pull] Failed to decode registry token from auth request

    Caused by:
        duplicate field `token` at line 1 column 1158

Why does it cause duplicate field token?

Anyway, how to solve this HTTP error? I still want to create a pod successfully first, I even still got stuck in the introduction part😞

@wanglilin0628
Copy link

wanglilin0628 commented Jun 28, 2021

I got the same error and got stuck in the introduction too. When I run krustlet-wasi --node-ip 172.17.0.1 --bootstrap-file=~/.krustlet/config/bootstrap.conf,it reported:

[2021-06-28T02:40:42Z ERROR kubelet::state::common::image_pull] error decoding response body: expected value at line 2 column 1
    Caused by:
        expected value at line 2 column 1 

[2021-06-28T02:41:30Z ERROR kubelet::state::common::image_pull] error sending request for url (https://docker.io/v2/kindest/kindnetd/manifests/v20210326-1e038dc5): connection closed before message completed
    Caused by:
        connection closed before message completed

The nodes status:

NAME                    STATUS   ROLES                  AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE       KERNEL-VERSION           CONTAINER-RUNTIME
kind-control-plane      Ready    control-plane,master   24m   v1.21.1   172.18.0.2    <none>        Ubuntu 21.04   3.10.0-1062.el7.x86_64   containerd://1.5.2
localhost.localdomain   Ready    <none>                 34s   0.7.0     172.17.0.1    <none>        <unknown>      <unknown>                mvp

And the pod "kindnet-xxx" is always registered,run "kubectl describe pod xxx -n xxx",it show:

Name:           kindnet-mzxxg
Namespace:      kube-system
Priority:       0
Node:           localhost.localdomain/
Labels:         app=kindnet
                controller-revision-hash=5b547684d9
                k8s-app=kindnet
                pod-template-generation=1
                tier=node
Annotations:    <none>
Status:         Pending
Reason:         ImagePullBackoff
Message:        ImagePullBackoff
IP:             
IPs:            <none>
Controlled By:  DaemonSet/kindnet
Containers:
  kindnet-cni:
    Container ID:   
    Image:          docker.io/kindest/kindnetd:v20210326-1e038dc5
    Image ID:       
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       Registered
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:     100m
      memory:  50Mi
    Requests:
      cpu:     100m
      memory:  50Mi
    Environment:
      HOST_IP:                  (v1:status.hostIP)
      POD_IP:                   (v1:status.podIP)
      POD_SUBNET:              10.244.0.0/16
      CONTROL_PLANE_ENDPOINT:  kind-control-plane:6443
    Mounts:
      /etc/cni/net.d from cni-cfg (rw)
      /lib/modules from lib-modules (ro)
      /run/xtables.lock from xtables-lock (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fkt9s (ro)
Conditions:
  Type           Status
  PodScheduled   True 
Volumes:
  cni-cfg:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/cni/net.d
    HostPathType:  
  xtables-lock:
    Type:          HostPath (bare host directory volume)
    Path:          /run/xtables.lock
    HostPathType:  FileOrCreate
  lib-modules:
    Type:          HostPath (bare host directory volume)
    Path:          /lib/modules
    HostPathType:  
  kube-api-access-fkt9s:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Guaranteed
Node-Selectors:              <none>
Tolerations:                 op=Exists
                             node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/network-unavailable:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type    Reason     Age    From               Message
  ----    ------     ----   ----               -------
  Normal  Scheduled  3m58s  default-scheduler  Successfully assigned kube-system/kindnet-mzxxg to localhost.localdomain

And run "kubectl logs xxx", it shows:

Error from server: Get "https://172.17.0.1:3000/containerLogs/kube-system/kindnet-mzxxg/kindnet-cni?previous=true": dial tcp 172.17.0.1:3000: connect: no route to host

@mayocream
Copy link

mayocream commented Jul 8, 2021

Same error and pod is registered:

 krustlet-wasi[20706]: [2021-07-08T08:31:46Z DEBUG hyper::proto::h1::conn] incoming body is content-length (950 bytes)
 krustlet-wasi[20706]: [2021-07-08T08:31:46Z DEBUG hyper::proto::h1::conn] incoming body completed
 krustlet-wasi[20706]: [2021-07-08T08:31:46Z DEBUG hyper::client::pool] pooling idle connection for ("https", harbor....)
 krustlet-wasi[20706]: [2021-07-08T08:31:46Z DEBUG reqwest::async_impl::client] response '200 OK' for https://harbor..../service/token?scope=repository%3Abifrost%2Fhello-wasm%3Apull&service=harbor-registry
 krustlet-wasi[20706]: [2021-07-08T08:31:46Z DEBUG oci_distribution::client] Received response from auth request: {"token":"...","access_token":"","expires_in":1800,"issued_at":"2021-07-08T08:31:46Z"}
 krustlet-wasi[20706]: [2021-07-08T08:31:46Z ERROR kubelet::state::common::image_pull] Failed to decode registry token from auth request
 krustlet-wasi[20706]:     
 krustlet-wasi[20706]:     Caused by:
 krustlet-wasi[20706]:         duplicate field `token` at line 1 column 1129
 krustlet-wasi[20706]: [2021-07-08T08:31:46Z DEBUG krator::state] State::status
 krustlet-wasi[20706]: [2021-07-08T08:31:46Z DEBUG krator::state] Applying status patch to object. name=wasm-hello patch={"metadata":{"resourceVersion":""},"status":{"phase":"Pending","message":"ImagePullBackoff","reason":"ImagePullBackoff"}}

@VishnuJin
Copy link
Contributor

Im also facing the same error, pod is 'Registered'

❯ kubectl describe pod krustlet-tutorial                                     ─╯
Name:         krustlet-tutorial
Namespace:    default
Priority:     0
Node:         sakthivishnus-mac.local/
Labels:       <none>
Annotations:  <none>
Status:       Pending
Reason:       ImagePullBackoff
Message:      ImagePullBackoff
IP:
IPs:          <none>
Containers:
  krustlet-tutorial:
    Container ID:
    Image:          localhost:5000/krustlet_demo.wasm:v1
    Image ID:
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       Registered
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-crv5b (ro)
Conditions:
  Type           Status
  PodScheduled   True
Volumes:
  kube-api-access-crv5b:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 kubernetes.io/arch=wasm32-wasi:NoExecute
                             kubernetes.io/arch=wasm32-wasi:NoSchedule
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  31m   default-scheduler  Successfully assigned default/krustlet-tutorial to sakthivishnus-mac.local

Krustlet's node logs

❯ krustlet-wasi --node-ip=192.168.46.249 --bootstrap-file=~/.krustlet/config/bootstrap.conf
Aug 03 10:02:15.368 ERROR kubelet::state::common::image_pull: error=unsupported media type: application/vnd.docker.distribution.manifest.list.v2+json
Aug 03 10:02:27.551 ERROR kubelet::state::common::image_pull: error=unsupported media type: application/vnd.docker.distribution.manifest.list.v2+json
Aug 03 10:02:49.937 ERROR kubelet::state::common::image_pull: error=unsupported media type: application/vnd.docker.distribution.manifest.list.v2+json
Aug 03 10:03:31.961 ERROR kubelet::state::common::image_pull: error=unsupported media type: application/vnd.docker.distribution.manifest.list.v2+json
Aug 03 10:07:00.345 ERROR kubelet::state::common::image_pull: error=unsupported media type: application/vnd.docker.distribution.manifest.list.v2+json
Aug 03 10:08:09.451 ERROR kubelet::state::common::image_pull: error=error sending request for url (https://localhost:5000/v2/): error trying to connect: record overflow
Aug 03 10:08:19.477 ERROR kubelet::state::common::image_pull: error=error sending request for url (https://localhost:5000/v2/): error trying to connect: record overflow
Aug 03 10:08:39.509 ERROR kubelet::state::common::image_pull: error=error sending request for url (https://localhost:5000/v2/): error trying to connect: record overflow
Aug 03 10:09:19.548 ERROR kubelet::state::common::image_pull: error=error sending request for url (https://localhost:5000/v2/): error trying to connect: record overflow
Aug 03 10:09:43.063 ERROR kubelet::state::common::image_pull: error=OCI API error: authentication required on https://registry-1.docker.io/v2/kindest/kindnetd/manifests/v20210326-1e038dc5
Aug 03 10:10:39.584 ERROR kubelet::state::common::image_pull: error=error sending request for url (https://localhost:5000/v2/): error trying to connect: record overflow
Aug 03 10:13:19.636 ERROR kubelet::state::common::image_pull: error=error sending request for url (https://localhost:5000/v2/): error trying to connect: record overflow
Aug 03 10:14:45.710 ERROR kubelet::state::common::image_pull: error=OCI API error: authentication required on https://registry-1.docker.io/v2/kindest/kindnetd/manifests/v20210326-1e038dc5

Running from KinD on M1 Mac by following Krustlet's howto

@VishnuJin
Copy link
Contributor

❯ krustlet-wasi --node-ip=192.168.46.249 --bootstrap-file=~/.krustlet/config/bootstrap.conf
Aug 03 10:02:15.368 ERROR kubelet::state::common::image_pull: error=unsupported media type: application/vnd.docker.distribution.manifest.list.v2+json
Aug 03 10:02:27.551 ERROR kubelet::state::common::image_pull: error=unsupported media type: application/vnd.docker.distribution.manifest.list.v2+json
Aug 03 10:02:49.937 ERROR kubelet::state::common::image_pull: error=unsupported media type: application/vnd.docker.distribution.manifest.list.v2+json
Aug 03 10:03:31.961 ERROR kubelet::state::common::image_pull: error=unsupported media type: application/vnd.docker.distribution.manifest.list.v2+json
Aug 03 10:07:00.345 ERROR kubelet::state::common::image_pull: error=unsupported media type: application/vnd.docker.distribution.manifest.list.v2+json
Aug 03 10:08:09.451 ERROR kubelet::state::common::image_pull: error=error sending request for url (https://localhost:5000/v2/): error trying to connect: record overflow
Aug 03 10:08:19.477 ERROR kubelet::state::common::image_pull: error=error sending request for url (https://localhost:5000/v2/): error trying to connect: record overflow
Aug 03 10:08:39.509 ERROR kubelet::state::common::image_pull: error=error sending request for url (https://localhost:5000/v2/): error trying to connect: record overflow
Aug 03 10:09:19.548 ERROR kubelet::state::common::image_pull: error=error sending request for url (https://localhost:5000/v2/): error trying to connect: record overflow
Aug 03 10:09:43.063 ERROR kubelet::state::common::image_pull: error=OCI API error: authentication required on https://registry-1.docker.io/v2/kindest/kindnetd/manifests/v20210326-1e038dc5
Aug 03 10:10:39.584 ERROR kubelet::state::common::image_pull: error=error sending request for url (https://localhost:5000/v2/): error trying to connect: record overflow
Aug 03 10:13:19.636 ERROR kubelet::state::common::image_pull: error=error sending request for url (https://localhost:5000/v2/): error trying to connect: record overflow
Aug 03 10:14:45.710 ERROR kubelet::state::common::image_pull: error=OCI API error: authentication required on https://registry-1.docker.io/v2/kindest/kindnetd/manifests/v20210326-1e038dc5

Running from KinD on M1 Mac by following Krustlet's howto

I was using KinD's registry to store my wasm module. I switched to the latest tag, built from source and by passing --insecure-registries localhost:5000 flag to Krustlet-wasi I was successfully able to see it working.

cdmurph32 added a commit to adobe-platform/krustlet that referenced this issue Aug 13, 2021
AWS ECR (public.ecr.aws), Red Hat (registry.redhat.io), and likely other
registries do not provide the `Docker-Content-Digest` header. As
discussed in krustlet#624, this
header is not required by the OCI
[Spec](https://github.com/opencontainers/distribution-spec/blob/main/spec.md#pull).
cdmurph32 added a commit to adobe-platform/krustlet that referenced this issue Aug 13, 2021
AWS ECR (public.ecr.aws), Red Hat (registry.redhat.io), and likely other
registries do not provide the `Docker-Content-Digest` header. As
discussed in krustlet#624, this
header is not required by the OCI
[Spec](https://github.com/opencontainers/distribution-spec/blob/main/spec.md#pull).

Signed-off-by: Colin Murphy <colmurph@adobe.com>
@matiaslareo
Copy link

Hi guys is there support for image pulling from ghcr.io ? iam facing the same error "ImagePullBackOff"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

7 participants