Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upgrade to 1.14 seems to have broken my access to private registries #384

Closed
termie opened this issue Mar 26, 2019 · 22 comments
Closed

upgrade to 1.14 seems to have broken my access to private registries #384

termie opened this issue Mar 26, 2019 · 22 comments
Labels

Comments

@termie
Copy link

termie commented Mar 26, 2019

I'm under some restrictions about uploading the tarball from microk8s inspect, but

user@orion-a:~$ microk8s.inspect 
Inspecting services
  Service snap.microk8s.daemon-containerd is running
  Service snap.microk8s.daemon-apiserver is running
  Service snap.microk8s.daemon-proxy is running
  Service snap.microk8s.daemon-kubelet is running
  Service snap.microk8s.daemon-scheduler is running
  Service snap.microk8s.daemon-controller-manager is running
  Service snap.microk8s.daemon-etcd is running

args.tar.gz

We've been using imagePullSecrets successfully on previous version (1.13.3), but the automatic upgrade to 1.14 both surprised me and seems to have broken the ability to pull from our private registry. (registry is using https://github.com/cesanta/docker_auth and registry:2)

this caused all of our microk8s installs to fail around 17 hours ago when the snap was released.

I checked this issue: containerd/cri#848 but we are using the modern syntax (dockerconfigjson) and haven't been having an issue prior to yesterday

@carlososiel
Copy link

carlososiel commented Mar 26, 2019

It's the same for me. I use microk8s registry without https and kubelet tries to pull from that registry using https, and I configure the options to the containerd through the containerd-template.toml for insecure-registries, but ignores them.

@termie
Copy link
Author

termie commented Mar 26, 2019

well, my registry responds on both https and http so i don't think that's the issue for me

@carlososiel
Copy link

You can try to add authorization. See the docs

@k33g
Copy link

k33g commented Mar 27, 2019

Before the update, I just needed to add to /var/snap/microk8s/current/args/docker-daemon.json this entry:

 "insecure-registries" : ["localhost:32000", "registry.test:5000"]

to be able to use a remote unsecure registry

Since the update, even with this, now MicroK8S try to pull images on https instead of http:

Failed to pull image "registry.test:5000/first:latest": rpc error: code = Unknown 
desc = failed to resolve image "registry.test:5000/first:latest": no available registry 
endpoint: failed to do request: Head https://registry.test:5000/v2/first/manifests/latest: 
http: server gave HTTP response to HTTPS client

@maltebaumann
Copy link

maltebaumann commented Mar 27, 2019

I'm seeing the same issues within my Polyaxon deployment:

Failed to pull image "127.0.0.1:31813/labs-image-matting_12:0b6fab77fde74129861bb2bbaa1f8e97": rpc error: code = Unknown 
desc = failed to resolve image "127.0.0.1:31813/labs-image-matting_12:0b6fab77fde74129861bb2bbaa1f8e97": no available registry 
endpoint: failed to do request: Head https://127.0.0.1:31813/v2/labs-image-matting_12/manifests/0b6fab77fde74129861bb2bbaa1f8e97: 
http: server gave HTTP response to HTTPS client

I've added 127.0.0.1:31813 to the containerd.toml, but that doesn't change anything. Any help would be greatly appreciated. The registry is a pod running within my Kubernetes cluster.

@carlososiel
Copy link

@maltebaumann Try to add the configuration into containerd-template.toml, and restart the containerd daemon

@termie
Copy link
Author

termie commented Mar 27, 2019

fwiw, to the folks saying "add the configuration the the containerd config", that's not how kubernetes works, i need to be able to set the values using secrets

@ktsakalozos
Copy link
Member

Since integration with registries is producing friction, we could try to have some command/addon that would automate the process of adding/configuring/removing registries (including the management of secrets).

I would also kindly ask you when you report an issue to also include instructions on how to reproduce the error you are seeing. Please, do not assume I know anything about your setup eg how the registry you have was setup. At this point I can only point you to the following resources:

@maltebaumann
Copy link

maltebaumann commented Mar 28, 2019

Hey @ktsakalozos, thanks for getting back and the awesome help on Slack. I managed to reproduce the exact behaviour I'm seeing in a multipass VM on my MacBook. I'll describe the steps below:

Test-Env: MacOS 10.14.2 Mojave, MacBook Pro MacBook Pro (13-inch, 2018)

Microk8s setup in VM using multipass (4 CPUs is important for Polyaxon installation to succeed):

$ multipass launch --name microk8s-vm --mem 4G --disk 40G --cpus 4 
$ multipass exec microk8s-vm -- sudo snap install microk8s --classic
$ multipass exec microk8s-vm —- sudo iptables -P FORWARD ACCEPT

$ multipass mount ./polyaxon microk8s-vm:/polyaxon # Contains my polyaxon config, attached below

Enter VM using $ multipass shell microk8s-vm, then enable plugins: $ microk8s.enable dashboard dns storage and follow Polyaxon setup guide:

$ curl https://raw.githubusercontent.com/kubernetes/helm/master/scripts/get | bash
$ microk8s.kubectl --namespace kube-system create sa tiller
$ microk8s.kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller
$ helm init --service-account tiller

Add --allow-privileged to kubelet & kube-apiserver by editing /var/snap/microk8s/current/args/kubelet and /var/snap/microk8s/current/args/kube-apiserver. Restart both services afterwards:

sudo systemctl restart snap.microk8s.daemon-apiserver
sudo systemctl restart snap.microk8s.daemon-kubelet

Proceed with Polyaxon installation:

$ helm repo add polyaxon https://charts.polyaxon.com
$ helm repo update
$ microk8s.kubectl create namespace polyaxon
$ helm install polyaxon/polyaxon --name=polyaxon --namespace=polyaxon -f /polyaxon/debug_polyaxon-config.yml --timeout=5000

The running Polyaxon installation now provides an internal registry at 127.0.0.1:31813. To test it, I followed the steps from #382 (comment):

$ apt-get install docker.io
$ sudo docker pull busybox
$ sudo docker tag busybox 127.0.0.1:31813/my-busybox
$ sudo docker push 127.0.0.1:31813/my-busybox
$ cat > bbox.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: busybox
  namespace: default
spec:
  containers:
  - name: busybox
    image: 127.0.0.1:31813/my-busybox
    command:
      - sleep
      - "3600"
    imagePullPolicy: IfNotPresent
  restartPolicy: Always
$ microk8s.kubectl apply -f ./bbox.yaml
$ microk8s.kubectl describe pod busybox

This yields the error message I'm seeing on my production cluster as well:

Events:
  Type     Reason     Age                From                  Message
  ----     ------     ----               ----                  -------
  Normal   Scheduled  43s                default-scheduler     Successfully assigned default/busybox to microk8s-vm
  Normal   BackOff    15s (x2 over 41s)  kubelet, microk8s-vm  Back-off pulling image "127.0.0.1:31813/my-busybox"
  Warning  Failed     15s (x2 over 41s)  kubelet, microk8s-vm  Error: ImagePullBackOff
  Normal   Pulling    1s (x3 over 42s)   kubelet, microk8s-vm  Pulling image "127.0.0.1:31813/my-busybox"
  Warning  Failed     1s (x3 over 42s)   kubelet, microk8s-vm  Failed to pull image "127.0.0.1:31813/my-busybox": rpc error: code = Unknown desc = failed to resolve image "127.0.0.1:31813/my-busybox:latest": no available registry endpoint: failed to do request: Head https://127.0.0.1:31813/v2/my-busybox/manifests/latest: http: server gave HTTP response to HTTPS client
  Warning  Failed     1s (x3 over 42s)   kubelet, microk8s-vm  Error: ErrImagePull

So I edit /var/snap/microk8s/current/args/containerd-template.toml and add the following lines (giving great attention to tabs and spaces 😉):

        [plugins.cri.registry.mirrors."test.insecure-registry.io"]
          endpoint = ["http://127.0.0.1:31813"]

Then I restart the cluster: microk8s.stop followed by microk8s.start and verify that my changes were copied to containerd.toml.

Inspecting the pod again, the issue remains:

  Normal   BackOff         19s                   kubelet, microk8s-vm  Back-off pulling image "127.0.0.1:31813/my-busybox"
  Warning  Failed          19s                   kubelet, microk8s-vm  Error: ImagePullBackOff
  Normal   Pulling         7s (x2 over 27s)      kubelet, microk8s-vm  Pulling image "127.0.0.1:31813/my-busybox"
  Warning  Failed          7s (x2 over 20s)      kubelet, microk8s-vm  Error: ErrImagePull
  Warning  Failed          7s                    kubelet, microk8s-vm  Failed to pull image "127.0.0.1:31813/my-busybox": rpc error: code = Unknown desc = failed to resolve image "127.0.0.1:31813/my-busybox:latest": no available registry endpoint: failed to do request: Head https://127.0.0.1:31813/v2/my-busybox/manifests/latest: http: server gave HTTP response to HTTPS client

To be safe I delete the pod using microk8s.kubectl delete pod busybox and recreate it with microk8s.kubectl apply -f ./bbox.yaml, but the result remains the same:

multipass@microk8s-vm:/polyaxon$ microk8s.kubectl describe pod busybox
Name:               busybox
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               microk8s-vm/192.168.64.2
Start Time:         Thu, 28 Mar 2019 14:03:35 +0100
Labels:             <none>
Annotations:        kubectl.kubernetes.io/last-applied-configuration:
                      {"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"busybox","namespace":"default"},"spec":{"containers":[{"command":["sl...
Status:             Pending
IP:                 10.1.1.42
Containers:
  busybox:
    Container ID:  
    Image:         127.0.0.1:31813/my-busybox
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      sleep
      3600
    State:          Waiting
      Reason:       ErrImagePull
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-m8p5z (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  default-token-m8p5z:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-m8p5z
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age   From                  Message
  ----     ------     ----  ----                  -------
  Normal   Scheduled  5s    default-scheduler     Successfully assigned default/busybox to microk8s-vm
  Normal   Pulling    4s    kubelet, microk8s-vm  Pulling image "127.0.0.1:31813/my-busybox"
  Warning  Failed     4s    kubelet, microk8s-vm  Failed to pull image "127.0.0.1:31813/my-busybox": rpc error: code = Unknown desc = failed to resolve image "127.0.0.1:31813/my-busybox:latest": no available registry endpoint: failed to do request: Head https://127.0.0.1:31813/v2/my-busybox/manifests/latest: http: server gave HTTP response to HTTPS client
  Warning  Failed     4s    kubelet, microk8s-vm  Error: ErrImagePull
  Normal   BackOff    4s    kubelet, microk8s-vm  Back-off pulling image "127.0.0.1:31813/my-busybox"
  Warning  Failed     4s    kubelet, microk8s-vm  Error: ImagePullBackOff

Edit: Forgot the files:
bbox.yaml.txt
debug_polyaxon-config.yml.txt

@ktsakalozos
Copy link
Member

@maltebaumann, thank you for the detailed description.

There are two things that look strange. First you never tell docker you are talking to an insecure registry yet you can push images there (https://docs.docker.com/registry/insecure/). Second I was able to correctly reference the my-busybox image like this image: localhost:31813/my-busybox while having

[plugins.cri.registry.mirrors."local.insecure-registry.io"]
          endpoint = ["http://127.0.0.1:31813"]

in contanerd configuration.

@maltebaumann
Copy link

Thanks to your hint I at least managed to get Polyaxon working again:
After inspecting polyaxon/registry.py to find out where the 127.0.0.1 comes from, I adjusted the corresponding values in the global polyaxon-polyaxon-config ConfigMap:
POLYAXON_REGISTRY_IN_CLUSTER = "false" # To force use of HOST & PORT
POLYAXON_REGISTRY_HOST = "localhost"
POLYAXON_REGISTRY_PORT = "31813"

I'm now able to run experiments like I did before, but don't think this is a 'clean' solution.

@ps-feng
Copy link

ps-feng commented Apr 7, 2019

I too can't access my private registry and getting https://192.168.3.25:5000/v2/my-busybox/manifests/latest: http: server gave HTTP response to HTTPS client.

My setup is:

  • Ubuntu server 18.04 (192.168.3.25)
  • docker-ce 18.09.4
  • microk8s 1.14.0

And I've got a private registry at port 5000. Note that this registry is not in Kubernetes.

I've edited /etc/docker/daemon.json to include "insecure-registries" : ["192.168.3.25:5000"], so I'm able to:

docker pull busybox
docker tag busybox 192.168.3.25:5000/my-busybox
docker push 192.168.3.25:5000/my-busybox

My /var/snap/microk8s/current/args/containerd-template.toml also includes

[plugins.cri.registry.mirrors."local.insecure-registry.io"]
          endpoint = ["http://192.168.3.25:5000"]

But then when I try to kubectl apply -f bbox.yaml

apiVersion: v1
kind: Pod
metadata:
  name: busybox
  namespace: default
spec:
  containers:
  - name: busybox
    image: 192.168.3.25:5000/my-busybox
    command:
      - sleep
      - "3600"
    imagePullPolicy: IfNotPresent
  restartPolicy: Always

I get the dreaded
Failed to pull image "192.168.3.25:5000/my-busybox:latest": rpc error: code = Unknown desc = failed to resolve image "192.168.3.25:5000/my-busybox:latest": no available registry endpoint: failed to do request: Head https://192.168.3.25:5000/v2/my-busybox/manifests/latest: http: server gave HTTP response to HTTPS client

After investigating for a while and going through containerd/containerd#2758, I tried doing

$ microk8s.ctr image pull 192.168.3.25:5000/my-busybox:latest
ctr: failed to resolve reference "192.168.3.25:5000/my-busybox:latest": failed to do request: Head https://192.168.3.25:5000/v2/my-busybox/manifests/latest: http: server gave HTTP response to HTTPS client

But then if I try with --plain-http it works!

$ microk8s.ctr image pull --plain-http=true 192.168.3.25:5000/my-busybox:latest
192.168.3.25:5000/my-busybox:latest:                                              resolved       |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:f79f7a10302c402c052973e3fa42be0344ae6453245669783a9e16da3d56d5b4: done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:fc1a6b909f82ce4b72204198d49de3aaf757b3ab2bb823cb6e47c416b97c5985:    done           |++++++++++++++++++++++++++++++++++++++|
config-sha256:af2f74c517aac1d26793a6ed05ff45b299a037e1a9eefeae5eacda133e70a825:   done           |++++++++++++++++++++++++++++++++++++++|
elapsed: 0.3 s                                                                    total:  738.6  (2.4 MiB/s)
unpacking linux/amd64 sha256:f79f7a10302c402c052973e3fa42be0344ae6453245669783a9e16da3d56d5b4...
done

So it seems containerd within microk8s is not checking the insecure repository mirror entry. Is this a bug or is there anything else I could do?

@ps-feng
Copy link

ps-feng commented Apr 7, 2019

I got this to work!

I checked cri's source code and saw this:

for _, e := range c.config.Registry.Mirrors[refspec.Hostname()].Endpoints

If we trace that further, the hostname comes from the 'pull image request'. I wasn't entirely sure what that is but I figured it could very well be the hostname of the image repository, so then I tried editing /var/snap/microk8s/current/args/containerd-template.toml with:

[plugins.cri.registry.mirrors."192.168.3.25:5000"]
          endpoint = ["http://192.168.3.25:5000"]

Restarted the service, tried deploying bbox.yaml again and it worked!

Bottom line is it seems we can't just put any arbitrary text after plugins.cri.registry.mirrors., it has to match the host of the registry.

EDIT: after further investigation, I can confirm that it works like stated above. ParseNormalizedName will convert the image name into a fully qualified one that Docker can use unambiguously. When there's no host name, docker.io is preppended to the name, hence the default [plugins.cri.registry.mirrors."docker.io"]. See test. This makes me wonder about https://github.com/containerd/cri/blob/master/docs/registry.md, which seems misleading as there's no description about test.secure-registry.io and test.insecure-registry.io and it leads to think that you can put anything there.

@jacksontj
Copy link

jacksontj commented Apr 23, 2019

For anyone else hitting this (and waiting on a more long-term fix), @ps-feng 's comment fixes it, but if you are unfamiliar with what to restart-- For myself I edited the template then ran sudo systemctl restart snap.microk8s.daemon-containerd.service -- after that its able to pull.

As expressed by others, this is really a poor upgrade experience :/ (not to mention still broken without manual changes 😱 )

@wdiestel
Copy link

Thanks @jacksontj
adding this --plain-http=true switch I could pull the image with ctr now

@wittlesouth
Copy link

@ps-feng 's comment seems to work to get the container to be accessed via http. Unfortunately, I also seem to be having a problem with a clean install that any requests to service ports via "localhost:" fail from the local machine (hang with no response). I'm running microk8s in a VM, I am up and running in the moment by ensuring that the VM DNS name from my laptop's /etc/hosts file is also assigned to the interface IP (not the loopback address) in the /etc/hosts of the VM. Once the hostname is the same both inside and outside the VM, I can replace "localhost" with the host name in @ps-feng 's fix.

My /etc/hosts entry in OS X:

192.168.119.3 wsv-dev.wittlesouth.local traefik-ui.wittlesouth.local kube.roadmapsftw.local registry.wittlesouth.local pypi.wittlesouth.local

My /etc/hosts entry in the VM:

192.168.119.3 wsv-dev.wittlesouth.local registry.wittlesouth.local

The edited section of containerd-template.toml

    [plugins.cri.registry]
      [plugins.cri.registry.mirrors]
        [plugins.cri.registry.mirrors."docker.io"]
          endpoint = ["https://registry-1.docker.io"]
        [plugins.cri.registry.mirrors."registry.wittlesouth.local:32000"]
          endpoint = ["http://registry.wittlesouth.local:32000"]

I've mucked with my VM configuration some, so conceivably I caused the problem there. However, I have a clean install on a physical node on my network, and if I enable the registry service and try to access it via 'localhost:32000' there it fails as well. I can get a response from that port from off the box, but not on the box via localhost. Probably a separate problem, but sharing the above fix here in case it helps someone else.

@mludvig
Copy link

mludvig commented Jun 15, 2019

I struggled with the same, here are a few more tips that helped me resolve it:

  1. Make sure you enabled inter-pod communication:

    sudo ufw allow in on cbr0
    sudo ufw allow out on cbr0
    sudo ufw default allow routed
    
  2. Find out the ClusterIP / port of the repository:

    ~$ microk8s.kubectl --namespace=container-registry get service/registry 
    NAME       TYPE       CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
    registry   NodePort   10.152.183.185   <none>        5000:32000/TCP   17h
    

    In my case the registry is on 10.152.183.185:5000.

  3. Add it as insecure registry to /var/snap/microk8s/current/args/containerd-template.toml

    [plugins.cri.registry]
      [plugins.cri.registry.mirrors]
        [...]
        [plugins.cri.registry.mirrors."10.152.183.185:5000"]
          endpoint = ["http://10.152.183.185:5000"]
    

    Then restart microk8s with microk8s.stop && microk8s.start or better yet reboot the whole VM.

  4. Use that ClusterIP:Port in your manifests:

    [...]
      containers:
      - name: nginx
        image: 10.152.183.185:5000/mynginx:koumak
    

This is what worked for me.

@lazzarello
Copy link
Contributor

lazzarello commented Aug 23, 2019

After a long road on my end, it appears this was all user error. I can pull images when the imagePullSecrets key is set to the correct name. I believe this issue can be closed when @termie gets to it.

I have configured microk8s 1.15.2 to use the docker socket on localhost. I have configured kubernetes to use the secret named .dockerconfigjson as recommended by @ktsakalozos

If I manually login in an interactive shell and pull images, microk8s can create pods backed by the image. But microk8s cannot pull images using the secret I have provided. This secret could pull images successfully in version 1.13

@akinsella
Copy link

akinsella commented Nov 11, 2019

Please note that if you have username / password defined for you private registry, you will have to also add auth section in containerd.toml file. Defining mirror is not enough. An example:

  [plugins.cri.registry.auths] 
    [plugins.cri.registry.auths."http://registry.my-private-domain-name.com:5000"]
      username = "test"
      password = "test"

Solution comes from this stackoverflow answer: https://stackoverflow.com/a/56788222

If you do not define auth infos for mirror in the auth section, you will continue to receive https client error: http: server gave HTTP response to HTTPS client, which can be misleading.

@Richard87
Copy link
Contributor

Hi!

I am also having issues with using the insecure registry that microk8s provides...

I have edited /var/snap/microk8s/current/args/containerd-template.toml so it contains this:

    [plugins.cri.registry]
      [plugins.cri.registry.mirrors]
        [plugins.cri.registry.mirrors."docker.io"]
          endpoint = ["https://registry-1.docker.io"]
        [plugins.cri.registry.mirrors."localhost:32000"]
          endpoint = ["http://localhost:32000"]
      [plugins.cri.registry.auths]
        [plugins.cri.registry.auths."localhost:32000"]
          username = "test"
          password = "test"

(I have also added the insecure registry to ~/.docker/config.json and to skaffold.yaml)

I also ran microk8s.inspect and it told me to add insecure registries to /etc/docker/daemon.json:

{
    "insecure-registries" : ["localhost:32000"] 
}

But got the same error:

rpc error: code = Unknown desc = failed to resolve image "127.0.0.1:32000/eportal@sha256:4d959ba6ccb9aaed3b99042fcf8305311d8e281679e9a75fce0ed61e6c5e4de6": no available registry endpoint: failed to do request: Head https://127.0.0.1:32000/v2/eportal/manifests/sha256:4d959ba6ccb9aaed3b99042fcf8305311d8e281679e9a75fce0ed61e6c5e4de6: http: server gave HTTP response to HTTPS client

Kubernetes version:

Server Version: version.Info
{
  Major:"1", 
Minor:"17", 
GitVersion:"v1.17.0", 
GitCommit:"70132b0f130acc0bed193d9ba59dd186f0e634cf", 
GitTreeState:"clean", 
BuildDate:"2019-12-07T21:12:17Z", 
GoVersion:"go1.13.4", 
Compiler:"gc", 
Platform:"linux/amd64"
}

@hyacin75
Copy link

I had some painful issues with this and couldn't really find the solution online anywhere and had to piece it together, so in case anyone comes along with the same issue of being unable to pull images from a self-signed Docker registry, I had to modify the [plugins."io.containerd.grpc.v1.cri".registry] section of /var/snap/microk8s/current/args/containerd-template.toml to skip verification like so -

  # 'plugins."io.containerd.grpc.v1.cri".registry' contains config related to the registry
  [plugins."io.containerd.grpc.v1.cri".registry]
    [plugins."io.containerd.grpc.v1.cri".registry.configs."my.private.registry:5000".tls]
      insecure_skip_verify = true

@stale
Copy link

stale bot commented Jul 24, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the inactive label Jul 24, 2021
@stale stale bot closed this as completed Aug 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests