Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BACKPORT][v1.6.1][BUG] longhorn manager pod fails to start in container-based K3s #7847

Closed
github-actions bot opened this issue Feb 5, 2024 · 3 comments
Assignees
Labels
area/v1-data-engine v1 data engine (iSCSI tgt) component/longhorn-manager Longhorn manager (control plane) kind/backport Backport request kind/bug
Milestone

Comments

@github-actions
Copy link

github-actions bot commented Feb 5, 2024

backport #5693

@github-actions github-actions bot added area/v1-data-engine v1 data engine (iSCSI tgt) component/longhorn-manager Longhorn manager (control plane) kind/backport Backport request kind/bug labels Feb 5, 2024
@github-actions github-actions bot added this to the v1.6.1 milestone Feb 5, 2024
@longhorn-io-github-bot
Copy link

longhorn-io-github-bot commented Feb 19, 2024

Pre Ready-For-Testing Checklist

  • Where is the reproduce steps/test steps documented?
    The reproduce steps/test steps are at:

PRs:
v1.6.x

@chriscchien
Copy link
Contributor

chriscchien commented Mar 7, 2024

I tried setup k3s in a Ubuntu container and then install Longhohn v1.6.x, v1.6.0, v1.4.0 in that container, all version's longhorn-manager kept running, did not observed CrashLoopBackOff on longhorhn -manager pod.

And then I tried to use container with base image=Alpine 3.16 on a Alpine 3.16 machine(accord to the steps in the origin ticket), encounter issue when setup k3s

  • Command curl -sfL https://get.k3s.io | sh -s - --write-kubeconfig-mode 644 need to use systemd, but Alpine did not use, so It can not proceed.
  • Next I tried to install k3s by binary in the Alpine container, currently after command k3s server executed, eventually I would get FATA[0000] failed to find memory cgroup (v2), still trying to figure out a way to enable cgroup (v2) inside the container.

Some log of command k3s server

INFO[0000] Running cloud-controller-manager --allocate-node-cidrs=true --authentication-kubeconfig=/var/lib/rancher/k3s/server/cred/cloud-controller.kubeconfig --authorization-kubeconfig=/var/lib/rancher/k3s/server/cred/cloud-controller.kubeconfig --bind-address=127.0.0.1 --cloud-config=/var/lib/rancher/k3s/server/etc/cloud-config.yaml --cloud-provider=k3s --cluster-cidr=10.42.0.0/16 --configure-cloud-routes=false --controllers=*,-route --feature-gates=CloudDualStackNodeIPs=true --kubeconfig=/var/lib/rancher/k3s/server/cred/cloud-controller.kubeconfig --leader-elect=false --leader-elect-resource-name=k3s-cloud-controller-manager --node-status-update-frequency=1m0s --profiling=false 
I0307 09:32:16.384540      13 server.go:156] Version: v1.28.7+k3s1
I0307 09:32:16.384576      13 server.go:158] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
INFO[0000] Server node token is available at /var/lib/rancher/k3s/server/token 
INFO[0000] To join server node to cluster: k3s server -s https://172.17.0.2:6443 -t ${SERVER_NODE_TOKEN} 
INFO[0000] Agent node token is available at /var/lib/rancher/k3s/server/agent-token 
INFO[0000] To join agent node to cluster: k3s agent -s https://172.17.0.2:6443 -t ${AGENT_NODE_TOKEN} 
INFO[0000] Wrote kubeconfig /etc/rancher/k3s/k3s.yaml   
INFO[0000] Run: k3s kubectl                             
FATA[0000] failed to find memory cgroup (v2)

@chriscchien
Copy link
Contributor

Verified pass on longhorn v1.6.x(longhorn-engine 479021 longhorn-instance-manager f8a921)

Using custom k3d images(chanow/k3d:iscsi) which iscsiadm installed and then deploy longhorn v1.6.1-rc1, all pod running correctly.

/ # k get nodes -o wide
NAME                STATUS   ROLES                  AGE   VERSION        INTERNAL-IP   EXTERNAL-IP   OS-IMAGE           KERNEL-VERSION                 CONTAINER-RUNTIME
k3d-prod-server-0   Ready    control-plane,master   10m   v1.29.2+k3s1   172.19.0.2    <none>        K3s v1.29.2+k3s1   5.14.21-150500.55.44-default   containerd://1.7.11-k3s2
/ # 
/ # k get engineimage -A
NAMESPACE         NAME          INCOMPATIBLE   STATE      IMAGE                                   REFCOUNT   BUILDDATE   AGE
longhorn-system   ei-d882ef59   false          deployed   longhornio/longhorn-engine:v1.6.1-rc1   0          3h7m        8m43s
/ # 
/ # k get pods -A
NAMESPACE         NAME                                                READY   STATUS      RESTARTS   AGE
kube-system       local-path-provisioner-6c86858495-726cj             1/1     Running     0          10m
kube-system       coredns-6799fbcd5-vk2v2                             1/1     Running     0          10m
kube-system       helm-install-traefik-crd-ds78m                      0/1     Completed   0          10m
kube-system       svclb-traefik-0e224e20-c6kpt                        2/2     Running     0          10m
kube-system       helm-install-traefik-9gf5x                          0/1     Completed   1          10m
kube-system       traefik-f4564c4f4-9fqlr                             1/1     Running     0          10m
kube-system       metrics-server-67c658944b-krfcm                     1/1     Running     0          10m
longhorn-system   longhorn-ui-7474bf558c-cfgq2                        1/1     Running     0          8m57s
longhorn-system   longhorn-manager-l9hz5                              1/1     Running     0          8m57s
longhorn-system   longhorn-ui-7474bf558c-l9sst                        1/1     Running     0          8m57s
longhorn-system   longhorn-driver-deployer-75d5f5f7bc-clzkm           1/1     Running     0          8m57s
longhorn-system   instance-manager-5f2590304e90cf6584ace321bc41298e   1/1     Running     0          8m47s
longhorn-system   engine-image-ei-d882ef59-fdvj8                      1/1     Running     0          8m47s
longhorn-system   csi-provisioner-6c78dcb664-brnmp                    1/1     Running     0          8m24s
longhorn-system   csi-provisioner-6c78dcb664-m75sw                    1/1     Running     0          8m24s
longhorn-system   csi-provisioner-6c78dcb664-cdt55                    1/1     Running     0          8m24s
longhorn-system   csi-resizer-7466f7b45f-kfv68                        1/1     Running     0          8m24s
longhorn-system   csi-resizer-7466f7b45f-nmhsh                        1/1     Running     0          8m24s
longhorn-system   csi-resizer-7466f7b45f-r2kpf                        1/1     Running     0          8m24s
longhorn-system   csi-snapshotter-58bf69fbd5-bzjln                    1/1     Running     0          8m24s
longhorn-system   csi-attacher-57689cc84b-x5tjr                       1/1     Running     0          8m24s
longhorn-system   csi-attacher-57689cc84b-4977m                       1/1     Running     0          8m24s
longhorn-system   csi-snapshotter-58bf69fbd5-2hcxr                    1/1     Running     0          8m24s
longhorn-system   csi-snapshotter-58bf69fbd5-77w4h                    1/1     Running     0          8m24s
longhorn-system   csi-attacher-57689cc84b-pn2cg                       1/1     Running     0          8m24s
longhorn-system   longhorn-csi-plugin-7brz7                           3/3     Running     0          8m24s
/ # 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/v1-data-engine v1 data engine (iSCSI tgt) component/longhorn-manager Longhorn manager (control plane) kind/backport Backport request kind/bug
Projects
None yet
Development

No branches or pull requests

4 participants