Still connecting to unix:///... #139

eBeyond · 2020-08-07T05:48:54Z

The hcloud-csi nodes and the controller are in Crashloops after istalling hcloud-csi as described.

The hcloud-csi-driver reports:
level=debug ts=2020-08-07T05:42:01.694472148Z msg="getting instance id from metadata service"

All other pods of the hcloud-csi-controller reporting:
Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock
The pods of the hcloud-csi-nodes are reporting (except the hcloud-csi-driver, which reports the same mentioned above):
Still connecting to unix:///csi/csi.sock

It's a k8s 1.18.6 running in nbg-dc3 on private networks only. All of them have no public network-interfaces but having a firewall with nat allowed for everything. ipv6 is disabled for all nodes.

The text was updated successfully, but these errors were encountered:

eBeyond · 2020-08-07T06:14:06Z

It seems that it's not possible to get an ID. It's calling http://169.254.169.254/2009-04-04/meta-data/instance-id (hardcoded in code). But it's not reachable. Means: It cannot be reached from the k8s-nodes within the private network. It's only reachable from the nodes which have a public interface.

jekakm · 2020-08-07T06:18:19Z

#131
Take a look at this, if request are NATed - metadata server can't recognize what node making request and can't mount volume to appropriate server

eBeyond · 2020-08-07T06:22:09Z

Okay.... That seems to be a way. I'm not happy with it. Opening a door always make it accessible from both sides. But it's just a small one... :(

LKaemmerling · 2020-08-07T06:31:45Z

@eBeyond try to use the current master deployment :) there is a small change in there which first uses the node name for getting the data. https://github.com/hetznercloud/csi-driver/blob/master/deploy/kubernetes/hcloud-csi-master.yml

eBeyond · 2020-08-07T08:17:19Z

I've tried the hcloud-csi-master, but it didn't succeed. Now I've tried it with the route and it work as expected. Thanks!

LKaemmerling · 2020-08-07T08:21:18Z

Actually it should work, do the nodes in k8s have the same name as in the Cloud Console? Did you apply the new deployment?

JWDobken · 2020-08-18T08:35:02Z

I have Kubernetes 1.18.0 running on CX21 instances with Ubuntu 18.04. Following the quick guide I created the secret, deployed the latest hcloud-csi-master.yml. Deployment of the containers does not give any issue, but from the logs of the pod: hcloud-csi-node-h6cjw running on the master node will try to connect the socket:

│ csi-node-driver-registrar I0818 08:31:03.924615       1 main.go:120] Attempting to open a gRPC connection with: "/csi/csi.sock"                                                │
│ csi-node-driver-registrar I0818 08:31:03.924646       1 connection.go:151] Connecting to unix:///csi/csi.sock                                                                  │
│ csi-node-driver-registrar W0818 08:31:13.925445       1 connection.go:170] Still connecting to unix:///csi/csi.sock                                                            │
│ csi-node-driver-registrar W0818 08:31:23.925405       1 connection.go:170] Still connecting to unix:///csi/csi.sock                                                            │
│ csi-node-driver-registrar W0818 08:31:33.924873       1 connection.go:170] Still connecting to unix:///csi/csi.sock                                                            │
│ csi-node-driver-registrar W0818 08:31:43.924885       1 connection.go:170] Still connecting to unix:///csi/csi.sock                                                            │
│ csi-node-driver-registrar W0818 08:31:53.925084       1 connection.go:170] Still connecting to unix:///csi/csi.sock                                                            │
│ csi-node-driver-registrar W0818 08:32:03.924902       1 connection.go:170] Still connecting to unix:///csi/csi.sock                                                            │
│ csi-node-driver-registrar W0818 08:32:13.925007       1 connection.go:170] Still connecting to unix:///csi/csi.sock                                                            │
│ csi-node-driver-registrar W0818 08:32:23.924883       1 connection.go:170] Still connecting to unix:///csi/csi.sock                                                            │
│ csi-node-driver-registrar W0818 08:32:33.924853       1 connection.go:170] Still connecting to unix:///csi/csi.sock                                                            │
│ liveness-probe I0818 08:31:11.568778       1 connection.go:151] Connecting to unix:///csi/csi.sock                                                                             │
│ liveness-probe W0818 08:31:21.576747       1 connection.go:170] Still connecting to unix:///csi/csi.sock                                                                       │
│ liveness-probe W0818 08:31:31.569400       1 connection.go:170] Still connecting to unix:///csi/csi.sock                                                                       │
│ liveness-probe W0818 08:31:41.569045       1 connection.go:170] Still connecting to unix:///csi/csi.sock                                                                       │
│ liveness-probe W0818 08:31:51.569540       1 connection.go:170] Still connecting to unix:///csi/csi.sock                                                                       │
│ liveness-probe W0818 08:32:01.569005       1 connection.go:170] Still connecting to unix:///csi/csi.sock                                                                       │
│ liveness-probe W0818 08:32:11.569049       1 connection.go:170] Still connecting to unix:///csi/csi.sock                                                                       │
│ liveness-probe W0818 08:32:21.569052       1 connection.go:170] Still connecting to unix:///csi/csi.sock                                                                       │
│ liveness-probe W0818 08:32:31.569049       1 connection.go:170] Still connecting to unix:///csi/csi.sock                                                                       │
│ liveness-probe W0818 08:32:41.569141       1 connection.go:170] Still connecting to unix:///csi/csi.sock                                                                       │
│ csi-node-driver-registrar W0818 08:32:43.924932       1 connection.go:170] Still connecting to unix:///csi/csi.sock                                                            │
│ hcloud-csi-driver stream closed                                                                                                                                                │
│ liveness-probe W0818 08:32:51.569085       1 connection.go:170] Still connecting to unix:///csi/csi.sock                                                                       │
│ csi-node-driver-registrar W0818 08:32:53.924888       1 connection.go:170] Still connecting to unix:///csi/csi.sock                                                            │
│ liveness-probe W0818 08:33:01.569025       1 connection.go:170] Still connecting to unix:///csi/csi.sock                                                                       │
│ csi-node-driver-registrar W0818 08:33:03.924955       1 connection.go:170] Still connecting to unix:///csi/csi.sock

Same for the controller and the other nodes, just like @eBeyond explained

LKaemmerling · 2020-08-18T08:36:57Z

@JWDobken what does the controller say?

JWDobken · 2020-08-18T08:38:29Z

@JWDobken what does the controller say?

│ liveness-probe W0818 08:37:51.353118       1 connection.go:170] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock                                           │
│ liveness-probe W0818 08:38:01.353133       1 connection.go:170] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock                                           │
│ csi-attacher W0818 08:38:04.420338       1 connection.go:172] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock                                             │
│ csi-provisioner W0818 08:38:06.416689       1 connection.go:172] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock                                          │
│ csi-resizer W0818 08:38:09.426785       1 connection.go:170] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock

LKaemmerling · 2020-08-18T08:42:50Z

@JWDobken the controller pod log more than just this (likely after it started).

JWDobken · 2020-08-18T08:54:51Z

Full output of the controller just after launch:

 csi-attacher I0818 08:53:34.726025       1 main.go:91] Version: v2.2.0-0-g97411fa7                                                                                             │
│ csi-attacher I0818 08:53:34.728121       1 connection.go:153] Connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock                                                   │
│ csi-attacher W0818 08:53:44.728881       1 connection.go:172] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock                                             │
│ csi-attacher W0818 08:53:54.728511       1 connection.go:172] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock                                             │
│ csi-resizer I0818 08:53:39.820133       1 main.go:61] Version : v0.3.0-0-g150071d                                                                                              │
│ csi-resizer I0818 08:53:39.821187       1 connection.go:151] Connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock                                                    │
│ csi-resizer W0818 08:53:49.821836       1 connection.go:170] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock                                              │
│ csi-provisioner W0818 08:53:46.819849       1 deprecatedflags.go:53] Warning: option provisioner="csi.hetzner.cloud" is deprecated and has no effect                           │
│ csi-provisioner I0818 08:53:46.819919       1 feature_gate.go:243] feature gates: &{map[Topology:true]}                                                                        │
│ csi-provisioner I0818 08:53:46.819942       1 csi-provisioner.go:107] Version: v1.6.0-0-g321fa5c1c                                                                             │
│ csi-provisioner I0818 08:53:46.819957       1 csi-provisioner.go:121] Building kube configs for running in cluster...                                                          │
│ csi-provisioner I0818 08:53:46.834836       1 connection.go:153] Connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock                                                │
│ csi-provisioner W0818 08:53:56.835306       1 connection.go:172] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock                                          │
│ hcloud-csi-driver level=debug ts=2020-08-18T08:53:50.457833279Z msg="getting instance id from metadata service"                                                                │
│ hcloud-csi-driver level=debug ts=2020-08-18T08:53:50.459078735Z msg="fetching server"                                                                                          │
│ liveness-probe I0818 08:53:51.919082       1 connection.go:151] Connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock                                                 │
│ csi-resizer W0818 08:53:59.821414       1 connection.go:170] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock                                              │
│ liveness-probe W0818 08:54:01.919454       1 connection.go:170] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock                                           │
│ csi-attacher W0818 08:54:04.728545       1 connection.go:172] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock

LKaemmerling · 2020-08-18T08:56:17Z

It doesn't looks like you are using the latest "master", as it tries to use the metadata service (which is not possible for your setup based on your comments in #131). Please use: https://github.com/hetznercloud/csi-driver/blob/master/deploy/kubernetes/hcloud-csi-master.yml

JWDobken · 2020-08-18T09:06:19Z

Instead of this line from the quickstart:

kubectl apply -f https://raw.githubusercontent.com/hetznercloud/csi-driver/v1.4.0/deploy/kubernetes/hcloud-csi.yml

I run:

$ kubectl apply -f https://raw.githubusercontent.com/hetznercloud/csi-driver/master/deploy/kubernetes/hcloud-csi-master.yml
csidriver.storage.k8s.io/csi.hetzner.cloud created
storageclass.storage.k8s.io/hcloud-volumes created
serviceaccount/hcloud-csi created
clusterrole.rbac.authorization.k8s.io/hcloud-csi created
clusterrolebinding.rbac.authorization.k8s.io/hcloud-csi created
statefulset.apps/hcloud-csi-controller created
daemonset.apps/hcloud-csi-node created
service/hcloud-csi-controller-metrics created
service/hcloud-csi-node-metrics created

Logs of the controller:

 csi-attacher I0818 09:05:23.707861       1 main.go:91] Version: v2.2.0-0-g97411fa7                                                                                             │
│ csi-attacher I0818 09:05:23.711141       1 connection.go:153] Connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock                                                   │
│ csi-attacher W0818 09:05:33.711523       1 connection.go:172] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock                                             │
│ csi-attacher W0818 09:05:43.711432       1 connection.go:172] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock                                             │
│ csi-resizer I0818 09:05:29.430596       1 main.go:61] Version : v0.3.0-0-g150071d                                                                                              │
│ csi-resizer I0818 09:05:29.431849       1 connection.go:151] Connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock                                                    │
│ csi-resizer W0818 09:05:39.432659       1 connection.go:170] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock                                              │
│ csi-provisioner W0818 09:05:37.054418       1 deprecatedflags.go:53] Warning: option provisioner="csi.hetzner.cloud" is deprecated and has no effect                           │
│ csi-provisioner I0818 09:05:37.054490       1 feature_gate.go:243] feature gates: &{map[Topology:true]}                                                                        │
│ csi-provisioner I0818 09:05:37.054516       1 csi-provisioner.go:107] Version: v1.6.0-0-g321fa5c1c                                                                             │
│ csi-provisioner I0818 09:05:37.054531       1 csi-provisioner.go:121] Building kube configs for running in cluster...                                                          │
│ csi-provisioner I0818 09:05:37.066302       1 connection.go:153] Connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock                                                │
│ hcloud-csi-driver level=debug ts=2020-08-18T09:05:40.627217379Z msg="getting instance id from metadata service"                                                                │
│ hcloud-csi-driver level=debug ts=2020-08-18T09:05:40.629569322Z msg="fetching server"                                                                                          │
│ liveness-probe I0818 09:05:42.089503       1 connection.go:151] Connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock                                                 │
│ csi-provisioner W0818 09:05:47.066593       1 connection.go:172] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock                                          │
│ csi-resizer W0818 09:05:49.432170       1 connection.go:170] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock```

JWDobken · 2020-08-18T10:57:25Z

btw, if I change /etc/network/interfaces.d/61-my-private-network.cfg to:

auto ens10
iface ens10 inet dhcp
        post-up ip route add default via 10.0.0.1
        post-up ip route add 169.254.169.254 dev eth0

and restart the service networking:

$ systemctl status networking.service
 networking.service - Raise network interfaces
   Loaded: loaded (/lib/systemd/system/networking.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Tue 2020-08-18 12:56:42 CEST; 6s ago
     Docs: man:interfaces(5)
  Process: 20424 ExecStop=/sbin/ifdown -a --read-environment --exclude=lo (code=exited, status=0/SUCCESS)
  Process: 20571 ExecStart=/sbin/ifup -a --read-environment (code=exited, status=1/FAILURE)
  Process: 20547 ExecStartPre=/bin/sh -c [ "$CONFIGURE_INTERFACES" != "no" ] && [ -n "$(ifquery --read-environment --list --exclude=lo)" ] && udevadm se
 Main PID: 20571 (code=exited, status=1/FAILURE)

Aug 18 12:56:42 node-1 dhclient[20793]: DHCPOFFER of 10.10.1.11 from 10.10.0.1
Aug 18 12:56:42 node-1 dhclient[20793]: DHCPACK of 10.10.1.11 from 10.10.0.1
Aug 18 12:56:42 node-1 ifup[20571]: DHCPACK of 10.10.1.11 from 10.10.0.1
Aug 18 12:56:42 node-1 dhclient[20793]: bound to 10.10.1.11 -- renewal in 34406 seconds.
Aug 18 12:56:42 node-1 ifup[20571]: bound to 10.10.1.11 -- renewal in 34406 seconds.
Aug 18 12:56:42 node-1 ifup[20571]: Error: Nexthop has invalid gateway.
Aug 18 12:56:42 node-1 ifup[20571]: Failed to bring up ens10.
Aug 18 12:56:42 node-1 systemd[1]: networking.service: Main process exited, code=exited, status=1/FAILURE
Aug 18 12:56:42 node-1 systemd[1]: networking.service: Failed with result 'exit-code'.
Aug 18 12:56:42 node-1 systemd[1]: **Failed to start Raise network interfaces.**

MrSaints · 2020-08-26T22:01:20Z

~~I'm seeing a similar issue, but no logs in the hcloud-csi-driver container. The node is able to get the metadata without any issues.~~

Managed to resolve it by checking:

Is the DNS working? No
Can we reach the DNS server (CoreDNS)? No
Why can't we reach?

At this point, I compared iptables rules between nodes that were working vs. ones that weren't.

I noticed in the non-working ones:

-P FORWARD DROP

So I ran iptables --policy FORWARD ACCEPT, then requests started going through to CoreDNS, and everything appears to be working now :)

Related kubernetes/kubernetes#40182

mehdisadeghi · 2020-09-02T16:29:48Z

I have exactly the same issue with 1.4.0 and kubelet 1.17.9 (from rancher) as described here. Nor the suggestion above, nor the yaml file from master helped. I had the exact same setup before and it worked. They only change I made on this new cluster, is using Weave for networking.
Update: it didn't work even without weave.
Update 2:
I also tried adding iptables rules suggested here and in the original issue (from 2017!) to each of my nodes (Ubuntu 18.04 LTS) to no avail.

tumbl3w33d · 2020-09-03T14:57:02Z

Same for me with the following setup:

Debian 10 VMs
kubespray-based k8s (2.13.3)
k8s 1.17.9
calico network (3.15.2)
hcloud-cloud-controller-manager active (1.7.0), routing cluster traffic via private network
3 master nodes, 2 worker nodes
latest docker (19.03.12)

Since I can recreate my cluster from scratch (including recreating VMs, using ansible) within ~40 minutes, I redeployed numerous times and this behavior is reproduced everytime. Remarkable - the csi-node-driver-registrar keeps saying

I0903 14:43:43.209303       6 main.go:110] Version: v1.3.0-0-g6e9fff3e
I0903 14:43:43.209394       6 main.go:120] Attempting to open a gRPC connection with: "/csi/csi.sock"
I0903 14:43:43.209412       6 connection.go:151] Connecting to unix:///csi/csi.sock
W0903 14:43:53.209845       6 connection.go:170] Still connecting to unix:///csi/csi.sock
[repeated over and over again]

while hcloud-csi-driver is quiet after

level=debug ts=2020-09-03T14:50:58.686361553Z msg="getting instance id from metadata service"
level=debug ts=2020-09-03T14:50:58.688746261Z msg="fetching server"

Remarkable - of the 5 hcloud-csi-nodes there's always only one broken like that and ~~it's always the one on my first worker (which is no different than the second, based on my simple kubespray knowledge)~~ ~~it seems to be the one that's on the same worker node which also hosts the hcloud-csi-controller~~ with no recognizable correlation/pattern. The other ones happily report

I0903 14:43:43.219396       6 main.go:110] Version: v1.3.0-0-g6e9fff3e
I0903 14:43:43.219533       6 main.go:120] Attempting to open a gRPC connection with: "/csi/csi.sock"
I0903 14:43:43.219568       6 connection.go:151] Connecting to unix:///csi/csi.sock
I0903 14:43:48.481906       6 main.go:127] Calling CSI driver to discover driver name
I0903 14:43:48.481986       6 connection.go:180] GRPC call: /csi.v1.Identity/GetPluginInfo
I0903 14:43:48.482002       6 connection.go:181] GRPC request: {}
I0903 14:43:48.486558       6 connection.go:183] GRPC response: {"name":"csi.hetzner.cloud","vendor_version":"1.4.0"}
I0903 14:43:48.487519       6 connection.go:184] GRPC error: <nil>
I0903 14:43:48.487538       6 main.go:137] CSI driver name: "csi.hetzner.cloud"
I0903 14:43:48.487651       6 node_register.go:51] Starting Registration Server at: /registration/csi.hetzner.cloud-reg.sock
I0903 14:43:48.487832       6 node_register.go:60] Registration Server started at: /registration/csi.hetzner.cloud-reg.sock
I0903 14:43:49.046609       6 main.go:77] Received GetInfo call: &InfoRequest{}
I0903 14:43:50.046955       6 main.go:77] Received GetInfo call: &InfoRequest{}
I0903 14:43:50.091637       6 main.go:87] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:true,Error:,}

I'm clueless where else to look.

P.S.: I also tried with master but the behavior is exactly the same.

ArtemLavrentii · 2020-09-03T18:16:23Z

Just had same issue. In my case metadata service wasn't accessible inside of pod, and I'm also having NAT-like setup.

While hcloud-csi-node are getting instance IDs from their name, controlled doesn't as it doesn't have KUBE_NODE_NAME variable. Modifying deployment file and adding new env. variable fixed this issue for me.

PR: #144

mehdisadeghi · 2020-09-07T14:45:04Z

No luck for me. Still getting the same error even with the above PR merged. Logs from 'hcloud-csi-driver'

 Mon, Sep 7 2020 4:40:34 pm
level=debug ts=2020-09-07T14:40:34.128483056Z msg="server not found by name, fallback to metadata service" err=null
Mon, Sep 7 2020 4:40:34 pm
level=debug ts=2020-09-07T14:40:34.128543659Z msg="getting instance id from metadata service"
Mon, Sep 7 2020 4:40:34 pm
level=debug ts=2020-09-07T14:40:34.129628629Z msg="fetching server"
Mon, Sep 7 2020 4:40:34 pm
panic: runtime error: invalid memory address or nil pointer dereference
Mon, Sep 7 2020 4:40:34 pm
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x90e2f2]
Mon, Sep 7 2020 4:40:34 pm
Mon, Sep 7 2020 4:40:34 pm
goroutine 1 [running]:
Mon, Sep 7 2020 4:40:34 pm
main.main()
Mon, Sep 7 2020 4:40:34 pm
/csi/cmd/driver/main.go:112 +0x6a2

LKaemmerling · 2020-09-08T05:59:40Z

@mehdisadeghi looks like the lookup using the name was not possible (the node name in k8s is different from the name in the hcloud cloud console) and then the lookup by the meta data service also failed. I guess there is something wrong with your network.

mehdisadeghi · 2020-09-08T14:08:45Z

@LKaemmerling thanks for the tip. The hostnames on both nodes (the control plane and the only worker) are exactly the same as the hostnames on hcloud console (as I can see under my servers list and using hostname command on the nodes).

The cluster is also a vanilla cluster created using rancher based on Ubuntu 18.04 images with Weave networking.

LKaemmerling · 2020-09-08T14:11:39Z

Could you make sure that you can access https://api.hetzner.cloud (Maybe just a curl within a pod)? It looks like the network doesn't work correctly.

mehdisadeghi · 2020-09-08T15:18:07Z

@LKaemmerling in order to move Weave out of the equation, I just upgraded my rancher image to the latest version (rancher/rancher:latest - released 18 hours ago), created a new cluster with one control-plane node (etcd, control plane) and one worker node (both CPX21). With K8s v1.18.6-rancher1-2 with Canal networking (rancher's default) with docker v19.03. Then I applied my secret and then the csi-driver yaml declarations on the master branch. Still similar errors, which look like a go runtime error:

kubectl apply -f https://raw.githubusercontent.com/hetznercloud/csi-driver/master/deploy/kubernetes/hcloud-csi-master.yml
and the rest logs as before:

 Tue, Sep 8 2020 5:00:38 pm
level=debug ts=2020-09-08T15:00:38.103320541Z msg="server not found by name, fallback to metadata service" err=null
Tue, Sep 8 2020 5:00:38 pm
level=debug ts=2020-09-08T15:00:38.103451366Z msg="getting instance id from metadata service"
Tue, Sep 8 2020 5:00:38 pm
level=debug ts=2020-09-08T15:00:38.104406866Z msg="fetching server"
Tue, Sep 8 2020 5:00:38 pm
panic: runtime error: invalid memory address or nil pointer dereference
Tue, Sep 8 2020 5:00:38 pm
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x90e2f2]
Tue, Sep 8 2020 5:00:38 pm
Tue, Sep 8 2020 5:00:38 pm
goroutine 1 [running]:
Tue, Sep 8 2020 5:00:38 pm
main.main()
Tue, Sep 8 2020 5:00:38 pm
/csi/cmd/driver/main.go:112 +0x6a2

I also deployed a pod and I can access the api within the pod (even though I get a TLS error, because the dnsutils pod does not come with necessary certificates):

~ kubectl apply -f https://k8s.io/examples/admin/dns/dnsutils.yaml
~ kubectl exec -ti pod/dnsutils -- wget https://api.hetzner.cloud
Connecting to api.hetzner.cloud (213.239.246.1:443)
ssl_client: api.hetzner.cloud: TLS connect failed
wget: error getting response: Connection reset by peer
command terminated with exit code 1
~ kubectl exec -ti pod/dnsutils -- wget http://api.hetzner.cloud 
Connecting to api.hetzner.cloud (213.239.246.1:80)
wget: server returned error: HTTP/1.1 308 Permanent Redirect
command terminated with exit code 1

~ kubectl exec -ti pod/dnsutils -- nslookup api.hetzner.cloud  
Server:		10.43.0.10
Address:	10.43.0.10#53

Name:	api.hetzner.cloud
Address: 213.239.246.1

FWIW all the nodes are connected over a private network.
Edit: for creating Hetzner nodes from Rancher I use ui-driver-hetzner which in turn uses docker-machine-driver-hetzner.

vlasov-y · 2020-09-10T12:03:17Z

I think the issue is because of rancher-provided kubernetes. I have read hcloud-csi and found out that it use to mount /var/lib/kubelet/plugins/, while k3s or rancher-k8s are installed into /var/lib/rancher and file structure differs.

mehdisadeghi · 2020-09-10T12:50:09Z

@vlasov-y interesting point. I checked the nodes and infact /var/lib/kubelet/plugins/ exist on each one of them. This is the layout:

root@cp-1:~# ls /var/lib/kubelet/ /var/lib/kubelet/plugins/csi.hetzner.cloud/
/var/lib/kubelet/:
cpu_manager_state  device-plugins  pki  plugins  plugins_registry  pod-resources  pods  volumeplugins

/var/lib/kubelet/plugins/csi.hetzner.cloud/:

There is this csi.hetzner.cloud directory there, but it is empty. The rancher directory (/var/lib/rancher/rke/log/) only contains logs.

vlasov-y · 2020-09-10T13:02:15Z

@vlasov-y interesting point. I checked the nodes and infact /var/lib/kubelet/plugins/ exist on each one of them. This is the layout:
root@cp-1:~# ls /var/lib/kubelet/ /var/lib/kubelet/plugins/csi.hetzner.cloud/
/var/lib/kubelet/:
cpu_manager_state  device-plugins  pki  plugins  plugins_registry  pod-resources  pods  volumeplugins

/var/lib/kubelet/plugins/csi.hetzner.cloud/:
There is this csi.hetzner.cloud directory there, but it is empty. The rancher directory (/var/lib/rancher/rke/log/) only contains logs.

I am setting cluster with plain kubeadm and try running hcloud-csi. Will reply there about the results.

vlasov-y · 2020-09-10T15:18:26Z

The same result... Setting up of cluster using kubeadm and v1.18.8 produce the same result

mehdisadeghi · 2020-09-10T15:27:10Z

@vlasov-y good to know that! In that case this ticket should be reopened.

vlasov-y · 2020-09-11T10:42:12Z

Hello @mehdisadeghi. I had an error in secrets configuration. I have fixed it and now it works!

I0911 10:42:25.214230       1 main.go:71] Health check succeeded
I0911 10:33:27.491828       1 main.go:110] Version: v1.3.0-0-g6e9fff3e
I0911 10:33:27.491925       1 main.go:120] Attempting to open a gRPC connection with: "/csi/csi.sock"
I0911 10:33:27.491950       1 connection.go:151] Connecting to unix:///csi/csi.sock
I0911 10:33:29.538873       1 main.go:127] Calling CSI driver to discover driver name
I0911 10:33:29.540027       1 connection.go:180] GRPC call: /csi.v1.Identity/GetPluginInfo
I0911 10:33:29.540045       1 connection.go:181] GRPC request: {}
I0911 10:33:29.544250       1 connection.go:183] GRPC response: {"name":"csi.hetzner.cloud","vendor_version":"1.4.0"}
I0911 10:33:29.545565       1 connection.go:184] GRPC error: <nil>
I0911 10:33:29.545588       1 main.go:137] CSI driver name: "csi.hetzner.cloud"
I0911 10:33:29.545700       1 node_register.go:51] Starting Registration Server at: /registration/csi.hetzner.cloud-reg.sock
I0911 10:33:29.545932       1 node_register.go:60] Registration Server started at: /registration/csi.hetzner.cloud-reg.sock
I0911 10:33:30.868693       1 main.go:77] Received GetInfo call: &InfoRequest{}
I0911 10:33:30.896740       1 main.go:87] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:true,Error:,}

Solution in my case was migration from k3s.io to standard kubeadm. Issue should be in:

            - --kubelet-registration-path=/var/lib/kubelet/plugins/csi.hetzner.cloud/csi.sock

mehdisadeghi · 2020-09-21T11:09:17Z

Thanks for the update @vlasov-y. I'll try to see whether I can fix that, however in that case it should be noted in the README that hetznercloud csi-driver does not work with Rancher-based clusters.

mehdisadeghi · 2020-09-21T16:01:43Z

I hardcoded my server ID in csi-driver source code, built and ran the csi-driver image locally and realized the token I was using belonged to a different project on Hetzner and not the one running the Rancher VMs. After using the right token everything worked as expected.

My carelessness aside, some less cryptic error message would have gone a long way. Thanks.

intdel · 2021-05-13T12:57:32Z

I have the same issue. In my case it is with the new firewall feature. If I do this:

It does not work and says the message: Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock

If I allow any incoming traffic it works. So what am I missing?

thbiela · 2021-05-23T18:56:01Z

@intdel I also face this issue. Would be great to know what incoming ports need to be allowed for the csi-driver to work.

thbiela · 2021-05-27T11:48:14Z

@intdel You have to allow incoming UDP Port 8472

ekeih · 2021-05-28T14:42:07Z

@intdel You have to allow incoming UDP Port 8472

This is a rather popular issue, so just in case other people end up here while searching for the error message: Please don't allow public access to 8472, we sorted that out in #204 (comment).

ludgart · 2022-01-16T17:50:28Z

Hello,

I was struggling with the same problem. My solution were, I forgot to install the following packages

ca-certificates
curl
gnupg
lsb-release
ntp
apparmor
apparmor-utils

Install command:
apt-get install ca-certificates curl gnupg lsb-release ntp apparmor apparmor-utils

After it, the connection issue disappeared.

Regards
ludgart

mehran-prs · 2022-03-13T22:17:16Z

In our case, it was just because the hcloud-csi-driver docker image version was on the latest in the daemonset, so on the new image version, it caused this error.

We just changed the image tag with the stable version and it was fixed.

sui77 · 2022-11-01T12:36:41Z

I had this issue with a new node from a different hetzner cloud project. Worked fine after assigning the VM to the right project.

mehdisadeghi · 2022-12-08T17:04:43Z

Two years are gone and I had the same issue. Came back to this thread. Read supportive community comments and finally my own commands to realize that, well... I have made the same mistake of using the wrong hcloud token in a wrong project. I guess I'm not getting younger.

The morale of the story, it is good to write descriptive comments and contribute even if it is reporting errors.

strowi · 2024-02-06T15:48:54Z

Came here again after a while. api.hetzner.cloud was resolving to a wrong ip. Solution was adding the linked ndots to the csi-controller deployment.

eBeyond closed this as completed Aug 7, 2020

JWDobken mentioned this issue Aug 18, 2020

CSI attaches the PVC to a node which is not even a part of K8s Cluster #131

Closed

tumbl3w33d mentioned this issue Sep 8, 2020

Specify ndots dns config for hcloud-csi-driver #147

Closed

thbiela mentioned this issue May 25, 2021

CSI-driver does not work with new hetzner firewall feature #204

Closed

NikoGrano mentioned this issue Jul 3, 2023

[BUG] CSI Back-off restarting: Can't connect unix:///csi/csi.sock juicedata/juicefs-csi-driver#684

Open

apricote mentioned this issue Jul 20, 2023

[Q] Still connecting to unix:///run/csi/socket #465

Closed

Still connecting to unix:///... #139

Still connecting to unix:///... #139

Comments

eBeyond commented Aug 7, 2020

eBeyond commented Aug 7, 2020

jekakm commented Aug 7, 2020

eBeyond commented Aug 7, 2020

LKaemmerling commented Aug 7, 2020

eBeyond commented Aug 7, 2020

LKaemmerling commented Aug 7, 2020 • edited Loading

JWDobken commented Aug 18, 2020 • edited Loading

LKaemmerling commented Aug 18, 2020

JWDobken commented Aug 18, 2020

LKaemmerling commented Aug 18, 2020

JWDobken commented Aug 18, 2020

LKaemmerling commented Aug 18, 2020

JWDobken commented Aug 18, 2020

JWDobken commented Aug 18, 2020

MrSaints commented Aug 26, 2020 • edited Loading

mehdisadeghi commented Sep 2, 2020 • edited Loading

tumbl3w33d commented Sep 3, 2020 • edited Loading

ArtemLavrentii commented Sep 3, 2020 • edited Loading

mehdisadeghi commented Sep 7, 2020

LKaemmerling commented Sep 8, 2020

mehdisadeghi commented Sep 8, 2020

LKaemmerling commented Sep 8, 2020

mehdisadeghi commented Sep 8, 2020 • edited Loading

vlasov-y commented Sep 10, 2020 • edited Loading

mehdisadeghi commented Sep 10, 2020

vlasov-y commented Sep 10, 2020

vlasov-y commented Sep 10, 2020

mehdisadeghi commented Sep 10, 2020

vlasov-y commented Sep 11, 2020 • edited Loading

mehdisadeghi commented Sep 21, 2020 • edited Loading

mehdisadeghi commented Sep 21, 2020 • edited Loading

intdel commented May 13, 2021

thbiela commented May 23, 2021

thbiela commented May 27, 2021

ekeih commented May 28, 2021

ludgart commented Jan 16, 2022 • edited Loading

mehran-prs commented Mar 13, 2022

sui77 commented Nov 1, 2022

mehdisadeghi commented Dec 8, 2022

strowi commented Feb 6, 2024

LKaemmerling commented Aug 7, 2020 •

edited

Loading

JWDobken commented Aug 18, 2020 •

edited

Loading

MrSaints commented Aug 26, 2020 •

edited

Loading

mehdisadeghi commented Sep 2, 2020 •

edited

Loading

tumbl3w33d commented Sep 3, 2020 •

edited

Loading

ArtemLavrentii commented Sep 3, 2020 •

edited

Loading

mehdisadeghi commented Sep 8, 2020 •

edited

Loading

vlasov-y commented Sep 10, 2020 •

edited

Loading

vlasov-y commented Sep 11, 2020 •

edited

Loading

mehdisadeghi commented Sep 21, 2020 •

edited

Loading

mehdisadeghi commented Sep 21, 2020 •

edited

Loading

ludgart commented Jan 16, 2022 •

edited

Loading