What you expected to happen?
- Pods placed on different nodes in the cluster should be able to communicate.
- Internet connection is lost intermittently and restarting the nodes solves the issue.
- kubectl exec is giving timeout for pods on a one particular node
What happened?
I have an backend api running as pod on node 1 which invokes db running as pod in node 2. The api is not able to connect to db and if i place the both pods on same node, it is working fine.
Anything else we need to know?
Kubernetes 1.25.4 running on On-Prem Ubuntu nodes. (1 master, 2 worker nodes)
Versions:
$ weave version
kubectl exec -n kube-system weave-net-zdxxw -c weave -- /home/weave/weave --local status
Version: git-34de0b10a69c (failed to check latest version - see logs; next check at 2022/12/19 11:39:08)
Service: router
Protocol: weave 1..2
Name: XXXXXXXX (masternode)
Encryption: disabled
PeerDiscovery: enabled
Targets: 2
Connections: 2 (2 established)
Peers: 3 (with 6 established connections)
TrustedSubnets: none
Service: ipam
Status: ready
Range: 10.32.0.0/12
DefaultSubnet: 10.32.0.0/12
$ docker version
20.10.18
$ uname -a
Linux masternode 5.4.0-131-generic #147-Ubuntu SMP Fri Oct 14 17:07:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Linux node1 5.4.0-131-generic #147-Ubuntu SMP Fri Oct 14 17:07:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Linux node2 5.4.0-126-generic #142-Ubuntu SMP Fri Aug 26 12:12:57 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
node1 is the one impacted
$ kubectl version
1.25.4
$ kubectl logs -n kube-system weave-net-node1 weave
Error from server: Get "https://10.x.x.x:10250/containerLogs/kube-system/weave-net-tnphg/weave": dial tcp 10.x.x.x:10250: i/o timeout