Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weave net randomly blocking connections #3829

Open
bcollard opened this issue Jul 6, 2020 · 6 comments
Open

Weave net randomly blocking connections #3829

bcollard opened this issue Jul 6, 2020 · 6 comments

Comments

@bcollard
Copy link

bcollard commented Jul 6, 2020

What you expected to happen?

No connection randomly blocked, especially when there are no netpol (network policies) or when netpols target different ports and pods. See logs at the end of this issue.
Same issue than #3761

What happened?

Looking at the logs of my weave-net pods, I can see multiple connections blocked by weave-npc. Looks like it's random.

How to reproduce it?

Installed Weave CNI plugin with this method: https://www.weave.works/docs/net/latest/kubernetes/kube-addon/#-installation

Anything else we need to know?

On-prem DC. K8s installed with kubeadm. Here is my kubeadm ClusterConfiguration: https://gist.github.com/bcollard/b55108e4355b4edab169a025e02723e8
I first installed Weave-net without specifying the pod subnet. So pods spawn with IP address within the 10.32.0.0/12 range. Then, I redeployed the weave daemonset with the IPALLOC_RANGE set to 10.244.0.0/16, which is the subnet configured for pods in kubeadm.
That's why you'll see pods with these 2 kinds of IP; and also a bit more IP routes and rules.

Versions:

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.3", GitCommit:"06ad960bfd03b39c8310aaf92d1e7c12ce618213", GitTreeState:"clean", BuildDate:"2020-02-11T18:14:22Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.3", GitCommit:"06ad960bfd03b39c8310aaf92d1e7c12ce618213", GitTreeState:"clean", BuildDate:"2020-02-11T18:07:13Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"linux/amd64"}

Logs:

$ kubectl logs -n kube-system <weave-net-pod> weave

https://gist.github.com/bcollard/bda85abdcb2ee9c5779dc38512b494eb

$ ks exec -it weave-net-npt9n -c weave -- /home/weave/weave --local status

        Version: 2.6.5 (failed to check latest version - see logs; next check at 2020/07/07 02:06:37)

        Service: router
       Protocol: weave 1..2
           Name: 46:b1:bd:d0:2d:c5(fr0-viaas-271)
     Encryption: disabled
  PeerDiscovery: enabled
        Targets: 5
    Connections: 5 (5 established)
          Peers: 6 (with 30 established connections)
 TrustedSubnets: none

        Service: ipam
         Status: ready
          Range: 10.244.0.0/16
  DefaultSubnet: 10.244.0.0/16

$ journalctl -u docker.service --no-pager
https://gist.github.com/bcollard/0b95c22958cd1b8f71ab264eaf19e8ef

$ journalctl -u kubelet --no-pager
https://gist.github.com/bcollard/000ad7e54a696a7cb3d87c526bcab847

$ kubectl get events -- executed in a namespace called 'r3' where connections between pods are blocked sometimes:
https://gist.github.com/bcollard/bf50d813b7cf355638aa9f94873b5ea7

Network:

$ ip route
default via 10.125.140.1 dev ens160 proto static metric 100
10.32.0.0/12 dev weave proto kernel scope link src 10.32.0.1
10.125.140.0/22 dev ens160 proto kernel scope link src 10.125.140.229 metric 100
10.244.0.0/16 dev weave proto kernel scope link src 10.244.176.0
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
$ ip -4 -o addr
1: lo    inet 127.0.0.1/8 scope host lo\       valid_lft forever preferred_lft forever
2: ens160    inet 10.125.140.229/22 brd 10.125.143.255 scope global noprefixroute ens160\       valid_lft forever preferred_lft forever
3: docker0    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0\       valid_lft forever preferred_lft forever
6: weave    inet 10.32.0.1/12 brd 10.47.255.255 scope global weave\       valid_lft forever preferred_lft forever
6: weave    inet 10.244.176.0/16 brd 10.244.255.255 scope global weave\       valid_lft forever preferred_lft forever
$ sudo iptables-save

https://gist.github.com/bcollard/8ff28dcef4dbe721baa6db5efadd4117

Weave npc logs:

Filtered logs:

stern -n kube-system -l "name=weave-net" --since 10m | grep -i "blocked by"

https://gist.github.com/bcollard/4a5da7f16a1903a0a5755fc9ecd47163

Full logs:

https://gist.githubusercontent.com/bcollard/531596b08ba27a5dc2a931caec1c5ede/raw/54851c90f5fc291f1b63758d23542fcba824072c/weave-npc%2520full%2520logs

Bregor added a commit to evilmartians/chef-kubernetes that referenced this issue Jul 22, 2020
Corresponding issues:
- weaveworks/weave#3829
- weaveworks/weave#3761

Signed-off-by: Maxim Filatov <pipopolam@gmail.com>
@bboreham
Copy link
Contributor

@bcollard thanks for supplying the logs. Note the way Weave-NPC works is it sets up a rule to drop everything by default, then adds more rules to allow traffic according to network policies.

The first set of 'blocked' reports is very simply explained:

WARN: 2020/07/06 15:07:18.046750 TCP connection from 10.42.0.11:47088 to 10.244.0.3:27017 blocked by Weave NPC.
...
WARN: 2020/07/06 15:07:18.047165 TCP connection from 10.42.0.11:33200 to 10.244.96.1:27017 blocked by Weave NPC.

This comes around 100ms before Weave-NPC learns about this pod and is therefore able to open up traffic for it:

DEBU: 2020/07/06 15:07:18.144209 EVENT AddPod {"metadata":{"labels":{"app.kubernetes.io/component":"gateway","app.kubernetes.io/instance":"gravitee-apim","app.kubernetes.io/name":"apim","pod-template-hash":"7b6d69cb98"},"name":"gravitee-apim-gateway-7b6d69cb98-h6dj6","namespace":"apim"},"spec":{...},
"status":{"phase":"Running","podIP":"10.42.0.11"}}

100ms is a plausible delay on a machine with a lot going on.

The same pod gets blocked again a couple of seconds later, so that must be for a different reason:

WARN: 2020/07/06 15:17:21.005084 TCP connection from 10.42.0.11:54950 to 10.244.0.3:27017 blocked by Weave NPC.

Looking at the destination address, it is for this pod:

DEBU: 2020/07/06 15:07:18.081676 EVENT AddPod {"metadata":{"labels":{"app":"mongodb-replicaset","controller-revision-hash":"mongodb-replicaset-6874549864","release":"mongodb-replicaset","statefulset.kubernetes.io/pod-name":"mongodb-replicaset-2"},"name":"mongodb-replicaset-2","namespace":"r3"},"spec":{...},
"status":{"phase":"Running","podIP":"10.244.0.3"}}

Weave-NPC has at this point received nine NetworkPolicys:

INFO: 2020/07/06 15:07:17.723286 EVENT AddNetworkPolicy {"metadata":{"name":"r3-api-authorization-to-mongo-netpol","namespace":"r3",},"spec":{"ingress":[{"from":[{"namespaceSelector":{"matchLabels":{"role":"r3"}},"podSelector":{"matchLabels":{"app.kubernetes.io/instance":"api-authorization"}}}],"ports":[{"port":27017,"protocol":"TCP"}]}],"podSelector":{"matchLabels":{"app":"mongodb-replicaset","release":"mongodb-replicaset"}},"policyTypes":["Ingress"]}}
INFO: 2020/07/06 15:07:17.758049 EVENT AddNetworkPolicy {"metadata":{"name":"r3-api-calibration-to-mongo-netpol","namespace":"r3},"spec":{"ingress":[{"from":[{"namespaceSelector":{"matchLabels":{"role":"r3"}},"podSelector":{"matchLabels":{"app.kubernetes.io/instance":"api-calibration"}}}],"ports":[{"port":27017,"protocol":"TCP"}]}],"podSelector":{"matchLabels":{"app":"mongodb-replicaset","release":"mongodb-replicaset"}},"policyTypes":["Ingress"]}}
INFO: 2020/07/06 15:07:17.797106 EVENT AddNetworkPolicy {"metadata":{"name":"apim-to-mongo-netpol","namespace":"r3"},"spec":{"ingress":[{"from":[{"namespaceSelector":{"matchLabels":{"role":"apim"}},"podSelector":{"matchLabels":{"app.kubernetes.io/instance":"gravitee-apim","app.kubernetes.io/name":"apim"}}}],"ports":[{"port":27017,"protocol":"TCP"}]}],"podSelector":{"matchLabels":{"app":"mongodb-replicaset","release":"mongodb-replicaset"}},"policyTypes":["Ingress"]}}
INFO: 2020/07/06 15:07:17.826132 EVENT AddNetworkPolicy {"metadata":{"name":"prometheus-to-mongo-netpol","namespace":"r3},"spec":{"ingress":[{"from":[{"namespaceSelector":{"matchLabels":{"role":"monitoring"}},"podSelector":{"matchLabels":{"release":"prometheus-operator"}}}],"ports":[{"port":9216,"protocol":"TCP"}]}],"podSelector":{"matchLabels":{"app":"mongodb-replicaset","release":"mongodb-replicaset"}},"policyTypes":["Ingress"]}}
INFO: 2020/07/06 15:07:17.842977 EVENT AddNetworkPolicy {"metadata":{"name":"r3-api-asset-to-mongo-netpol","namespace":"r3"},"spec":{"ingress":[{"from":[{"namespaceSelector":{"matchLabels":{"role":"r3"}},"podSelector":{"matchLabels":{"app.kubernetes.io/instance":"api-asset"}}}],"ports":[{"port":27017,"protocol":"TCP"}]}],"podSelector":{"matchLabels":{"app":"mongodb-replicaset","release":"mongodb-replicaset"}},"policyTypes":["Ingress"]}}
INFO: 2020/07/06 15:07:17.877301 EVENT AddNetworkPolicy {"metadata":{"name":"r3-api-launcher-to-mongo-netpol","namespace":"r3"},"spec":{"ingress":[{"from":[{"namespaceSelector":{"matchLabels":{"role":"r3"}},"podSelector":{"matchLabels":{"app.kubernetes.io/instance":"api-launcher"}}}],"ports":[{"port":27017,"protocol":"TCP"}]}],"podSelector":{"matchLabels":{"app":"mongodb-replicaset","release":"mongodb-replicaset"}},"policyTypes":["Ingress"]}}
INFO: 2020/07/06 15:07:18.076460 EVENT AddNetworkPolicy {"metadata":{"name":"r3-api-procedure-to-mongo-netpol","namespace":"r3"},"spec":{"ingress":[{"from":[{"namespaceSelector":{"matchLabels":{"role":"r3"}},"podSelector":{"matchLabels":{"app.kubernetes.io/instance":"api-procedure"}}}],"ports":[{"port":27017,"protocol":"TCP"}]}],"podSelector":{"matchLabels":{"app":"mongodb-replicaset","release":"mongodb-replicaset"}},"policyTypes":["Ingress"]}}
INFO: 2020/07/06 15:07:18.093341 EVENT AddNetworkPolicy {"metadata":{"name":"am-openldap-netpol","namespace":"r3"},"spec":{"ingress":[{"from":[{"namespaceSelector":{"matchLabels":{"role":"apim"}},"podSelector":{"matchLabels":{"app.kubernetes.io/instance":"gravitee-am","app.kubernetes.io/name":"am"}}}],"ports":[{"port":27017,"protocol":"TCP"}]}],"podSelector":{"matchLabels":{"app":"mongodb-replicaset","release":"mongodb-replicaset"}},"policyTypes":["Ingress"]}}
INFO: 2020/07/06 15:07:18.105970 EVENT AddNetworkPolicy {"metadata":{"name":"am-to-openldap-netpol","namespace":"r3"},"spec":{"ingress":[{"from":[{"namespaceSelector":{"matchLabels":{"role":"apim"}},"podSelector":{"matchLabels":{"app.kubernetes.io/component":"gateway","app.kubernetes.io/instance":"gravitee-am","app.kubernetes.io/name":"am"}}}],"ports":[{"port":389,"protocol":"TCP"}]}],"podSelector":{"matchLabels":{"app.kubernetes.io/instance":"openldap","app.kubernetes.io/name":"openldap"}},"policyTypes":["Ingress"]}}

The namespace of the source pod looks like this:

INFO: 2020/07/06 15:07:17.881760 EVENT AddNamespace {"metadata":{"labels":{"role":"apim"},"name":"apim"},"spec":{"finalizers":["kubernetes"]},"status":{"phase":"Active"}}

as far as I can see, rule apim-to-mongo-netpol should match.

Sorry, need to stop now, will continue analysis later.

@bboreham
Copy link
Contributor

OK, let's look again. The block of interest is:

from 10.42.0.11:54950 to 10.244.0.3:27017 blocked by Weave NPC.

Here's the policy that seems to match the labels:

{
  "metadata": { "name": "apim-to-mongo-netpol", "namespace": "r3" },
  "spec": {
    "ingress": [ {
        "from": [ {
            "namespaceSelector": {
              "matchLabels": { "role": "apim" }
            },
            "podSelector": {
              "matchLabels": {
                "app.kubernetes.io/instance": "gravitee-apim",
                "app.kubernetes.io/name": "apim" }
            }
          } ],
        "ports": [  { "port": 27017, "protocol": "TCP" }   ]
      } ],
    "podSelector": {
      "matchLabels": {
        "app": "mongodb-replicaset",
        "release": "mongodb-replicaset"
      }
    },
    "policyTypes": [ "Ingress" ]
  }
}

Weave-NPC creates a rule to match that policy:

INFO: 2020/07/06 15:07:17.798791 adding rule [
-p TCP 
-m set --match-set weave-|LSW_{T;OT2F@p!~fygG.)/*f src 
-m set --match-set weave-tOJqtq[!|VUp_6aFowu*[qv{L dst 
--dport 27017 -m comment --comment namespaces: selector: role=apimapp.kubernetes.io/instance=gravitee-apim,app.kubernetes.io/name=apim -> pods: namespace: r3, selector: app=mongodb-replicaset,release=mongodb-replicaset (ingress) 
-j ACCEPT
] to "WEAVE-NPC-INGRESS" chain

When the destination pod appears it is added to sets including the dst set above:

INFO: 2020/07/06 15:07:18.081761 added entry 10.244.0.3 to weave-p?tG(={(/5jTZKf@8|Hj:~0b9 of 476ba812-9507-4c5a-9e6b-c732eb313601
INFO: 2020/07/06 15:07:18.083018 added entry 10.244.0.3 to weave-tOJqtq[!|VUp_6aFowu*[qv{L of 476ba812-9507-4c5a-9e6b-c732eb313601
INFO: 2020/07/06 15:07:18.084163 added entry 10.244.0.3 to weave-Q;Nva[*7GXRvP?Ti8k?o(Vdg[ of 476ba812-9507-4c5a-9e6b-c732eb313601

Source pod is added to the src set:

INFO: 2020/07/06 15:07:18.144263 added entry 10.42.0.11 to weave-Uj_._UE5YIW]EykWhLbhdM~p2 of 72cb5746-c362-4210-b789-283766fe900f
INFO: 2020/07/06 15:07:18.145423 added entry 10.42.0.11 to weave-t4SJ7UI13[ANR;kZS_UJuw|9r of 72cb5746-c362-4210-b789-283766fe900f
INFO: 2020/07/06 15:07:18.146618 added entry 10.42.0.11 to weave-|c{!MZMM((|7Q.UfTjJCG#CNK of 72cb5746-c362-4210-b789-283766fe900f
INFO: 2020/07/06 15:07:18.147734 added entry 10.42.0.11 to weave-|LSW_{T;OT2F@p!~fygG.)/*f of 72cb5746-c362-4210-b789-283766fe900f

Everything so far is working as intended.

TBC...

@bboreham
Copy link
Contributor

So I went to look at the iptables rules you supplied, and I notice there are two places where we bump the counter that causes the "blocked" log:

-A FORWARD -o weave -m state --state NEW -j NFLOG --nflog-group 86
-A FORWARD -o weave -j DROP
...
-A WEAVE-NPC-EGRESS -m state --state NEW -m mark ! --mark 0x40000/0x40000 -j NFLOG --nflog-group 86

but only one of them actually drops the traffic.

This changed in #3639 - prior to that we dropped in both places, but after we only drop if an egress policy is added. And you only have ingress policies.

So, @bcollard, is it possible this is a spurious log and the traffic is not actually dropped?

@bboreham
Copy link
Contributor

This traffic still shouldn't be getting logged by the egress rules because we added the source address to this set:

INFO: 2020/07/06 15:07:18.146618 added entry 10.42.0.11 to weave-|c{!MZMM((|7Q.UfTjJCG#CNK of 72cb5746-c362-4210-b789-283766fe900f

and we mark traffic originating in that set:

DEBU: 2020/07/06 15:07:18.070828 adding rule for DefaultAllow in namespace: apim, chain: WEAVE-NPC-EGRESS-DEFAULT, [-m set --match-set weave-|c{!MZMM((|7Q.UfTjJCG#CNK src -j WEAVE-NPC-EGRESS-ACCEPT -m comment --comment DefaultAllow egress isolation for namespace: apim]
DEBU: 2020/07/06 15:07:18.073580 adding rule for DefaultAllow in namespace: apim, chain: WEAVE-NPC-EGRESS-DEFAULT, [-m set --match-set weave-|c{!MZMM((|7Q.UfTjJCG#CNK src -j RETURN -m comment --comment DefaultAllow egress isolation for namespace: apim]

So I'm not sure what is happening there.

@bboreham
Copy link
Contributor

Most "blocked by" messages around this time are either shortly before a pod's IP address is known, or for the pair 10.42.0.11:xxx to 10.244.0.3:27017 that we discussed above.

A different kind is:

WARN: 2020/07/06 15:20:09.235518 UDP connection from 10.244.32.3:48227 to 10.244.192.1:53 blocked by Weave NPC.

The first time we hear of 10.244.32.3 from the API server is half a second later:

DEBU: 2020/07/06 15:20:09.890170 EVENT UpdatePod ... {"phase":"Succeeded","podIP":"10.244.32.3","qosClass":"BestEffort","startTime":"2020-07-06T15:20:07Z"}}

but because it is in phase "Succeeded", i.e. it has exited, Weave-NPC doesn't bother adding it to the rules.

however it is still apparently trying to talk one second later:

WARN: 2020/07/06 15:20:10.628294 UDP connection from 10.244.32.3:52281 to 10.244.32.1:53 blocked by Weave NPC.

a similar sequence happens for IP 10.244.32.6 at 15:30:03.

So that's the first 20 minutes or so in your logs. None of it is "random", but one particular set of reports is a mystery.

@bcollard given the lag in notifications, I recommend you check that Weave Net, and your nodes in general, have enough resources. In the supplied manifest we specify quite a small CPU requests, in order that the pod can be scheduled on small clusters, but if your machine(s) are bigger you can raise the value.

It's also not great to have the two different IP ranges in use - only one of them is known to kube-proxy and while I cannot find a direct connection between that and your troubles, it may be related. I suggest you do a rolling reboot of all nodes to clear out all interfaces and settings from the older range.

@bcollard
Copy link
Author

Thanks for your feedbacks, I'll have a look on next week, as I'm having a few days off.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants