Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NetworkPolicy ipBlock cannot set the real client IP address #1199

Closed
yuchunyun opened this issue Dec 2, 2021 · 15 comments
Closed

NetworkPolicy ipBlock cannot set the real client IP address #1199

yuchunyun opened this issue Dec 2, 2021 · 15 comments
Labels

Comments

@yuchunyun
Copy link

yuchunyun commented Dec 2, 2021

Problem: NetworkPolicy ipBlock set real client IP address not work.
I deployed the whoami service and set externalIPs for it.
access the externalIPs from outside the k8s cluster,, pod log appear as follows:

Hostname: whoami-586fd9cddd-x7wjc
IP: 127.0.0.1
IP: 10.244.6.196
IP: 192.168.120.37
RemoteAddr: 192.168.110.15:48386   #This is my k8s node address
GET / HTTP/1.1
Host: 192.168.120.37   #This is my externalIP
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.55 Safari/537.36 Edg/96.0.1054.34
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Accept-Encoding: gzip, deflate
Accept-Language: zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6
Cache-Control: max-age=0
Connection: keep-alive
Upgrade-Insecure-Requests: 1

set kube-router.io/service.dsr: tunnel, it will appear as follows.

Hostname: whoami-586fd9cddd-x7wjc
IP: 127.0.0.1
IP: 10.244.6.196
IP: 192.168.120.37
RemoteAddr: 192.168.83.148:59349   #client real address
GET / HTTP/1.1
Host: 192.168.120.37
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.55 Safari/537.36 Edg/96.0.1054.34
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Accept-Encoding: gzip, deflate
Accept-Language: zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6
Cache-Control: no-cache
Connection: keep-alive
Pragma: no-cache
Upgrade-Insecure-Requests: 1

here my networkpolicy

kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: access-whoami
spec:
  podSelector:
    matchLabels:
      run: whoami
  ingress:
  - from:
     - ipBlock:
        cidr: 192.168.83.0/24     # can not work !!!
     - ipBlock:
        cidr: 192.168.110.0/24     #work.

** System Information (please complete the following information):**

  • Kube-Router Version (kube-router --version): v1.3.2
  • Kubernetes Version (kubectl version) : v1.22.2
@yuchunyun yuchunyun added the bug label Dec 2, 2021
@aauren
Copy link
Collaborator

aauren commented Dec 6, 2021

Hmm... I'm not sure what you mean by #work and # can not work !!!, the situation works fine for me when I reproduce this locally. Were you expecting the traffic to be blocked and it wasn't? Were you expecting the traffic to be unblocked and it wasn't?

You probably need to include more information in your bug report about what you were expecting to happen and what actually happened. Along with logs and what options you run kube-router with.

All of the fields that are in our issue template are important and in order to actually be able to resolve issues we need all of them. However, the information you provided only hits about 20% of them.

Keep in mind that with DSR you almost always want to combine it with kube-router.io/service.local: "true" or internalTrafficPolicy: Local if the remote IP is of interest to you, otherwise your service may be proxied via another node and you'll end up with another node's IP rather than the real IP of the remote node.

@yuchunyun
Copy link
Author

yuchunyun commented Dec 7, 2021

Sorry, my English is poor.
kube-router args like this:

...   
     args:
        - --run-router=true
        - --run-firewall=true
        - --run-service-proxy=true
        - --bgp-graceful-restart=true
        - --advertise-external-ip=true
        - --service-external-ip-range=192.168.120.0/24
        - --cluster-asn=64611
        - --peer-router-ips=202.173.8.36,202.173.8.37
        - --peer-router-multihop-ttl=5
        - --peer-router-asns=64600,64600
        - --kubeconfig=/var/lib/kube-router/kubeconfig
        - --metrics-path=/metrics
        - --metrics-port=8080
        - --v=5

my test svc defined as follows

apiVersion: v1
kind: Service
metadata:
  annotations:
    kube-router.io/service.dsr: tunnel
  name: whoami
  namespace: default
spec:
  externalIPs:
  - 192.168.120.37
  ports:
  - port: 80
    protocol: TCP
    targetPort: 80
  selector:
    run: whoami

then client access, and the client IP in the POD log is the real address of my client (it`s great)
Then add network policy ingress to allow my client address to access svc like this

kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: access-whoami
spec:
  podSelector:
    matchLabels:
      run: whoami
  ingress:
  - from:
    - ipBlock:
        cidr: 192.168.83.0/24    #this is my client address

but my client cannot access it properly

Failed connect to 192.168.120.37:80; Connection timed out

only configure like this, the client can access it properly

kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: access-whoami
spec:
  podSelector:
    matchLabels:
      run: whoami
  ingress:
  - from:
    - ipBlock:
        cidr: 192.168.110.0/24  #this is my nodes address

@yuchunyun
Copy link
Author

yuchunyun commented Dec 8, 2021

observe the dropped packets by running tcpdump -i nflog:100 -nnnn, found two problems,please help me to see if this is normal

14:43:03.000708 IP 192.168.110.15.19005 > 10.244.6.196.80: Flags [S], seq 4158493724, win 29200, options [mss 1460,sackOK,TS val 659845462 ecr 0,nop,wscale 9], length 0
14:43:22.071803 IP 192.168.110.15 > 10.244.6.195: IP 192.168.83.148.47874 > 192.168.120.37.80: Flags [S], seq 2531701233, win 29200, options [mss 1460,sackOK,TS val 659864542 ecr 0,nop,wscale 9], length 0 (ipip-proto-4)

Q1:Whether I use dsr(Tunnel) mode or not, the source IP in tcpdump is 192.168.110.15,Is this why NetworkPolicy.spec.ingress.from.ipBlock must be set k8s node address,not client real address?
Q2:In dsr(Tunnel), NetworkPolicy.spec.ingress.ports cannot be set(have tested it and it's true), as shown in tcpdump, the destination port appears to be different from normal mode?

@aauren
Copy link
Collaborator

aauren commented Dec 8, 2021

My best guess at this point, is that your service isn't declared as a local service. If you aren't using a local service there is a chance that your request will ingress one node and be proxied to another node that contains the service pod. When this happens the L3 header is rewritten and the new source IP would be seen as a kubernetes node.

Can you try ensuring that your service is a local service via: internalTrafficPolicy: Local and see if you have the same results?

@yuchunyun
Copy link
Author

Yes, I tried to set internalTrafficPolicy: Local for the svc, but still got the same result.

I see this vip 192.168.120.37 declared by all my work nodes on the upper BGP peer, and then ipvs entries for this vip on all my whork nodes, when external requests are loaded to nodes that are not running pods, the source IP would be seen as this node.

I think that when set the internalTrafficPolicy: Local, not all nodes should declare the VIP to their upper BGP peers, but only nodes running pod should declare the VIP and have IPVS entries. Just like Metallb does.

@aauren
Copy link
Collaborator

aauren commented Dec 9, 2021

Yes, this is the way that kube-router should be functioning. If you have spec.ExternalTrafficPolicy: Local then you should only see the nodes with an active service pod running advertising the external IP to the upstream BGP peer.

I'm having a hard time understanding what could be going wrong here. I know that multiple users have made heavy use of this feature across the last 2 - 3 years of kube-router versions and there has never been an issue with the BGP announcement functionality that I'm aware of. So I'm wondering what could be going wrong.

Can you show the following?

  • Your full service definition: kubectl get service -n <namespace> <service_name> -o yaml
  • Your pod output selected by the same selector: kubectl get pod -n <namespace> -l <same_selector_as_service> -o wide (from your example above it would be something like kubectl get pod -n default -l run=whoami -o wide)
  • Exec into kube-router on a node (be sure to show the node name) that has your service pod and run: gobgp global rib <external_ip>/32
  • Exec into kube-router on a node (be sure to show the node name) that does NOT have your service pod and run: gobgp global rib <external_ip>/32
  • It may also be helpful to show the RIB on your upstream router as well

@yuchunyun
Copy link
Author

service definition: kubectl get svc whoami -o yaml

apiVersion: v1
kind: Service
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"name":"whoami","namespace":"default"},"spec":{"externalIPs":["192.168.120.37"],"internalTrafficPolicy":"Local","ports":[{"port":80,"protocol":"TCP","targetPort":80}],"selector":{"run":"whoami"}}}
  creationTimestamp: "2021-12-09T02:39:44Z"
  name: whoami
  namespace: default
  resourceVersion: "62814793"
  uid: 198259b7-daaf-4095-906c-ec7985302314
spec:
  clusterIP: 10.96.0.205
  clusterIPs:
  - 10.96.0.205
  externalIPs:
  - 192.168.120.37
  internalTrafficPolicy: Local
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  ports:
  - port: 80
    protocol: TCP
    targetPort: 80
  selector:
    run: whoami
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}

pod selected : kubectl get pods -l run=whoami -o wide

NAME                      READY   STATUS    RESTARTS   AGE    IP            NODE              NOMINATED NODE   READINESS GATES
whoami-586fd9cddd-8hgqq   1/1     Running   0          110m   10.244.7.39   zdns-yxz-k8s-18   <none>           <none>
whoami-586fd9cddd-dfs2z   1/1     Running   0          110m   10.244.7.40   zdns-yxz-k8s-18   <none>           <none>

k8s node : kubectl -n kube-system get pods -o wide|grep kube-route

kube-router-4dfzn                         1/1     Running   0             40h   192.168.110.15   zdns-yxz-k8s-15   <none>           <none>
kube-router-88pqp                         1/1     Running   0             40h   192.168.110.16   zdns-yxz-k8s-16   <none>           <none>
kube-router-bpbh4                         1/1     Running   0             40h   192.168.110.17   zdns-yxz-k8s-17   <none>           <none>
kube-router-g6qdl                         1/1     Running   0             40h   192.168.110.11   zdns-yxz-k8s-11   <none>           <none>
kube-router-ghpp7                         1/1     Running   0             40h   192.168.110.14   zdns-yxz-k8s-14   <none>           <none>
kube-router-qgg7w                         1/1     Running   0             40h   192.168.110.18   zdns-yxz-k8s-18   <none>           <none>
kube-router-rsvp6                         1/1     Running   0             40h   192.168.110.13   zdns-yxz-k8s-13   <none>           <none>
kube-router-x5bt9                         1/1     Running   0             40h   192.168.110.12   zdns-yxz-k8s-12   <none>           <none>

node zdns-yxz-k8s-18 run gobgp:

   Network              Next Hop             AS_PATH              Age        Attrs
*> 192.168.120.37/32    192.168.110.18                            00:02:52   [{Origin: i}]

node zdns-yxz-k8s-15/zdns-yxz-k8s-17 run gobgp:

   Network              Next Hop             AS_PATH              Age        Attrs
*> 192.168.120.37/32    192.168.110.15                            00:00:33   [{Origin: i}]
   Network              Next Hop             AS_PATH              Age        Attrs
*> 192.168.120.37/32    192.168.110.17                            00:00:35   [{Origin: i}]

show the RIB on your upstream router:

inet.0: 54 destinations, 134 routes (54 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

192.168.120.37/32  *[BGP/170] 01:55:43, localpref 100, from 192.168.110.18
                      AS path: 64611 I, validation-state: unverified
                      to 192.168.110.14 via vlan.6
                    > to 192.168.110.15 via vlan.6
                      to 192.168.110.16 via vlan.6
                      to 192.168.110.17 via vlan.6
                      to 192.168.110.18 via vlan.6
                    [BGP/170] 01:55:43, localpref 100
                      AS path: 64611 I, validation-state: unverified
                    > to 192.168.110.14 via vlan.6
                    [BGP/170] 01:55:43, localpref 100
                      AS path: 64611 I, validation-state: unverified
                    > to 192.168.110.15 via vlan.6
                    [BGP/170] 01:55:43, localpref 100
                      AS path: 64611 I, validation-state: unverified
                    > to 192.168.110.16 via vlan.6
                    [BGP/170] 01:55:43, localpref 100
                      AS path: 64611 I, validation-state: unverified
                    > to 192.168.110.17 via vlan.6

@aauren
Copy link
Collaborator

aauren commented Dec 9, 2021

Can you change internalTrafficPolicy to externalTrafficPolicy? See https://kubernetes.io/docs/concepts/services-networking/service/#external-traffic-policy

kube-router only pays attention to the external policy when deciding how to advertise BGP VIPs

@yuchunyun
Copy link
Author

thx, apologize for my carelessness
the ideal state is restored.
set type: LoadBalancer and externalTrafficPolicy: Local

@yuchunyun
Copy link
Author

have one more question

If externalTrafficPolicy is not set,only set kube-router.io/service.dsr: tunnel , NetworkPolicy.spec.ingress.from.ipBlock must be set k8s node address,not client real address?

@aauren
Copy link
Collaborator

aauren commented Dec 9, 2021

No worries! I sent you down that path a couple of comments back when I accidentally mixed up the internal and external policy. Sorry for my carelessness as well. 😅

So, just double checking, did this resolve the issue you were experiencing with network policy and obtaining the source ip from inside the pod?

@yuchunyun
Copy link
Author

No worries! I sent you down that path a couple of comments back when I accidentally mixed up the internal and external policy. Sorry for my carelessness as well. 😅

So, just double checking, did this resolve the issue you were experiencing with network policy and obtaining the source ip from inside the pod?

set externalTrafficPolicy: Local have resolve the issue.
just had another question, as mentioned above

@murali-reddy
Copy link
Member

If externalTrafficPolicy is not set,only set kube-router.io/service.dsr: tunnel , NetworkPolicy.spec.ingress.from.ipBlock must be set k8s node address,not client real address?

While this will allow traffic for the services marked with set kube-router.io/service.dsr: tunnel to be whitelisted, but will not be able to enforce policies based on real client IP address. i.e.) any client accessing service marked with set kube-router.io/service.dsr: tunnel will be permitted (as we have added exception to allow traffic from the nodes)

Unfortunately I can not see as a solution that can be recommended. If you want to enforce network policies based on real client IP address your best bet is to use services that are marked externalTrafficPolicy: Local and does not need DSR

This is the issue from kube-router on how DSR is implemented. Traffic is tunneled into the pod, so we miss an opportunity to perform proper network policy enforcement as when its done on the node its done on the encapsulated packet with different IP address

I will add this limitation to current DSR implementation and potentially see for a solution.

@yuchunyun
Copy link
Author

@murali-reddy
Thanks for your explanation. Expect kube-router to get better.

@tuananh170489
Copy link

tuananh170489 commented Jan 24, 2022

Great!
I've been researching about DSR as well, and luckily I founded this issue.
So, will I able to use --advertise-external-ip=true without BGP peering?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants