Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement globalnet for headless service without selector #1558

Merged
merged 2 commits into from
Apr 28, 2022

Conversation

mkimuram
Copy link
Contributor

@mkimuram mkimuram commented Sep 17, 2021

@submariner-bot
Copy link
Contributor

🤖 Created branch: z_pr1558/mkimuram/issue/1537
🚀 Full E2E won't run until the "ready-to-test" label is applied. I will add it automatically once the PR has 2 approvals, or you can add it manually.

@mkimuram
Copy link
Contributor Author

mkimuram commented Sep 17, 2021

@sridhargaddam @vthapar @aswinsuryan

This is still a prototype that only supports a single IP of the endpoints (and many quick & dirty hacks remaining). However, it seems to work (See below for details). Let's make this prototype as a start point of the further discussion.

  1. Deploy this branch
make e2e using=external-net,globalnet
  1. Update RBAC to allow handling core/v1 endpoints for both cluster1 and cluster2 (this should be modified, later.)

(Just kubectl edit the ClusterRole submariner-globalnet to add endpoints in resources of apiGroups == "")

export KUBECONFIG=output/kubeconfigs/kind-config-cluster1
kubectl get ClusterRole submariner-globalnet -o yaml > /tmp/rbac-before.yaml 
kubectl edit ClusterRole submariner-globalnet
kubectl get ClusterRole submariner-globalnet -o yaml > /tmp/rbac-after.yaml 
diff -urN /tmp/rbac-before.yaml /tmp/rbac-after.yaml 
--- /tmp/rbac-before.yaml       2021-09-16 23:34:30.754644486 +0000
+++ /tmp/rbac-after.yaml        2021-09-16 23:35:13.890396448 +0000
@@ -3,7 +3,7 @@
 metadata:
   creationTimestamp: "2021-09-16T23:31:06Z"
   name: submariner-globalnet
-  resourceVersion: "859"
+  resourceVersion: "1952"
   uid: 14770711-cda4-47a7-85d3-8cb750a394be
 rules:
 - apiGroups:
@@ -13,6 +13,7 @@
   - services
   - namespaces
   - nodes
+  - endpoints
   verbs:
   - get
   - list

(Do the same to cluster2)

export KUBECONFIG=output/kubeconfigs/kind-config-cluster2
kubectl edit ClusterRole submariner-globalnet
  1. Test headless service without selector
  • Create headless service without selector and endpoints, and export the service
docker inspect "ext-app" -f '{{(index .NetworkSettings.Networks "'pseudo-ext'").IPAddress}}'
ip=$(docker inspect "ext-app" -f '{{(index .NetworkSettings.Networks "'pseudo-ext'").IPAddress}}')
export KUBECONFIG=output/kubeconfigs/kind-config-cluster1 
cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
  name: test-vm
spec:
  clusterIP: None
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80
EOF

cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Endpoints
metadata:
  name: test-vm
subsets:
  - addresses:
      - ip: ${ip}
    ports:
      - port: 80
EOF

subctl export service -n default test-vm
  • Check the globalnet log
kubectl logs -n submariner-operator -l app=submariner-globalnet

I0917 16:42:44.860635       1 service_export_controller.go:147] Processing ServiceExport "default/test-vm"
I0917 16:42:44.961770       1 ingress_endpoints_controller.go:93] Created Endpoints controller for (default/test-vm) with selector "metadata.name=test-vm"
I0917 16:42:44.962547       1 ingress_endpoints_controller.go:125] "create" ingress Endpoints default/test-vm for service test-vm
I0917 16:42:44.972396       1 global_ingressip_controller.go:135] Processing created &v1.GlobalIngressIP{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"ep-test-vm", GenerateName:"", Namespace:"default", SelfLink:"", UID:"a163ca0d-125c-4f80-a0ad-c471c0cf6536", ResourceVersion:"2307", Generation:1, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63767493764, loc:(*time.Location)(0x2026ea0)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string{"submariner.io/serviceRef":"test-vm"}, Annotations:map[string]string{"submariner.io/headless-svc-endpoints-ip":"172.117.0.4"}, OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry{v1.ManagedFieldsEntry{Manager:"submariner-globalnet", Operation:"Update", APIVersion:"submariner.io/v1", Time:(*v1.Time)(0xc0006a0db0), FieldsType:"FieldsV1", FieldsV1:(*v1.FieldsV1)(0xc0006a0dc8)}}}, Spec:v1.GlobalIngressIPSpec{Target:"HeadlessServiceEndpoints", ServiceRef:(*v1.LocalObjectReference)(0xc0005c7070), PodRef:(*v1.LocalObjectReference)(0xc0005c7080)}, Status:v1.GlobalIngressIPStatus{Conditions:[]v1.Condition(nil), AllocatedIP:""}}
I0917 16:42:44.972455       1 global_ingressip_controller.go:158] Allocating global IP for "default/ep-test-vm"
I0917 16:42:44.972476       1 iface.go:145] Installing iptables rule for Headless SVC Pod -d 242.254.1.253 -j DNAT --to 172.117.0.4
I0917 16:42:44.976890       1 iface.go:227] Installing iptable egress rules for HDLS SVC Pod "default/ep-test-vm": -p all -s 172.117.0.4 -m mark --mark 0xC0000/0xC0000 -j SNAT --to 242.254.1.253
I0917 16:42:44.981070       1 base_controllers.go:194] Updated: &v1.GlobalIngressIPStatus{Conditions:[]v1.Condition{v1.Condition{Type:"Allocated", Status:"True", ObservedGeneration:0, LastTransitionTime:v1.Time{Time:time.Time{wall:0xc0494f413a79ca9a, ext:311044842452, loc:(*time.Location)(0x2026ea0)}}, Reason:"Success", Message:"Allocated global IP"}}, AllocatedIP:"242.254.1.253"}
  • Confirm that globalingress IP is assigned to the service
kubectl get globalingressip ep-test-vm -o jsonpath='{.status.allocatedIP}{"\n"}'
242.254.1.253
  • Test Access from cluster1 and cluster2, and check the log
export KUBECONFIG=output/kubeconfigs/kind-config-cluster1 
kubectl -n default run tmp-shell --rm -i --tty --image quay.io/submariner/nettest -- bash
curl 242.254.1.253
export KUBECONFIG=output/kubeconfigs/kind-config-cluster2 
kubectl -n default run tmp-shell --rm -i --tty --image quay.io/submariner/nettest -- bash
curl 242.254.1.253
docker logs ext-app
172.117.0.3 - - [17/Sep/2021 16:44:17] "GET / HTTP/1.1" 200 -
242.254.2.6 - - [17/Sep/2021 16:44:37] "GET / HTTP/1.1" 200 -

@tpantelis tpantelis marked this pull request as draft September 17, 2021 17:12
@mkimuram
Copy link
Contributor Author

mkimuram commented Sep 17, 2021

Also, manually confirmed that reverse access also works as expected.

  • Deploy StatefulSet in cluster2
export KUBECONFIG=output/kubeconfigs/kind-config-cluster2
cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
  name: nginx-ss
  labels:
    app.kubernetes.io/instance: nginx-ss
    app.kubernetes.io/name: nginx-ss
spec:
  ports:
  - port: 80
    name: web
  clusterIP: None
  selector:
    app.kubernetes.io/instance: nginx-ss
    app.kubernetes.io/name: nginx-ss
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  serviceName: "nginx-ss"
  replicas: 2
  selector:
    matchLabels:
        app.kubernetes.io/instance: nginx-ss
        app.kubernetes.io/name: nginx-ss
  template:
    metadata:
      labels:
        app.kubernetes.io/instance: nginx-ss
        app.kubernetes.io/name: nginx-ss
    spec:
      containers:
      - name: nginx-ss
        image: k8s.gcr.io/nginx-slim:0.8
        ports:
        - containerPort: 80
          name: web
EOF
subctl export service -n default nginx-ss
  • Check globalingressip for each pod managed by the StatefulSet
kubectl get globalingressip pod-web-0 -o jsonpath='{.status.allocatedIP}{"\n"}'
242.254.2.252
kubectl get globalingressip pod-web-1 -o jsonpath='{.status.allocatedIP}{"\n"}'
242.254.2.251
  • Test access from external app to each pod
docker exec -it ext-app bash
curl 242.254.2.252
curl 242.254.2.251
  • Check the access log for each pod (source IPs match ext-app's globalIngressIP)
kubectl logs web-0
242.254.1.253 - - [17/Sep/2021:19:26:29 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.29.0"
kubectl logs web-1
242.254.1.253 - - [17/Sep/2021:19:26:33 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.29.0"
  • Test access from each pod to external app
kubectl exec -it web-0 -- bash
curl 242.254.1.253
kubectl exec -it web-1 -- bash
curl 242.254.1.253
  • Check the access log from each pod (source IPs match each pod's globalIngressIP)
docker logs ext-app
242.254.2.252 - - [17/Sep/2021 19:29:55] "GET / HTTP/1.1" 200 -
242.254.2.251 - - [17/Sep/2021 19:30:15] "GET / HTTP/1.1" 200 -

Copy link
Member

@sridhargaddam sridhargaddam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the nice work @mkimuram.

Few things

  1. Are you planning to push the Unit tests as part of the same PR?
  2. We need changes to the RBAC to support listening for endpoints both in Submariner-operator and submariner-charts. I think you can go ahead and push these PRs first.
  3. Did you test the end-to-end functionality and is it working fine?

pkg/globalnet/controllers/ingress_endpoints_controller.go Outdated Show resolved Hide resolved
pkg/globalnet/controllers/ingress_endpoints_controller.go Outdated Show resolved Hide resolved
klog.Infof("%q ingress Endpoints %s for service %s", op, key, c.svcName)

// TODO: consider handling multiple endpoints IPs
if len(endpoints.Subsets) == 0 || len(endpoints.Subsets[0].Addresses) == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per this implementation, if a user manually creates an endpoint with more than one ipAddress, we will only program rules for the first entry right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The current implementation only uses the first entry.

This commit meant to be just a quick prototype. I expect that multiple entries can be added, however I'm not sure how multiple GlobalIngressIPs can be handled by using admiral's syncer framework. Can it just return GlobalIngressIPList? (and then need to create a controller for it?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is still TODO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The current implementation only uses the first entry.

This commit meant to be just a quick prototype. I expect that multiple entries can be added, however I'm not sure how multiple GlobalIngressIPs can be handled by using admiral's syncer framework. Can it just return GlobalIngressIPList? (and then need to create a controller for it?)

@tpantelis can you please advise?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sridhargaddam @anfredette @nerdalert

As per this implementation, if a user manually creates an endpoint with more than one ipAddress, we will only program rules for the first entry right?

The last commit adds support for multiple IPs in endpoints of headless service without selector. PTAL
(Note that first 2 commits will be merged through #1582 )

Current E2E test doesn't cover this specific case, but I've manually tested by below way and worked well:

  1. Deploy the test environment
make deploy using=external-net,globalnet
  1. Create headless service without selector with one endpoint and test connectivity and source IPs
  • In cluster1, create headless service without selector with one endpoint
export KUBECONFIG=output/kubeconfigs/kind-config-cluster1
 
cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
  name: test-vm
spec:
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80
  clusterIP: None
EOF
 
cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Endpoints
metadata:
  name: test-vm
subsets:
  - addresses:
      - ip: 172.117.0.4
    ports:
      - port: 80
EOF
 
 
kubectl get svc,endpoints,globalingressip
NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
service/kubernetes   ClusterIP   100.0.0.1    <none>        443/TCP   23m
service/test-vm      ClusterIP   None         <none>        80/TCP    27s
 
NAME                   ENDPOINTS         AGE
endpoints/kubernetes   172.18.0.5:6443   23m
endpoints/test-vm      172.117.0.4:80    27s
 
 
subctl export service -n default test-vm
  • globalingressip is created after the service is exported
kubectl get svc,endpoints,globalingressip
NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
service/kubernetes   ClusterIP   100.0.0.1    <none>        443/TCP   24m
service/test-vm      ClusterIP   None         <none>        80/TCP    55s
 
NAME                   ENDPOINTS         AGE
endpoints/kubernetes   172.18.0.5:6443   24m
endpoints/test-vm      172.117.0.4:80    55s
 
NAME                                                   IP
globalingressip.submariner.io/ep-test-vm-172.117.0.4   242.254.1.253
 
kubectl get -o yaml globalingressip.submariner.io/ep-test-vm-172.117.0.4
apiVersion: submariner.io/v1
kind: GlobalIngressIP
metadata:
  annotations:
    submariner.io/headless-svc-endpoints-ip: 172.117.0.4
  creationTimestamp: "2022-03-08T22:48:58Z"
  generation: 2
  labels:
    submariner-io/originatingNamespace: default
    submariner.io/serviceRef: test-vm
  name: ep-test-vm-172.117.0.4
  namespace: default
  resourceVersion: "5297"
  uid: 46ed1f13-e65c-4240-8374-0935313d134b
spec:
  serviceRef:
    name: test-vm
  target: HeadlessServiceEndpoints
status:
  allocatedIP: 242.254.1.253
  conditions:
  - lastTransitionTime: "2022-03-08T22:48:58Z"
    message: Allocated global IP
    reason: Success
    status: "True"
    type: Allocated
  • In cluster2, create statefulset, service without selector and export the service.
export KUBECONFIG=output/kubeconfigs/kind-config-cluster2
 
cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
  name: nginx-ss
  labels:
    app.kubernetes.io/instance: nginx-ss
    app.kubernetes.io/name: nginx-ss
spec:
  ports:
  - port: 80
    name: web
  clusterIP: None
  selector:
    app.kubernetes.io/instance: nginx-ss
    app.kubernetes.io/name: nginx-ss
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  serviceName: "nginx-ss"
  replicas: 2
  selector:
    matchLabels:
        app.kubernetes.io/instance: nginx-ss
        app.kubernetes.io/name: nginx-ss
  template:
    metadata:
      labels:
        app.kubernetes.io/instance: nginx-ss
        app.kubernetes.io/name: nginx-ss
    spec:
      containers:
      - name: nginx-ss
        image: k8s.gcr.io/nginx-slim:0.8
        ports:
        - containerPort: 80
          name: web
EOF
 
subctl export service -n default nginx-ss
  • Test access from pod to external
kubectl get globalingressip
NAME        IP
pod-web-0   242.254.2.252
pod-web-1   242.254.2.253
 
kubectl exec -it web-0 -- curl -m 10 242.254.1.253
 
docker logs ext-app 2>&1 | tail -n 1
242.254.2.252 - - [08/Mar/2022 22:56:55] "GET / HTTP/1.1" 200 -
 
kubectl exec -it web-1 -- curl -m 10 242.254.1.253
 
docker logs ext-app 2>&1 | tail -n 1
242.254.2.253 - - [08/Mar/2022 22:57:40] "GET / HTTP/1.1" 200 -
  • Test access from external to pod
docker exec -it ext-app curl -m 10 242.254.2.252
 
kubectl logs web-0
242.254.1.253 - - [08/Mar/2022:22:59:11 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.29.0"
 
 
docker exec -it ext-app curl -m 10 242.254.2.253
kubectl logs web-1
242.254.1.253 - - [08/Mar/2022:23:00:12 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.29.0"
  1. Create another test container, add it to endpoints, and test connectivity and source IPs
  • Create another test container
docker run --cap-add=NET_ADMIN --name ext-app2 --network pseudo-ext \
-d registry.access.redhat.com/ubi7/ubi python -m SimpleHTTPServer 80
 
docker exec -i ext-app2 yum install -y iproute
docker exec -i ext-app2 ip r add 242.254.1.0/24 via 172.117.0.3
docker exec -i ext-app2 ip r add 242.254.2.0/24 via 172.117.0.3
  • Add it to endpoints (Then new ingressglobalip ep-test-vm-172.117.0.5 is added)
docker inspect ext-app2 -f '{{(index .NetworkSettings.Networks "pseudo-ext").IPAddress}}'
172.117.0.5
 
export KUBECONFIG=output/kubeconfigs/kind-config-cluster1
 
cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Endpoints
metadata:
  name: test-vm
subsets:
  - addresses:
      - hostname: ext-app
        ip: 172.117.0.4
      - hostname: ext-app2
        ip: 172.117.0.5
    ports:
      - port: 80
EOF
 
kubectl get svc,endpoints,globalingressip
NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
service/kubernetes   ClusterIP   100.0.0.1    <none>        443/TCP   52m
service/test-vm      ClusterIP   None         <none>        80/TCP    28m
 
NAME                   ENDPOINTS                       AGE
endpoints/kubernetes   172.18.0.5:6443                 52m
endpoints/test-vm      172.117.0.4:80,172.117.0.5:80   28m
 
NAME                                                   IP
globalingressip.submariner.io/ep-test-vm-172.117.0.4   242.254.1.253
globalingressip.submariner.io/ep-test-vm-172.117.0.5   242.254.1.252
  • Test access from pod to external
export KUBECONFIG=output/kubeconfigs/kind-config-cluster2
 
kubectl exec -it web-0 -- curl -m 10 242.254.1.252
 
docker logs ext-app2 2>&1 | tail -n 1
242.254.2.252 - - [08/Mar/2022 23:19:05] "GET / HTTP/1.1" 200 -
 
kubectl exec -it web-1 -- curl -m 10 242.254.1.252
 
docker logs ext-app2 2>&1 | tail -n 1
242.254.2.253 - - [08/Mar/2022 23:19:25] "GET / HTTP/1.1" 200 -
  • Test access from external to pod
docker exec -it ext-app2 curl -m 10 242.254.2.252
 
kubectl logs web-0
242.254.1.253 - - [08/Mar/2022:22:59:11 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.29.0"
242.254.1.252 - - [08/Mar/2022:23:20:31 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.29.0"
 
docker exec -it ext-app2 curl -m 10 242.254.2.253
kubectl logs web-1
242.254.1.253 - - [08/Mar/2022:23:00:12 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.29.0"
242.254.1.252 - - [08/Mar/2022:23:20:42 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.29.0"

Comment on lines 96 to 99
// TODO: fixme (Maybe only rename and change messages?)
err := controller.iptIface.AddIngressRulesForHeadlessSvcPod(reservedIPs[0], target)
if err != nil {
return err
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is same as L88-L94. I agree, we can rename it and match both Target with submarinerv1.HeadlessServicePod || submarinerv1.HeadlessServiceEndpoints

pkg/globalnet/controllers/global_ingressip_controller.go Outdated Show resolved Hide resolved
Copy link
Contributor Author

@mkimuram mkimuram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sridhargaddam @tpantelis Thank you for your review. I will reflect the comments and add unit tests.

klog.Infof("%q ingress Endpoints %s for service %s", op, key, c.svcName)

// TODO: consider handling multiple endpoints IPs
if len(endpoints.Subsets) == 0 || len(endpoints.Subsets[0].Addresses) == 0 {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The current implementation only uses the first entry.

This commit meant to be just a quick prototype. I expect that multiple entries can be added, however I'm not sure how multiple GlobalIngressIPs can be handled by using admiral's syncer framework. Can it just return GlobalIngressIPList? (and then need to create a controller for it?)

pkg/globalnet/controllers/ingress_endpoints_controller.go Outdated Show resolved Hide resolved
@mkimuram
Copy link
Contributor Author

We need changes to the RBAC to support listening for endpoints both in Submariner-operator and submariner-charts. I think you can go ahead and push these PRs first.

Created issues and PRs, and added descriptions on dependencies in the first comment of this PR.

@mkimuram
Copy link
Contributor Author

mkimuram commented Sep 22, 2021

@sridhargaddam @tpantelis

Addressed review comments. PTAL

Are you planning to push the Unit tests as part of the same PR?

Unit tests are added.

Did you test the end-to-end functionality and is it working fine?

End-to-end functionality is tested manually as described in here and here. I will add automated E2E tests later on top of #1555.

@mkimuram
Copy link
Contributor Author

mkimuram commented Oct 1, 2021

Refactored and added E2E tests. The last three commits are for E2E tests, but only the tests in the last commit are for this PR's change. I will move the other E2E commits to another PR to make it clear.

Note that the tests in the last commits fails due to the lack of access right to endpointslice (and it seems that there is a bug in the cleanup logic of headless service without selector, because unrelated tests become fail after the tests). I will investigate and fix it.

@stale
Copy link

stale bot commented Oct 15, 2021

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Oct 15, 2021
@sridhargaddam sridhargaddam removed the wontfix This will not be worked on label Oct 16, 2021
@stale
Copy link

stale bot commented Oct 30, 2021

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Oct 30, 2021
@stale stale bot removed the wontfix This will not be worked on label Nov 4, 2021
@sridhargaddam
Copy link
Member

When this PR is ready for review, please remove the status from draft.

@mkimuram
Copy link
Contributor Author

mkimuram commented Apr 1, 2022

Removed the workaround commit and rebased to include the fix for the flakey tests. Now it should be ready to be merged.

@mkimuram
Copy link
Contributor Author

mkimuram commented Apr 4, 2022

@sridhargaddam @tpantelis Could you approve this PR?

Copy link
Member

@sridhargaddam sridhargaddam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @mkimuram.

@@ -28,6 +28,7 @@ const (
// The following chains are added as part of GN 2.0 implementation.
SmGlobalnetEgressChainForPods = "SM-GN-EGRESS-PODS"
SmGlobalnetEgressChainForHeadlessSvcPods = "SM-GN-EGRESS-HDLS-PODS"
SmGlobalnetEgressChainForHeadlessSvcEPs = "SM-GN-EGRESS-HDLS-EPS"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we re-use the same chain SmGlobalnetEgressChainForHeadlessSvcPods?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we can. However, the name is a bit confusing and it makes rules for different purpose exist in one chain. If there is no side effect(like update issue), it would be clean to add a separate chain.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a look at the code and thought about it for sometime and I think it should be okay even if we introduce a new chain in between. Can you please verify once in your local setup running this updated Globalnet image on an existing (v0.12.0) Globalnet setup. If you find that there is no issue, we can go with the new chain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sridhargaddam Confirmed that it worked as expected. Please also see below details:

  • Build the container image for this PR
git checkout issue/1537
make images
docker tag quay.io/submariner/submariner-globalnet:dev quay.io/submariner/submariner-globalnet:issue1537
  • Deploy v0.12.0 submariner
git checkout v0.12.0

make deploy using=external-net,globalnet
  • Confirm the nat chain for v0.12.0 (SM-GN-EGRESS-HDLS-EPS doesn't exist)
docker exec -it cluster1-worker iptables -t nat -S SUBMARINER-GN-EGRESS
# Warning: iptables-legacy tables present, use iptables-legacy to see them
-N SUBMARINER-GN-EGRESS
-A SUBMARINER-GN-EGRESS -j SUBMARINER-GN-MARK
-A SUBMARINER-GN-EGRESS -j SM-GN-EGRESS-PODS
-A SUBMARINER-GN-EGRESS -j SM-GN-EGRESS-HDLS-PODS
-A SUBMARINER-GN-EGRESS -j SM-GN-EGRESS-NS
-A SUBMARINER-GN-EGRESS -j SM-GN-EGRESS-CLUSTER

docker exec -it cluster2-worker iptables -t nat -S SUBMARINER-GN-EGRESS
# Warning: iptables-legacy tables present, use iptables-legacy to see them
-N SUBMARINER-GN-EGRESS
-A SUBMARINER-GN-EGRESS -j SUBMARINER-GN-MARK
-A SUBMARINER-GN-EGRESS -j SM-GN-EGRESS-PODS
-A SUBMARINER-GN-EGRESS -j SM-GN-EGRESS-HDLS-PODS
-A SUBMARINER-GN-EGRESS -j SM-GN-EGRESS-NS
-A SUBMARINER-GN-EGRESS -j SM-GN-EGRESS-CLUSTER
  • Test that existing test pass
make e2e using=external-net,globalnet

• [SLOW TEST:46.771 seconds]
[external-dataplane] Connectivity
/go/src/github.com/submariner-io/submariner/test/external/dataplane/connectivity
.go:41
  should be able to connect from an external app to a pod in a cluster
  /go/src/github.com/submariner-io/submariner/test/external/dataplane/connectivi
ty.go:44
------------------------------

JUnit path was configured: /go/src/github.com/submariner-io/submariner/output/e2
e-junit.xml

JUnit report was created: /go/src/github.com/submariner-io/submariner/output/e2e
-junit.xml

Ran 1 of 1 Specs in 46.822 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 0 Skipped
--- PASS: TestE2E (46.82s)
PASS
  • Update the globalnet image to the one built from this PR
    (submariner-operator doesn't allow directly updating globalnet daemonset because it reconciles the image. So, it seems that we need to update the Submariner CR).
kind load docker-image quay.io/submariner/submariner-globalnet:issue1537 --name cluster1
kind load docker-image quay.io/submariner/submariner-globalnet:issue1537 --name cluster2

export KUBECONFIG=output/kubeconfigs/kind-config-cluster1
kubectl get daemonset submariner-globalnet -n submariner-operator -o jsonpath='{range .spec.template.spec.containers[*]}{.name}{" "}{.image}{"\n"}{end}'
submariner-globalnet localhost:5000/submariner-globalnet:local

kubectl get submariner -n submariner-operator -o yaml > /tmp/submariner_cr.txt
kubectl edit submariner -n submariner-operator
kubectl get submariner -n submariner-operator -o yaml > /tmp/submariner_cr_after.txt

diff -urN  /tmp/submariner_cr.txt /tmp/submariner_cr_after.txt | grep -C 2 "imageOverrides"
     debug: false
     globalCIDR: 242.254.1.0/24
+    imageOverrides:
+      submariner-globalnet: quay.io/submariner/submariner-globalnet:issue1537
     namespace: submariner-operator

kubectl get daemonset submariner-globalnet -n submariner-operator -o jsonpath='{range .spec.template.spec.containers[*]}{.name}{" "}{.image}{"\n"}{end}'
submariner-globalnet quay.io/submariner/submariner-globalnet:issue1537
export KUBECONFIG=output/kubeconfigs/kind-config-cluster2
kubectl edit submariner -n submariner-operator

kubectl get daemonset submariner-globalnet -n submariner-operator -o jsonpath='{range .spec.template.spec.containers[*]}{.name}{" "}{.image}{"\n"}{end}'
submariner-globalnet quay.io/submariner/submariner-globalnet:issue1537
  • Confirm the nat chain for this PR (SM-GN-EGRESS-HDLS-EPS exists in the right order)
docker exec -it cluster1-worker iptables -t nat -S SUBMARINER-GN-EGRESS
# Warning: iptables-legacy tables present, use iptables-legacy to see them
-N SUBMARINER-GN-EGRESS
-A SUBMARINER-GN-EGRESS -j SUBMARINER-GN-MARK
-A SUBMARINER-GN-EGRESS -j SM-GN-EGRESS-PODS
-A SUBMARINER-GN-EGRESS -j SM-GN-EGRESS-HDLS-PODS
-A SUBMARINER-GN-EGRESS -j SM-GN-EGRESS-HDLS-EPS
-A SUBMARINER-GN-EGRESS -j SM-GN-EGRESS-NS
-A SUBMARINER-GN-EGRESS -j SM-GN-EGRESS-CLUSTER

docker exec -it cluster2-worker iptables -t nat -S SUBMARINER-GN-EGRESS
# Warning: iptables-legacy tables present, use iptables-legacy to see them
-N SUBMARINER-GN-EGRESS
-A SUBMARINER-GN-EGRESS -j SUBMARINER-GN-MARK
-A SUBMARINER-GN-EGRESS -j SM-GN-EGRESS-PODS
-A SUBMARINER-GN-EGRESS -j SM-GN-EGRESS-HDLS-PODS
-A SUBMARINER-GN-EGRESS -j SM-GN-EGRESS-HDLS-EPS
-A SUBMARINER-GN-EGRESS -j SM-GN-EGRESS-NS
-A SUBMARINER-GN-EGRESS -j SM-GN-EGRESS-CLUSTER
  • Test that e2e test for this PR succeeds
git checkout issue/1537

go mod vendor

make "e2e" using="external-net globalnet"


Ran 15 of 28 Specs in 281.317 seconds
SUCCESS! -- 15 Passed | 0 Failed | 7 Pending | 6 Skipped
--- PASS: TestE2E (281.34s)
PASS

@@ -287,7 +288,12 @@ func (g *gatewayMonitor) startControllers() error {
return errors.Wrap(err, "error creating the IngressPodControllers")
}

c, err = NewServiceExportController(g.syncerConfig, podControllers)
endpointsControllers, err := NewIngressEndpointsControllers(g.syncerConfig)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: endpointsController and similarly NewIngressEndpointsController

}

if c.ingressIPMap.Contains(ingressIP.Name) {
// Avoid assigning ingressIPs to endpoints that are not ready with an endpoint IP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this debug message is not relevant for endpoints

klog.Infof("%q ingress Endpoints %s for service %s", op, key, c.svcName)

// TODO: consider handling multiple endpoints IPs
if len(endpoints.Subsets) == 0 || len(endpoints.Subsets[0].Addresses) == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The current implementation only uses the first entry.

This commit meant to be just a quick prototype. I expect that multiple entries can be added, however I'm not sure how multiple GlobalIngressIPs can be handled by using admiral's syncer framework. Can it just return GlobalIngressIPList? (and then need to create a controller for it?)

@tpantelis can you please advise?

}

func areEndpointsEqual(obj1, obj2 *unstructured.Unstructured) bool {
subsets1, _, _ := unstructured.NestedSlice(obj1.Object, "subsets")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we are checking the entire subsets whereas while creating the IngressIP, we only consider IPAddress from subsets[0] (Line 131). Should we check only subset[0]?

return nil, errors.Wrap(err, "error creating the syncer")
}

if err := controller.Start(); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific reason for starting the controller here instead of in gateway_monitor.go Line 316?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my understanding, it needs to be started here, because this controller is started via IngressEndpointsControllers.start() per Service basis. It is using the similar approach that ingress_pod_controllers.go does for ingress_pod_controller.go.

annotations := ingressIP.GetAnnotations()
if epIP, ok := annotations[headlessSvcEndpointsIP]; ok {
if _, used := epIPs[epIP]; used {
// ingressIP is still used
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the ingressIP is still used, should we remove it from the epIP list? As otherwise we seem to create new IngressIPs in Line 152. Am I missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that it is safe to ensure the ingressIP resource to exist, but I noticed that calling create without checking already exist error doesn't do the intended works. So, it would be better removing it from epIP as you mentioned. Thank you for pointing it out.

return fmt.Errorf("globalIP %q or podIP %q cannot be empty", globalIP, podIP)
func (i *ipTables) AddIngressRulesForHeadlessSvc(globalIP, ip string, targetType TargetType) error {
if globalIP == "" || ip == "" {
return fmt.Errorf("globalIP %q or %s IP %q cannot be empty", targetType, globalIP, ip)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return fmt.Errorf("globalIP %q or %s IP %q cannot be empty", targetType, globalIP, ip)
return fmt.Errorf("globalIP %q or %s IP %q cannot be empty", globalIP, targetType, ip)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed as suggested.

return fmt.Errorf("globalIP %q or podIP %q cannot be empty", globalIP, podIP)
func (i *ipTables) RemoveIngressRulesForHeadlessSvc(globalIP, ip string, targetType TargetType) error {
if globalIP == "" || ip == "" {
return fmt.Errorf("globalIP %q or %s IP %q cannot be empty", targetType, globalIP, ip)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return fmt.Errorf("globalIP %q or %s IP %q cannot be empty", targetType, globalIP, ip)
return fmt.Errorf("globalIP %q or %s IP %q cannot be empty", globalIP, targetType, ip)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed as suggested.

Copy link
Contributor Author

@mkimuram mkimuram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sridhargaddam Thank you for your review and comments. I put some comments back and will update the code latter.

@@ -390,6 +396,12 @@ func (g *gatewayMonitor) createGlobalnetChains() error {
return errors.Wrapf(err, "error creating iptables chain %s", constants.SmGlobalnetEgressChainForHeadlessSvcPods)
}

klog.V(log.DEBUG).Infof("Install/ensure %s chain exists", constants.SmGlobalnetEgressChainForHeadlessSvcEPs)

if err := iptables.CreateChainIfNotExists(g.ipt, "nat", constants.SmGlobalnetEgressChainForHeadlessSvcEPs); err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the jump rule to this chain is added by using iptables.InsertUnique with specifying the number to be inserted, so where this chain is inserted would be consistent regardless of whether the other chains already present.

However, sharing SmGlobalnetEgressChainForHeadlessSvcPods chain may be the other way to workaround it, as you mentioned.

annotations := ingressIP.GetAnnotations()
if epIP, ok := annotations[headlessSvcEndpointsIP]; ok {
if _, used := epIPs[epIP]; used {
// ingressIP is still used
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that it is safe to ensure the ingressIP resource to exist, but I noticed that calling create without checking already exist error doesn't do the intended works. So, it would be better removing it from epIP as you mentioned. Thank you for pointing it out.


controller.resourceSyncer, err = syncer.NewResourceSyncer(&syncer.ResourceSyncerConfig{
Name: "Ingress Endpoints syncer",
ResourceType: &corev1.Endpoints{},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Endpoints are deprecated, we would need to re-implement this to use EndpointSlices. However, I'm not sure if there are fixed plan to deprecate it.

For this particular headless service without selector case, I think that we should use Endpoints, because "The control plane automatically creates EndpointSlices for any Kubernetes Service that has a selector specified." as described here. As I tested as below, EndpointSlices doesn't seem to be created automatically, for manually created Endpoints.

kubectl get endpoints,endpointslices
NAME                   ENDPOINTS         AGE
endpoints/kubernetes   172.18.0.2:6443   2m39s

NAME                                        ADDRESSTYPE   PORTS   ENDPOINTS    AGE
endpointslice.discovery.k8s.io/kubernetes   IPv4          6443    172.18.0.2   2m39s

cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Endpoints
metadata:
  name: test-vm
subsets:
  - addresses:
      - ip: 192.168.4.2
    ports:
      - port: 80
EOF

kubectl get endpoints,endpointslices
NAME                   ENDPOINTS         AGE
endpoints/kubernetes   172.18.0.2:6443   3m48s
endpoints/test-vm      192.168.4.2:80    4s

NAME                                        ADDRESSTYPE   PORTS   ENDPOINTS    AGE
endpointslice.discovery.k8s.io/kubernetes   IPv4          6443    172.18.0.2   3m48s

@@ -28,6 +28,7 @@ const (
// The following chains are added as part of GN 2.0 implementation.
SmGlobalnetEgressChainForPods = "SM-GN-EGRESS-PODS"
SmGlobalnetEgressChainForHeadlessSvcPods = "SM-GN-EGRESS-HDLS-PODS"
SmGlobalnetEgressChainForHeadlessSvcEPs = "SM-GN-EGRESS-HDLS-EPS"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we can. However, the name is a bit confusing and it makes rules for different purpose exist in one chain. If there is no side effect(like update issue), it would be clean to add a separate chain.

@mkimuram mkimuram force-pushed the issue/1537 branch 2 times, most recently from 7f43500 to 9e63ceb Compare April 13, 2022 22:48
@@ -52,6 +52,7 @@ func UninstallDataPath() {
// The chains have to be deleted in a specific order.
constants.SmGlobalnetEgressChainForCluster,
constants.SmGlobalnetEgressChainForHeadlessSvcPods,
constants.SmGlobalnetEgressChainForHeadlessSvcEPs,
Copy link
Contributor Author

@mkimuram mkimuram Apr 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, added uninstall code here.

By this change, this version of subctl comand starts to try deleting this chain, while old globalnet deployment doesn't have this chain. However, uninstallation continues even FlushIPTableChain fails, so there won't be issue by version mismatch.

Actually, ClearChain called from FlushIPTableChain doesn't seem to return error even when the chain doesn't exist already (it creates new one and flush). So, it would be safe to just add the chain to the list to be deleted.

Copy link
Member

@sridhargaddam sridhargaddam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the review comments @mkimuram.

@tpantelis @anfredette can you please review.

@sridhargaddam
Copy link
Member

@mkimuram The last commit can be squashed with the previous one as there is a linting error.

Signed-off-by: Masaki Kimura <masaki.kimura@hitachivantara.com>
Copy link
Contributor

@tpantelis tpantelis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good - just a couple minor things.

c.resourceSyncer.Reconcile(func() []runtime.Object {
objList, err := client.List(context.TODO(), metav1.ListOptions{LabelSelector: labelSelector})
objList, err := client.List(context.TODO(), listOptions)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can simplify this.

Suggested change
objList, err := client.List(context.TODO(), listOptions)
objList, err := client.List(context.TODO(), metav1.ListOptions{
LabelSelector: labelSelector,
FieldSelector: fieldSelector,
})

Comment on lines 228 to 252
var target string
var tType iptables.TargetType

if ingressIP.Spec.Target == submarinerv1.HeadlessServicePod {
target = ingressIP.GetAnnotations()[headlessSvcPodIP]
if target == "" {
_ = c.pool.Release(ips...)

klog.Warningf("%q annotation is missing on %q", headlessSvcPodIP, key)
klog.Warningf("%q annotation is missing on %q", headlessSvcPodIP, key)

return true
return true
}

tType = iptables.PodTarget
} else if ingressIP.Spec.Target == submarinerv1.HeadlessServiceEndpoints {
target = ingressIP.GetAnnotations()[headlessSvcEndpointsIP]
if target == "" {
_ = c.pool.Release(ips...)

klog.Warningf("%q annotation is missing on %q", headlessSvcEndpointsIP, key)

return true
}

tType = iptables.EndpointsTarget
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can simplify to:

var annotationKey string
var tType iptables.TargetType

if ingressIP.Spec.Target == submarinerv1.HeadlessServicePod {
    annotationKey = headlessSvcPodIP
    tType = iptables.PodTarget
} else if ingressIP.Spec.Target == submarinerv1.HeadlessServiceEndpoints {
    annotationKey = headlessSvcEndpointsIP
    tType = iptables.EndpointsTarget
}

target = ingressIP.GetAnnotations()[annotationKey]
if target == "" {
    _ = c.pool.Release(ips...)

    klog.Warningf("%q annotation is missing on %q", annotationKey, key)

    return true
}

klog.Infof("%q ingress Endpoints %s for service %s", op, key, c.svcName)

// Get list of current endpoint IPs
epIPs := map[string]bool{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
epIPs := map[string]bool{}
usedEpIPs := map[string]bool{}

annotations := ingressIP.GetAnnotations()
if epIP, ok := annotations[headlessSvcEndpointsIP]; ok {
// ingressIP is still used
if _, used := epIPs[epIP]; used {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if _, used := epIPs[epIP]; used {
if usedEpIPs[epIP] {

Signed-off-by: Masaki Kimura <masaki.kimura@hitachivantara.com>
@mkimuram
Copy link
Contributor Author

@tpantelis Thank you for your review. Fixed as suggested.

Note that "Linting / Commit Message(s)" and "Linting / Markdown Links (modified files)" are constantly failing, so "not required". Also, "Upgrade / Latest Release to Latest Version" seems flaking (but succeeded this time).

@tpantelis tpantelis merged commit de3223c into submariner-io:devel Apr 28, 2022
@submariner-bot
Copy link
Contributor

🤖 Closed branches: [z_pr1558/mkimuram/issue/1537]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed For issues and PRs which we definitely want (disables the stale bot) enhancement New feature or request ready-to-test When a PR is ready for full E2E testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Headless service without selector doesn't work
7 participants