Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clustermesh: Add support for service-affinity #19521

Merged
merged 5 commits into from
Jun 15, 2022

Conversation

sayboras
Copy link
Member

@sayboras sayboras commented Apr 21, 2022

Description

This is to add service-annotation for global service in clustermesh

Changes

Please refer the individual commit for more details.

TODO

  • Update the docs later, I will either update docs after first round of review, or in separate PR.
  • Confirm if backport to 1.11 is required. Updated: backport is not required.
    • If yes, backport should be done manually + separately due to conflict with quanrantine backend service feature.

Testing

Testing was done for locally with 2 kind clusters

  • Service Affinity set as Local
    • Local endpoints are healthy, responses must be from these local endpoints only
    • Local endpoints are not healthy, responses can be from remote clusters
      • scale down local pods to 0
      • mark local endpoint as not Active (e.g. Quarantine)
  • Service Affinity set as Remote
    • Remote endpoints are healthy, responses must be from these remote endpoints only
    • Remote endpoints are not healthy, responses should fall back to local endpoints
      • scale down remote pods to 0
      • mark remote endpoint as not Active (e.g. Quarantine)
  • Toggle service affinity
    • From none to local
    • From none to remote
    • From remote to local
    • From local to remote
    • Remove service affinity
One case with local service affinity
# Starting with no service affinity
$ ./local/clustermesh_check.sh
Checking in cluster 1
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
Checking in cluster 2
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}

# Set service affinity to local for cluster 2 service
$ k annotate service rebel-base io.cilium/service-affinity="local" --context kind-kind-2 --overwrite
service/rebel-base annotated

# Check the preferred endpoints, which should match local endpoints
$ ksysex ds/cilium -c cilium-agent -- cilium service list                                           
ID   Frontend            Service Type   Backend                                     
1    10.96.0.1:443       ClusterIP      1 => 172.18.0.5:6443 (active)               
2    10.96.0.10:53       ClusterIP      1 => 10.244.1.249:53 (active)               
                                        2 => 10.244.1.170:53 (active)               
3    10.96.0.10:9153     ClusterIP      1 => 10.244.1.249:9153 (active)             
                                        2 => 10.244.1.170:9153 (active)             
11   10.96.198.86:2379   ClusterIP      1 => 10.244.1.111:2379 (active)             
12   10.96.178.32:80     ClusterIP      1 => 10.244.1.71:80 (active) (preferred)    
                                        2 => 10.244.0.123:80 (active) (preferred)   
                                        3 => 10.244.1.105:80 (active)               
                                        4 => 10.244.1.49:80 (active)   

$ kg endpoints                                           
NAME         ENDPOINTS                        AGE
kubernetes   172.18.0.5:6443                  3h19m
rebel-base   10.244.0.123:80,10.244.1.71:80   39m

# Run the test again
$ ./local/clustermesh_check.sh
Checking in cluster 1
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
Checking in cluster 2
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
Scale down the local service
$ kubectl scale --replicas=0 deployment/rebel-base --context kind-kind-2
deployment.apps/rebel-base scaled

# Check preferred service list again
$ ksysex ds/cilium -c cilium-agent -- cilium service list               
ID   Frontend            Service Type   Backend                           
1    10.96.0.1:443       ClusterIP      1 => 172.18.0.5:6443 (active)     
2    10.96.0.10:53       ClusterIP      1 => 10.244.1.249:53 (active)     
                                        2 => 10.244.1.170:53 (active)     
3    10.96.0.10:9153     ClusterIP      1 => 10.244.1.249:9153 (active)   
                                        2 => 10.244.1.170:9153 (active)   
11   10.96.198.86:2379   ClusterIP      1 => 10.244.1.111:2379 (active)   
12   10.96.178.32:80     ClusterIP      1 => 10.244.1.105:80 (active)     
                                        2 => 10.244.1.49:80 (active)   
                                        
# Run the check again
$ ./local/clustermesh_check.sh
Checking in cluster 1
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
Checking in cluster 2
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}

@maintainer-s-little-helper
Copy link

Commit 346950bc9d39d7d6edab7823fe8cfa12015d20cb does not contain "Signed-off-by".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@maintainer-s-little-helper maintainer-s-little-helper bot added dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Apr 21, 2022
@sayboras sayboras added dont-merge/preview-only Only for preview or testing, don't merge it. and removed dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Apr 21, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot added dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Apr 21, 2022
@sayboras sayboras force-pushed the tam/clustermesh-service-affinity branch from 945ac54 to 9ddb9e3 Compare April 21, 2022 15:50
@sayboras sayboras added the release-note/minor This PR changes functionality that users may find relevant to operating Cilium. label Apr 21, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Apr 21, 2022
@sayboras sayboras added area/clustermesh Relates to multi-cluster routing functionality in Cilium. dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Apr 21, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Apr 21, 2022
bpf/lib/common.h Outdated Show resolved Hide resolved
@sayboras sayboras force-pushed the tam/clustermesh-service-affinity branch 2 times, most recently from 9f2cfc2 to d17686b Compare April 25, 2022 11:10
@sayboras
Copy link
Member Author

Force pushed to incorporate review comments, same tests mentioned in PR description are done again.

@sayboras sayboras force-pushed the tam/clustermesh-service-affinity branch from d17686b to 809a187 Compare April 25, 2022 11:25
@sayboras sayboras removed the dont-merge/preview-only Only for preview or testing, don't merge it. label Apr 25, 2022
@sayboras sayboras marked this pull request as ready for review April 25, 2022 11:27
@sayboras sayboras requested a review from a team as a code owner April 25, 2022 11:27
@sayboras sayboras requested review from a team April 25, 2022 11:27
@sayboras sayboras requested a review from a team as a code owner April 25, 2022 11:27
@sayboras sayboras requested a review from a team April 25, 2022 11:27
@sayboras sayboras requested a review from a team as a code owner April 25, 2022 11:27
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jun 15, 2022
@aanm aanm merged commit 69da7f2 into cilium:master Jun 15, 2022
@sayboras sayboras deleted the tam/clustermesh-service-affinity branch June 15, 2022 17:23
sayboras added a commit to sayboras/cilium that referenced this pull request Jun 23, 2022
This commit is to add getting started docs for global service with
affinity feature.

Relates: cilium#19521
Signed-off-by: Tam Mach <tam.mach@cilium.io>
pchaigno pushed a commit that referenced this pull request Jun 28, 2022
This commit is to add getting started docs for global service with
affinity feature.

Relates: #19521
Signed-off-by: Tam Mach <tam.mach@cilium.io>
pchaigno pushed a commit to pchaigno/cilium that referenced this pull request Jun 28, 2022
[ upstream commit 4f6e7e7 ]

This commit is to add getting started docs for global service with
affinity feature.

Relates: cilium#19521
Signed-off-by: Tam Mach <tam.mach@cilium.io>
Signed-off-by: Paul Chaignon <paul@cilium.io>
qmonnet pushed a commit that referenced this pull request Jul 5, 2022
[ upstream commit 4f6e7e7 ]

This commit is to add getting started docs for global service with
affinity feature.

Relates: #19521
Signed-off-by: Tam Mach <tam.mach@cilium.io>
Signed-off-by: Paul Chaignon <paul@cilium.io>
gandro pushed a commit to gandro/cilium that referenced this pull request Aug 4, 2022
[ upstream commit 4f6e7e7 ]

This commit is to add getting started docs for global service with
affinity feature.

Relates: cilium#19521
Signed-off-by: Tam Mach <tam.mach@cilium.io>
Signed-off-by: Paul Chaignon <paul@cilium.io>
Signed-off-by: Michi Mutsuzaki <michi@isovalent.com>
dezmodue pushed a commit to dezmodue/cilium that referenced this pull request Aug 10, 2022
This commit is to add getting started docs for global service with
affinity feature.

Relates: cilium#19521
Signed-off-by: Tam Mach <tam.mach@cilium.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/clustermesh Relates to multi-cluster routing functionality in Cilium. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-blocker/1.12 This issue will prevent the release of the next version of Cilium. release-note/minor This PR changes functionality that users may find relevant to operating Cilium. requires-doc-change Requires updates to the documentation as part of the development effort.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants