Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework spidercoordinator informer to update pod and service cidr #3249

Merged

Conversation

cyclinder
Copy link
Collaborator

@cyclinder cyclinder commented Feb 26, 2024

The responsibility of the SpiderCoordinator Informer is to monitor changes in the Pod and Service CIDR of the cluster and automatically update them in the Status of SpiderCoordinator. However, due to different sources of Pod and Service CIDR in different cluster environments, a uniform approach cannot be adopted. Additionally, to avoid coroutine leaks and confusion in updating status,

Thanks for contributing!

What type of PR is this?

  • release/bug
  • release/feature

What this PR does / why we need it:

This PR addresses the following four main issues:

  1. Avoiding goroutine leakage and avoiding frequent goroutine creation and closure affecting performance. In the previous version, every time the spidercoordiantor's instance is updated, it will close and restart the goroutine that listens to the pod and the service, which may affect the performance or even leak the goroutine in large-scale case.

  2. Unify the updating of spidercoordinator status in each reconcile event of spidercoordinator, instead of updating it in different places. Avoid confusing or even unsynchronized status updates.

  3. Avoid unnecessary reconcile events, Reduce requests to interact with the APISever.

  4. Add a "Reason" field to SpiderCoordinator's status, which can explain why the Phase is NotReady.

apiVersion: spiderpool.spidernet.io/v2beta1
kind: SpiderCoordinator
metadata:
  creationTimestamp: "2024-02-27T10:18:43Z"
  finalizers:
  - spiderpool.spidernet.io
  generation: 5
  name: default
  resourceVersion: "174300"
  uid: 95891abd-2ab3-4fcf-b5cc-68570a551ee7
spec:
  detectGateway: false
  detectIPConflict: false
  hijackCIDR:
  - 10.244.64.0/18
  - fd00:10:244::/112
  hostRPFilter: 0
  hostRuleTable: 500
  mode: auto
  podCIDRType: calico
  podDefaultRouteNIC: ""
  podMACPrefix: ""
  tunePodRoutes: true
  txQueueLen: 0
status:
  overlayPodCIDR: []
  phase: NotReady
  reason: 'failed to get API group resources: unable to retrieve the complete list
    of server APIs: crd.projectcalico.org/v1: the server could not find the requested
    resource'
  serviceCIDR:
  - 10.233.0.0/18
  - fd00:10:233::/116
{"level":"INFO","ts":"2024-02-28T07:06:31.313Z","logger":"Coordinator-Informer","caller":"coordinatormanager/coordinator_informer.go:312","msg":"PodCIDRtype changed from calico to cilium","CoordinatorName":"default","Operation":"UPDATE"}
{"level":"DEBUG","ts":"2024-02-28T07:06:31.313Z","logger":"Coordinator-Informer","caller":"coordinatormanager/coordinator_informer.go:314","msg":"Enqueue Coordinator","CoordinatorName":"default","Operation":"UPDATE"}
{"level":"INFO","ts":"2024-02-28T07:06:31.313Z","logger":"spiderpool","caller":"record/event.go:364","msg":"Event(v1.ObjectReference{Kind:\"SpiderCoordinator\", Namespace:\"\", Name:\"default\", UID:\"95891abd-2ab3-4fcf-b5cc-68570a551ee7\", APIVersion:\"spiderpool.spidernet.io/v2beta1\", ResourceVersion:\"180876\", FieldPath:\"\"}): type: 'Normal' reason: 'PodCIDRTypeChanged' Pod CIDR type changed from calico to cilium"}
{"level":"INFO","ts":"2024-02-28T07:06:31.316Z","logger":"Coordinator-Informer","caller":"coordinatormanager/coordinator_informer.go:462","msg":"Trying to fetch the ClusterCIDR from kube-system/kubeadm-config","Event Key":"SpiderCoordinator/default","Operation":"PROCESS"}
{"level":"INFO","ts":"2024-02-28T07:06:31.316Z","logger":"Coordinator-Informer","caller":"coordinatormanager/coordinator_informer.go:483","msg":"Detect podCIDRType is: cilium, try to update podCIDR","Event Key":"SpiderCoordinator/default","Operation":"PROCESS"}
{"level":"WARN","ts":"2024-02-28T07:06:31.319Z","logger":"Coordinator-Informer","caller":"coordinatormanager/coordinator_informer.go:507","msg":"ServiceCIDR is unavailable, update service cidr from cluster service cidr","Event Key":"SpiderCoordinator/default","Operation":"PROCESS"}
{"level":"INFO","ts":"2024-02-28T07:06:31.319Z","logger":"Coordinator-Informer","caller":"coordinatormanager/coordinator_informer.go:435","msg":"Patching coordinator's status from {NotReady failed to get API group resources: unable to retrieve the complete list of server APIs: crd.projectcalico.org/v1: the server could not find the requested resource [] [10.233.0.0/18 fd00:10:233::/116]} to {Synced  [10.244.64.0/18 fd00:10:244::/112] [10.233.0.0/18 fd00:10:233::/116]}","Event Key":"SpiderCoordinator/default","Operation":"PROCESS"}
{"level":"INFO","ts":"2024-02-28T07:06:31.319Z","logger":"spiderpool","caller":"record/event.go:364","msg":"Event(v1.ObjectReference{Kind:\"SpiderCoordinator\", Namespace:\"\", Name:\"default\", UID:\"95891abd-2ab3-4fcf-b5cc-68570a551ee7\", APIVersion:\"spiderpool.spidernet.io/v2beta1\", ResourceVersion:\"180876\", FieldPath:\"\"}): type: 'Warning' reason: 'NetworkingV1alpha1NotFound' ServiceCIDR is available in k8s v1.29, Your cluster version maybe less than v1.29"}
{"level":"INFO","ts":"2024-02-28T07:06:31.357Z","logger":"Coordinator-Informer","caller":"coordinatormanager/coordinator_informer.go:325","msg":"status have no changes, ignore add it to workqueue","CoordinatorName":"default","Operation":"UPDATE"}
{"level":"INFO","ts":"2024-02-28T07:06:31.357Z","logger":"Coordinator-Informer","caller":"coordinatormanager/coordinator_informer.go:440","msg":"Success to patch coordinator's status to {Synced  [10.244.64.0/18 fd00:10:244::/112] [10.233.0.0/18 fd00:10:233::/116]}","Event Key":"SpiderCoordinator/default","Operation":"PROCESS"}
{"level":"INFO","ts":"2024-02-28T07:06:31.357Z","logger":"Coordinator-Informer","caller":"coordinatormanager/coordinator_informer.go:419","msg":"Succeed to SYNC","Event Key":"SpiderCoordinator/default","Operation":"PROCESS"}

Which issue(s) this PR fixes:

Fixes #3225
Fixes #3210

Special notes for your reviewer:

Copy link

codecov bot commented Feb 26, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.12%. Comparing base (e483ecd) to head (54a92bc).

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #3249   +/-   ##
=======================================
  Coverage   81.12%   81.12%           
=======================================
  Files          50       50           
  Lines        5373     5373           
=======================================
  Hits         4359     4359           
  Misses        856      856           
  Partials      158      158           
Flag Coverage Δ
unittests 81.12% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

@cyclinder cyclinder force-pushed the rework_spidercoordinator_informer branch 2 times, most recently from 1524d0c to 7573699 Compare February 27, 2024 07:13
@cyclinder cyclinder force-pushed the rework_spidercoordinator_informer branch 5 times, most recently from cd1ce62 to 18b48b6 Compare February 28, 2024 04:47
@cyclinder cyclinder force-pushed the rework_spidercoordinator_informer branch 4 times, most recently from 8a71100 to 7e494e4 Compare February 28, 2024 06:37
@cyclinder cyclinder added the release/feature-new release note for new feature label Feb 28, 2024
@cyclinder cyclinder force-pushed the rework_spidercoordinator_informer branch 5 times, most recently from 99cc9ec to 8238f05 Compare February 28, 2024 09:32
The responsibility of the SpiderCoordinator Informer is to monitor changes in the Pod and Service CIDR of the
cluster and automatically update them in the Status of SpiderCoordinator. However, due to different sources
of Pod and Service CIDR in different cluster environments, a uniform approach cannot be adopted. Additionally,
to avoid coroutine leaks and confusion in updating status.

Signed-off-by: cyclinder <qifeng.guo@daocloud.io>
@cyclinder cyclinder force-pushed the rework_spidercoordinator_informer branch from 8238f05 to 54a92bc Compare February 28, 2024 11:06
@cyclinder cyclinder added cherrypick-release-v0.8 Cherry-pick the PR to branch release-v0.8. cherrypick-release-v0.9 labels Feb 29, 2024
@@ -60,11 +60,14 @@ type CoordinatorStatus struct {
// +kubebuilder:validation:Requred
Phase string `json:"phase"`

// +kubebuilder: validation:Optional
Reason string `json:"reason,omitempty"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*string

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里我的想法: 一个 omitempty 足够满足需求,这里不需要考虑默认值和赋空值的区别,这里的需求是reason 为空,不显示,不为空就显示,比较简单。如果为指针类型,还需要处理一些 nil 的情况,没有必要

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cyclinder cyclinder merged commit 3556e10 into spidernet-io:main Feb 29, 2024
49 checks passed
cyclinder added a commit to cyclinder/spiderpool that referenced this pull request Mar 1, 2024
…dinator_informer

Rework spidercoordinator informer to update pod and service cidr
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherrypick-release-v0.8 Cherry-pick the PR to branch release-v0.8. cherrypick-release-v0.9 release/bug release/feature-new release note for new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

spiderpool-controller goroutines leak Sync to serviceCIDR failed and status was not set to not-ready
4 participants