Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: delete the cluster object when give up the leadership #774

Merged
merged 6 commits into from
Dec 24, 2021

Conversation

tokers
Copy link
Contributor

@tokers tokers commented Nov 26, 2021

Please answer these questions before submitting a pull request


In the previous implementation, the cluster object won't be removed if
the controller gives up the leadership, which causes some error state is
cached even if the next leader takes office, this hurts the running.

In this pull request, the cluster is deleted once the leadership is
given up, and the retry-logic will be aborted (for the cluster added in
last term) if the context is canceld.

Signed-off-by: Chao Zhang tokers@apache.org

In the previous implementation, the cluster object won't be removed if
the controller gives up the leadership, which causes some error state is
cached even if the next leader takes office, this hurts the running.

In this pull request, the cluster is deleted once the leadership is
given up, and the retry-logic will be aborted (for the cluster added in
last term) if the context is canceld.

Signed-off-by: Chao Zhang <tokers@apache.org>
@codecov-commenter
Copy link

codecov-commenter commented Nov 26, 2021

Codecov Report

Merging #774 (cbcda7f) into master (f470867) will increase coverage by 0.27%.
The diff coverage is 23.80%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #774      +/-   ##
==========================================
+ Coverage   32.28%   32.56%   +0.27%     
==========================================
  Files          66       69       +3     
  Lines        6808     7285     +477     
==========================================
+ Hits         2198     2372     +174     
- Misses       4353     4643     +290     
- Partials      257      270      +13     
Impacted Files Coverage Δ
pkg/apisix/apisix.go 61.36% <0.00%> (-6.14%) ⬇️
pkg/ingress/controller.go 0.97% <0.00%> (-0.04%) ⬇️
test/e2e/e2e.go 100.00% <ø> (ø)
pkg/apisix/cluster.go 33.64% <71.42%> (+3.62%) ⬆️
pkg/apisix/nonexistentclient.go 38.81% <0.00%> (-2.98%) ⬇️
pkg/apisix/schema.go 62.26% <0.00%> (-2.45%) ⬇️
pkg/api/router/router.go 72.72% <0.00%> (-2.28%) ⬇️
pkg/ingress/ingress.go 6.89% <0.00%> (-1.22%) ⬇️
pkg/ingress/status.go 0.00% <0.00%> (ø)
... and 19 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f470867...cbcda7f. Read the comment docs.

@tao12345666333 tao12345666333 added bug Something isn't working triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Nov 26, 2021
@tao12345666333
Copy link
Member

CI failed.

@tokers
Copy link
Contributor Author

tokers commented Nov 26, 2021

CI failed.

The go version on my local is 1.17 and the build directive is different.

Signed-off-by: Chao Zhang <tokers@apache.org>
@tokers
Copy link
Contributor Author

tokers commented Nov 26, 2021

CI failed.

The go version on my local is 1.17 and the build directive is different.

I use the go/1.16.10 to re-generate some files.

tools.go Outdated
@@ -1,3 +1,4 @@
//go:build tools
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is new in 1.17, I want to delete it before we fully switch to v1.17

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You haven't pushed the latest code yet?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping @tokers

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tokers any update?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated it but I don't have time to validate the e2e case until this weekend.

@gxthrj
Copy link
Contributor

gxthrj commented Nov 29, 2021

better to add e2e for this feature.

@tokers
Copy link
Contributor Author

tokers commented Nov 30, 2021

better to add e2e for this feature.

First of all, this is not a feature, indeed it's a bugfix, also, it's tough to add cases for it.

@tao12345666333
Copy link
Member

Maybe we can introduce chaos test later.

If you want to add tests here,
We can first set the number of replicas of apisix to 0 to ensure that all ep are cleaned up.

We may also need to rely on #770 to confirm that the current cluster is not available

@gxthrj
Copy link
Contributor

gxthrj commented Nov 30, 2021

better to add e2e for this feature.

First of all, this is not a feature, indeed it's a bugfix, also, it's tough to add cases for it.

I think the bugfix also need e2e test , You can restart the APISIX pod during the test, and then check the behavior of the Ingress controller.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

license-eye has totally checked 473 files.

Valid Invalid Ignored Fixed
249 1 223 0
Click to see the invalid file list
  • test/e2e/chaos/chaos.go

test/e2e/chaos/chaos.go Show resolved Hide resolved
tools.go Outdated Show resolved Hide resolved
Signed-off-by: Jintao Zhang <zhangjintao9020@gmail.com>
@tao12345666333 tao12345666333 added this to In progress in v1.4 Planning via automation Dec 24, 2021
@tao12345666333 tao12345666333 added this to the 1.4.0 milestone Dec 24, 2021
Copy link
Member

@tao12345666333 tao12345666333 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

v1.4 Planning automation moved this from In progress to Reviewer approved Dec 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

bug: Unable to reconnect to apisix, when all ep are deleted under svc of apisix
5 participants