Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: ensure the xds grpc server is properly stopped #1860

Merged
merged 4 commits into from
Sep 12, 2023

Conversation

shawnh2
Copy link
Contributor

@shawnh2 shawnh2 commented Sep 1, 2023

What type of PR is this?

fix: the two xds grpc server in envoy-gateway has not been properly stopped.

logs (before fix)
INFO	gateway-api	runner/runner.go:146	shutting down	{"runner": "gateway-api"}
INFO	xds-translator	runner/runner.go:102	subscriber shutting down	{"runner": "xds-translator"}
INFO	xds-server	runner/runner.go:157	subscriber shutting down	{"runner": "xds-server"}
INFO	infrastructure	runner/runner.go:75	infra subscriber shutting down	{"runner": "infrastructure"}
INFO	provider	kubernetes/controller.go:1037	grpcRoute status subscriber shutting down	{"runner": "provider"}
INFO	provider	kubernetes/controller.go:1009	httpRoute status subscriber shutting down	{"runner": "provider"}
INFO	cmd/server.go:220	shutting down
INFO	provider	kubernetes/controller.go:1093	tcpRoute status subscriber shutting down	{"runner": "provider"}
INFO	provider	kubernetes/controller.go:1121	udpRoute status subscriber shutting down	{"runner": "provider"}
INFO	provider	kubernetes/controller.go:1065	tlsRoute status subscriber shutting down	{"runner": "provider"}

# there is no log indicates the grpc server has been stopped.
logs (after fix)
INFO	gateway-api	runner/runner.go:146	shutting down	{"runner": "gateway-api"}
INFO	xds-translator	runner/runner.go:102	subscriber shutting down	{"runner": "xds-translator"}
INFO	xds-server	runner/runner.go:160	subscriber shutting down	{"runner": "xds-server"}
INFO	infrastructure	runner/runner.go:75	infra subscriber shutting down	{"runner": "infrastructure"}
INFO	cmd/server.go:220	shutting down
INFO	provider	kubernetes/controller.go:1093	tcpRoute status subscriber shutting down	{"runner": "provider"}
INFO	provider	kubernetes/controller.go:1121	udpRoute status subscriber shutting down	{"runner": "provider"}
INFO	provider	kubernetes/controller.go:1065	tlsRoute status subscriber shutting down	{"runner": "provider"}

# the grpc server stopped
INFO	xds-server	runner/runner.go:108	grpc server shutting down	{"runner": "xds-server"}  
# the ratelimit grpc server stopped if ratelimit has been enabled
INFO	global-ratelimit	runner/runner.go:103	grpc server shutting down	{"runner": "global-ratelimit"}

What this PR does / why we need it:

the grpc.Serve() method will block <-ctx.Done(), so the <-ctx.Done() will never be executed even if the ctx is canceled, causing the grpc server never be stopped.

err = r.grpc.Serve(l)
if err != nil {
r.Logger.Error(err, "failed to start grpc based xds server")
}
<-ctx.Done()

so the fix here is to let grpc.Serve() method DO NOT block <-ctx.Done(), let grpc server properly stopped.

Which issue(s) this PR fixes:

None

Signed-off-by: sh2 <shawnhxh@outlook.com>
@shawnh2 shawnh2 requested a review from a team as a code owner September 1, 2023 07:54
Signed-off-by: sh2 <shawnhxh@outlook.com>
@codecov
Copy link

codecov bot commented Sep 1, 2023

Codecov Report

Merging #1860 (7649614) into main (1fab508) will decrease coverage by 0.06%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##             main    #1860      +/-   ##
==========================================
- Coverage   65.35%   65.29%   -0.06%     
==========================================
  Files          86       86              
  Lines       12526    12527       +1     
==========================================
- Hits         8186     8180       -6     
- Misses       3823     3830       +7     
  Partials      517      517              
Files Changed Coverage Δ
internal/xds/server/runner/runner.go 28.70% <0.00%> (-0.27%) ⬇️

... and 4 files with indirect coverage changes

Copy link
Contributor

@arkodg arkodg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks !

@zirain zirain merged commit 98cb3a8 into envoyproxy:main Sep 12, 2023
17 of 18 checks passed
@shawnh2 shawnh2 deleted the fix-grpc-close branch September 12, 2023 02:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants