Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add XDS keepalives #1747

Merged
merged 3 commits into from
Aug 9, 2023
Merged

Add XDS keepalives #1747

merged 3 commits into from
Aug 9, 2023

Conversation

jackkleeman
Copy link
Contributor

Fixes #1746

@jackkleeman jackkleeman requested a review from a team as a code owner August 2, 2023 11:27
@@ -125,6 +125,11 @@ static_resources:
http2_protocol_options: {}
name: xds_cluster
type: STRICT_DNS
upstream_connection_options:
tcp_keepalive:
keepalive_probes: 3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where are these number came from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the same ones as Contour, but we can use anything really

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's OK now, just want to clearify.

Copy link
Contributor

@arkodg arkodg Aug 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we control the server (xds server https://github.com/envoyproxy/gateway/blob/72bfe9bde95015b6731857f52ea30d27ee85145d/internal/xds/server/runner/runner.go#L94C12-L94C22 & https://pkg.go.dev/net#ListenConfig) and client (this envoy cluster), can we use higher values to reduce cpu consumption ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does controlling the server help? the goal is for the client detect half closed situation fairly quickly. i think the server will detect this anyway whenever it sends an update. i dont feel strongly about what values we use though, if you have a preference i am all ears!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome thanks for doing this in the client/envoy, this will also need to be done in the peer/server

grpc.KeepaliveParams

https://pkg.go.dev/google.golang.org/grpc/keepalive#ServerParameters

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for server detecting client going away, right? I feel its a separate discussion, I suppose the goal would be saving some resources on server side, but it won't matter for clients, right?

Copy link
Contributor

@arkodg arkodg Aug 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for grpc keep alive the server needs to be set up with similar grpc keep alive settings, else the client might receive GO_AWAY ENHANCE_YOUR_CALM
https://grpc.io/docs/guides/keepalive/#background
I think its fine to set it to 30s and 5s timeout, these can be revisited later

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done :)

@codecov
Copy link

codecov bot commented Aug 2, 2023

Codecov Report

Merging #1747 (f5fd065) into main (8019ae4) will increase coverage by 0.04%.
Report is 1 commits behind head on main.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##             main    #1747      +/-   ##
==========================================
+ Coverage   64.88%   64.92%   +0.04%     
==========================================
  Files          84       84              
  Lines       12192    12195       +3     
==========================================
+ Hits         7911     7918       +7     
+ Misses       3774     3771       -3     
+ Partials      507      506       -1     
Files Changed Coverage Δ
internal/xds/server/runner/runner.go 28.97% <0.00%> (-0.84%) ⬇️

... and 2 files with indirect coverage changes

@arkodg arkodg added this to the 0.6.0-rc1 milestone Aug 2, 2023
@arkodg
Copy link
Contributor

arkodg commented Aug 3, 2023

hey @jackkleeman can you also force push with a signed commit since DCO is failing

@jackkleeman jackkleeman force-pushed the issue-1746 branch 2 times, most recently from 6743c7f to 3ed5553 Compare August 4, 2023 09:13
@jackkleeman
Copy link
Contributor Author

done @arkodg

Fixes envoyproxy#1746

Signed-off-by: Jack Kleeman <jackkleeman@gmail.com>
Signed-off-by: Jack Kleeman <jackkleeman@gmail.com>
zirain
zirain previously approved these changes Aug 9, 2023
Copy link
Contributor

@zirain zirain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks!

Signed-off-by: Jack Kleeman <jackkleeman@gmail.com>
Copy link
Contributor

@arkodg arkodg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@jackkleeman
Copy link
Contributor Author

I can't merge - please go ahead when happy to

@arkodg arkodg merged commit 6b8268f into envoyproxy:main Aug 9, 2023
17 of 18 checks passed
@jackkleeman jackkleeman deleted the issue-1746 branch August 9, 2023 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

XDS connections need TCP keepalives
3 participants