Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: enable tcp keepalive for http server #4019

Merged
merged 3 commits into from
May 27, 2024

Conversation

MichaelScofield
Copy link
Collaborator

I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

What's changed and what's your intention?

In our customer's env, we've observed a lot of "dangling" http connections. They are all in the “established” state, even the client side pods were already destroyed long before. These dangling connections are harmful, somehow they consumed a lot of memory.

For the origin of the dangling connections, my guessing is that the client side pods are not gracefully shutdown, or the bad network. They are both common situations in cloud env, so I decide to add the keepalive option for http server: if the connections are idle for 1 hour, close it actively in the server side.

I'm a little hesitate to expose the keepalive option in the config file. For http connection, if not http2, rebuilding is very common. Any decent http connection pool can do that (there are keepalive options in themselves, too). So I think for the sake of simplicity, the keepalive option is hardcoded, and make it one hour long so to give the connections long enough time to say they are alive.

After adding the keepalive option, our customer says the memory issue is gone.

This PR also modify some codes to make greptimedb able to be integrated into other projects.

Checklist

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.
  • This PR requires documentation updates.

@MichaelScofield MichaelScofield requested a review from a team as a code owner May 23, 2024 06:41
@github-actions github-actions bot added the docs-not-required This change does not impact docs. label May 23, 2024
Copy link
Member

@sunng87 sunng87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I thought this is not required for modern network stack.

And do we need same configuration for gRPC, mysql and postgres?

tests-integration/src/test_util.rs Outdated Show resolved Hide resolved
@MichaelScofield
Copy link
Collaborator Author

LGTM. I thought this is not required for modern network stack.

And do we need same configuration for gRPC, mysql and postgres?

Thought about that. We can enable http server keepalive like this is because the protocol used here(http 1.1) is simple enough. For the other three, there might be their specific mechanism for this kind of purpose. For example, grpc has "ping" based keepalive made into its protocol layer(not tcp layer). And jdbc seems to have its own keepalive parameter as well. So I don't want to all simply blindly enable the underlying tcp keepalive for them.

Copy link

codecov bot commented May 27, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 85.13%. Comparing base (418090b) to head (482287d).
Report is 13 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4019      +/-   ##
==========================================
- Coverage   85.44%   85.13%   -0.31%     
==========================================
  Files         980      983       +3     
  Lines      170112   170512     +400     
==========================================
- Hits       145352   145166     -186     
- Misses      24760    25346     +586     

@MichaelScofield MichaelScofield added this pull request to the merge queue May 27, 2024
Merged via the queue into GreptimeTeam:main with commit 2971052 May 27, 2024
27 of 28 checks passed
@MichaelScofield MichaelScofield deleted the chore/ent-sync branch May 27, 2024 04:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-not-required This change does not impact docs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants