Skip to content

Commit

Permalink
rpc: increase gRPC server keepalive interval to 2x PingInterval
Browse files Browse the repository at this point in the history
This patch increases the gRPC server keepalive interval from 1x to 2x
PingInterval. These keepalive pings are used both to keep the connection
alive, and to detect and close failed connections from the server-side.
Keepalive pings are only sent and checked if there is no other activity
on the connection.

We set this to 2x PingInterval, since we expect RPC heartbeats to be
sent regularly (obviating the need for the keepalive ping), and there's
no point sending keepalive pings until we've waited long enough for the
RPC heartbeat to show up.

An environment variable COCKROACH_RPC_SERVER_KEEPALIVE_INTERVAL is also
added to tune this.

Epic: none
Release note: None
  • Loading branch information
erikgrinaker committed Aug 30, 2023
1 parent 57b8230 commit 2e6eb2c
Show file tree
Hide file tree
Showing 2 changed files with 17 additions and 5 deletions.
8 changes: 4 additions & 4 deletions pkg/base/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -148,10 +148,10 @@ var (
// 2 * NetworkTimeout is sufficient.
DialTimeout = envutil.EnvOrDefaultDuration("COCKROACH_RPC_DIAL_TIMEOUT", 2*NetworkTimeout)

// PingInterval is the interval between network heartbeat pings. It is used
// both for RPC heartbeat intervals and gRPC server keepalive pings. It is
// set to 1 second in order to fail fast, but with large default timeouts
// to tolerate high-latency multiregion clusters.
// PingInterval is the interval between RPC heartbeat pings. It is set to 1
// second in order to fail fast, but with large default timeouts to tolerate
// high-latency multiregion clusters. The gRPC server keepalive interval is
// also affected by this.
PingInterval = envutil.EnvOrDefaultDuration("COCKROACH_PING_INTERVAL", time.Second)

// defaultRangeLeaseDuration specifies the default range lease duration.
Expand Down
14 changes: 13 additions & 1 deletion pkg/rpc/keepalive.go
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,18 @@ import (
var serverTimeout = envutil.EnvOrDefaultDuration(
"COCKROACH_RPC_SERVER_TIMEOUT", 2*base.NetworkTimeout)

// serverKeepaliveInterval is the interval between server keepalive pings.
// These are used both to keep the connection alive, and to detect and close
// failed connections from the server-side. Keepalive pings are only sent and
// checked if there is no other activity on the connection.
//
// We set this to 2x PingInterval, since we expect RPC heartbeats to be sent
// regularly (obviating the need for the keepalive ping), and there's no point
// sending keepalive pings until we've waited long enough for the RPC heartbeat
// to show up.
var serverKeepaliveInterval = envutil.EnvOrDefaultDuration(
"COCKROACH_RPC_SERVER_KEEPALIVE_INTERVAL", 2*base.PingInterval)

// 10 seconds is the minimum keepalive interval permitted by gRPC.
// Setting it to a value lower than this will lead to gRPC adjusting to this
// value and annoyingly logging "Adjusting keepalive ping interval to minimum
Expand All @@ -61,7 +73,7 @@ var clientKeepalive = keepalive.ClientParameters{
}
var serverKeepalive = keepalive.ServerParameters{
// Send periodic pings on the connection when there is no other traffic.
Time: base.PingInterval,
Time: serverKeepaliveInterval,
// Close the connection if either a keepalive ping doesn't receive a response
// within the timeout, or a TCP send doesn't receive a TCP ack within the
// timeout (enforced by the OS via TCP_USER_TIMEOUT).
Expand Down

0 comments on commit 2e6eb2c

Please sign in to comment.