Skip to content

Hot fix/grpc resourceexhausted#374

Merged
bootjp merged 1 commit intocodex/primary-blocking-bzpopminfrom
hot-fix/grpc-resourceexhausted
Mar 20, 2026
Merged

Hot fix/grpc resourceexhausted#374
bootjp merged 1 commit intocodex/primary-blocking-bzpopminfrom
hot-fix/grpc-resourceexhausted

Conversation

@bootjp
Copy link
Owner

@bootjp bootjp commented Mar 20, 2026

No description provided.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses potential gRPC ResourceExhausted errors by centralizing gRPC configuration and standardizing message size limits. It also enhances the Redis proxy's handling of blocking commands by introducing dynamic timeout management and improves the dual-write mechanism's robustness for Redis Lua scripts by implementing a script caching and fallback strategy.

Highlights

  • gRPC Configuration Centralization: Centralized gRPC server and dial options into a new internal/grpc.go file, including MaxRecvMsgSize and MaxSendMsgSize to prevent ResourceExhausted errors. This change removes redundant grpc.WithTransportCredentials(insecure.NewCredentials()) and grpc.NewServer() calls across multiple files.
  • Redis Blocking Command Handling: Introduced logic to dynamically calculate and apply appropriate timeouts for Redis blocking commands (e.g., BLPOP, XREAD) by adding a DoWithTimeout method to the Backend interface and implementing it in RedisBackend.
  • Redis Script Caching for Dual-Write: Implemented a mechanism to cache Redis Lua scripts executed via EVAL commands. This allows the dual-writer to fall back to EVAL with the full script if a NOSCRIPT error is encountered when attempting EVALSHA on the secondary backend, improving robustness for dual-write scenarios.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@bootjp bootjp changed the base branch from main to codex/primary-blocking-bzpopmin March 20, 2026 11:49
@bootjp bootjp requested a review from Copilot March 20, 2026 11:49
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces two main sets of changes. First, it addresses potential grpc.ResourceExhausted errors by centralizing gRPC client and server options into a new internal package and increasing the maximum message size. This is a good practice for maintainability. Second, it enhances the Redis proxy functionality by adding support for custom timeouts on blocking commands, forcing RESP2 protocol for backend connections, and implementing a script cache to handle NOSCRIPT errors for EVALSHA commands by falling back to EVAL. These are valuable improvements for the proxy's robustness.

However, I've found a critical issue in kv/grpc_conn_cache.go where the new gRPC dial options are being incorrectly applied, which would cause the message size limits to be ignored for cached connections. I've left a detailed comment with a suggested fix that also aligns with the practice of centralizing common logic into shared helper functions.

Comment on lines +78 to 81
conn, err := grpc.NewClient(
string(addr),
append(internalutil.GRPCDialOptions(), grpc.WithDefaultCallOptions(grpc.WaitForReady(true)))...,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The use of append here will cause the grpc.WithDefaultCallOptions from internalutil.GRPCDialOptions() to be overridden by the new grpc.WithDefaultCallOptions(grpc.WaitForReady(true)). This means the MaxCallRecvMsgSize and MaxCallSendMsgSize settings will be lost for connections created via GRPCConnCache, which undermines the main goal of this pull request for this part of the code.

To fix this, I recommend adding a new helper function in internal/grpc.go that includes the WaitForReady(true) option alongside the message size options. This approach centralizes common complex logic into a shared helper function, improving maintainability and avoiding code duplication, as per best practices.

You could add this to internal/grpc.go:

// GRPCDialOptionsWithWaitForReady returns the common insecure dial options and
// enables waiting for connections to be ready.
func GRPCDialOptionsWithWaitForReady() []grpc.DialOption {
	return []grpc.DialOption{
		grpc.WithTransportCredentials(insecure.NewCredentials()),
		grpc.WithDefaultCallOptions(
			grpc.MaxCallRecvMsgSize(GRPCMaxMessageBytes),
			grpc.MaxCallSendMsgSize(GRPCMaxMessageBytes),
			grpc.WaitForReady(true),
		),
	}
}

Then, you can use this new function here.

Suggested change
conn, err := grpc.NewClient(
string(addr),
append(internalutil.GRPCDialOptions(), grpc.WithDefaultCallOptions(grpc.WaitForReady(true)))...,
)
conn, err := grpc.NewClient(
string(addr),
internalutil.GRPCDialOptionsWithWaitForReady()...,
)
References
  1. Centralize common complex logic, such as retry/backoff mechanisms, into shared helper functions to avoid code duplication and improve maintainability.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses gRPC ResourceExhausted failures by standardizing larger gRPC message-size limits across the server and client dial paths used for Raft replication and the KV gRPC APIs.

Changes:

  • Add shared gRPC server/dial option helpers (GRPCServerOptions, GRPCDialOptions) with a 64MiB message budget.
  • Apply the shared options to Raft gRPC servers and the raft-grpc-transport dial configuration.
  • Update the KV gRPC connection cache and demo/test utilities to use the shared dial/server options.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
multiraft_runtime.go Uses shared dial options for raft-grpc-transport connections.
main.go Creates gRPC servers with shared max send/recv message limits.
kv/grpc_conn_cache.go Reuses shared dial options (plus existing WaitForReady) for cached client conns.
internal/grpc.go Introduces centralized gRPC max message sizing for both server and client.
cmd/server/demo.go Applies shared options to demo server and leader-dial logic.
adapter/test_util.go Applies shared options to test node gRPC servers and raft transport dialing.

@bootjp bootjp merged commit d12b3b6 into codex/primary-blocking-bzpopmin Mar 20, 2026
12 checks passed
@bootjp bootjp deleted the hot-fix/grpc-resourceexhausted branch March 20, 2026 11:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants