fix(redis): attach client error handler so a dropped connection doesn't crash the process#999
Open
JohnMcLear wants to merge 1 commit into
Open
fix(redis): attach client error handler so a dropped connection doesn't crash the process#999JohnMcLear wants to merge 1 commit into
JohnMcLear wants to merge 1 commit into
Conversation
…'t crash the process node-redis re-emits connection/socket errors (idle drop, server restart, failover, a proxy closing the connection) on the client as an EventEmitter 'error'. With no listener Node treats it as uncaught and terminates the host process — and node-redis also won't begin reconnecting. The redis driver never attached one, so any of those events crashed Etherpad. This is the redis analogue of the postgres bug in ether/etherpad#7878; the official node-redis docs require this listener. - Attach `client.on('error', …)` before connect() (so connect-time failures are covered too): log the error and let node-redis reconnect. - Add test/redis/connection-drop.spec.ts: warm the client, kill its connection server-side with CLIENT KILL (what a failover / proxy idle timeout does), and assert the handler caught it and the client recovered. Verified locally that this FAILS without the handler (the client cannot reconnect and the recovery query hangs to a timeout) and PASSES with it. redis is already in the container test matrix, so this runs in CI under `pnpm run test redis`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Review Summary by Qodo(Agentic_describe updated until commit e8d9ce7)Fix redis client error handling to prevent process crashes
WalkthroughsDescription• Attach error handler to redis client before connect to prevent process crash • Log connection errors and allow transparent reconnection on socket failures • Add integration test verifying recovery from server-side connection kills Diagramflowchart LR
A["Redis Connection Error"] -->|"Without Handler"| B["Process Crash"]
A -->|"With Handler"| C["Error Logged"]
C --> D["Client Reconnects"]
D --> E["Service Continues"]
File Changes1. databases/redis_db.ts
|
Code Review by Qodo
1. Flaky reconnect wait
|
Code Review by Qodo
1. Admin client cleanup missing
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The redis analogue of ether/etherpad#7878 (postgres, PR #998). node-redis re-emits connection/socket errors — an idle drop, server restart, failover, or a proxy/LB/firewall closing the connection — on the client as an EventEmitter
'error'. With no listener, Node treats it as an uncaught exception and terminates the host process, and node-redis additionally won't start reconnecting. The official node-redis error-handling docs require attaching this listener;redis_db.tsnever did, so any of those events crashed Etherpad.Fix
Attach
client.on('error', …)ininit()beforeconnect()(so connect-time failures are covered too): log the error and let node-redis transparently reconnect.Test (real, not a wiring assertion)
test/redis/connection-drop.spec.tsstarts a redis container, warms the client, then kills its connection server-side withCLIENT KILL TYPE normal— exactly what a failover or a proxy idle-timeout does — and asserts:Verified locally that this fails without the handler (node-redis can't reconnect, so the recovery query hangs to a timeout) and passes with it. redis is already in the container
testmatrix, so this runs in CI underpnpm run test redis(the new spec uses a dynamic mapped port, so it doesn't collide with the existing fixed-6379 redis spec).Verification
pnpm run lint/format:check/ts-check/build— cleanvitest run test/redis/connection-drop.spec.ts— passes with fix, fails (timeout) withoutContext
Part of a per-driver sweep prompted by #998. Each affected backend gets its own PR (no combined PR). postgres = #998; mysql to follow; mongodb/cassandra/elasticsearch/couch and the local stores are not affected; rethink/surrealdb still under review.
🤖 Generated with Claude Code