Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto reconnect doesn't work under certains circumstances #1007

Closed
cschockaert opened this issue Nov 20, 2019 · 9 comments · Fixed by #1139
Closed

Auto reconnect doesn't work under certains circumstances #1007

cschockaert opened this issue Nov 20, 2019 · 9 comments · Fixed by #1139
Labels

Comments

@cschockaert
Copy link

cschockaert commented Nov 20, 2019

Hello,

When ready check is on and an MaxRetriesPerRequestError happens, a recoverFromFatalError is called in the ready check causing disconnect(true) to be called and will lead to abort to reconnection process.

This is leading us in a situation where express session associated to redisStore isn't working anymore.

a quick and dirty fix for us is to disable ready check or set MaxRetriesPerRequest to -1

Our case to test the bug:

ioredis express client connected as a standalone redis configuration to redis-ha chart with haproxy enabled and 3 redis/sentinel behind. (haproxy is routing all requests to the master node)
kill or scale to 0 haproxy.
ioredis is entering the autoreconnect loop (with retrystrategy)

Adding this stack to help :

programname programname 2019-11-20T08:58:29.197Z ioredis:redis status[10.19.246.167:6379 (clientname)]: close -> reconnecting
programname programname 2019-11-20T08:58:29.197Z ioredis:connection reach maxRetriesPerRequest limitation, flushing command queue... v1
programname programname 2019-11-20T08:58:29.197Z ioredis:connection reach maxRetriesPerRequest limitation, flushing command queue... DONE v1
programname programname 2019-11-20T08:58:29.197Z ioredis:redis recoverFromFatalError MaxRetriesPerRequestError: Reached the max retries per request limit (which is 1). Refer to "maxRetriesPerRequest" option for details.
programname programname 2019-11-20T08:58:29.198Z ioredis:redis DISCONNECT CALLED v2 true
programname programname Trace
programname programname     at Redis.disconnect (/snapshot/programname/node_modules/ioredis/built/redis/index.js:323:10)
programname programname     at Redis.recoverFromFatalError (/snapshot/programname/node_modules/ioredis/built/redis/index.js:365:10)
programname programname     at /snapshot/programname/node_modules/ioredis/built/redis/event_handler.js:53:30
programname programname     at /snapshot/programname/node_modules/ioredis/built/redis/index.js:436:20
programname programname     at tryCatcher (/snapshot/programname/node_modules/standard-as-callback/built/utils.js:11:23)
programname programname     at promise.then (/snapshot/programname/node_modules/standard-as-callback/built/index.js:30:51)
programname programname     at process._tickCallback (internal/process/next_tick.js:68:7)

@cschockaert cschockaert changed the title Auto reconnect doesn't work in special case Auto reconnect doesn't work under certains circumstances Nov 20, 2019
@stale
Copy link

stale bot commented Dec 20, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 7 days if no further activity occurs, but feel free to re-open a closed issue if needed.

@stale stale bot added the wontfix label Dec 20, 2019
@cschockaert
Copy link
Author

cschockaert commented Dec 20, 2019

Still an active bug

@stale stale bot removed the wontfix label Dec 20, 2019
@fbruffaert
Copy link

fbruffaert commented Jan 16, 2020

I also have encountered the same issue. With maxRetriesPerRequest set to 1 and enableOfflineQueue to false, auto-reconnect mechanism stops working after entering into the recoverFromFatalError method.

Here are some logs:

ioredis:redis status[127.0.0.1:6379]: connecting -> connect +82ms
ioredis:redis write command[127.0.0.1:6379]: 0 -> info([]) +2ms
ioredis:redis status[127.0.0.1:6379]: connect -> ready +5ms
ioredis:redis write command[127.0.0.1:6379]: 0 -> flushdb([]) +1ms

ioredis:redis status[127.0.0.1:6379]: ready -> close +8s
ioredis:connection reconnect in 50ms +0ms
ioredis:redis status[127.0.0.1:6379]: close -> reconnecting +1ms
ioredis:redis status[127.0.0.1:6379]: reconnecting -> connecting +52ms
ioredis:redis status[127.0.0.1:6379]: connecting -> connect +2ms
ioredis:redis write command[127.0.0.1:6379]: 0 -> info([]) +1ms
ioredis:connection error: Error: read ECONNRESET +57ms
[ioredis] Unhandled error event: Error: read ECONNRESET
at TCP.onStreamRead (internal/stream_base_commons.js:111:27)
ioredis:redis status[127.0.0.1:6379]: connect -> close +5ms
ioredis:connection reconnect in 100ms +3ms
ioredis:redis status[127.0.0.1:6379]: close -> reconnecting +0ms
ioredis:connection reach maxRetriesPerRequest limitation, flushing command queue... +0ms

@luin were you able to investigate about this issue?

Thanks!

@592da
Copy link

592da commented Feb 25, 2020

happening on idle lamdas as well
any fix ?

@shizhx
Copy link

shizhx commented Mar 28, 2020

+1
We also have encountered the same issue.

@shizhx
Copy link

shizhx commented Mar 28, 2020

when debuging, we see:

  • status = 'reconnecting'
  • but reconnectTimeout timer task is null

@shizhx
Copy link

shizhx commented Mar 30, 2020

reproduced method:

  1. redis-server not started
  2. new Redis() and call any command, e.g. get
  3. ioredis will connect failed and retry(enable DEBUG and print retryAttempts)
  4. once retryAttempts reach 20, start netcat nc -l 6379 as a fake redis-server
  5. wait for a whlie, once ioredis connected to netcat and info command written, stop netcat
  6. ioredis will reconnect and the reconnectTimeout task would always canceled by recoverFromFatalError
  7. ioredis dead and get command will never return

luin pushed a commit that referenced this issue May 30, 2020
ioredis-robot pushed a commit that referenced this issue May 30, 2020
## [4.17.3](v4.17.2...v4.17.3) (2020-05-30)

### Bug Fixes

* race conditions in `Redis#disconnect()` can cancel reconnection unexpectedly ([6fad73b](6fad73b)), closes [#1138](#1138) [#1007](#1007)
@ioredis-robot
Copy link
Collaborator

ioredis-robot commented May 30, 2020

🎉 This issue has been resolved in version 4.17.3 🎉

The release is available on:

Your semantic-release bot 📦🚀

@tupizz
Copy link

tupizz commented Jul 15, 2020

The error still here, even with the version 4.17.3...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants