rdy_timeout handling round 2 #37

mreiferson · 2013-06-24T18:42:54Z

more cleanup re: #36

fix edge cases where rdy_timeout was not being cleaned up
separates the handling of Reader global backoff timers from per-connection RDY delay timers in order to handle connections closing that had the backoff timer state
resuming normal RDY state for all connections after completely exiting backoff state
re-connecting while in backoff block

mreiferson · 2013-06-24T20:25:09Z

pushed a few commits up and updated issue description

jehiah · 2013-06-25T03:40:24Z

I've pushed up a few more various fixes here. This code is now stable for me and tested on some production systems. I am still thinking through if there is anything else i want to tackle, and i have a few ideas on how to test.

We put a connection in the connection list too soon (it was possible to get sent a RDY count before it was connected). I delayed that till it was actually connected.
end of handling calls 2 functions, exit or start backoff block.
other points for backoff (connection add, close) call restart backoff block as appropriate
backoff block functions never take a connection; they pick one randomly

* skip redistributing RDY while blocked by backoff * stop callbacks on closed connections * re-schedule backoff on remaining connections * properly send RDY on connect (after IDENTIFY) * log message body on exception * throttle connection attempts * rename various internal methods for clarity * separate a conn's rdy_timeout (for disabled handling) from the reader's global backoff_timeout (and clear backoff_timeout) * force redistribute after a connection closes when in backoff or when it would have toggled out of normal redistribution cases * test improvements * dont 'optimize' when RDY is already the value we want (far far easier to reason in tests) * when completely exiting backoff return to normal operation * choose a random connection when backoff block expires

jehiah · 2013-06-26T17:45:29Z

LGTM. squash/rebase please.

This set of changes fantastically improves lots of strange edge cases. 🚀 💯

rdy_timeout handling round 2

jehiah added a commit that referenced this pull request Jun 26, 2013

Merge pull request #37 from mreiferson/rdy_timeout_again_37

e41e329

rdy_timeout handling round 2

jehiah merged commit e41e329 into nsqio:master Jun 26, 2013

jehiah mentioned this pull request Aug 19, 2013

reader: improve RDY logic nsqio/nsq#253

Closed

5 tasks

mreiferson mentioned this pull request Sep 3, 2013

improve RDY logic nsqio/go-nsq#1

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rdy_timeout handling round 2 #37

rdy_timeout handling round 2 #37

mreiferson commented Jun 24, 2013

mreiferson commented Jun 24, 2013

jehiah commented Jun 25, 2013

jehiah commented Jun 26, 2013

rdy_timeout handling round 2 #37

rdy_timeout handling round 2 #37

Conversation

mreiferson commented Jun 24, 2013

mreiferson commented Jun 24, 2013

jehiah commented Jun 25, 2013

jehiah commented Jun 26, 2013