Skip to content

Commit

Permalink
bug 1525 - moxi eating 100% cpu due to conn_pause
Browse files Browse the repository at this point in the history
Compared to memcached, moxi adds a new 'conn_pause' state to the
drive_machine() state machine.  Some of the new membase/vbucket features,
especially when encountering a pending vbucket (where the downstream
server blocks for a potentially appreciably long time), were
exercising this codepath a lot more.  In this case, the upstream
connection (to the application/client) was left in conn_pause
state.

In one of the situations where I caught the 100% cpu situation, the
drive_machine() was spin looping on conn_pause'ed connections.

One thing that might lead to this is moxi code seems to (incorrectly)
not be unregistering the libevent registrations for upstream
connections that go into conn_pause state.

As a catch-all fix, any connections that are in conn_pause will have
their libevent registrations event_del()'ed.

Change-Id: Ia88133a6cb97209fff6bab465a14b934b2dd6274
Reviewed-on: http://review.northscale.com:8080/844
Reviewed-by: Dustin Sallings <dustin@spy.net>
Tested-by: Dustin Sallings <dustin@spy.net>
  • Loading branch information
steveyen authored and dustin committed Jun 23, 2010
1 parent 2910f24 commit bd804ef
Showing 1 changed file with 12 additions and 1 deletion.
13 changes: 12 additions & 1 deletion memcached.c
Expand Up @@ -3210,9 +3210,10 @@ bool update_event(conn *c, const int new_flags) {
if (c->ev_flags == new_flags)
return true;
if (event_del(&c->event) == -1) return false;
c->ev_flags = new_flags;
if (new_flags == 0) return true;
event_set(&c->event, c->sfd, new_flags, event_handler, (void *)c);
event_base_set(base, &c->event);
c->ev_flags = new_flags;
if (event_add(&c->event, 0) == -1) return false;
return true;
}
Expand Down Expand Up @@ -3328,6 +3329,11 @@ void drive_machine(conn *c) {
assert(c != NULL);

while (!stop) {
if (settings.verbose > 2) {
fprintf(stderr, "%d: drive_machine %s\n",
c->sfd, state_text(c->state));
}

switch(c->state) {
case conn_listening:
addrlen = sizeof(addr);
Expand Down Expand Up @@ -3597,6 +3603,11 @@ void drive_machine(conn *c) {
break;

case conn_pause:
// In case whoever put us into conn_pause didn't clear out
// libevent registration, do so now.
//
update_event(c, 0);

if (c->funcs->conn_pause != NULL)
c->funcs->conn_pause(c);

Expand Down

0 comments on commit bd804ef

Please sign in to comment.