Permalink
Browse files

bug 1525 - moxi eating 100% cpu due to conn_pause

Compared to memcached, moxi adds a new 'conn_pause' state to the
drive_machine() state machine.  Some of the new membase/vbucket features,
especially when encountering a pending vbucket (where the downstream
server blocks for a potentially appreciably long time), were
exercising this codepath a lot more.  In this case, the upstream
connection (to the application/client) was left in conn_pause
state.

In one of the situations where I caught the 100% cpu situation, the
drive_machine() was spin looping on conn_pause'ed connections.

One thing that might lead to this is moxi code seems to (incorrectly)
not be unregistering the libevent registrations for upstream
connections that go into conn_pause state.

As a catch-all fix, any connections that are in conn_pause will have
their libevent registrations event_del()'ed.

Change-Id: Ia88133a6cb97209fff6bab465a14b934b2dd6274
Reviewed-on: http://review.northscale.com:8080/844
Reviewed-by: Dustin Sallings <dustin@spy.net>
Tested-by: Dustin Sallings <dustin@spy.net>
  • Loading branch information...
1 parent 2910f24 commit bd804efa42b729ea23e7ff1d91ce1dbf1507b202 @steveyen steveyen committed with dustin Jun 23, 2010
Showing with 12 additions and 1 deletion.
  1. +12 −1 memcached.c
View
@@ -3210,9 +3210,10 @@ bool update_event(conn *c, const int new_flags) {
if (c->ev_flags == new_flags)
return true;
if (event_del(&c->event) == -1) return false;
+ c->ev_flags = new_flags;
+ if (new_flags == 0) return true;
event_set(&c->event, c->sfd, new_flags, event_handler, (void *)c);
event_base_set(base, &c->event);
- c->ev_flags = new_flags;
if (event_add(&c->event, 0) == -1) return false;
return true;
}
@@ -3328,6 +3329,11 @@ void drive_machine(conn *c) {
assert(c != NULL);
while (!stop) {
+ if (settings.verbose > 2) {
+ fprintf(stderr, "%d: drive_machine %s\n",
+ c->sfd, state_text(c->state));
+ }
+
switch(c->state) {
case conn_listening:
addrlen = sizeof(addr);
@@ -3597,6 +3603,11 @@ void drive_machine(conn *c) {
break;
case conn_pause:
+ // In case whoever put us into conn_pause didn't clear out
+ // libevent registration, do so now.
+ //
+ update_event(c, 0);
+
if (c->funcs->conn_pause != NULL)
c->funcs->conn_pause(c);

0 comments on commit bd804ef

Please sign in to comment.