Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Commits on Apr 26, 2012
  1. Steve Yen

    MB-4334 - clear downstream timeout before releasing

    steveyen authored
    There's one code path in cproxy_release_downstream() where a
    cproxy_forward() fails, during a retry, possibly while the downstream
    timeout_tv event is still regsitered.  In that case, a downstream
    could be released that has a non-0 timeout_tv.  It's a small window
    that some users have hit.
    
    Change-Id: I1b87298dba1151c8ece51c0cd78d68ca6fa2bdb0
    Reviewed-on: http://review.couchbase.org/15232
    Reviewed-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>
    Tested-by: Steve Yen <steve.yen@gmail.com>
Commits on Mar 30, 2012
  1. Steve Yen

    CBSE-126 - clear links & timeouts before and after releasing conn

    steveyen authored
    THe new zstored / asynchronous connect()'ing feature introduced
    code that didn't cleanup correctly in two places.
    
    First, when a downstream was released (cproxy_release_downstream()),
    the downstream should be de-linked before looping through
    downstream_conns and repeatedly invoking
    zstored_release_downstream_conn().  The reason is that
    zstored_release_downstream_conn() calls cproxy_forward(), which might
    recurse.
    
    Secondly, when a downstream conn is zstored_release()'ed, the next
    chosen waiting downstream needs to have its timeout cleared.
    Otherwise, the (incorrectly) still-registered timeout "might" fire in
    unexpected places, possibly leading to bug CBSE-126 situation,
    although it's not a proven linkage.
    
    Change-Id: Ib3fb8365a606a9a8260ac59a4e2d1dba00851d24
    Reviewed-on: http://review.couchbase.org/14423
    Reviewed-by: Bin Cui <bin.cui@gmail.com>
    Tested-by: Steve Yen <steve.yen@gmail.com>
Commits on Mar 21, 2012
  1. Steve Yen

    CBSE-115 - bug in the very first multiget command

    steveyen authored
    The 1.8.0 feature to enable event-based, asynchronous downstream
    connect()'ing (instead of the old synchronous downstream
    connect()'ing) apparently introduced a bug with multi-get.
    
    When the first request is a multi-get command that actually has
    multiple key parameters, that first request can sometimes fail,
    because an asynchronously connect()'ing downstream conn is left
    registered in libevent.  So, libevent might sometimes inadvertently
    invoke the on_pause() callbacks which closes the downstream conns.
    
    An existing unit tests for STATS (which also uses a broadcast codepath
    like multiget) seems to sometimes catch this.
    
      ./t/issue-MB-3076.sh
    
    After this fix of unregistering the half connected downstream conn
    from libevent, test for MB-3076 passes.
    
    Change-Id: Ia197a033fc7fece39055cf4bf3ea23c1e576ebc6
    Reviewed-on: http://review.couchbase.org/14129
    Reviewed-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>
    Tested-by: Steve Yen <steve.yen@gmail.com>
Commits on Mar 16, 2012
  1. Steve Yen Aliaksey Kandratsenka (aka Aliaksei Kandratsenka)

    MB-4896 - Fix memory leak during dynamic reconfiguration.

    steveyen authored alk committed
    The downstream data structs in moxi didn't have their
    behavior data structures cleaned up.
    
    Change-Id: Id96740f47f508b194e45990a1b580cbbfaabbefd
    Reviewed-on: http://review.couchbase.org/13939
    Reviewed-by: Bin Cui <bin.cui@gmail.com>
    Tested-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>
Commits on Oct 29, 2011
  1. Steve Yen

    cproxy_forward assert(d->upstream_conn != NULL) cproxy.c, 1925

    steveyen authored
    Here's the (rather involved) scenario where the assert would be triggered...
    
    * start a rebalance.  The rebalance should be slow, where each vbucket
      migration takes awhile.
    
    * a multi-get command (with lots of keys) is processed by moxi, and
      moxi scatters out downstream GET requests to the involved nodes.
    
    * one downstream request will result in a NOT_MY_VBUCKET response from
      a node, which will make moxi register for a retry of the command
      again later (XXX).
    
    * a different downstream request hits a pending vbucket, meaning it's
      going to block moxi on that request, so moxi can't make progress on
      that multi-get command.
    
    * the rebalance is going very slowly, so next...
    
    * moxi's downstream_timeout timer gets fired, so...
    
    * moxi sends an error back up to the upstream client.
    
    * THEN, since a retry was registered at step XXX above due to the
      NOT_MY_VBUCKET, moxi next incorrectly goes through retry codepaths
      and the assert() catches this bad move.
    
    Change-Id: I4aea4e130ad55bbd2400bbbcd0547a86bddbc5da
    Reviewed-on: http://review.couchbase.org/10039
    Reviewed-by: Steve Yen <steve.yen@gmail.com>
    Tested-by: Steve Yen <steve.yen@gmail.com>
  2. nimishzynga Steve Yen

    Moxi change for enabling event on downstream connection in pause state

    nimishzynga authored steveyen committed
    when we put the downstream connection in conn_pause state, we disable the
    events on that.  So if the connection get closed, we know only when we linked
    that connection to upstream connection ,forward the upstream command to that
    connection and wait for read event on that.When we wait for read event,
    we come to know that connection has closed.
    This fix is to enable read event on the downstream connection in pause state,
    so  we will know if server send extra data for a command or closes the connection.
    
    Change-Id: Iff50b252a04a8036bb901849953eac557ef8cf9d
    Reviewed-on: http://review.couchbase.org/9717
    Tested-by: Steve Yen <steve.yen@gmail.com>
    Reviewed-by: Steve Yen <steve.yen@gmail.com>
Commits on Sep 14, 2011
  1. nimishzynga Steve Yen

    Moxi in compatibility mode crashed after downstream timeout

    nimishzynga authored steveyen committed
    when there is not free downstream connection, downstream is put into the downstream
    queue, and if the downstream conn queue timeout happens, we try to remove it from the
    downstream queue but it is not removed from the queue since the host identifier is different
    during insert and remove in the downstream queue and if the downstream connection is
    closed, we try to send error to all downstream waiting.Then we try to send error to
    released downstream which don't have upstream and moxi crashed.
    
    Change-Id: I789618a617878e04bd65449cad5dff9f78a256c2
    Reviewed-on: http://review.couchbase.org/9547
    Reviewed-by: Steve Yen <steve.yen@gmail.com>
    Tested-by: Steve Yen <steve.yen@gmail.com>
  2. nimishzynga Steve Yen

    Moxi in mcmux mode give error if default(or behaviour) downstream pro…

    nimishzynga authored steveyen committed
    …tocol is ascii
    
    when moxi running in mcmux mode, if client want to use binary protocol between
    moxi and membase, it will send B:host:port <command>, so the peer_protocol of
    upstream connection will be binary but we will send still in ascii to membase since
    default behaviour is ascii. When we get response from membase, we check that that
    upstream connection's peer protocol is binary, so try to parse ascii response with
    binary handler and it fails.
    
    Change-Id: I26f097bc7d20a0e11c76560c8be54a9dda6b05b1
    Reviewed-on: http://review.couchbase.org/9554
    Reviewed-by: Steve Yen <steve.yen@gmail.com>
    Tested-by: Steve Yen <steve.yen@gmail.com>
Commits on Jul 7, 2011
  1. Steve Yen

    MB-2897 - use 64-bits for msec_current time

    steveyen authored
    32-bits overlows after 49+ days, and 64-bits should be enough for
    anybody.
    
    Change-Id: I769839ee4cb41f10ce808cf7f669c0cd1beb7245
    Reviewed-on: http://review.couchbase.org/7748
    Reviewed-by: Bin Cui <bin.cui@gmail.com>
    Tested-by: Steve Yen <steve.yen@gmail.com>
Commits on May 18, 2011
  1. Steve Yen

    MB-3856 - SERVER_ERROR proxy downstream timeout $HOST

    steveyen authored
    The $HOST is only appended during timeout of single-server commands
    (get, set, delete, etc) which are the single-key commands.  Broadcast
    commands (like flush_all) won't have the $HOST appended during a
    timeout.
    
    Change-Id: I40c307a2ea4974aca6b884054da1f935ea8216a4
    Reviewed-on: http://review.membase.org/6315
    Reviewed-by: Bin Cui <bin.cui@gmail.com>
    Tested-by: Steve Yen <steve.yen@gmail.com>
Commits on May 16, 2011
  1. Steve Yen Aliaksey Kandratsenka (aka Aliaksei Kandratsenka)

    MB-3849 - SERVER_ERROR proxy write to downstream $HOST

    steveyen authored alk committed
    Changed the error message to include the downstream host which moxi
    could not propagate the request to.
    
    Change-Id: Ia3e0bbc7ccf2f2ae203aaff56b60f571a1036b75
    Reviewed-on: http://review.membase.org/6269
    Tested-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>
    Reviewed-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>
  2. Steve Yen Aliaksey Kandratsenka (aka Aliaksei Kandratsenka)

    MB-3849 - SERVER_ERROR proxy downstream closed $HOST_IDENT

    steveyen authored alk committed
    Change-Id: Ibe9fd62bc76fc1f14554977f592ef347f871d734
    Reviewed-on: http://review.membase.org/6268
    Tested-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>
    Reviewed-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>
  3. Steve Yen Aliaksey Kandratsenka (aka Aliaksei Kandratsenka)

    MB-3479 - Use binary protocol AUTH_ERROR on null bucket access

    steveyen authored alk committed
    When the client tries to do operations on the so-called "NULL bucket",
    instead of returning a binary protocol response ENOMEM, respond with
    the PROTOCOL_BINARY_RESPONSE_AUTH_ERROR result code.
    
    Change-Id: I0efed77c4dbc2782fad1d8638a7ec7fe42313e21
    Reviewed-on: http://review.membase.org/6267
    Tested-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>
    Reviewed-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>
Commits on May 14, 2011
  1. Steve Yen

    MB-3479 - Use binary protocol EBUSY & EINTERNAL instead of ENOMEM

    steveyen authored
    Instead of over-using the OOM / ENOMEM binary protocol response
    error...
    
    - Return EBUSY during a timeout.
    - Return EINTERNAL for closed sockets & down servers.
    - Return ENOMEM when memcached returns ENOMEM.
    
    Also, use EINTERNAL rather than ENOMEM as the generic catch-all error
    code, which should reduce confusion ("but, I'm not actually out of
    memory").
    
    Change-Id: I207903f0c4d5b967866c67cb61ac3b43a832d5cd
    Reviewed-on: http://review.membase.org/6177
    Tested-by: Steve Yen <steve.yen@gmail.com>
    Reviewed-by: Steve Yen <steve.yen@gmail.com>
Commits on May 11, 2011
  1. Steve Yen

    MB-3798 - moxi option for ketama/weighted/modula item distributions

    steveyen authored
    When using libmemcached, start moxi with an extra -Z key=value
    configuration option...
    
      moxi -Z mcs_opts=distribution:ketama
      moxi -Z mcs_opts=distribution:ketama-weighted
      moxi -Z mcs_opts=distribution:modula
    
    In this commit, moxi stays with distribution:ketama as its default and
    a later debate can change that.
    
    Change-Id: I36d3df3a2ba79c9d793a5e1e1a31d0d24ba48450
    Reviewed-on: http://review.membase.org/6138
    Tested-by: Steve Yen <steve.yen@gmail.com>
    Reviewed-by: Matt Ingenthron <matt@northscale.com>
Commits on Feb 19, 2011
  1. Steve Yen Aliaksey Kandratsenka (aka Aliaksei Kandratsenka)

    MB-3447 - binary protocol err handling

    steveyen authored alk committed
    Need to schedule the upstream conn into libevent so the error response
    actually gets written to socket.
    
    Change-Id: Id058e9dc84beeb2b109b99787b1ee9fc35dbbb38
    Reviewed-on: http://review.membase.org/4664
    Reviewed-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>
    Tested-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>
Commits on Feb 10, 2011
  1. Aliaksey Kandratsenka (aka Aliaksei Kandratsenka) Steve Yen

    added stats for local & total latencies

    alk authored steveyen committed
    NOTE: that we assume that first server in downstream server list is
    'local'. Corresponding change will be made to ns_server.
    
    Change-Id: I8dd5c61828f6cdcc3c75a0e4f8d2559731701341
    Reviewed-on: http://review.membase.org/4499
    Tested-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>
    Reviewed-by: Steve Yen <steve.yen@gmail.com>
  2. Aliaksey Kandratsenka (aka Aliaksei Kandratsenka) Steve Yen

    fixed gcc warning

    alk authored steveyen committed
    Change-Id: I5d09d6a7474e5e2906a5bf78b0d4c0aa732b1f37
    Reviewed-on: http://review.membase.org/4498
    Reviewed-by: Steve Yen <steve.yen@gmail.com>
    Tested-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>
  3. Aliaksey Kandratsenka (aka Aliaksei Kandratsenka) Steve Yen

    replaced self parameter with local

    alk authored steveyen committed
    local requests are requests to first downstream server
    
    Change-Id: I9e95bffd6adc388f55986b280c2023eda61d66e8
    Reviewed-on: http://review.membase.org/4497
    Tested-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>
    Reviewed-by: Steve Yen <steve.yen@gmail.com>
  4. Aliaksey Kandratsenka (aka Aliaksei Kandratsenka) Steve Yen

    removed self-optimization

    alk authored steveyen committed
    Change-Id: I1eb77daded0a5e4ecaec0a0507ae0d89ce08021c
    Reviewed-on: http://review.membase.org/4495
    Reviewed-by: Steve Yen <steve.yen@gmail.com>
    Tested-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>
Commits on Jan 29, 2011
  1. Steve Yen Aliaksey Kandratsenka (aka Aliaksei Kandratsenka)

    MB-3389 - b2b not-my-vbucket handling

    steveyen authored alk committed
    This bug fix required these changes...
    
    - binary protocol ntohs(status) conversion, so the NOT_MY_VBUCKET
      status code comparison actually works.
    
    - remove an assert() so that b2b request retry codepath works.
    
    - send binary error message to binary upstream clients instead of
      an ASCII protocol error string.
    
    Test hint -- point: memcachetest -h HOST:11211 -i 1 -c 1000 -l -L 2
    at moxi while adding & removing servers in a cluster.
    
    Change-Id: I06aba8cc3390042cc7430a3b4d5c659421f112de
    Reviewed-on: http://review.membase.org/4374
    Tested-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>
    Reviewed-by: Aliaksey Kandratsenka <alkondratenko@gmail.com>
Commits on Dec 16, 2010
  1. Steve Yen

    MB-3202 - windows async connect() returns different errcode

    steveyen authored
    Enabled async connect() usage on windows, so moxi will have the same
    "pull network plug" behavior as on linux.  However, windows returns
    EWOULDBLOCK instead of EINPROGRESS, and WSAGetLastError() must be used
    instead of errno.
    
    Change-Id: I6f4918387737cbb6ee458825aec96b57877062ae
    Reviewed-on: http://review.membase.org/4103
    Reviewed-by: Bin Cui <bin.cui@gmail.com>
    Tested-by: Steve Yen <steve.yen@gmail.com>
Commits on Dec 14, 2010
  1. Steve Yen

    MB-3175 - SERVER_ERROR proxy downstream timeout

    steveyen authored
    Provide a slightly different ascii error response if moxi hits a
    downstream timeout.
    
    Change-Id: Icb4501f1e255072263d96b17ea40a4e006a011de
    Reviewed-on: http://review.membase.org/4073
    Reviewed-by: Dustin Sallings <dustin@spy.net>
    Tested-by: Steve Yen <steve.yen@gmail.com>
Commits on Dec 9, 2010
  1. Steve Yen Sean Richard Lynch

    MB-3113 - close new upstream conns when there are no buckets

    steveyen authored seanlynch committed
    Change-Id: I0dd5ccb3b35f6c3ba4d0b8678aaa9b8810559951
    Reviewed-on: http://review.membase.org/4008
    Tested-by: Sean Lynch <seanl@literati.org>
    Reviewed-by: Sean Lynch <seanl@literati.org>
  2. Steve Yen Sean Richard Lynch

    MB-3113 - allow conn_init() callbacks to return an error

    steveyen authored seanlynch committed
    Change-Id: Ie3a41ea3f3009eb7995200bed7c01a219aa4baba
    Reviewed-on: http://review.membase.org/4007
    Tested-by: Sean Lynch <seanl@literati.org>
    Reviewed-by: Sean Lynch <seanl@literati.org>
Commits on Dec 8, 2010
  1. Steve Yen Chiyoung Seo

    MB-3129 - handle downstream_timeout/conn_queue_timeout cleaner

    steveyen authored chiyoung committed
    When we hit a downstream_timeout or downstream_conn_queue_timeout,
    count those events as separate stats.
    
    Also, explicitly propagate an error to the upstream conn right away,
    so upstream client is not hung.
    
    Also, use explicit downstream_release, instead of the indirect,
    counter-based approach of cproxy_close_conn().
    
    Change-Id: Ib7babe83812c69224d67815fe732d7e5bf3e7d52
    Reviewed-on: http://review.membase.org/3972
    Reviewed-by: Chiyoung Seo <chiyoung.seo@gmail.com>
    Tested-by: Chiyoung Seo <chiyoung.seo@gmail.com>
  2. Steve Yen Chiyoung Seo

    MB-3129 - avoid queue infinite loops

    steveyen authored chiyoung committed
    Change-Id: Ied94ff62af5794cc8c21467d1533ed78d4951227
    Reviewed-on: http://review.membase.org/3971
    Tested-by: Steve Yen <steve.yen@gmail.com>
    Reviewed-by: Chiyoung Seo <chiyoung.seo@gmail.com>
Commits on Dec 7, 2010
  1. Steve Yen Chiyoung Seo

    stats for downstream_conn_queue add/remove

    steveyen authored chiyoung committed
    Change-Id: I9d409005ce508669426f4be203501a7e16566ae5
    Reviewed-on: http://review.membase.org/3969
    Tested-by: Chiyoung Seo <chiyoung.seo@gmail.com>
    Reviewed-by: Chiyoung Seo <chiyoung.seo@gmail.com>
Commits on Dec 3, 2010
  1. Steve Yen Chiyoung Seo

    MB-3099 - revert code from f763c6a that stops cmd forwarding

    steveyen authored chiyoung committed
    Removed the extra check in cproxy_forward_or_error() that
    caused a regression in moxi behavior when you kill an entire
    cluster.
    
    Change-Id: Ib4befef8aebda79f4207df94b341c917cfa05b9e
    Reviewed-on: http://review.membase.org/3946
    Tested-by: Chiyoung Seo <chiyoung.seo@gmail.com>
    Reviewed-by: Chiyoung Seo <chiyoung.seo@gmail.com>
  2. Steve Yen

    MB-3076 - broadcast commands have correct responses

    steveyen authored
    Due to the async connect() enhancement from zstored, moxi might
    receive a downstream connection error callback ("on close") at odd
    asynchronous times in the future.  So, the old synchronous connect()
    error handling code stopped working right.
    
    In this fix, if that asynchronous downstream conn "on_close" callback
    was due to a connect error, then don't run the usual response-or-error
    gathering codepaths because the downstream conn should have already
    been delinked from the request context.
    
    Change-Id: I689ce3201283ecd4226815ed3a2352eb1fa34b1d
    Reviewed-on: http://review.membase.org/3942
    Reviewed-by: Chiyoung Seo <chiyoung.seo@gmail.com>
    Tested-by: Steve Yen <steve.yen@gmail.com>
Commits on Dec 2, 2010
  1. Steve Yen

    MB-3067 - suffix missing for broadcast commands on connect error

    steveyen authored
    During broadcast commands like "stats", if one of the downstream conns
    had a connect error, moxi wouldn't send back a response suffix (like
    "END\r\n").  This would block clients like memcachetest which use
    stats.  This originated due to the zstored conn pooling enhancement
    where the original response suffix code ended up now being too late in
    the codepath.
    
    Change-Id: Id08b12d2953843a73f3887d048cafdef594ea501
    Reviewed-on: http://review.membase.org/3929
    Reviewed-by: Chiyoung Seo <chiyoung.seo@gmail.com>
    Tested-by: Steve Yen <steve.yen@gmail.com>
Commits on Nov 30, 2010
  1. Steve Yen

    MB-2980 - front cache helper functions to avoid over-deleting

    steveyen authored
    New helper functions cproxy_front_cache_key() and
    cproxy_front_cache_delete(), originally via code patches
    from Paul Gale.
    
    These will help reduce unnecessary calls to mcache_delete()
    which will help keep better front cache statistics.
    
    Change-Id: If53ff0dbc3953ec74bb5d6bee8e6dfb4872f398e
    Reviewed-on: http://review.membase.org/3906
    Reviewed-by: Chiyoung Seo <chiyoung.seo@gmail.com>
    Tested-by: Steve Yen <steve.yen@gmail.com>
Commits on Nov 24, 2010
  1. Steve Yen

    MB-2972 - return SERVER_ERROR for membase bucket GET's

    steveyen authored Matt Ingenthron committed
    But, still return END's for memcached bucket GET's.
    
    Change-Id: Ie6c658b006eb870609e78b2b4988e12636fe08b2
    Reviewed-on: http://review.membase.org/3839
    Tested-by: Matt Ingenthron <matt@northscale.com>
    Reviewed-by: Matt Ingenthron <matt@northscale.com>
Commits on Nov 19, 2010
  1. Steve Yen Chiyoung Seo

    Only put paused conns back into downstream conn pool

    steveyen authored chiyoung committed
    Otherwise, if moxi releases a downstream conn that's not paused,
    close that conn.
    
    Change-Id: I8801114c174023e2dfbf2f6fa7bfe3eb0ca06b22
    Reviewed-on: http://review.membase.org/3739
    Tested-by: Chiyoung Seo <chiyoung.seo@gmail.com>
    Reviewed-by: Chiyoung Seo <chiyoung.seo@gmail.com>
Commits on Nov 18, 2010
  1. Steve Yen Chiyoung Seo

    inflight downstream conns count assert() fixed

    steveyen authored chiyoung committed
    Change-Id: I6b4a8e91e33bf2fa290731ac6316a32c7b09117e
    Reviewed-on: http://review.membase.org/3719
    Tested-by: Chiyoung Seo <chiyoung.seo@gmail.com>
    Reviewed-by: Chiyoung Seo <chiyoung.seo@gmail.com>
Something went wrong with that request. Please try again.