Permalink
Commits on May 15, 2013
  1. squash me

    Dave Parfitt committed May 15, 2013
  2. wait for tcp_mon in updown test

    Dave Parfitt committed May 15, 2013
Commits on May 10, 2013
  1. kill riak_core_tcp_mon on test shutdown

    Dave Parfitt committed May 10, 2013
  2. fix tcpmon node up/down test

    manually apply changes from riak_repl 915a57db26055fc212aaf0ddbfbb8a1141ab9fda
    Dave Parfitt committed May 10, 2013
Commits on May 8, 2013
  1. Handle node down/node up correctly.

    Dave Parfitt committed May 8, 2013
Commits on May 6, 2013
  1. Merge pull request #312 from basho/pevm-transfers-formatting

    Enhance transfer display + wrapping nodenames.
    evanmcc committed May 6, 2013
Commits on May 3, 2013
  1. convert input atoms to lists

    evanmcc committed May 3, 2013
  2. pretty up transfer boxes a little, make nodenames wrap up

    to 75 characters.
    evanmcc committed May 3, 2013
Commits on May 2, 2013
  1. Merge pull request #299 from basho/gh274-vnode-noblocking-reply

    Vnode nonblocking reply, First draft (3rd edition), ready for some review
    slfritchie committed May 2, 2013
Commits on Apr 25, 2013
  1. Merge pull request #297 from basho/dip_ssl_cleanup

    don't use hardcoded app names in SSL utils
    Dave Parfitt committed Apr 25, 2013
  2. Merge pull request #290 from basho/jrw-handoff-progress

    Add support for tracking progress of individual handoffs
    jrwest committed Apr 25, 2013
Commits on Apr 24, 2013
  1. Merge pull request #304 from basho/adt-lager-2.0

    Update lager to 2.0.0rc2
    Vagabond committed Apr 24, 2013
Commits on Apr 22, 2013
  1. track handoff progress

    * vnodes can optionally provide the size of their data (in bytes or # of objects)
      when returning from is_empty/1. For vnodes whose size may change during handoff
      (e.g. a non read-isolated iterator in riak_kv like ets) a function may be provided
      instead. The function is called to calculate the size of the vnode each time
      transfer stats are requested from the handoff manager
    * use existing handoff stats to calculate completion percentage
    * move transfers console function from riak_kv_console to riak_core_console. It
      is not specific to kv and other applications should be able to take advantage
      of it
    * transfers prints completion percentage w/ progress bar
    jrwest committed Feb 21, 2013
  2. Change lager dep to 2.0.0rc2

    Vagabond committed Apr 22, 2013
Commits on Apr 15, 2013
  1. Merge pull request #300 from branch 'bwf-pool-race'

    Bryan Fink committed Apr 15, 2013
  2. fix races between riak_core_vnode_worker_pool and poolboy

    This patch fixes both races demonstrated by the test in the previous
    commit, and described in #298.
    
    The fix for the checkin race is for the worker to only checkin to
    riak_core_vnode_worker_pool, and allow riak_core_vnode_worker_pool to
    decide when to checkin to poolboy. If there is outstanding work in the
    queue, the worker will just be reused immediately instead of being
    checked in.
    
    The fix for the DOWN race is for the worker to send a 'worker_started'
    message to riak_core_vnode_worker_pool, when the worker starts. While
    the worker is sending this message, poolboy is blocking, so
    riak_core_vnode_worker_pool can send poolboy a checkout message
    immediately, and still be guaranteed that it will be received after
    poolboy has put the worker in its pool.
    
    Documentation about how this works, and about how
    riak_core_vnode_worker_pool is acting as a queue wrapper around the
    pool, is also included.
    Bryan Fink committed Apr 15, 2013
  3. demonstrate the race between r_c_vnode_worker_pool and poolboy's fsm

    As described in #298:
    
    When a riak_core_vnode_worker finishes work, it sends checkin messages
    to both poolboy and riak_core_vnode_worker_pool. The latter maintains a
    queue of work to be handled when there's room in the pool. As soon as
    RCVWP gets the checkin message, it asks poolboy if there is a worker
    available (expecting that the worker just checked in will now be
    available).
    
    The problem is that poolboy may receive RCVWP's message before receiving
    the worker's checkin message. If this happens, it will tell RCVWP that
    the pool is full. RCVWP then sticks in the 'queueing' state until it
    receives another checkin message from a worker. Since another checkin
    may never arrive, the pool may become frozen.
    
    Crashing workers create a similar race condition to the double-checking
    case, because 'DOWN' messages are delivered to both
    riak_core_vnode_worker_pool and poolboy. RCVWP again asks poolboy to
    checkout a worker (effectively immediately), which might happen before
    poolboy receives its 'DOWN' and starts a replacement.
    
    The test defined by worker_pool_pulse.erl demonstrates these races.
    Under PULSE execution, the test will fail with deadlock. If it fails for
    another reason (like timeout) you may have missed one of the
    requirements described below.
    
    In order to run the test, you will need the pulse_otp beams from
    https://github.com/Quviq/pulse_otp on your path. The riak_core and
    poolboy applications, as well as the worker_pool_pulse module, must also
    be compiled with the 'PULSE' macro defined. The newly-added 'pulse' make
    target will do this for you (and also run the test), but you will need
    to start with a clean checkout (no beams built), or recompilation will
    be skipped.
    Bryan Fink committed Apr 11, 2013
Commits on Apr 12, 2013
  1. First draft (3rd edition), ready for some review

    Vnode replies always go via reply(), and reply() always uses unreliable
    messaging.  (As opposed to the usual (and more reliable) send-and-pray
    messaging.)
    
    During handoff, all forwarding requests use unreliable vnode master
    commands to avoid net_kernel blocking interference.
    slfritchie committed Apr 5, 2013
Commits on Apr 8, 2013
  1. don't used a hardcoded app name in SSL utils

    Dave Parfitt committed Apr 5, 2013
  2. Merge pull request #296 from basho/1.3_to_master

    1.3 to master
    engelsanchez committed Apr 8, 2013
Commits on Apr 5, 2013
  1. Merge branch '1.3' into 1.3_to_master

    Conflicts:
    	rebar.config
    	src/riak_core_ring_handler.erl
    	src/riak_core_util.erl
    	src/riak_core_vnode.erl
    	src/riak_core_vnode_manager.erl
    engelsanchez committed Apr 5, 2013
  2. Merge pull request #291 from basho/dip_ssl

    SSL support
    Vagabond committed Apr 5, 2013
  3. Implement SSL support for riak_core_connection and riak_core_service_mgr

    This is a port of the SSL implementation from Riak's MDC implementation.
    The app.config arguments are the same, only now they're under riak_core.
    
    SSL is negotiated right after capabilities are exchanged, so minimal
    information is sent 'in the clear'. If one side requests SSL and the
    other side does not have it enabled, SSL is not allowed to connect.
    Dave Parfitt committed with Vagabond Mar 21, 2013
Commits on Apr 3, 2013
  1. Merge pull request #292 from basho/dip_pin_ranch

    use custom Ranch build to support R14B03|4
    Dave Parfitt committed Apr 3, 2013
  2. use custom Ranch build to support R14B03|4

    Dave Parfitt committed Apr 3, 2013
Commits on Apr 1, 2013
  1. Merge pull request #289 from basho/dip_typos

    fixed typos found by @DeadZen
    Dave Parfitt committed Apr 1, 2013
  2. fixed typos found by @DeadZen

    Dave Parfitt committed Apr 1, 2013
Commits on Mar 22, 2013
  1. Merge pull request #288 from basho/kv508-stats-warn

    Failure to calculate a stats value should be temporary so warn only
    russelldb committed Mar 22, 2013
Commits on Mar 21, 2013
  1. Make stats more robust in the face of failure

    Use pid() for timer call so that a crashing stat cache does
    not end up with multiple timers for the same stat mods
    
    In some cases underlying ets tables for stats go away. When this
    happens the effected stats break and stay broken.
    
    When a stat is broken the stat calculation throws an error. For
    the sake of robustness this commit wraps stat calculation in a try
    catch, and returns the atom `unavailable` if a stat cannot be
    calculated. Broken stats are expected to be detected and repaired
    when they are updated.
    
    Rather than calculate stats on demand when stale, backfill the cache
    
    Always serve the stats that are in the cache, no matter how old they are.
    Add a timestamp to the stats so consumers know how stale they are.
    Fill the cache continuously in the background.
    russelldb committed Mar 21, 2013
  2. Merge pull request #281 from basho/eas-parallel-vnode-init-backport

    Porting parallel vnode init fix to 1.3 + revert switch
    engelsanchez committed Mar 21, 2013
Commits on Mar 20, 2013
  1. Merge pull request #284 from basho/dip_conn_mgr

    initial add of the Riak Core Connection Manager
    
    We'll be circling back to fix the Ranch incompatibilities with R14B03|4 soon.
    Dave Parfitt committed Mar 20, 2013