Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
branch: anya-upgrade
Commits on Oct 14, 2013
  1. @evanmcc @engelsanchez

    At this point in riak's history disterl is our primary bottleneck.

    evanmcc authored engelsanchez committed
    Something about the way that it is framing messages is causing its
    tcp connection to size its send and recv buffers much smaller than
    is optimal.
    
    This change adds code to automatically set dist port sizes on nodeup,
    as well as code to manually set all disterl connections to specific
    sizes, meant to be used from the console for testing and tuning.
    
    Small refactorings:
    
    * Store the send & recv buffer sizes in #state so that nodeup events
    that arrive after a set_dist_buf_sizes() call will use the specified
    sizes rather than OTP app environment var defaults.
    
    * Create get_riak_env_vars/0 helper function and export it for debugging use
    After refactoring further, it's only used once internally, so if YMMV,
    I'm fine with inlining this function's code and removing the export.
    
    * Add explicit buffer size args to set_port_buffers().
    
    Variable renaming to match inet:setopts() option names
    
    - remove redundant supervisor.
    
    allow feature to be disabled
Commits on Aug 23, 2013
  1. @jaredmorrow
  2. @jaredmorrow
Commits on Aug 20, 2013
  1. @engelsanchez

    Merge pull request #356 from basho/eas-folsom-stat-error-protection

    engelsanchez authored
    Add protection against folsom stat errors
Commits on Aug 19, 2013
  1. @evanmcc

    Merge pull request #359 from basho/pevm-drop-bad-data

    evanmcc authored
    Corruption filtering changes for core.
  2. @engelsanchez

    Add protection against folsom stat errors

    engelsanchez authored
    Folsom may sometimes return an error tuple if something goes wrong (see
    folsom_ets.erl), but our code was only catching exceptions. So the error
    would end up being used as a valid value and crash the riak_kv_stat
    process later. This fixes that problem and gives us better protection
    from folsom funkiness.
Commits on Aug 17, 2013
  1. @evanmcc
Commits on Aug 1, 2013
  1. @rzezeski

    Roll version 1.4.1

    rzezeski authored
Commits on Jul 31, 2013
  1. @russelldb

    Merge pull request #351 from basho/gh350-vnodeq-stats

    russelldb authored
    Fix catch pattern to match all errors
Commits on Jul 30, 2013
  1. @jonmeredith

    Merge pull request #352 from basho/jdm-tcp-mon-add-dist-fix

    jonmeredith authored
    Fix TCP mon to correctly spot nodes coming up.
  2. @jonmeredith

    Fix TCP mon to correctly spot nodes coming up.

    jonmeredith authored
    Corrected add_dist_conn argument order on nodeup event.
Commits on Jul 29, 2013
  1. @russelldb
Commits on Jul 9, 2013
  1. @jtuple
  2. @jtuple

    Fix two major vnode manager bugs

    jtuple authored
    First, fix a bug that enabled a race condition wherein the vnode
    manager could start the same vnode multiple times. This would result
    in both vnode instances trying to acquire the same backend, which
    would fail and force the Riak node to shutdown.
    
    The cause of this bug was a change introduced during the large
    ring optimization work for Riak 1.4. In this work, an unbounded
    `ets:match_delete` that resulted in a table scan was changed to
    a straightforward `ets:delete`. Unfortunately, the `ets:delete`
    could delete data associated with a newer instance of a given vnode
    in cases where a monitor for a prior instance fired after the new
    instance was created.  This bug was fixed by switching to a bounded
    `ets:match_delete` that avoids the table scan while also avoiding
    unintended deletes.
    
    Second, fix a bug introduced during the parallel vnode initialization
    work from Riak 1.3.1 that caused the vnode manager to newly monitor a
    given vnode each time get_vnode_pid was called. This bug could result
    in an unbounded number of monitors being created in certain scenarios,
    causing a node to become slower over time until it was restarted.
Commits on Jul 1, 2013
  1. @jrwest

    Merge pull request #345 from basho/jrw-incrvsn-resize-replace

    jrwest authored
    Incrememnt Ring Version when Force-Replacing during Resize
Commits on Jun 28, 2013
  1. @jrwest

    Incrememnt Ring Version when Force-Replacing during Resize

    jrwest authored
    Because the claimant runs in a different "mode" the ring version may
    not be incremented otherwise causing reconcilation during gossip to
    fail. Seen in the wild and recreated periodically during riak_test
Commits on Jun 26, 2013
  1. @jaredmorrow
Commits on Jun 24, 2013
  1. @jrwest

    Merge pull request #331 from basho/jrw-resize-foh-fix

    jrwest authored
    fix forced_ownership_handoff during resize
Commits on Jun 21, 2013
  1. @russelldb

    Merge pull request #336 from basho/gh335-reshed-stats

    russelldb authored
    Fix crashing stat mod never getting rescheduled
Commits on Jun 19, 2013
  1. @beerriot
  2. @beerriot

    only silently drop DOWN-normal messages in deleted modstate

    beerriot authored
    This is a restriction of the modification made in PR #334.
    
    Dropping all {'DOWN',_,process,_,normal} messages on the floor instead
    of passing them to vnode handle_info functions causes riak_pipe vnodes
    to missing messages that it uses to cleanup workers for pipes that
    shutdown unexpectedly.
    
    This commit restricts the DOWN-normal message dropping to the case
    that the vnode's modstate is {deleted, _}. PR #334 suggests the original
    modification was made only to quiet the log spam generated by the
    following clause, which also only operates in modstate-deleted.
    
    Before this commit, the riak_test pipe_verify_exceptions would fail
    during its verify_middle_fitting_normal test, because workers would be
    left running after the fitting exited 'normal'. After this commit,
    workers are once again terminated correctly, so the test passes again.
Commits on Jun 17, 2013
  1. @engelsanchez

    Merge pull request #339 from basho/eas-fix-partition-repair-not-sent-fun

    engelsanchez authored
    Fix repair handoff crash, missing not sent fun
  2. @jrwest
Commits on Jun 15, 2013
  1. @engelsanchez
Commits on Jun 14, 2013
  1. @russelldb
Commits on Jun 13, 2013
  1. @slfritchie

    Merge pull request #334 from basho/slf-no-log-spam-on-normal-shutdown

    slfritchie authored
    Reporting 'normal' events is spammy, don't do it
Commits on Jun 12, 2013
  1. @russelldb

    Fix crashing stat mod never getting rescheduled

    russelldb authored
    1.3.1 updated the cache to fetch stats in the background rather than
    on demand. A new bug was added. If the stat mod crashes during
    production of stats, it is never rescheduled.
    
    Fix by rescheduling when crash is detected. Exponentially backoff
    the schedule after an error so as not to spam the log.
Commits on Jun 7, 2013
  1. @slfritchie
Commits on Jun 4, 2013
  1. @evanmcc

    Merge pull request #332 from basho/pevm-timeout-guard

    evanmcc authored
    update bad value protection for timer value
  2. @evanmcc

    remove superfluous case

    evanmcc authored
Commits on Jun 3, 2013
  1. @evanmcc
  2. @jrwest

    fix forced_ownership_handoff during resize

    jrwest authored
    All resize operations remain in the ring's list of pending
    changes until all complete. Prior to this change transfers would
    only be triggered for the first forced_ownership_handoff operations.
    Subsequent operations would only be triggered by vnode *inactivity*.
    
    This commit modifies the use of forced_ownership_handoff during resize
    to ensure that only resize operations that are still pending are in
    the throttled transfer list.
Commits on May 30, 2013
  1. @jrwest

    Merge pull request #330 from basho/jrw-infinity-timeout-fix

    jrwest authored
    dont start coverage timeout timer if timeout is infinite
  2. @jrwest
  3. @engelsanchez
Something went wrong with that request. Please try again.