Permalink
Commits on Feb 2, 2012
  1. - Fix inline issue with older compilers (gcc 4.2.2)

    swills committed with dormando Feb 2, 2012
    ed note: this needs to be redone in memcached.h as a static inline, or changed
    to a define.
  2. fix glitch with flush_all <future>

    dormando committed Feb 2, 2012
    reported by jhpark. items at the bottom of the LRU would be popped for sets if
    flush_all was set for the "future" but said future hadn't arrived yet.
    item_get handled this correctly so the flush would not happen, but items at
    the bottom of the LRU would be reclaimed early.
    
    Added tests for this as well.
Commits on Jan 28, 2012
  1. Skip SASL tests unless RUN_SASL_TESTS is defined.

    dustin committed Jan 28, 2012
    This fails for various stupid platform-specific things.  The SASL code
    can be working correctly, but not in a way that is completely
    predictable on every platform (for example, we may be missing a
    particular auth mode).
  2. Specify hostname in sasl_server_new.

    dustin committed Jan 28, 2012
    saslpasswd2 does something a little magical when initializing the
    structure that's different from what happens if you just pass NULL.
    
    The magic is too great for the tests as is, so this code does the same
    thing saslpasswd2 does to determine the fqdn.
  3. build fix: Define sasl_callback_ft on older versions of sasl.

    dustin committed Jan 28, 2012
    They just changed this randomly with no way to really detect it.  You
    can read about it here:
    
    http://lists.andrew.cmu.edu/pipermail/cyrus-sasl/2011-September/002340.html
Commits on Jan 26, 2012
  1. fix segfault when sending a zero byte command

    dormando committed Jan 26, 2012
    echo "" | nc localhost 11211 would segfault the server
    
    simple fix is to add the proper token check to the one place it's missing.
  2. fix warning in UDP test

    dormando committed Jan 26, 2012
  3. properly detect GCC atomics

    dormando committed Jan 26, 2012
    I was naive. GCC atomics were added in 4.1.2, and not easily detectable
    without configure tests. 32bit platforms, centos5, etc.
  4. tests: loop on short binary packet reads

    dustin committed Jan 26, 2012
    Awesome bug goes like this:
    
    let "c1" be the commit of the "good state" and "c2" be the commit
    immediately after (in a bad state).  "t1" is the state of the tree in "c1"
    and "t2" is the state of the tree in "c2"
    
    In their natural states, we have this:
    
    c1 -> t1 -> success
    c1 -> t2 -> fail
    
    However, if you take
    
    c1 -> t1 -> patch to t2 -> success
    c2 -> t2 -> patch to t1 -> fail
    
    So t1 *and* t2 both succeed if the committed tree is c1, but both fail of
    the committed tree is c2.
    
    The difference?  c1 has a tag that points to it so the version number is
    "1.2.10" whereas the version number for the unreleased c2 is
    "1.4.10-1-gee486ab" -- a bit longer, breaks stuff in tests that try to
    print stats.
Commits on Jan 18, 2012
  1. fix slabs_reassign tests on 32bit hosts

    dormando committed Jan 18, 2012
    32bit pointers are smaller... need more items to fill the slabs, sigh.
Commits on Jan 12, 2012
  1. update protocol.txt

    dormando committed Jan 12, 2012
  2. bug237: Don't compute incorrect argc for timedrun

    dustin committed with dormando Jan 12, 2012
    Since spawn_and_wait doesn't use argc anyway, might as well just not
    send a value in.
  3. fix 'age' stat for stats items

    dormando committed Jan 12, 2012
    credit goes to anton.yuzhaninov for the report and patch
  4. binary deletes were not ticking stats counters

    dormando committed Jan 12, 2012
    Thanks to Stephen Yang for the bug report.
  5. test for the error code, not the full message

    dormando committed Jan 12, 2012
    bad practice.
Commits on Jan 10, 2012
  1. more portable refcount atomics

    dormando committed Jan 10, 2012
    Most credit to Dustin and Trond for showing me the way, though I have no way
    of testing this myself.
    
    These should probably just be defines...
Commits on Jan 9, 2012
  1. Fix a race condition from 1.4.10 on item_remove

    dormando committed Jan 9, 2012
    Updates the slab mover for the new method.
    
    1.4.10 lacks some crucial protection around item freeing and removal,
    resulting in some potential crashes. Moving the cache_lock around item_remove
    caused a 30% performance drop, so it's been reimplemented with GCC atomics.
    
    refcount of 1 now means an item is linked but has no reference, which allows
    us to test an atomic sub and fetch of 0 as a clear indicator of when to free
    an item.
  2. fix braindead linked list fail

    dormando committed Jan 9, 2012
    I re-implemented a linked list for the slab freelist since we don't need to
    manage the tail, check the previous item, and use it as a FIFO. However
    prev/next must be managed so the slab mover is safe.
    
    However I neglected to clear prev on a fetch, so if the slab mover was
    zeroing the head of the freelist it would relink the next item in the freelist
    with one in the main LRU.
    
    Which results in chaos.
Commits on Jan 6, 2012
  1. close some idiotic race conditions

    dormando committed Jan 6, 2012
    do_item_update could decide to update an item, then wait on the cache_lock,
    but the item could be unlinked in the meantime.
    
    caused this to happen on purpose by flooding with sets, then flushing
    repeatedly. flush has to unlink items until it hits the previous second.
Commits on Jan 5, 2012
  1. reap items on read for slab mover

    dormando committed Jan 5, 2012
    popular items could stuck the slab mover forever, so if a move is in progress,
    check to see if the item we're fetching should be unlinked instead.
Commits on Jan 4, 2012
  1. no same-class reassignment, better errors

    dormando committed Jan 4, 2012
    Add human parseable strings to the errors for slabs ressign. Also prevent
    reassigning memory to the same source and destination.
  2. initial slab automover

    dormando committed Jan 4, 2012
    Enable at startup with -o slab_reassign,slab_automove
    
    Enable or disable at runtime with "slabs automove 1\r\n"
    
    Has many weaknesses. Only pulls from slabs which have had zero recent
    evictions. Is slow, not tunable, etc. Use the scripts/mc_slab_mover example to
    write your own external automover if this doesn't satisfy.
Commits on Dec 20, 2011
  1. slab reassignment

    dormando committed Dec 19, 2011
    Adds a "slabs reassign src dst" manual command, and a thread to safely process
    slab moves in the background.
    
    - slab freelist is now a linked list, reusing the item structure
    - is -o slab_reassign is enabled, an extra background thread is started
    - thread attempts to safely free up items when it's been told to move a page
      from one slab to another.
    
    -o slab_automove is stubbed.
    
    There are some limitations. Most notable is that you cannot repeatedly move
    pages around without first having items use up the memory. Slabs with newly
    assigned memory work off of a pointer, handing out chunks individually. We
    would need to change that to quickly split chunks for all newly assigned pages
    into that slabs freelist.
    
    Further testing is required to ensure such is possible without impacting
    performance.
Commits on Dec 16, 2011
  1. clean do_item_get logic a bit. fix race.

    dormando committed Dec 16, 2011
    the race here is absolutely insane:
    - do_item_get and do_item_alloc call at the same time, against different
      items
    - do_item_get wins cache_lock lock race, returns item for internal testing
    - do_item_alloc runs next, pulls item off of tail of a slab class which is the
      same item as do_item_get just got
    - do_item_alloc sees refcount == 0 since do_item_get incrs it at the bottom,
      and starts messing with the item
    - do_item_get runs its tests and maybe even refcount++'s and returns the item
    - evil shit happens.
    
    This race is much more likely to hit during the slab reallocation work, so I'm
    fixing it even though it's almost impossible to cause.
    
    Also cleaned up the logic so it's not testing the item for NULL more than
    once. Far fewer branches now, though I did not examine gcc's output to see if
    it is optimized differently.
Commits on Dec 15, 2011
  1. clean up the do_item_alloc logic

    dormando committed Dec 15, 2011
    Fix an unlikely bug where search == NULL and the first alloc fails, which then
    attempts to use search.
    
    Also reorders branches from most likely to least likely, and removes all
    redundant tests that I can see. No longer double checks things like refcount
    or exptime for the eviction case.
Commits on Dec 13, 2011
  1. shorten lock for item allocation more

    dormando committed Dec 13, 2011
    after pulling an item off of the LRU, there's no reason to hold the cache lock
    while we initialize a few values and memcpy some junk.
Commits on Dec 3, 2011
Commits on Nov 10, 2011
  1. disable issue 140's test.

    dormando committed Nov 9, 2011
    the fix for issue 140 only helped in the case of you poking at memcached with
    a handful of items (or this particular test). On real instances you could
    easily exhaust the 50 item search and still come up with a crap item.
    
    It was removed because adding the proper locks back in that place is
    difficult, and it makes "stats items" take longer in a gross lock anyway.
  2. Use a proper hash mask for item lock table

    dormando committed Oct 13, 2011
    Directly use the hash for accessing the table. Performance seems unchanged
    from before but this is more proper. It also scales the hash table a bit as
    worker threads are increased.
  3. push cache_lock deeper into item_alloc

    dormando committed Oct 3, 2011
    easy win without restructuring item_alloc more: push the lock down after it's
    done fiddling with snprintf.
  4. use item partitioned lock for as much as possible

    dormando committed Oct 3, 2011
    push cache_lock deeper into the abyss
  5. Remove the depth search from item_alloc

    dormando committed Oct 2, 2011
    Code checked 50 items before checking up to 50 more items to expire one, if
    none were expired. Given the shallow depth search (50) by any sizeable cache
    (as low as 1000 items, even), I believe that whole optimization was pointless.
    
    Flattening it to be a single test is shorter code and benches a bit faster as
    it holds the lock for less time.
    
    I may have made a mess of the logic, could be cleaned up a little.
  6. move hash calls outside of cache_lock

    dormando committed Oct 2, 2011
    been hard to measure while using the intel hash (since it's very fast), but
    should help with the software hash.