Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Commits on Apr 25, 2015
  1. @dormando

    stop clang from whining about asserts

    dormando authored
    we now use up to exactly clsid 255, which is the max size of a byte so the
    assertion can't fail.
  2. @dormando

    relax timing glitch in the lru maintainer test

    dormando authored
    This test is requiring that the juggler thread runs at all before the stats
    check happens. I've tried running this under an rPi1 and can't reproduce the
    race, but for some reason solaris amd64 does. This is likely due to the usleep
    not working as expected.
    
    Unfortunately I don't have direct access to a solaris host, so this is the
    best I can do for now. The juggler does eventually wake up so I'm unconcerned.
Commits on Apr 24, 2015
  1. @dormando

    fix major off by one issue

    dormando authored
    none of my machines could repro a crash, but it's definitely wrong :/ Very
    sad.
Commits on Apr 20, 2015
  1. @dormando

    don't overwrite stack during slab_automove

    dormando authored
    every time slab_automove would run it would segfault immediately, since the
    call out into items.c would overwrite its stack.
  2. @dormando

    fix off-by-one with slab management

    dormando authored
    data sticking into the highest slab class was unallocated. Thanks to pyry for
    the repro case:
    
    perl -e 'use Cache::Memcached;$memd = new Cache::Memcached {
    servers=>["127.0.0.1:11212"]};for(20..1000){print "$_\n";$memd->set("fo2$_",
    "a"x1024)};'
    (in a loop)
    with:
    ./memcached -v -m 32 -p 11212 -f 1.012
    
    This serves as a note to turn this into a test.
Commits on Feb 13, 2015
  1. @dormando

    Make LRU crawler work from maint thread.

    dormando authored
    Wasn't sending the condition signal after a refactor :(
    
    Also adds some stats to inspect how much work the LRU crawler is doing, and
    removes some printf noise for the LRU maintainer.
Commits on Feb 7, 2015
  1. @dormando

    basic lock around hash_items counter

    dormando authored
    could/should be an atomic. Previously all write mutations were wrapped with
    cache_lock, but that's not the case anymore. Just enforce consistency around
    the hash_items counter, which is used for hash table expansion.
  2. @dormando

    fix crawler/maintainer threads starting with -d

    dormando authored
    the fork is racey and the lru crawler or maintainer threads end up not
    starting with daemonization. So we start them post-fork now.
    
    Thanks pyry for the report!
Commits on Jan 10, 2015
  1. @dormando

    spinlocks never seem to help in benchmarks

    dormando authored
    If a thread is allowed to go to sleep, it can be woken up early as soon as the
    lock is freed. If we spinlock, the scheduler can't help us and threads will
    randomly run out their timeslice until the thread actually holding the lock
    finishes its work.
    
    In my benchmarks killing the spinlock only makes things better.
  2. @dormando

    small crawler refactor

    dormando authored
    Separate the start function from what was string parsing and allow passing in
    the 'remaining' value as an argument.
    
    Also adds a (non-configurable yet) settings for how many crawls to run per
    sleep, to raise the default aggressiveness of the crawler.
  3. @dormando

    update some comments

    dormando authored
    started to drift from reality over the patch series.
  4. @dormando
Commits on Jan 9, 2015
  1. @dormando
  2. @dormando

    fix refhang test.

    dormando authored
    The new code is a lot more efficient as unblocking LRU's as it's able to
    unlink refcounted items. However it's less aggressive in these cases. You'll
    get one OOM per stuck item and then it'll be gone in most cases.
    
    Removed the bottom half of the test since it's too flaky, and the above case
    now looks for both OOM's and STORED's plus relevant counters.
  3. @dormando

    add `-o expirezero_does_not_evict` feature

    dormando authored
    When enabled, items with an expiration time of 0 are placed into a separate
    LRU and are not subject to evictions. This allows a mixed-mode instance where
    you can have a stronger "guarantee" (not a real guarantee) that items aren't
    removed from the cache due to low memory.
    
    This is a dangerous option, as mixing unevictable items has obvious
    repercussions.
  4. @dormando

    make HOT/WARM ratios starttime tunable.

    dormando authored
    runtime tunable is difficult and may require either atomics, or adding an
    extra items.c array. Adjusting the value would roll through and lock each
    LRU before changing the value.
Commits on Jan 8, 2015
  1. @dormando

    basic LRU maintainer tests.

    dormando authored
    this did actually discover the bug in the previous commit..
  2. @dormando

    fix bitshifting transposition

    dormando authored
    Fuck me this is embarrassing. I got it right once, then flipped them
    everywhere else. This is why you use defines for everything. :(
  3. @dormando

    cap aggressiveness of LRU maintainer

    dormando authored
    We can revisit, but the number of use cases with typical set loads above 1m
    items/sec are unknown to me.
  4. @dormando

    another lock fix for slab mover

    dormando authored
    wasn't holding LRU locks while unlinking an item. options were either never
    hold slabs lock underneath the LRU locks, which is doable but annoying... or
    drop the slabs lock for the unlink step. It's not very clear but I think it's
    safe.
Commits on Jan 7, 2015
  1. @dormando

    compat mode.

    dormando authored
    Enabling the new LRU routine requires starting with `-o lru_maintainer`.
    
    This makes almost all of the tests pass, except refhang.t. Will need new tests
    for the LRU maintainer.
    
    So far as I can tell it's still handling the refhang scenario, but in a more
    natural way. Instead of flipping the items back to the top of the list, it's
    unlinking them from the hash table and LRU. This completely removes them from
    the problem, but it doesn't retry as many times to get them out of the way.
    
    A system with many stuck items next to each other could do a handful of OOM's
    before clearing the backlog, but it won't keep running into them. The test
    appears flaky even in 1.4.22; running with -vv causes it to fail in a funny
    way.
  2. @dormando

    make slab mover lock safe again.

    dormando authored
    Given mutex_locks act as memory barriers this should work.
    
    This does not yet fix being able to eject hot items from the fetch path.
  3. @dormando

    LRU maintainer thread now fires LRU crawler

    dormando authored
    ... if available. Very simple starter heuristic for how often to run the
    crawler.
    
    At this point, this patch series should have a significant impact on hit
    ratio.
Commits on Jan 5, 2015
  1. @dormando

    simple fix for LRU crawler

    dormando authored
    ends up parallel crawling the three sub-LRU's, but that's fine.
  2. @dormando

    fix a few bugs and add more stats

    dormando authored
    wasn't passing total_chunks into the bg thread anymore, which causes all items
    to flow to cold.
    
    also re-added ability to see hot/warm/cold counts. NOEXP is missing until
    that's implemented.
  3. @dormando

    fix itemstats to be combination of sub LRUs

    dormando authored
    easier to reason, more tests pass.
  4. @dormando

    direct reclaim mode for evictions

    dormando authored
    Only way to do eviction case fast enough is to inline it, sadly.
    This finally deletes the old item_alloc code now that I'm not intending on
    reusing it.
    
    Also removes the condition wakeup for the background thread. Instead runs on a
    timer, and meters its aggressiveness by how much shuffling is going on.
    
    Also fixes a segfault in lru_pull_tail(), was unlinking `it` instead of
    `search`.
  5. @dormando

    reorg juggle routine, replace prints with stats

    dormando authored
    code is clearer, and able to react a bit faster to required evictions.
Commits on Jan 4, 2015
  1. @dormando

    first pass at LRU maintainer thread

    dormando authored
    The basics work, but tests still do not pass.
    
    A background thread wakes up once per second, or when signaled. It is signaled
    if a slab class gets an allocation request and has fewer than N chunks free.
    
    The background thread shuffles LRU's: HOT, WARM, COLD. HOT is where new items
    exist. HOT and WARM flow into COLD. Active items in COLD flow back to WARM.
    Evictions are pulled from COLD.
    
    item_update's no longer do anything (and need to be fixed to tick it->time).
    Items are reshuffled within or around LRU's as they reach the bottom.
    
    Ratios of HOT/WARM memory are hardcoded, as are the low/high watermarks.
    Thread is not fast enough right now, sets cannot block on it.
Commits on Jan 3, 2015
  1. @dormando

    Beginning work for LRU rework

    dormando authored
    Primarily splitting cache_lock into a lock-per LRU, and making the
    it->slab_clsid lookup indirect. cache_lock is now more or less gone.
    
    Stats are still wrong. they need to internally summarize over each
    sub-class.
Commits on Jan 2, 2015
  1. @dormando

    small fix for flush_all test

    dormando authored
  2. @dormando

    leave comment about stats cachedump locks

    dormando authored
    It's safe based on a technicality, which may not stay true for long.
    
    Same was true for stats sizes.
  3. @dormando

    flush_all was not thread safe.

    dormando authored
    Unfortunately if you disable CAS, all items set in the same second as a
    flush_all will immediately expire. This is the old (2006ish) behavior.
    
    However, if CAS is enabled (as is the default), it will still be more or less
    exact.
    
    The locking issue is that if the LRU lock is held, you may not be able to
    modify an item if the item lock is also held. This means that some items may
    not be flushed if locking is done correctly.
    
    In the current code, it could lead to corruption as an item could be locked
    and in use while the expunging is happening.
Commits on Jan 1, 2015
  1. @dormando

    cache_lock refactoring

    dormando authored
    item_lock() now protects accesses to item structures. cache_lock is just for
    LRU and LRU stats. This patch removes cache_lock from a number of places it's
    no longer needed.
    
    Some pre-existing bugs became obvious: flush_all, cachedump, and slab
    reassignment's do_item_get short-circuit all need repairs.
  2. @menghan @dormando
Something went wrong with that request. Please try again.