Permalink
Commits on Sep 28, 2016
  1. Add option for max file size. The currend hard-coded value of 2M is i…

    …nefficient in colossus.
    
    -------------
    Created by MOE: https://github.com/google/moe
    MOE_MIGRATED_REVID=134391640
    corrado committed with cmumford Sep 27, 2016
Commits on Aug 11, 2016
  1. Increase leveldb version to 1.19.

    -------------
    Created by MOE: https://github.com/google/moe
    MOE_MIGRATED_REVID=129930720
    cmumford committed with cmumford Aug 11, 2016
Commits on Jul 6, 2016
  1. A zippy change broke test assumptions about the size of compressed ou…

    …tput.
    
    Fix the tests by allowing more slop in zippy's behavior.
    -------------
    Created by MOE: https://github.com/google/moe
    MOE_MIGRATED_REVID=123432472
    ghemawat committed with cmumford May 27, 2016
  2. fix problems in LevelDB's caching code

    Background:
    
    LevelDB uses a cache (util/cache.h, util/cache.cc) of (key,value)
    pairs for two purposes:
    - a cache of (table, file handle) pairs
    - a cache of blocks
    
    The cache places the (key,value) pairs in a reference-counted
    wrapper.  When it returns a value, it returns a reference to this
    wrapper.  When the client has finished using the reference and
    its enclosed (key,value), it calls Release() to decrement the
    reference count.
    
    Each (key,value) pair has an associated resource usage.  The
    cache maintains the sum of the usages of the elements it holds,
    and removes values as needed to keep the sum below a capacity
    threshold.  It maintains an LRU list so that it will remove the
    least-recently used elements first.
    
    The max_open_files option to LevelDB sets the size of the cache
    of (table, file handle) pairs.  The option is not used in any
    other way.
    
    The observed behaviour:
    
    If LevelDB at any time used more file handles concurrently than
    the cache size set via max_open_files, it attempted to reduce the
    number by evicting entries from the table cache.  This could
    happen most easily during compaction, and if max_open_files was
    low.  Because the handles were in use, their reference count did
    not drop to zero, and so the usage sum in the cache was not
    modified by the evictions.  Subsequent Insert() calls returned
    valid handles, but their entries were immediately evicted from
    the cache, which though empty still acted as though full.  As a
    result, there was effectively no caching, and the number of open
    file handles rose []ly until it hit system-imposed limits and
    the process died.
    
    If one set max_open_files lower, the cache was more likely to
    exhibit this beahviour, and cause the process to run out of file
    descriptors.  That is, max_open_files acted in almost exactly the
    opposite manner from what was intended.
    
    The problems:
    
    1. The cache kept all elements on its LRU list eligible for capacity
       eviction---even those with outstanding references from clients.  This was
       ineffective in reducing resource consumption because there was an
       outstanding reference, guaranteeing that the items remained.  A secondary
       issue was that there is no guarantee that these in-use items will be the
       last things reached in the LRU chain, which actually recorded
       "least-recently requested" rather than "least-recently used".
    
    2. The sum of usages was decremented not when a (key,value) was evicted from
       the cache, but when its reference count went to zero.  Thus, when things
       were removed from the cache, either by garbage collection or via Erase(),
       the usage sum was not necessarily decreased.  This allowed the cache to act
       as though full when it was in fact not, reducing caching effectiveness, and
       leading to more resources being consumed---the opposite of what the
       evictions were intended to achieve.
    
    3. (minor) The cache's clients insert items into it by first looking up the
       key, and inserting only if no value is found.  Although the cache has an
       internal lock, the clients use no locking to ensure atomicity of the
       Lookup/Insert pair.  (see table/table.cc:  block_cache->Insert() and
       db/table_cache.cc:  cache_->Insert()).  Thus, if two threads Insert() at
       about the same time, they can both Lookup(), find nothing, and both
       Insert().  The second Insert() would evict the first value, leaving each
       thread with a handle on its own version of the data, and with the second
       version in the cache.  It would be better if both threads ended up with a
       handle on the same (key,value) pair, which implies it must be the first item
       inserted.  This suggests that Insert() should not replace an existing value.
    
       This can be made safe with current usage inside LeveDB itself, but this is
       not easy to change first because Cache is a public interface, so to change
       the semantics of an existing call might break things, second because Cache
       is an abstract virtual class, so adding a new abstract virtual method may
       break other implementations, and third, the new method "insert without
       replacing" cannot be implemented in terms of the existing methods, so cannot
       be implemented with a non-abstract default.   But fortunately, the effects
       of this issue are minor, so this issue is not fixed by this change.
    
    The changes:
    
    The assumption in the fixes is that it is always better to cache
    entries unless removal from the cache would lead to deallocation.
    
    Cache entries now have an "in_cache" boolean indicating whether
    the cache has a reference on the entry.  The only ways that this can
    become false without the entry being passed to its "deleter" are via
    Erase(), via Insert() when an element with a duplicate key is inserted,
    or on destruction of the cache.
    
    The cache now keeps two linked lists instead of one.  All items
    in the cache are in one list or the other, and never both.  Items
    still referenced by clients but erased from the cache are in
    neither list.  The lists are:
    - in-use:  contains the items currently referenced by clients, in no particular
      order.  (This list is used for invariant checking.  If we removed the check,
      elements that would otherwise be on this list could be left as disconnected
      singleton lists.)
    - LRU:  contains the items not currently referenced by clients, in LRU order
    
    A new internal Ref() method increments the reference count.  If
    incrementing from 1 to 2 for an item in the cache, it is moved
    from the LRU list to the in-use list.
    
    The Unref() call now moves things from the in-use list to the LRU
    list if the reference count falls to 1, and the item is in the
    cache.  It no longer adjusts the usage sum.  The usage sum now
    reflects only what is in the cache, rather than including
    still-referenced items that have been evicted.
    
    The LRU_Append() now takes a "list" parameter so that it can be
    used to append either to the LRU list or the in-use list.
    
    Lookup() is modified to use the new Ref() call, rather than
    adjusting the reference count and LRU chain directly.
    
    Insert() eviction code is also modified to adjust the usage sum and the
    in_cache boolean of the evicted elements.  Some LevelDB tests assume that there
    will be no caching whatsoever if the cache size is set to zero, so this is
    handled as a special case.
    
    A new private method FinishErase() is factored out
    with the common code from where items are removed from the cache.
    
    Erase() is modified to adjust the usage sum and the in_cache
    boolean of the erased elements, and to use FinishErase().
    
    Prune() is modified to use FinishErase() also, and to make use of the fact that
    the lru_ list now contains only items with reference count 1.
    
    - EvictionPolicy is modified to test that an entry with an
    outstanding handle is not evicted.  This test fails with the old cache.cc.
    
    - A new test case UseExceedsCacheSize verifies that even when the
    cache is overfull of entries with outstanding handles, none are
    evicted.  This test fails with the old cache.cc, and is the key
    issue that causes file descriptors to run out when the cache
    size is set too small.
    -------------
    Created by MOE: https://github.com/google/moe
    MOE_MIGRATED_REVID=123247237
    m3bm3b committed with cmumford May 25, 2016
Commits on Apr 15, 2016
  1. Fix LevelDB build when asserts are enabled in release builds. (#367)

    * Fix LevelDB build when asserts are enabled in release builds.
    
    BUG=https://bugs.chromium.org/p/chromium/issues/detail?id=603166
    
    * fix
    
    * Add comment
    jabdelmalek committed with cmumford Apr 15, 2016
Commits on Apr 12, 2016
  1. Change std::uint64_t to uint64_t (#354)

    -This fixes compile errors with default setup on RHEL 6 systems.
    nwestlake committed with cmumford Apr 12, 2016
Commits on Mar 31, 2016
  1. This CL fixes a bug encountered when reading records from leveldb fil…

    …es that have been split, as in a [] input task split.
    
    Detailed description:
    
    Suppose an input split is generated between two leveldb record blocks and the preceding block ends with null padding.
    
    A reader that previously read at least 1 record within the first block (before encountering the padding) upon trying to read the next record, will successfully and correctly read the next logical record from the subsequent block, but will return a last record offset pointing to the padding in the first block.
    
    When this happened in a [], it resulted in duplicate records being handled at what appeared to be different offsets that were separated by only a few bytes.
    
    This behavior is only observed when at least 1 record was read from the first block before encountering the padding. If the initial offset for a reader was within the padding, the correct record offset would be reported, namely the offset within the second block.
    
    The tests failed to catch this scenario/bug, because each read test only read a single record with an initial offset. This CL adds an explicit test case for this scenario, and modifies the test structure to read all remaining records in the test case after an initial offset is specified.  Thus an initial offset that jumps to record #3, with 5 total records in the test file, will result in reading 2 records, and validating the offset of each of them in order to pass successfully.
    -------------
    Created by MOE: https://github.com/google/moe
    MOE_MIGRATED_REVID=115338487
    mikewiacek committed with cmumford Feb 23, 2016
  2. Deleted redundant null ptr check prior to delete.

    Fixes issue #338.
    -------------
    Created by MOE: https://github.com/google/moe
    MOE_MIGRATED_REVID=113439460
    cmumford committed with cmumford Jan 30, 2016
Commits on Feb 24, 2016
  1. Merge pull request #348 from randomascii/master

    Fix signed/unsigned mismatch on VC++ builds
    cmumford committed Feb 24, 2016
Commits on Feb 19, 2016
Commits on Jan 30, 2016
  1. Putting build artifacts in subdirectory.

    1. Object files, libraries, and compiled executables are put
       into subdirectories.
    2. The shared library is linked from individual object files.
       This provides for greater parallelism on large desktops
       while at the same time making for easier builds on small
       (i.e. embedded) systems. Fixes issue #279.
    3. One program, db_bench, is compiled using the shared library.
    4. The source file for "leveldbutil" was renamed from
       leveldb_main.cc to leveldbutil.cc. This provides for simpler
       makefile rules.
    5. Because all targets placed the library (libleveldb.a) at the top
       level, the last platform built (desktop/device) always overwrote
       any prior artifact.
    -------------
    Created by MOE: https://github.com/google/moe
    MOE_MIGRATED_REVID=113407013
    cmumford committed with cmumford Jan 29, 2016
Commits on Jan 15, 2016
  1. Merge pull request #329 from ralphtheninja/travis-badge

    Add travis build badge to README
    cmumford committed Jan 15, 2016
  2. add travis build badge

    ralphtheninja committed Jan 15, 2016
  3. Merge pull request #328 from cmumford/master

    Added a Travis CI build file.
    cmumford committed Jan 15, 2016
  4. Added a Travis CI build file.

    This allows for continuous integration builds by travis-ci.org.
    More information at https://docs.travis-ci.com/user/languages/cpp
    cmumford committed Jan 15, 2016
Commits on Jan 12, 2016
  1. Merge pull request #284 from ideawu/master

    log compaction output file's level along with number
    cmumford committed Jan 12, 2016
  2. Merge pull request #317 from falvojr/patch-1

    Update README.md
    cmumford committed Jan 12, 2016
  3. Merge pull request #272 from vapier/master

    Fix Android/MIPS build.
    cmumford committed Jan 12, 2016
Commits on Jan 4, 2016
  1. Added a contributors section to README.md

    In preparation for accepting GitHub pull requests this new README
    section outlines the general criteria that the leveldb project owners
    will use when accepting external (and internal) project contributions.
    -------------
    Created by MOE: https://github.com/google/moe
    MOE_MIGRATED_REVID=111349899
    cmumford committed with cmumford Jan 4, 2016
Commits on Dec 9, 2015
  1. Merge pull request #275 from paulirish/patch-1

    readme: improved documentation link
    cmumford committed Dec 9, 2015
  2. Resolve race when getting approximate-memory-usage property

    The write operations in the table happens without holding the mutex
    lock, but concurrent writes are avoided using "writers_" queue.
    The Arena::MemoryUsage could access the blocks when write happens.
    So, the memory usage is cached in atomic word and can be loaded
    from any thread safely.
    -------------
    Created by MOE: https://github.com/google/moe
    MOE_MIGRATED_REVID=107573379
    ssiddhartha committed with cmumford Nov 11, 2015
  3. Only compiling TrimSpace on linux.

    Incorporated change by zmodem at google#310
    to fix issue #310.
    
    This change will only build TrimSace on linux to avoid unused function
    warning/error.
    -------------
    Created by MOE: https://github.com/google/moe
    MOE_MIGRATED_REVID=105323419
    cmumford committed with cmumford Jul 22, 2015
  4. Including atomic_pointer.h in port_posix

    A recent CL (104348226) created the port_posix library, but omitted: port/atomic_pointer.h.
    
    And when:
    
        [] test third_party/leveldb:all
    
    was run this error was reported:
    
        //third_party/leveldb:port_posix does not depend on a
        module exporting 'third_party/leveldb/port/atomic_pointer.h'
    -------------
    Created by MOE: https://github.com/google/moe
    MOE_MIGRATED_REVID=105243399
    cmumford committed with cmumford Oct 12, 2015
  5. Let LevelDB use xcrun to determine Xcode.app path instead of using a …

    …hardcoded path.
    
    This allows build agents to select from multiple Xcode installations.
    -------------
    Created by MOE: https://github.com/google/moe
    MOE_MIGRATED_REVID=104859097
    maplemuse committed with cmumford Oct 7, 2015
  6. Add "approximate-memory-usage" property to leveldb::DB::GetProperty

    The approximate RAM usage of the database is calculated from the memory
    allocated for write buffers and the block cache. This is to give an
    estimate of memory usage to leveldb clients.
    -------------
    Created by MOE: https://github.com/google/moe
    MOE_MIGRATED_REVID=104222307
    ssiddhartha committed with cmumford Sep 29, 2015
  7. Add leveldb::Cache::Prune

    Prune() drops on-memory read cache of the database, so that the client can
    relief its memory shortage.
    -------------
    Created by MOE: https://github.com/google/moe
    MOE_MIGRATED_REVID=101335710
    tzik committed with cmumford Aug 24, 2015
  8. Fix size_t/int comparison/conversion issues in leveldb.

    The create function took |num_keys| as an int, but callers and implementers wanted it to function as a size_t (e.g. passing std::vector::size() in, passing it to vector constructors as a size arg, indexing containers by it, etc.).  This resulted in implicit conversions between the two types as well as warnings (found with Chromium's external copy of these sources, built with MSVC) about signed vs. unsigned comparisons.
    
    The leveldb sources were already widely using size_t elsewhere, e.g. for key and filter lengths, so using size_t here is not inconsistent with the existing code.  However, it does change the public C API.
    -------------
    Created by MOE: https://github.com/google/moe
    MOE_MIGRATED_REVID=101074871
    pkasting committed with cmumford Aug 19, 2015
  9. Added leveldb::Status::IsInvalidArgument() method.

    All other Status::Code enum values have an Is**() method with the one
    exception of InvalidArgument.
    -------------
    Created by MOE: https://github.com/google/moe
    MOE_MIGRATED_REVID=97166441
    cmumford committed with cmumford Jun 29, 2015
  10. Suppress error reporting after seeking but before a valid First or Fu…

    …ll record is encountered.
    
    Fix a spelling mistake.
    mikewiacek committed with cmumford Aug 11, 2015
  11. include <assert> -> <cassert>

    Fixes reported public issue #280.
    cmumford committed Aug 11, 2015
Commits on Nov 23, 2015
  1. Update README.md

    falvojr committed Nov 23, 2015
Commits on Aug 11, 2015
  1. Will not reuse manifest if reuse_logs options is false.

    Prior implementation would always try to reuse the manifest, even if reuse_logs
    was false (the default). This was missed because the stock
    Env::NewAppendableFile implementation returns false forcing the creation of a
    new log.
    cmumford committed Jun 17, 2015
  2. LevelDB now attempts to reuse the preceding MANIFEST and log file whe…

    …n re-opened.
    
    (Based on a suggestion by cmumford.)
    
    "open" benchmark on my workstation speeds up significantly since we
    can now avoid three fdatasync calls and a compaction per open:
    
      Before: ~80000 microseconds
      After:    ~130 microseconds
    
    Details:
    
    (1) Added Options::reuse_logs (currently defaults to false) to control
    new behavior.  The intention is to change the default to true after some
    baking.
    
    (2) Added Env::NewAppendableFile() whose default implementation returns
    a not-supported error.
    
    (3) VersionSet::Recovery attempts to reuse the MANIFEST from which
    it is recovering.
    
    (4) DBImpl recovery attempts to reuse the last log file and memtable.
    
    (5) db_test.cc now tests a new configuration that sets reuse_logs to true.
    
    (6) fault_injection_test also tests a reuse_logs==true config.
    
    (7) Added a new recovery_test.
    ghemawat committed with cmumford Dec 11, 2014
Commits on Apr 20, 2015
  1. fix indent

    ideawu committed Apr 20, 2015