OOM while traversing large database. #1036

winstonewert · 2020-04-18T05:57:03Z

sled 0.31.0
rustc 1.42.0 (b8cedc004 2020-03-09)
Ubuntu 19.10
Code:

fn main() {
    dotenv::dotenv().ok();
    pretty_env_logger::init();


    let db = sled::Config::new()
        .path("/mount/large/server/sled")
        .use_compression(true)
        .cache_capacity(256)
        .open()
        .unwrap();

    let wikidata = db.open_tree("wikidata").unwrap();

    wikidata.len();
    wikidata.len();
}

Expected outcome: Uses a small amount of memory
Actual outcome: Uses several gigabytes of memory and gets killed.

The database is 4.7gb

Some debug logs: https://pastebin.com/e49teW5m

The text was updated successfully, but these errors were encountered:

tokahuke · 2020-05-21T14:35:29Z

This also causes an OOM kill (before disk is exhausted) if let running long enough.

extern crate bincode;

fn main() -> sled::Result<()> {
    let db = sled::open("db")?;
    let t1 = db.open_tree("t1")?;
    let t2 = db.open_tree("t2")?;

    let mut bincode = bincode::config();
    bincode.big_endian();
    
    // This approximates somewhat my workload...
    let infinite_data = (0u64..).flat_map(|i| (0u64..100).map(move |j| (i, j)));
    
    for (i, j) in infinite_data {
        let key1 = bincode.serialize(&(i, j)).expect("can serialize");
        let key2 = bincode.serialize(&(j, i)).expect("can serialize");
        
        t1.insert(key1, vec![])?;
        t2.insert(key2, vec![])?;
    }

    Ok(())
}

Using sled 0.31 (and also latest master) on Ubuntu 18 and Rust 1.45 (nightly 2020-05-12). C'mon, man! I can try to live with sled using tons of space, but this is a deal breaker in my case (and basically any big data usecase). Hope it's easy to fix, though.

tokahuke · 2020-05-21T15:09:03Z

Yo, found a mitigation which might also help solve the problem... I found this comment here:

cache_capacity is currently a bit messed up as it uses the on-disk size of things instead of the larger in-memory representation. So, 1gb is not actually the default memory usage, it's the amount of disk space that items loaded in memory will take, which will result in a lot more space in memory being used, at least for smaller keys and values. So, play around with setting it to a much smaller value.

#986 (comment)

Which led me to fiddle with the cache_capacity knob making it use 100_000 instead of the standard 1_000_000_000. What I have found (qualitatively):

Memory still seems to balloon out of control, but at a much slower rate.
Disk write goes down; db gets slower.
Still, it seems I was able to write the same amount of data to the disk, roughly 10GB, (ok, I know... it's not that simple) using less overall memory.

winstonewert · 2020-05-21T15:20:48Z

As you'll see in my code, I set the cache capacity as low as I could without resolving my problem. So I wonder if we are hitting different issues.

tokahuke · 2020-05-21T15:29:34Z

Good question... I was reluctant in opening a new issue, though. For context, my issue is to load a big dataset. So it's a write problem, not a read problem. @spacejam has been in contact and told me it's a knwon issue, it seems.

spacejam · 2020-06-04T09:31:20Z

I'm currently looking into this approach for handling this issue: #1093

winstonewert · 2020-06-04T14:52:28Z

@spacejam That ticket references many inserts, which wasn't my issue. My issue is simply traversing a large database. It could be related issues for all I know, but wanted to make sure.

spacejam added the bug label May 29, 2020

spacejam mentioned this issue Jun 4, 2020

pointer swizzling-based memory usage reduction #1093

Open

hdevalence mentioned this issue Sep 8, 2020

Consider tuning Sled's cache_capacity knob after we change database structure. ZcashFoundation/zebra#1026

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OOM while traversing large database. #1036

OOM while traversing large database. #1036

winstonewert commented Apr 18, 2020

tokahuke commented May 21, 2020

tokahuke commented May 21, 2020

winstonewert commented May 21, 2020

tokahuke commented May 21, 2020

spacejam commented Jun 4, 2020

winstonewert commented Jun 4, 2020

OOM while traversing large database. #1036

OOM while traversing large database. #1036

Comments

winstonewert commented Apr 18, 2020

tokahuke commented May 21, 2020

tokahuke commented May 21, 2020

winstonewert commented May 21, 2020

tokahuke commented May 21, 2020

spacejam commented Jun 4, 2020

winstonewert commented Jun 4, 2020