Comparing pebble and badger benchmarks #1779

venkatsvpr · 2022-06-21T04:05:56Z

I am interested in selecting a key-value DB and ran the comparison to understand the performance differences.
Didn't expect such big difference between pebble & badger. Am I missing something?

Link to the repo where I ran the benchmarks- https://github.com/venkatsvpr/pebble

Badger (metrics are not wired up completely)

Engine:badger Benchmarkycsb/A/values=1000 160284 16028.1 ops/sec 0 read 0 write 0.00 r-amp 0.00 w-amp
Engine:badger Benchmarkycsb/B/values=1000 707112 70709.1 ops/sec 0 read 0 write 0.00 r-amp 0.00 w-amp
Engine:badger Benchmarkycsb/C/values=1000 3447390 344717.4 ops/sec 0 read 0 write 0.00 r-amp 0.00 w-amp
Engine:badger Benchmarkycsb/D/values=1000 1916511 191648.8 ops/sec 0 read 0 write 0.00 r-amp 0.00 w-amp

Pebble

Engine:pebble Benchmarkycsb/A/values=1000 6820 681.9 ops/sec 0 read 13929144 write 6.45 r-amp 1.00 w-amp
Engine:pebble Benchmarkycsb/B/values=1000 66593 6658.7 ops/sec 0 read 13849539 write 6.42 r-amp 1.00 w-amp
Engine:pebble Benchmarkycsb/C/values=1000 3820737 382043.9 ops/sec 0 read 10377666 write 6.00 r-amp 1.00 w-amp
Engine:pebble Benchmarkycsb/D/values=1000 66757 6675.0 ops/sec 0 read 13926932 write 6.45 r-amp 1.00 w-amp

Thanks!

nicktrav · 2022-06-22T16:07:01Z

Hi @venkatsvpr - thanks for the report. The discrepancies are large enough to warrant some investigation on our end.

Are you able to provide some specific instructions for how you ran your benchmarks? Type of machine, for how long, type of block device, operating system / configuration, etc. Basically enough for us to observe the same discrepancy.

venkatsvpr · 2022-06-23T04:43:47Z

Hi @nicktrav,

I am running the benchmarks like below.

./testbench bench ycsb ./dbfiles/ --engine <engine_name> --workload <workload_type>

The commands are here

I am running this on a ubuntu:18:04 container running on WSL. My laptop has a SSD.

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 140
Model name: 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz
Stepping: 1
CPU MHz: 2995.212
BogoMIPS: 5990.42
Virtualization: VT-x
Hypervisor vendor: Microsoft
Virtualization type: full
L1d cache: 48K
L1i cache: 32K
L2 cache: 1280K
L3 cache: 12288K
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512vbmi umip avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid movdiri movdir64b fsrm avx512_vp2intersect flush_l1d arch_capabilities

root@e15e7b03dcbe:/stuff/GitDev/pebble/cmd/pebble# lsmem
RANGE SIZE STATE REMOVABLE BLOCK
0x0000000000000000-0x00000000f7ffffff 3.9G online yes 0-30
0x0000000100000000-0x00000005ffffffff 20G online yes 32-191

Memory block size: 128M
Total online memory: 23.9G
Total offline memory: 0B

root@e15e7b03dcbe:/stuff/GitDev/pebble/cmd/pebble# free -m
total used free shared buff/cache available
Mem: 23893 2939 18089 367 2864 20200
Swap: 6144 0 6144
root@e15e7b03dcbe:/stuff/GitDev/pebble/cmd/pebble#

Happy to provide more info. Thanks!

nicktrav · 2022-06-27T13:59:44Z

Thanks @venkatsvpr - we'll take a look.

ivanjaros · 2022-09-01T13:53:33Z

following

pkieltyka · 2022-10-27T12:17:52Z

Hi all -- I was wondering if there are any updates on this benchmark? thank you :)

nicktrav · 2022-10-27T14:21:14Z

No updates unfortunately. This got deprioritized.

Are you looking for anything in particular?

sean- · 2022-11-12T17:19:19Z

The benchmark used here does not perform synchronous writes for badger[1-4], whereas the pebble test respects the CLI flag for disabling WAL[5-9] (and all WAL writes are synchronous in pebble when a write batch is committed). Changing this benchmark to use the same WAL setting in both benchmarks would be illustrative.

[1] master...venkatsvpr:pebble:master#diff-33ef32bf6c23acb95f5902d7097b7a1d5128ca061167ec0716715b0b9eeaa5f6R17
[2] https://github.com/outcaste-io/badger/tree/v3.2202.0
[3] https://github.com/outcaste-io/badger/blob/v3.2202.0/options.go#L135-L158
[4] master...venkatsvpr:pebble:master#diff-056c0493a5a390469c794bee4fe9075f5f800246e9855c44990e0aa24faaa68aR26-R30
[5]

pebble/cmd/pebble/db.go

Lines 53 to 59 in 9de3a89

    
           func newPebbleDB(dir string) DB { 
        
           	cache := pebble.NewCache(cacheSize) 
        
           	defer cache.Unref() 
        
           	opts := &pebble.Options{ 
        
           		Cache:                       cache, 
        
           		Comparer:                    mvccComparer, 
        
           		DisableWAL:                  disableWAL,

[6]

pebble/commit.go

Lines 237 to 240 in 4a3adc3

    
           // Commit the specified batch, writing it to the WAL, optionally syncing the 
        
           // WAL, and applying the batch to the memtable. Upon successful return the 
        
           // batch's mutations will be visible for reading. 
        
           func (p *commitPipeline) Commit(b *Batch, syncWAL bool) error {

[7]

pebble/commit.go

Lines 347 to 364 in 4a3adc3

    
           func (p *commitPipeline) prepare(b *Batch, syncWAL bool) (*memTable, error) { 
        
           	n := uint64(b.Count()) 
        
           	if n == invalidBatchCount { 
        
           		return nil, ErrInvalidBatch 
        
           	} 
        
           	count := 1 
        
           	if syncWAL { 
        
           		count++ 
        
           	} 
        
           	// count represents the waiting needed for publish, and optionally the 
        
           	// waiting needed for the WAL sync. 
        
           	b.commit.Add(count) 
        
           	var syncWG *sync.WaitGroup 
        
           	var syncErr *error 
        
           	if syncWAL { 
        
           		syncWG, syncErr = &b.commit, &b.commitErr 
        
           	}

[8]

pebble/open.go

Lines 121 to 126 in 5ed983e

    
           d.commit = newCommitPipeline(commitEnv{ 
        
           	logSeqNum:     &d.mu.versions.atomic.logSeqNum, 
        
           	visibleSeqNum: &d.mu.versions.atomic.visibleSeqNum, 
        
           	apply:         d.commitApply, 
        
           	write:         d.commitWrite, 
        
           })

[9]

pebble/db.go

Lines 819 to 841 in 181258e

    
           func (d *DB) commitWrite(b *Batch, syncWG *sync.WaitGroup, syncErr *error) (*memTable, error) { 
        
           	var size int64 
        
           	repr := b.Repr() 
        
           	if b.flushable != nil { 
        
           		// We have a large batch. Such batches are special in that they don't get 
        
           		// added to the memtable, and are instead inserted into the queue of 
        
           		// memtables. The call to makeRoomForWrite with this batch will force the 
        
           		// current memtable to be flushed. We want the large batch to be part of 
        
           		// the same log, so we add it to the WAL here, rather than after the call 
        
           		// to makeRoomForWrite(). 
        
           		// 
        
           		// Set the sequence number since it was not set to the correct value earlier 
        
           		// (see comment in newFlushableBatch()). 
        
           		b.flushable.setSeqNum(b.SeqNum()) 
        
           		if !d.opts.DisableWAL { 
        
           			var err error 
        
           			size, err = d.mu.log.SyncRecord(repr, syncWG, syncErr) 
        
           			if err != nil { 
        
           				panic(err) 
        
           			} 
        
           		} 
        
           	}

cscetbon · 2022-11-13T07:21:07Z

What is then surprising is that numbers of badger[3] and pebble[8] are so close ... @venkatsvpr can you run your benchmarks again making sure WAL settings are identical as advised by Sean ?

venkatsvpr · 2022-11-21T05:18:53Z

Thanks @sean - Sure let me give it a try and get back.

nicktrav added this to Incoming in Storage via automation Jun 22, 2022

mwang1026 moved this from Incoming to To Do (investigations) in Storage Jun 22, 2022

jbowens assigned nicktrav Oct 17, 2022

jbowens moved this from To Do (investigations) to Community in Storage Oct 17, 2022

nicktrav removed their assignment Feb 10, 2023

nicktrav added the O-community label Feb 10, 2023

nicktrav moved this from Community to Backlog in Storage Feb 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comparing pebble and badger benchmarks #1779

Comparing pebble and badger benchmarks #1779

venkatsvpr commented Jun 21, 2022 •

edited

nicktrav commented Jun 22, 2022

venkatsvpr commented Jun 23, 2022

nicktrav commented Jun 27, 2022

ivanjaros commented Sep 1, 2022

pkieltyka commented Oct 27, 2022

nicktrav commented Oct 27, 2022

sean- commented Nov 12, 2022

cscetbon commented Nov 13, 2022

venkatsvpr commented Nov 21, 2022

Comparing pebble and badger benchmarks #1779

Comparing pebble and badger benchmarks #1779

Comments

venkatsvpr commented Jun 21, 2022 • edited

Badger (metrics are not wired up completely)

Pebble

nicktrav commented Jun 22, 2022

venkatsvpr commented Jun 23, 2022

nicktrav commented Jun 27, 2022

ivanjaros commented Sep 1, 2022

pkieltyka commented Oct 27, 2022

nicktrav commented Oct 27, 2022

sean- commented Nov 12, 2022

cscetbon commented Nov 13, 2022

venkatsvpr commented Nov 21, 2022

venkatsvpr commented Jun 21, 2022 •

edited