Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparing pebble and badger benchmarks #1779

Open
venkatsvpr opened this issue Jun 21, 2022 · 9 comments
Open

Comparing pebble and badger benchmarks #1779

venkatsvpr opened this issue Jun 21, 2022 · 9 comments
Projects

Comments

@venkatsvpr
Copy link

venkatsvpr commented Jun 21, 2022

I am interested in selecting a key-value DB and ran the comparison to understand the performance differences.
Didn't expect such big difference between pebble & badger. Am I missing something?

Link to the repo where I ran the benchmarks- https://github.com/venkatsvpr/pebble

Badger (metrics are not wired up completely)

Engine:badger Benchmarkycsb/A/values=1000 160284 16028.1 ops/sec 0 read 0 write 0.00 r-amp 0.00 w-amp
Engine:badger Benchmarkycsb/B/values=1000 707112 70709.1 ops/sec 0 read 0 write 0.00 r-amp 0.00 w-amp
Engine:badger Benchmarkycsb/C/values=1000 3447390 344717.4 ops/sec 0 read 0 write 0.00 r-amp 0.00 w-amp
Engine:badger Benchmarkycsb/D/values=1000 1916511 191648.8 ops/sec 0 read 0 write 0.00 r-amp 0.00 w-amp

Pebble

Engine:pebble Benchmarkycsb/A/values=1000 6820 681.9 ops/sec 0 read 13929144 write 6.45 r-amp 1.00 w-amp
Engine:pebble Benchmarkycsb/B/values=1000 66593 6658.7 ops/sec 0 read 13849539 write 6.42 r-amp 1.00 w-amp
Engine:pebble Benchmarkycsb/C/values=1000 3820737 382043.9 ops/sec 0 read 10377666 write 6.00 r-amp 1.00 w-amp
Engine:pebble Benchmarkycsb/D/values=1000 66757 6675.0 ops/sec 0 read 13926932 write 6.45 r-amp 1.00 w-amp

Thanks!

@nicktrav nicktrav added this to Incoming in Storage via automation Jun 22, 2022
@nicktrav
Copy link
Contributor

Hi @venkatsvpr - thanks for the report. The discrepancies are large enough to warrant some investigation on our end.

Are you able to provide some specific instructions for how you ran your benchmarks? Type of machine, for how long, type of block device, operating system / configuration, etc. Basically enough for us to observe the same discrepancy.

@mwang1026 mwang1026 moved this from Incoming to To Do (investigations) in Storage Jun 22, 2022
@venkatsvpr
Copy link
Author

Hi @nicktrav,

I am running the benchmarks like below.

./testbench bench ycsb ./dbfiles/ --engine <engine_name> --workload <workload_type>

The commands are here

I am running this on a ubuntu:18:04 container running on WSL. My laptop has a SSD.

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 140
Model name: 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz
Stepping: 1
CPU MHz: 2995.212
BogoMIPS: 5990.42
Virtualization: VT-x
Hypervisor vendor: Microsoft
Virtualization type: full
L1d cache: 48K
L1i cache: 32K
L2 cache: 1280K
L3 cache: 12288K
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512vbmi umip avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid movdiri movdir64b fsrm avx512_vp2intersect flush_l1d arch_capabilities

root@e15e7b03dcbe:/stuff/GitDev/pebble/cmd/pebble# lsmem
RANGE SIZE STATE REMOVABLE BLOCK
0x0000000000000000-0x00000000f7ffffff 3.9G online yes 0-30
0x0000000100000000-0x00000005ffffffff 20G online yes 32-191

Memory block size: 128M
Total online memory: 23.9G
Total offline memory: 0B

root@e15e7b03dcbe:/stuff/GitDev/pebble/cmd/pebble# free -m
total used free shared buff/cache available
Mem: 23893 2939 18089 367 2864 20200
Swap: 6144 0 6144
root@e15e7b03dcbe:/stuff/GitDev/pebble/cmd/pebble#

Happy to provide more info. Thanks!

@nicktrav
Copy link
Contributor

Thanks @venkatsvpr - we'll take a look.

@ivanjaros
Copy link

following

@jbowens jbowens moved this from To Do (investigations) to Community in Storage Oct 17, 2022
@pkieltyka
Copy link

Hi all -- I was wondering if there are any updates on this benchmark? thank you :)

@nicktrav
Copy link
Contributor

No updates unfortunately. This got deprioritized.

Are you looking for anything in particular?

@sean-
Copy link

sean- commented Nov 12, 2022

The benchmark used here does not perform synchronous writes for badger[1-4], whereas the pebble test respects the CLI flag for disabling WAL[5-9] (and all WAL writes are synchronous in pebble when a write batch is committed). Changing this benchmark to use the same WAL setting in both benchmarks would be illustrative.

[1] master...venkatsvpr:pebble:master#diff-33ef32bf6c23acb95f5902d7097b7a1d5128ca061167ec0716715b0b9eeaa5f6R17
[2] https://github.com/outcaste-io/badger/tree/v3.2202.0
[3] https://github.com/outcaste-io/badger/blob/v3.2202.0/options.go#L135-L158
[4] master...venkatsvpr:pebble:master#diff-056c0493a5a390469c794bee4fe9075f5f800246e9855c44990e0aa24faaa68aR26-R30
[5]

pebble/cmd/pebble/db.go

Lines 53 to 59 in 9de3a89

func newPebbleDB(dir string) DB {
cache := pebble.NewCache(cacheSize)
defer cache.Unref()
opts := &pebble.Options{
Cache: cache,
Comparer: mvccComparer,
DisableWAL: disableWAL,

[6]

pebble/commit.go

Lines 237 to 240 in 4a3adc3

// Commit the specified batch, writing it to the WAL, optionally syncing the
// WAL, and applying the batch to the memtable. Upon successful return the
// batch's mutations will be visible for reading.
func (p *commitPipeline) Commit(b *Batch, syncWAL bool) error {

[7]

pebble/commit.go

Lines 347 to 364 in 4a3adc3

func (p *commitPipeline) prepare(b *Batch, syncWAL bool) (*memTable, error) {
n := uint64(b.Count())
if n == invalidBatchCount {
return nil, ErrInvalidBatch
}
count := 1
if syncWAL {
count++
}
// count represents the waiting needed for publish, and optionally the
// waiting needed for the WAL sync.
b.commit.Add(count)
var syncWG *sync.WaitGroup
var syncErr *error
if syncWAL {
syncWG, syncErr = &b.commit, &b.commitErr
}

[8]

pebble/open.go

Lines 121 to 126 in 5ed983e

d.commit = newCommitPipeline(commitEnv{
logSeqNum: &d.mu.versions.atomic.logSeqNum,
visibleSeqNum: &d.mu.versions.atomic.visibleSeqNum,
apply: d.commitApply,
write: d.commitWrite,
})

[9]

pebble/db.go

Lines 819 to 841 in 181258e

func (d *DB) commitWrite(b *Batch, syncWG *sync.WaitGroup, syncErr *error) (*memTable, error) {
var size int64
repr := b.Repr()
if b.flushable != nil {
// We have a large batch. Such batches are special in that they don't get
// added to the memtable, and are instead inserted into the queue of
// memtables. The call to makeRoomForWrite with this batch will force the
// current memtable to be flushed. We want the large batch to be part of
// the same log, so we add it to the WAL here, rather than after the call
// to makeRoomForWrite().
//
// Set the sequence number since it was not set to the correct value earlier
// (see comment in newFlushableBatch()).
b.flushable.setSeqNum(b.SeqNum())
if !d.opts.DisableWAL {
var err error
size, err = d.mu.log.SyncRecord(repr, syncWG, syncErr)
if err != nil {
panic(err)
}
}
}

@cscetbon
Copy link

What is then surprising is that numbers of badger[3] and pebble[8] are so close ... @venkatsvpr can you run your benchmarks again making sure WAL settings are identical as advised by Sean ?

@venkatsvpr
Copy link
Author

Thanks @sean - Sure let me give it a try and get back.

@nicktrav nicktrav removed their assignment Feb 10, 2023
@nicktrav nicktrav moved this from Community to Backlog in Storage Feb 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Backlog
Storage
  
Backlog
Development

No branches or pull requests

6 participants