-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3.10 performance regression? #574
Comments
Your run on 3.10 is insanely slow, which is not expected. If you don't care, can you send us the database information "LOG" files for both runs to us? We'll have more information using that. Also, if you run with "-statistics", it will be also helpful for us to identify problems. By the way, something not related: as a recommendation, we run benchmarks using "tools/benchmark.sh". We usually update our default there. The parameter you use is probably from the webpage, which didn't contain an appropriate configuration for max_background_flushes. Setting max_background_flushes to a higher number, like 4 or 8, is the general recommendation for fillrandom or fillseq tests. |
I see this:
Maybe we added bunch of assertions that are slowing down db_bench. This is also something to fix, but can you please run db_bench without assertions? You can get db_bench without assertions by running I look into this soon. |
Able to repro and confirmed that his happens only in DEBUG build. |
thanks for looking into this guys. glad it's not an issue. I'll just have to figure out what incantation of gcc flags I need to compile with for these wacky Go bindings we're using (assume these are somewhere in the make file, I'll dig around tonight) |
thanks for tips @siying |
@rdallman it is still an issue, it's just not a big issue. I managed to bisect to this diff: https://reviews.facebook.net/D32787. Can you please reopen the issue? |
Should be fixed with https://reviews.facebook.net/D36963 |
@rdallman make sure to always compile with |
ahh, that's the guy I want. thanks @igorcanadi |
Summary: See github issue 574: #574 Basically when we're running in DEBUG mode we're calling `usleep(0)` on every mutex lock. I bisected the issue to https://reviews.facebook.net/D36963. Instead of calling sleep(0), this diff just avoids calling SleepForMicroseconds() when delay is not set. Test Plan: bpl=10485760;overlap=10;mcz=2;del=300000000;levels=2;ctrig=10000000; delay=10000000; stop=10000000; wbn=30; mbc=20; mb=1073741824;wbs=268435456; dds=1; sync=0; r=100000; t=1; vs=800; bs=65536; cs=1048576; of=500000; si=1000000; ./db_bench --benchmarks=fillrandom --disable_seek_compaction=1 --mmap_read=0 --statistics=1 --histogram=1 --num=$r --threads=$t --value_size=$vs --block_size=$bs --cache_size=$cs --bloom_bits=10 --cache_numshardbits=4 --open_files=$of --verify_checksum=1 --db=/tmp/rdb10test --sync=$sync --disable_wal=1 --compression_type=snappy --stats_interval=$si --compression_ratio=0.5 --disable_data_sync=$dds --write_buffer_size=$wbs --target_file_size_base=$mb --max_write_buffer_number=$wbn --max_background_compactions=$mbc --level0_file_num_compaction_trigger=$ctrig --level0_slowdown_writes_trigger=$delay --level0_stop_writes_trigger=$stop --num_levels=$levels --delete_obsolete_files_period_micros=$del --min_level_to_compress=$mcz --max_grandparent_overlap_factor=$overlap --stats_per_interval=1 --max_bytes_for_level_base=$bpl --memtablerep=vector --use_existing_db=0 --disable_auto_compactions=1 --source_compaction_factor=10000000 | grep ops Before: fillrandom : 117.525 micros/op 8508 ops/sec; 6.6 MB/s After: fillrandom : 1.283 micros/op 779502 ops/sec; 606.6 MB/s Reviewers: rven, yhchiang, sdong Reviewed By: sdong Subscribers: meyering, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D36963
Summary: See github issue 574: #574 Basically when we're running in DEBUG mode we're calling `usleep(0)` on every mutex lock. I bisected the issue to https://reviews.facebook.net/D36963. Instead of calling sleep(0), this diff just avoids calling SleepForMicroseconds() when delay is not set. Test Plan: bpl=10485760;overlap=10;mcz=2;del=300000000;levels=2;ctrig=10000000; delay=10000000; stop=10000000; wbn=30; mbc=20; mb=1073741824;wbs=268435456; dds=1; sync=0; r=100000; t=1; vs=800; bs=65536; cs=1048576; of=500000; si=1000000; ./db_bench --benchmarks=fillrandom --disable_seek_compaction=1 --mmap_read=0 --statistics=1 --histogram=1 --num=$r --threads=$t --value_size=$vs --block_size=$bs --cache_size=$cs --bloom_bits=10 --cache_numshardbits=4 --open_files=$of --verify_checksum=1 --db=/tmp/rdb10test --sync=$sync --disable_wal=1 --compression_type=snappy --stats_interval=$si --compression_ratio=0.5 --disable_data_sync=$dds --write_buffer_size=$wbs --target_file_size_base=$mb --max_write_buffer_number=$wbn --max_background_compactions=$mbc --level0_file_num_compaction_trigger=$ctrig --level0_slowdown_writes_trigger=$delay --level0_stop_writes_trigger=$stop --num_levels=$levels --delete_obsolete_files_period_micros=$del --min_level_to_compress=$mcz --max_grandparent_overlap_factor=$overlap --stats_per_interval=1 --max_bytes_for_level_base=$bpl --memtablerep=vector --use_existing_db=0 --disable_auto_compactions=1 --source_compaction_factor=10000000 | grep ops Before: fillrandom : 117.525 micros/op 8508 ops/sec; 6.6 MB/s After: fillrandom : 1.283 micros/op 779502 ops/sec; 606.6 MB/s Reviewers: rven, yhchiang, sdong Reviewed By: sdong Subscribers: meyering, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D36963
This is fixed now on master and 3.10.fb branch. Tnx for reporting @rdallman ! |
Summary: See github issue 574: #574 Basically when we're running in DEBUG mode we're calling `usleep(0)` on every mutex lock. I bisected the issue to https://reviews.facebook.net/D36963. Instead of calling sleep(0), this diff just avoids calling SleepForMicroseconds() when delay is not set. Test Plan: bpl=10485760;overlap=10;mcz=2;del=300000000;levels=2;ctrig=10000000; delay=10000000; stop=10000000; wbn=30; mbc=20; mb=1073741824;wbs=268435456; dds=1; sync=0; r=100000; t=1; vs=800; bs=65536; cs=1048576; of=500000; si=1000000; ./db_bench --benchmarks=fillrandom --disable_seek_compaction=1 --mmap_read=0 --statistics=1 --histogram=1 --num=$r --threads=$t --value_size=$vs --block_size=$bs --cache_size=$cs --bloom_bits=10 --cache_numshardbits=4 --open_files=$of --verify_checksum=1 --db=/tmp/rdb10test --sync=$sync --disable_wal=1 --compression_type=snappy --stats_interval=$si --compression_ratio=0.5 --disable_data_sync=$dds --write_buffer_size=$wbs --target_file_size_base=$mb --max_write_buffer_number=$wbn --max_background_compactions=$mbc --level0_file_num_compaction_trigger=$ctrig --level0_slowdown_writes_trigger=$delay --level0_stop_writes_trigger=$stop --num_levels=$levels --delete_obsolete_files_period_micros=$del --min_level_to_compress=$mcz --max_grandparent_overlap_factor=$overlap --stats_per_interval=1 --max_bytes_for_level_base=$bpl --memtablerep=vector --use_existing_db=0 --disable_auto_compactions=1 --source_compaction_factor=10000000 | grep ops Before: fillrandom : 117.525 micros/op 8508 ops/sec; 6.6 MB/s After: fillrandom : 1.283 micros/op 779502 ops/sec; 606.6 MB/s Reviewers: rven, yhchiang, sdong Reviewed By: sdong Subscribers: meyering, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D36963
Summary: See github issue 574: facebook#574 Basically when we're running in DEBUG mode we're calling `usleep(0)` on every mutex lock. I bisected the issue to https://reviews.facebook.net/D36963. Instead of calling sleep(0), this diff just avoids calling SleepForMicroseconds() when delay is not set. Test Plan: bpl=10485760;overlap=10;mcz=2;del=300000000;levels=2;ctrig=10000000; delay=10000000; stop=10000000; wbn=30; mbc=20; mb=1073741824;wbs=268435456; dds=1; sync=0; r=100000; t=1; vs=800; bs=65536; cs=1048576; of=500000; si=1000000; ./db_bench --benchmarks=fillrandom --disable_seek_compaction=1 --mmap_read=0 --statistics=1 --histogram=1 --num=$r --threads=$t --value_size=$vs --block_size=$bs --cache_size=$cs --bloom_bits=10 --cache_numshardbits=4 --open_files=$of --verify_checksum=1 --db=/tmp/rdb10test --sync=$sync --disable_wal=1 --compression_type=snappy --stats_interval=$si --compression_ratio=0.5 --disable_data_sync=$dds --write_buffer_size=$wbs --target_file_size_base=$mb --max_write_buffer_number=$wbn --max_background_compactions=$mbc --level0_file_num_compaction_trigger=$ctrig --level0_slowdown_writes_trigger=$delay --level0_stop_writes_trigger=$stop --num_levels=$levels --delete_obsolete_files_period_micros=$del --min_level_to_compress=$mcz --max_grandparent_overlap_factor=$overlap --stats_per_interval=1 --max_bytes_for_level_base=$bpl --memtablerep=vector --use_existing_db=0 --disable_auto_compactions=1 --source_compaction_factor=10000000 | grep ops Before: fillrandom : 117.525 micros/op 8508 ops/sec; 6.6 MB/s After: fillrandom : 1.283 micros/op 779502 ops/sec; 606.6 MB/s Reviewers: rven, yhchiang, sdong Reviewed By: sdong Subscribers: meyering, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D36963
Summary: See github issue 574: facebook#574 Basically when we're running in DEBUG mode we're calling `usleep(0)` on every mutex lock. I bisected the issue to https://reviews.facebook.net/D36963. Instead of calling sleep(0), this diff just avoids calling SleepForMicroseconds() when delay is not set. Test Plan: bpl=10485760;overlap=10;mcz=2;del=300000000;levels=2;ctrig=10000000; delay=10000000; stop=10000000; wbn=30; mbc=20; mb=1073741824;wbs=268435456; dds=1; sync=0; r=100000; t=1; vs=800; bs=65536; cs=1048576; of=500000; si=1000000; ./db_bench --benchmarks=fillrandom --disable_seek_compaction=1 --mmap_read=0 --statistics=1 --histogram=1 --num=$r --threads=$t --value_size=$vs --block_size=$bs --cache_size=$cs --bloom_bits=10 --cache_numshardbits=4 --open_files=$of --verify_checksum=1 --db=/tmp/rdb10test --sync=$sync --disable_wal=1 --compression_type=snappy --stats_interval=$si --compression_ratio=0.5 --disable_data_sync=$dds --write_buffer_size=$wbs --target_file_size_base=$mb --max_write_buffer_number=$wbn --max_background_compactions=$mbc --level0_file_num_compaction_trigger=$ctrig --level0_slowdown_writes_trigger=$delay --level0_stop_writes_trigger=$stop --num_levels=$levels --delete_obsolete_files_period_micros=$del --min_level_to_compress=$mcz --max_grandparent_overlap_factor=$overlap --stats_per_interval=1 --max_bytes_for_level_base=$bpl --memtablerep=vector --use_existing_db=0 --disable_auto_compactions=1 --source_compaction_factor=10000000 | grep ops Before: fillrandom : 117.525 micros/op 8508 ops/sec; 6.6 MB/s After: fillrandom : 1.283 micros/op 779502 ops/sec; 606.6 MB/s Reviewers: rven, yhchiang, sdong Reviewed By: sdong Subscribers: meyering, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D36963
hi, could you tell me what's that mean about the output, i want to measure the performance, but i don't know which attributes can measure. thx a lot. |
I hope that this is just me doing something wrong during the build process. I've done some sanity checking but can't seem to get things back to normal. Hoping that I can get some help debugging/building/benchmarking to make sure that it's just me who's insane :) Context: we updated from 3.8 to 3.10 and saw a performance regression where throughput was halved and latency was ~7x worse in our tests. So we started investigating and today tested using the
db_bench
tool against version 3.8 and 3.10. I'll post the commands ran below with results, but it's pretty jarring. I've tried building v3.10 withPORTABLE=1 make all
and the difference was negligible between the normal build (i.e.-march=native
). I built either one usingmake all
against gcc 4.9.2 in linux. The goal isn't so much to benchmark the thing (you'll notice the tests are short) under certain settings to get big numbers (happy to finagle, just shouldn't regress with same settings I don't think) -- just wanted to see that things haven't regressed. let me know if/what additional information you need to help diagnose.3.8:
TL;DR this takes ~1 second to complete at 245MB/s -- perfectly ok, like I said, didn't care what the original numbers are so much, just want them as a metric for comparison -- and these seem sane.
command:
output:
3.10:
TL;DR takes ~40 seconds at 2 MB/s, a > 100x performance regression.
command (same as above):
results:
Hope we can figure this one out, thanks guys
The text was updated successfully, but these errors were encountered: