Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Much better stats for seeks and prefix filtering #11460

Closed
wants to merge 4 commits into from

Conversation

pdillinger
Copy link
Contributor

@pdillinger pdillinger commented May 18, 2023

Summary: We want to know more about opportunities for better range filters, and the effectiveness of our own range filters. Currently the stats are very limited, essentially logging just hits and misses against prefix filters for range scans in BLOOM_FILTER_PREFIX_* without tracking the false positive rate. Perhaps confusingly, when prefix filters are used for point queries, the stats are currently going into the non-PREFIX tickers.

This change does several things:

  • Introduce new stat tickers for seeks and related filtering, *LEVEL_SEEK*
    • Most importantly, allows us to see opportunities for range filtering. Specifically, we can count how many times a seek in an SST file accesses at least one data block, and how many times at least one value() is then accessed. If a data block was accessed but no value(), we can generally assume that the key(s) seen was(were) not of interest so could have been filtered with the right kind of filter, avoiding the data block access.
    • We can get the same level of detail when a filter (for now, prefix Bloom/ribbon) is used, or not. Specifically, we can infer a false positive rate for prefix filters (not available before) from the seek "false positive" rate: when a data block is accessed but no value() is called. (There can be other explanations for a seek false positive, but in typical iterator usage it would indicate a filter false positive.)
    • For efficiency, I wanted to avoid making additional calls to the prefix extractor (or key comparisons, etc.), which would be required if we wanted to more precisely detect filter false positives. I believe that instrumenting value() is the best balance of efficiency vs. accurately measuring what we are often interested in.
    • The stats are divided between last level and non-last levels, to help understand potential tiered storage use cases.
  • The old BLOOM_FILTER_PREFIX_* stats have a different meaning: no longer referring to iterators but to point queries using prefix filters. BLOOM_FILTER_PREFIX_TRUE_POSITIVE is added for computing the prefix false positive rate on point queries, which can be due to filter false positives as well as different keys with the same prefix.
  • Similarly, the non-PREFIX BLOOM_FILTER stats are now for whole key filtering only.

Test Plan: unit tests updated, including updating many to pop the stat value since last read to improve test
readability and maintainability.

Performance test shows a consistent small improvement with these changes, both with clang and with gcc. CPU profile indicates that RecordTick is using less CPU, and this makes sense at least for a high filter miss rate. Before, we were recording two ticks per filter miss in iterators (CHECKED & USEFUL) and now recording just one (FILTERED).

Create DB with

TEST_TMPDIR=/dev/shm ./db_bench -benchmarks=fillrandom -num=10000000 -disable_wal=1 -write_buffer_size=30000000 -bloom_bits=8 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -prefix_size=8

And run simultaneous before&after with

TEST_TMPDIR=/dev/shm ./db_bench -readonly -benchmarks=seekrandom[-X1000] -num=10000000 -bloom_bits=8 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -prefix_size=8 -seek_nexts=1 -duration=20 -seed=43 -threads=8 -cache_size=1000000000 -statistics

Before: seekrandom [AVG 275 runs] : 189680 (± 222) ops/sec; 18.4 (± 0.0) MB/sec
After: seekrandom [AVG 275 runs] : 197110 (± 208) ops/sec; 19.1 (± 0.0) MB/sec

Summary: We want to know more about opportunities for better range
filters, and the effectiveness of our own range filters. Currently the
stats are very limited, essentially logging just hits and misses against
prefix filters for range scans in BLOOM_FILTER_PREFIX_* without tracking
the false positive rate. Perhaps confusingly, when prefix filters are
used for point queries, the stats are currently going into the
non-PREFIX tickers.

This change does several things:
* Introduce new stat tickers for seeks, *LEVEL_SEEK*
  * Most importantly, allows us to see opportunities for range filtering.
  Specifically, we can count how many times a seek in an SST file
  accesses at least one data block, and how many times at least one
  value() is then accessed. If a data block was accessed but no value(),
  we can generally assume that the key(s) seen was(were) not of interest
  so could have been filtered with the right kind of filter, avoiding
  the data block access.
  * We can get the same level of detail when a filter (for now, prefix
  Bloom/ribbon) is used, or not. Specifically, we can infer a false
  positive rate for prefix filters (not available before) from the
  seek "false positive" rate: when a data block is accessed but no
  value() is called. (There can be other explanations for a seek false
  positive, but in typical iterator usage it would indicate a filter
  false positive.)
  * The stats are divided between last level and non-last levels, to
  help understand potential tiered storage use cases.
* The old BLOOM_FILTER_PREFIX_* stats have a different meaning: no
longer referring to iterators but to point queries using prefix filters.
BLOOM_FILTER_PREFIX_TRUE_POSITIVE is added for computing the prefix
false positive rate on point queries, which can be due to filter false
positives as well as different keys with the same prefix.
* Similarly, the non-PREFIX BLOOM_FILTER stats are now for whole key
filtering only.

Test Plan: unit tests updated (TODO: finish)

Performance test shows a consistent small improvement with these changes,
both with clang and with gcc. I'm not sure why, but I'll take it.
TODO: details
@pdillinger pdillinger changed the title [WIP] Much better stats for seeks and prefix filtering Much better stats for seeks and prefix filtering May 19, 2023
@facebook-github-bot
Copy link
Contributor

@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@ajkr ajkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@facebook-github-bot
Copy link
Contributor

@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@pdillinger merged this pull request in 39f5846.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants