Skip to content

2.29.0.0-b116

@ttyusupov ttyusupov tagged this 29 Oct 22:27
Summary:
The following metrics were useful for index scan (seek pattern) performance investigation, so we want to have them available for further analysis when needed:
1) rocksdb_bloom_filter_time_nanos - how much time we spend inside `BloomFilterAwareFileFilter::Filter`. This include time measured by the next two metrics added:
2) rocksdb_get_fixed_size_filter_block_handle_nanos - now much time we spend inside `BlockBasedTable::GetFixedSizeFilterBlockHandle` (getting filter block handle).
3) rocksdb_get_filter_block_from_cache_nanos - how much time we spend getting filter block from block cache (using the handle).

During index scan (2) and (3) might account up to ~50% of (1).

Added runtime flag rocksdb_collect_bloom_filter_time_metrics (off by default) which enables collection of these metrics. We don't want to collect them always because their collection is in hot read path.

Also updated `rocksdb::StopWatch` class to rely on CPU cycles to avoid system calls for less overhead and added ability to measure in nanoseconds. Replaced `rocksdb::StopWatchNano` with updated `rocksdb::StopWatch` since it has less overhead.

`rocksdb::StopWatchNano` overhead:
```
[ RUN      ] PerfContextTest.StopWatchNanoOverhead
Count: 1000000  Average: 22.7228  StdDev: 57.10
Min: 20.0000  Median: 22.5234  Max: 40868.0000
Percentiles: P50: 22.52 P75: 23.79 P99: 25.00 P99.9: 69.65 P99.99: 391.27
```

Updated `rocksdb::StopWatch` overhead:
```
Count: 1000000  Average: 17.3568  StdDev: 17.85
Min: 6.0000  Median: 17.0132  Max: 10850.0000
Percentiles: P50: 17.01 P75: 17.54 P99: 19.34 P99.9: 20.03 P99.99: 24.80
```
Jira: DB-18312

Test Plan:
- Jenkins
- Manual test:
```
./bin/yb-ctl start --tserver_flags '"rocksdb_collect_bloom_filter_time_metrics=true"'
./bin/ysqlsh -c "explain (analyze, dist, debug) select v3 from t3 where v1 = 123"
...
     Metric rocksdb_bloom_filter_time_nanos: sum: 4155021.000, count: 5005.000
     Metric rocksdb_get_fixed_size_filter_block_handle_nanos: sum: 1189777.000, count: 5005.000
     Metric rocksdb_get_filter_block_from_cache_nanos: sum: 949035.000, count: 5005.000
...
```

Reviewers: arybochkin

Reviewed By: arybochkin

Subscribers: kannan, rthallam, yql, ybase

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D47624
Assets 2
Loading