[Exec](Cache) Support condition cache in Apache doris#61385
[Exec](Cache) Support condition cache in Apache doris#61385HappenLee wants to merge 6 commits intoapache:branch-4.1from
Conversation
(cherry picked from commit 87a5362)
fix condition cache not effective with range because the range condition change the row_bitmap (cherry picked from commit cef5f09)
…pache#58115) Condition cache malfunctions occurred due to flawed cache digest calculation in two core components, leading to inconsistent hash values, incorrect cache hits/misses, or query execution errors: In **VBloomPredicate::get_digest()** (file: be/src/vec/exprs/vbloom_predicate.cpp), the digest was computed directly using bloom filter data without incorporating the child expression’s digest, omitting critical context for hash consistency. In **VRuntimeFilterWrapper::get_digest()** (file: be/src/vec/exprs/vruntimefilter_wrapper.h), the _null_aware flag (a key attribute controlling filter behavior) was excluded from digest calculation, resulting in invalid cache key generation. (cherry picked from commit 30f4100)
Support in filter do digest cal to enable condition cache (cherry picked from commit b0c9e70)
…dd debug log (apache#58857) ### What problem does this PR solve? ## Overview This pull request (authored by HappenLee) focuses on **performance optimization** (via sorting algorithm replacement) and **observability enhancement** (via logging expansion) for Apache Doris, along with a critical fix to ensure accurate digest calculation in predicate expressions. The changes span core data structure handling, segment iteration, and vectorized expression logic. ## Key Changes ### 1. Performance: Replace `std::sort` with `pdqsort` for Faster Set Sorting - **File**: `be/src/exprs/hybrid_set.h` - Modifications : - Added `#include <pdqsort.h>` to enable the pdqsort algorithm (a fast, adaptive quicksort variant optimized for real-world data). - Replaced ``` std::sort(elems.begin(), elems.end()) ``` with ``` pdqsort(elems.begin(), elems.end()) ``` in three set classes: - `HybridSet`: For generic element type sets. - `StringSet`: For string reference (`StringRef`) sets. - `StringValueSet`: For string value-based sets. - **Purpose**: pdqsort outperforms `std::sort` in most practical scenarios (e.g., partially sorted data, duplicate values), reducing the time to sort elements during digest calculation for set-based operations. ### 2. Observability: Add Debug Logs for Condition Cache Operations - **File**: `be/src/olap/rowset/segment_v2/segment_iterator.cpp` - Modifications : - Cache Hit Logging (Line 132-138): Added ``` VLOG_DEBUG ``` output when a condition cache hit occurs, including: - Query ID (from `_opts.runtime_state->query_id()`). - Segment ID (`_segment->id()`). - Cache digest (`_opts.condition_cache_digest`). - Rowset ID (`_opts.rowset_id.to_string()`). - **Cache Insert Logging** (Line 2379-2383): Added `VLOG_DEBUG` output when inserting data into the condition cache, including the same fields as the hit log. - **Purpose**: Improve debuggability for cache-related issues (e.g., false misses, incorrect cache entries) by linking cache events to specific queries and data segments. ### 3. Correctness: Fix Digest Calculation for `VDirectInPredicate` - **File**: `be/src/vec/exprs/vdirect_in_predicate.h` - Modifications : - Updated the ``` get_digest(uint64_t seed) ``` method to: 1. First incorporate the digest of the predicate’s child expression (`_children[0]->get_digest(seed)`). 2. Only propagate the filter’s digest (`_filter->get_digest(seed)`) if the child digest is non-zero; otherwise, return the original seed. - Replaced the previous implementation (which directly returned `_filter->get_digest(seed)` without including the child expression). - **Purpose**: Ensure the digest uniquely identifies the full predicate logic (including both the child expression and the filter), preventing hash collisions that could lead to incorrect cache lookups or data processing. ## Impact - **Performance**: Faster sorting for set-based digest calculations may reduce latency in query operations involving `IN` predicates or set comparisons. - **Debuggability**: Detailed cache logs enable quicker diagnosis of cache performance issues. - **Correctness**: Fixes a potential source of incorrect digest values, improving the reliability of cache-dependent features (e.g., condition cache, query result caching). (cherry picked from commit cf940bc)
…che#59545) type datetime(3) is different from type datatime(8) which may change the condition cache result (cherry picked from commit 73fbbc9)
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
There was a problem hiding this comment.
Pull request overview
This PR cherry-picks and wires up “condition cache” support in Doris execution: FE propagates a digest into BE scan options, BE computes per-conjunct digests, caches segment/block filter results, and adds regression coverage and metrics.
Changes:
- Add FE session variable + thrift plumbing to pass a condition-cache digest into BE query options.
- Implement BE condition cache (LRU-backed) and expression/scan digest computation, integrate it into segment iteration and scan pipeline, and expose cache metrics.
- Add regression test suite/output for condition cache behavior (and disable condition cache in an existing index suite to reduce interference).
Reviewed changes
Copilot reviewed 49 out of 49 changed files in this pull request and generated 15 comments.
Show a summary per file
| File | Description |
|---|---|
| regression-test/suites/query_p0/cache/condition_cache.groovy | New regression suite covering condition cache behavior across predicates/joins/modifications. |
| regression-test/data/query_p0/cache/condition_cache.out | Expected results for the new condition cache regression suite. |
| regression-test/suites/index_p0/test_ngram_bloomfilter_index_change.groovy | Disable condition cache for this suite and minor formatting fix. |
| gensrc/thrift/PaloInternalService.thrift | Add thrift fields for condition cache digest/enable flag. |
| fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java | Add session variable, compute digest from affect-query-result variables, propagate to thrift. |
| be/src/vec/exprs/vexpr.h | Add virtual get_digest() to expression base class. |
| be/src/vec/exprs/vexpr.cpp | Implement default digest combining children + function/opcode identifiers. |
| be/src/vec/exprs/vexpr_context.h | Add VExprContext::get_digest(). |
| be/src/vec/exprs/vexpr_context.cpp | Implement context digest as root expr digest. |
| be/src/vec/exprs/vslot_ref.h | Add digest override support + store column unique id. |
| be/src/vec/exprs/vslot_ref.cpp | Populate column unique id and implement slot-ref digest. |
| be/src/vec/exprs/vliteral.h | Add digest override for literals. |
| be/src/vec/exprs/vliteral.cpp | Implement literal digest via column hashing. |
| be/src/vec/exprs/vruntimefilter_wrapper.h | Include runtime-filter wrapper digest contribution. |
| be/src/vec/exprs/vdirect_in_predicate.h | Digest support for direct IN predicate via underlying set digest. |
| be/src/vec/exprs/vcast_expr.h | Include CAST target type in digest. |
| be/src/vec/exprs/vbloom_predicate.h | Add bloom predicate digest declaration. |
| be/src/vec/exprs/vbloom_predicate.cpp | Implement bloom predicate digest using filter bytes. |
| be/src/vec/exprs/vbitmap_predicate.h | Explicitly disable digest support (returns 0). |
| be/src/vec/exprs/vlambda_function_expr.h | Disable digest support (returns 0). |
| be/src/vec/exprs/vlambda_function_call_expr.h | Disable digest support (returns 0). |
| be/src/vec/exprs/vin_predicate.h | Disable digest support (returns 0). |
| be/src/vec/exec/scan/olap_scanner.cpp | Pass condition-cache digest into tablet reader params and add profile counters. |
| be/src/pipeline/exec/scan_operator.h | Track condition-cache digest in scan local state; rename TopN filter ids member. |
| be/src/pipeline/exec/scan_operator.cpp | Compute per-scan digest from query option + conjunct digests; update TopN filter ids usage. |
| be/src/pipeline/exec/olap_scan_operator.h | Add profile counters for condition cache stats. |
| be/src/pipeline/exec/olap_scan_operator.cpp | Initialize new counters; adjust scan-parallelism setting call. |
| be/src/olap/tablet_reader.h | Add condition-cache digest field to reader params. |
| be/src/olap/tablet_reader.cpp | Propagate digest into rowset reader context. |
| be/src/olap/rowset/rowset_reader_context.h | Add condition-cache digest to rowset reader context. |
| be/src/olap/rowset/beta_rowset_reader.cpp | Disable cache when delete predicates exist; fold key-range/row-range digests into the cache digest. |
| be/src/olap/rowset/segment_v2/row_ranges.h | Add digest methods for row ranges/range lists. |
| be/src/olap/rowset/segment_v2/segment_iterator.h | Add condition cache state and helpers to segment iterator. |
| be/src/olap/rowset/segment_v2/segment_iterator.cpp | Lookup/apply/insert condition cache; update range bitmap intersection; add cache stats accounting. |
| be/src/olap/rowset/segment_v2/condition_cache.h | New LRU-backed condition cache policy + handle types. |
| be/src/olap/rowset/segment_v2/condition_cache.cpp | Implement cache lookup/insert. |
| be/src/runtime/exec_env_init.cpp | Initialize/destroy global condition cache with configurable memory limit. |
| be/src/runtime/exec_env.h | Add global condition cache pointer + accessor. |
| be/src/runtime/memory/cache_policy.h | Add CONDITION_CACHE enum and type string. |
| be/src/util/doris_metrics.h | Add global counters for condition cache lookups/hits. |
| be/src/util/doris_metrics.cpp | Define/register new condition cache metrics. |
| be/src/common/config.h | Add BE config condition_cache_limit. |
| be/src/common/config.cpp | Define default condition_cache_limit. |
| be/src/runtime/memory/cache_policy.h | Add cache type string mapping for condition cache. |
| be/src/olap/olap_common.h | Add condition cache stats to reader statistics struct. |
| be/src/olap/iterators.h | Add digest for key ranges and plumb digest through storage read options. |
| be/src/olap/parallel_scanner_builder.h | Rename/repurpose segment-parallelism flag and related comments. |
| be/src/olap/parallel_scanner_builder.cpp | Update scan-parallelism-by-segment strategy and related splitting logic. |
| be/src/vec/columns/column_const.h | Adjust constant-column hashing implementation. |
| be/src/exprs/hybrid_set.h | Add digest API for IN-filter sets and implement it for set types. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| _condition_cache = std::make_shared<std::vector<bool>>( | ||
| num_rows / CONDITION_CACHE_OFFSET + 1, false); |
| @VariableMgr.VarAttr(name = ENABLE_CONDITION_CACHE) | ||
| public boolean enableConditionCache = true; | ||
|
|
| int _query_parallel_instance_num = 0; | ||
|
|
| VLOG_DEBUG << "Condition cache insert, query id: " | ||
| << print_id(_opts.runtime_state->query_id()) | ||
| << ", rowset id: " << _opts.rowset_id.to_string() | ||
| << ", segment id: " << _segment->id() | ||
| << ", cache digest: " << _opts.condition_cache_digest; |
| 92: optional i32 condition_cache_digest = 0; | ||
|
|
| ConditionCacheHandle( | ||
| this, | ||
| LRUCachePolicy::insert(key.encode(), (void*)cache_value_ptr.release(), | ||
| result->capacity(), result->capacity(), CachePriority::NORMAL)); |
| if (enableConditionCache) { | ||
| tResult.setConditionCacheDigest(getAffectQueryResultVariableHashCode()); | ||
| } |
| // rebuild table to skip the delete operation | ||
| sql "create table temp like ${tableName}" | ||
| sql "insert into temp select * from ${tableName}" | ||
| sql "drop table ${tableName}" | ||
| sql "alter table temp rename ${tableName}" |
| 144: optional bool enable_inverted_index_searcher_cache = true; | ||
| 145: optional bool enable_inverted_index_query_cache = true; | ||
| 146: optional bool fuzzy_disable_runtime_filter_in_be = false; // deprecated | ||
| 146: optional bool enable_condition_cache = false; //deprecated |
| // Build scanners so that each segment is handled by its own scanner. | ||
| Status _build_scanners_by_segment(std::list<ScannerSPtr>& scanners); | ||
|
|
||
| std::shared_ptr<vectorized::OlapScanner> _build_scanner( | ||
| BaseTabletSPtr tablet, int64_t version, const std::vector<OlapScanRange*>& key_ranges, | ||
| TabletReadSource&& read_source); | ||
|
|
||
| pipeline::OlapScanLocalState* _parent; | ||
|
|
||
| /// Max scanners count limit to build | ||
| size_t _max_scanners_count {16}; | ||
|
|
||
| /// Min rows per scanner | ||
| size_t _min_rows_per_scanner {2 * 1024 * 1024}; | ||
|
|
||
| size_t _total_rows {}; | ||
|
|
||
| size_t _rows_per_scanner {_min_rows_per_scanner}; | ||
|
|
||
| std::map<RowsetId, std::vector<size_t>> _all_segments_rows; | ||
|
|
||
| // Force building one scanner per segment when true. | ||
| bool _optimize_index_scan_parallelism {false}; | ||
| bool _scan_parallelism_by_segment {false}; | ||
|
|
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
What problem does this PR solve?
cherry pick condition cache code
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)