-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Refactor](inverted index) refactor inverted index compound predicates evaluate logic #38908 #41385
[Refactor](inverted index) refactor inverted index compound predicates evaluate logic #38908 #41385
Conversation
…s evaluate logic (apache#38908) This PR addresses several key issues related to the compound condition support in the inverted index, and optimization for index skipping without returning to the table: 1. **Unified Handling of `expr` and `column predicate`**: - Combined the processing of inverted index-related `column predicate` and `expr`. - Ensured that compound conditions involving both `column predicate` and `expr` are processed uniformly to reduce complexity and improve robustness. 2. **Optimized the Execution of Compound Conditions**: - Removed the logic in `scan_operator` that normalized compound predicates by pushing down logic to `_common_expr_ctxs_push_down` where `expr` contexts are managed. - Added `evaluate_inverted_index` support to the `vexpr` and function layers, such as `function comparison` and `function collection_in`. - Introduced new data structures in `VExprContext` to store results from `evaluate_inverted_index`, thus facilitating quick lookup and application of these results during execution.
run buildall |
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
@@ -53,7 +54,107 @@ class VCompoundPred : public VectorizedFnCall { | |||
|
|||
const std::string& expr_name() const override { return _expr_name; } | |||
|
|||
Status evaluate_inverted_index(VExprContext* context, uint32_t segment_num_rows) override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: function 'evaluate_inverted_index' exceeds recommended size/complexity thresholds [readability-function-size]
Status evaluate_inverted_index(VExprContext* context, uint32_t segment_num_rows) override {
^
Additional context
be/src/vec/exprs/vcompound_pred.h:56: 95 lines including whitespace and comments (threshold 80)
Status evaluate_inverted_index(VExprContext* context, uint32_t segment_num_rows) override {
^
} | ||
return Status::OK(); | ||
} | ||
|
||
Status execute(VExprContext* context, Block* block, int* result_column_id) override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: function 'execute' has cognitive complexity of 112 (threshold 50) [readability-function-cognitive-complexity]
Status execute(VExprContext* context, Block* block, int* result_column_id) override {
^
Additional context
be/src/vec/exprs/vcompound_pred.h:154: +1, including nesting penalty of 0, nesting level increased to 1
if (_can_fast_execute && fast_execute(context, block, result_column_id)) {
^
be/src/vec/exprs/vcompound_pred.h:154: +1
if (_can_fast_execute && fast_execute(context, block, result_column_id)) {
^
be/src/vec/exprs/vcompound_pred.h:157: +1, including nesting penalty of 0, nesting level increased to 1
if (children().size() == 1 || !_all_child_is_compound_and_not_const()) {
^
be/src/vec/exprs/vcompound_pred.h:163: +1, including nesting penalty of 0, nesting level increased to 1
RETURN_IF_ERROR(_children[0]->execute(context, block, &lhs_id));
^
be/src/common/status.h:619: expanded from macro 'RETURN_IF_ERROR'
do { \
^
be/src/vec/exprs/vcompound_pred.h:163: +2, including nesting penalty of 1, nesting level increased to 2
RETURN_IF_ERROR(_children[0]->execute(context, block, &lhs_id));
^
be/src/common/status.h:621: expanded from macro 'RETURN_IF_ERROR'
if (UNLIKELY(!_status_.ok())) { \
^
be/src/vec/exprs/vcompound_pred.h:175: +1, including nesting penalty of 0, nesting level increased to 1
if (lhs_is_nullable) {
^
be/src/vec/exprs/vcompound_pred.h:189: nesting level increased to 1
auto get_rhs_colum = [&]() {
^
be/src/vec/exprs/vcompound_pred.h:190: +2, including nesting penalty of 1, nesting level increased to 2
if (rhs_id == -1) {
^
be/src/vec/exprs/vcompound_pred.h:191: +3, including nesting penalty of 2, nesting level increased to 3
RETURN_IF_ERROR(_children[1]->execute(context, block, &rhs_id));
^
be/src/common/status.h:619: expanded from macro 'RETURN_IF_ERROR'
do { \
^
be/src/vec/exprs/vcompound_pred.h:191: +4, including nesting penalty of 3, nesting level increased to 4
RETURN_IF_ERROR(_children[1]->execute(context, block, &rhs_id));
^
be/src/common/status.h:621: expanded from macro 'RETURN_IF_ERROR'
if (UNLIKELY(!_status_.ok())) { \
^
be/src/vec/exprs/vcompound_pred.h:201: +3, including nesting penalty of 2, nesting level increased to 3
if (rhs_is_nullable) {
^
be/src/vec/exprs/vcompound_pred.h:209: nesting level increased to 1
auto return_result_column_id = [&](ColumnPtr res_column, int res_id) -> int {
^
be/src/vec/exprs/vcompound_pred.h:210: +2, including nesting penalty of 1, nesting level increased to 2
if (result_is_nullable && !res_column->is_nullable()) {
^
be/src/vec/exprs/vcompound_pred.h:210: +1
if (result_is_nullable && !res_column->is_nullable()) {
^
be/src/vec/exprs/vcompound_pred.h:219: nesting level increased to 1
auto create_null_map_column = [&](ColumnPtr& null_map_column,
^
be/src/vec/exprs/vcompound_pred.h:221: +2, including nesting penalty of 1, nesting level increased to 2
if (null_map_data == nullptr) {
^
be/src/vec/exprs/vcompound_pred.h:230: nesting level increased to 1
auto vector_vector_null = [&]<bool is_and_op>() {
^
be/src/vec/exprs/vcompound_pred.h:240: +2, including nesting penalty of 1, nesting level increased to 2
if constexpr (is_and_op) {
^
be/src/vec/exprs/vcompound_pred.h:241: +3, including nesting penalty of 2, nesting level increased to 3
for (size_t i = 0; i < size; ++i) {
^
be/src/vec/exprs/vcompound_pred.h:246: +1, nesting level increased to 2
} else {
^
be/src/vec/exprs/vcompound_pred.h:247: +3, including nesting penalty of 2, nesting level increased to 3
for (size_t i = 0; i < size; ++i) {
^
be/src/vec/exprs/vcompound_pred.h:260: +1, including nesting penalty of 0, nesting level increased to 1
if (_op == TExprOpcode::COMPOUND_AND) {
^
be/src/vec/exprs/vcompound_pred.h:263: +2, including nesting penalty of 1, nesting level increased to 2
if ((lhs_all_false && !lhs_is_nullable) || (lhs_all_false && lhs_all_is_not_null)) {
^
be/src/vec/exprs/vcompound_pred.h:263: +1
if ((lhs_all_false && !lhs_is_nullable) || (lhs_all_false && lhs_all_is_not_null)) {
^
be/src/vec/exprs/vcompound_pred.h:263: +1
if ((lhs_all_false && !lhs_is_nullable) || (lhs_all_false && lhs_all_is_not_null)) {
^
be/src/vec/exprs/vcompound_pred.h:263: +1
if ((lhs_all_false && !lhs_is_nullable) || (lhs_all_false && lhs_all_is_not_null)) {
^
be/src/vec/exprs/vcompound_pred.h:266: +1, nesting level increased to 2
} else {
^
be/src/vec/exprs/vcompound_pred.h:267: +3, including nesting penalty of 2, nesting level increased to 3
RETURN_IF_ERROR(get_rhs_colum());
^
be/src/common/status.h:619: expanded from macro 'RETURN_IF_ERROR'
do { \
^
be/src/vec/exprs/vcompound_pred.h:267: +4, including nesting penalty of 3, nesting level increased to 4
RETURN_IF_ERROR(get_rhs_colum());
^
be/src/common/status.h:621: expanded from macro 'RETURN_IF_ERROR'
if (UNLIKELY(!_status_.ok())) { \
^
be/src/vec/exprs/vcompound_pred.h:269: +3, including nesting penalty of 2, nesting level increased to 3
if ((lhs_all_true && !lhs_is_nullable) || //not null column
^
be/src/vec/exprs/vcompound_pred.h:269: +1
if ((lhs_all_true && !lhs_is_nullable) || //not null column
^
be/src/vec/exprs/vcompound_pred.h:269: +1
if ((lhs_all_true && !lhs_is_nullable) || //not null column
^
be/src/vec/exprs/vcompound_pred.h:270: +1
(lhs_all_true && lhs_all_is_not_null)) { //nullable column
^
be/src/vec/exprs/vcompound_pred.h:273: +1, nesting level increased to 3
} else if ((rhs_all_false && !rhs_is_nullable) ||
^
be/src/vec/exprs/vcompound_pred.h:273: +1
} else if ((rhs_all_false && !rhs_is_nullable) ||
^
be/src/vec/exprs/vcompound_pred.h:273: +1
} else if ((rhs_all_false && !rhs_is_nullable) ||
^
be/src/vec/exprs/vcompound_pred.h:274: +1
(rhs_all_false && rhs_all_is_not_null)) {
^
be/src/vec/exprs/vcompound_pred.h:277: +1, nesting level increased to 3
} else if ((rhs_all_true && !rhs_is_nullable) ||
^
be/src/vec/exprs/vcompound_pred.h:277: +1
} else if ((rhs_all_true && !rhs_is_nullable) ||
^
be/src/vec/exprs/vcompound_pred.h:277: +1
} else if ((rhs_all_true && !rhs_is_nullable) ||
^
be/src/vec/exprs/vcompound_pred.h:278: +1
(rhs_all_true && rhs_all_is_not_null)) {
^
be/src/vec/exprs/vcompound_pred.h:281: +1, nesting level increased to 3
} else {
^
be/src/vec/exprs/vcompound_pred.h:282: +4, including nesting penalty of 3, nesting level increased to 4
if (!result_is_nullable) {
^
be/src/vec/exprs/vcompound_pred.h:284: +5, including nesting penalty of 4, nesting level increased to 5
for (size_t i = 0; i < size; i++) {
^
be/src/vec/exprs/vcompound_pred.h:287: +1, nesting level increased to 4
} else {
^
be/src/vec/exprs/vcompound_pred.h:292: +1, nesting level increased to 1
} else if (_op == TExprOpcode::COMPOUND_OR) {
^
be/src/vec/exprs/vcompound_pred.h:295: +2, including nesting penalty of 1, nesting level increased to 2
if ((lhs_all_true && !lhs_is_nullable) || (lhs_all_true && lhs_all_is_not_null)) {
^
be/src/vec/exprs/vcompound_pred.h:295: +1
if ((lhs_all_true && !lhs_is_nullable) || (lhs_all_true && lhs_all_is_not_null)) {
^
be/src/vec/exprs/vcompound_pred.h:295: +1
if ((lhs_all_true && !lhs_is_nullable) || (lhs_all_true && lhs_all_is_not_null)) {
^
be/src/vec/exprs/vcompound_pred.h:295: +1
if ((lhs_all_true && !lhs_is_nullable) || (lhs_all_true && lhs_all_is_not_null)) {
^
be/src/vec/exprs/vcompound_pred.h:298: +1, nesting level increased to 2
} else {
^
be/src/vec/exprs/vcompound_pred.h:299: +3, including nesting penalty of 2, nesting level increased to 3
RETURN_IF_ERROR(get_rhs_colum());
^
be/src/common/status.h:619: expanded from macro 'RETURN_IF_ERROR'
do { \
^
be/src/vec/exprs/vcompound_pred.h:299: +4, including nesting penalty of 3, nesting level increased to 4
RETURN_IF_ERROR(get_rhs_colum());
^
be/src/common/status.h:621: expanded from macro 'RETURN_IF_ERROR'
if (UNLIKELY(!_status_.ok())) { \
^
be/src/vec/exprs/vcompound_pred.h:300: +3, including nesting penalty of 2, nesting level increased to 3
if ((lhs_all_false && !lhs_is_nullable) || (lhs_all_false && lhs_all_is_not_null)) {
^
be/src/vec/exprs/vcompound_pred.h:300: +1
if ((lhs_all_false && !lhs_is_nullable) || (lhs_all_false && lhs_all_is_not_null)) {
^
be/src/vec/exprs/vcompound_pred.h:300: +1
if ((lhs_all_false && !lhs_is_nullable) || (lhs_all_false && lhs_all_is_not_null)) {
^
be/src/vec/exprs/vcompound_pred.h:300: +1
if ((lhs_all_false && !lhs_is_nullable) || (lhs_all_false && lhs_all_is_not_null)) {
^
be/src/vec/exprs/vcompound_pred.h:303: +1, nesting level increased to 3
} else if ((rhs_all_true && !rhs_is_nullable) ||
^
be/src/vec/exprs/vcompound_pred.h:303: +1
} else if ((rhs_all_true && !rhs_is_nullable) ||
^
be/src/vec/exprs/vcompound_pred.h:303: +1
} else if ((rhs_all_true && !rhs_is_nullable) ||
^
be/src/vec/exprs/vcompound_pred.h:304: +1
(rhs_all_true && rhs_all_is_not_null)) {
^
be/src/vec/exprs/vcompound_pred.h:307: +1, nesting level increased to 3
} else if ((rhs_all_false && !rhs_is_nullable) ||
^
be/src/vec/exprs/vcompound_pred.h:307: +1
} else if ((rhs_all_false && !rhs_is_nullable) ||
^
be/src/vec/exprs/vcompound_pred.h:307: +1
} else if ((rhs_all_false && !rhs_is_nullable) ||
^
be/src/vec/exprs/vcompound_pred.h:308: +1
(rhs_all_false && rhs_all_is_not_null)) {
^
be/src/vec/exprs/vcompound_pred.h:311: +1, nesting level increased to 3
} else {
^
be/src/vec/exprs/vcompound_pred.h:312: +4, including nesting penalty of 3, nesting level increased to 4
if (!result_is_nullable) {
^
be/src/vec/exprs/vcompound_pred.h:314: +5, including nesting penalty of 4, nesting level increased to 5
for (size_t i = 0; i < size; i++) {
^
be/src/vec/exprs/vcompound_pred.h:317: +1, nesting level increased to 4
} else {
^
be/src/vec/exprs/vcompound_pred.h:322: +1, nesting level increased to 1
} else {
^
size_t input_rows_count, const std::string& function_name) { | ||
if (!_enable_inverted_index_query) { | ||
return false; | ||
Status VExpr::_evaluate_inverted_index(VExprContext* context, const FunctionBasePtr& function, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: function '_evaluate_inverted_index' exceeds recommended size/complexity thresholds [readability-function-size]
Status VExpr::_evaluate_inverted_index(VExprContext* context, const FunctionBasePtr& function,
^
Additional context
be/src/vec/exprs/vexpr.cpp:604: 106 lines including whitespace and comments (threshold 80)
Status VExpr::_evaluate_inverted_index(VExprContext* context, const FunctionBasePtr& function,
^
} | ||
VLOG_DEBUG << "begin to execute match directly, column_name=" << column_name | ||
<< ", match_query_str=" << match_query_str; | ||
InvertedIndexCtx* inverted_index_ctx = reinterpret_cast<InvertedIndexCtx*>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: use auto when initializing with a cast to avoid duplicating the type name [modernize-use-auto]
InvertedIndexCtx* inverted_index_ctx = reinterpret_cast<InvertedIndexCtx*>( | |
auto* inverted_index_ctx = reinterpret_cast<InvertedIndexCtx*>( |
run buildall |
run buildall |
run buildall |
run buildall |
run buildall |
run buildall |
TeamCity be ut coverage result: |
run buildall |
TeamCity be ut coverage result: |
run buildall |
TeamCity be ut coverage result: |
cherry pick from #38908