Skip to content
This repository has been archived by the owner on Dec 1, 2022. It is now read-only.

expand logic expression for index scan #465

Merged
merged 4 commits into from
Dec 15, 2020

Conversation

bright-starry-sky
Copy link
Contributor

Analyze complex expression and expand the expression for index scan. for example :
(t1.c1 == 1 or t1.c2 == 2) and (t1.c3 == 3 or t1.c4 == 4)
Will be converted to:
(t1.c1==1 AND t1.c3==3) OR (t1.c1==1 AND t1.c4==4) OR (t1.c2==2 AND t1.c3==3) OR (t1.c2==2 AND t1.c4==4)

@bright-starry-sky bright-starry-sky requested review from a team December 9, 2020 13:38
@bright-starry-sky bright-starry-sky added the ready-for-testing PR: ready for the CI test label Dec 10, 2020
@critical27
Copy link
Contributor

critical27 commented Dec 10, 2020

Interesting, this really looks like math.... (A + B)(C + D) = AC + AD + BC + BD...

@Shylock-Hg
Copy link
Contributor

Well done. After expand some expression maybe evaluate multiple times, so how do you treat the expression with side affection (or which expression not pure).

@bright-starry-sky
Copy link
Contributor Author

Interesting, this really looks like math.... (A + B)(C + D) = AC + AD + BC + BD...

yeah, you are right . used for index scan in storage layer. An OR logic corresponds to an IndexQueryContext.

@bright-starry-sky
Copy link
Contributor Author

Well done. After expand some expression maybe evaluate multiple times, so how do you treat the expression with side affection (or which expression not pure).

You mean how do make sure correct of sub expressions after expanded?

Comment on lines +133 to +135
if (ops[0]->kind() == Expression::Kind::kLogicalOr ||
ops[1]->kind() == Expression::Kind::kLogicalOr) {
return expandExpr(target[0].get());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you give an example when do we hit this path? I suppose all expression of op has been expanded in expandImplAnd.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you give an example when do we hit this path? I suppose all expression of op has been expanded in expandImplAnd.

t1.c1 == 1 and t1.c2 == 2 and (t1.c4 == 4 or t1.c5 == 5)

nebula::graph::ExpressionUtils::expandExpr ExpressionUtils.cpp:135
nebula::graph::ExpressionUtils::expandImplAnd ExpressionUtils.cpp:156
nebula::graph::ExpressionUtils::expandExpr ExpressionUtils.cpp:121
nebula::graph::ExpressionUtilsTest_expandExpression_Test::TestBody ExpressionUtilsTest.cpp:507
void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) 0x0000000002f58ce3
void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) 0x0000000002f5340c
testing::Test::Run() 0x0000000002f38e9c
testing::TestInfo::Run() 0x0000000002f39718
testing::TestCase::Run() 0x0000000002f39d69
testing::internal::UnitTestImpl::RunAllTests() 0x0000000002f4089a
bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) 0x0000000002f59dea
bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) 0x0000000002f541f0
testing::UnitTest::Run() 0x0000000002f3f5a0
RUN_ALL_TESTS() 0x0000000002f65fdc
main 0x0000000002f65f6a
__libc_start_main 0x00007ffff7c86f43
_start 0x0000000001e0142e

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Convert AB(C + D) to ABC + ABD)

@Shylock-Hg
Copy link
Contributor

Well done. After expand some expression maybe evaluate multiple times, so how do you treat the expression with side affection (or which expression not pure).

You mean how do make sure correct of sub expressions after expanded?

Yes In your example t1.c1 will expand from one to two (evaluate twice).

@bright-starry-sky
Copy link
Contributor Author

Well done. After expand some expression maybe evaluate multiple times, so how do you treat the expression with side affection (or which expression not pure).

You mean how do make sure correct of sub expressions after expanded?

Yes In your example t1.c1 will expand from one to two (evaluate twice).

Well done. After expand some expression maybe evaluate multiple times, so how do you treat the expression with side affection (or which expression not pure).

You mean how do make sure correct of sub expressions after expanded?

Yes In your example t1.c1 will expand from one to two (evaluate twice).

I understand your question,don't need to evaluate it twice,for example :
(t1.c1 == 1 or t1.c1== 2) and t1.c2 == 3
expanded:
(t1.c1 == 1 and t1. c2 == 3) or (t1.c1 == 2 and t1.c2 == 3)

for index scan, will be executed twice, don't need to expression.eval()
prefix(12) union prefix(13)

@Shylock-Hg
Copy link
Contributor

Well done. After expand some expression maybe evaluate multiple times, so how do you treat the expression with side affection (or which expression not pure).

You mean how do make sure correct of sub expressions after expanded?

Yes In your example t1.c1 will expand from one to two (evaluate twice).

Well done. After expand some expression maybe evaluate multiple times, so how do you treat the expression with side affection (or which expression not pure).

You mean how do make sure correct of sub expressions after expanded?

Yes In your example t1.c1 will expand from one to two (evaluate twice).

I understand your question,don't need to evaluate it twice,for example :
(t1.c1 == 1 or t1.c1== 2) and t1.c2 == 3
expanded:
(t1.c1 == 1 and t1. c2 == 3) or (t1.c1 == 2 and t1.c2 == 3)

for index scan, will be executed twice, don't need to expression.eval()
prefix(12) union prefix(13)

Yes, if you check the t1.c1 is just a attribute expression.

yixinglu
yixinglu previously approved these changes Dec 10, 2020
@yixinglu
Copy link
Contributor

Actually, I am really confused about the logical expression unfolding implementation. I think you can implement the logic recursively like following pseudo codes:

Expression *expandAnd(Expression *root) {
  if (root->kind() != Expression::Kind::kAnd) {
    return root->clone();
  }
  auto operands = root->operands();
  // (A OR B) AND C => (A AND C) OR (B AND C)
  if (operands[0]->kind() == Expression::Kind::kOr) {
    auto orLeftChild = operands[0]->operands[0];
    auto orRightChild = operands[0]->operands[1];
    auto left = new LogicalExpression(Kind::kAnd, orLeftChild, operands[1]);
    auto right = new LogicalExpression(Kind::kAnd, orRightChild, operands[1]);
    return new LogicalExpression(Kind::kOr, expandAnd(left), expandAnd(right));
  }
  // A AND (B OR C) => (A AND B) OR (A AND C)
  if (operands[1]->kind() == Expression::Kind::kOr) {
    auto orLeftChild = operands[1]->operands[0];
    auto orRightChild = operands[1]->operands[1];
    auto left = new LogicalExpression(Kind::kAnd, orLeftChild, operands[0]);
    auto right = new LogicalExpression(Kind::kAnd, orRightChild, operands[0]);
    return new LogicalExpression(Kind::kOr, expandAnd(left), expandAnd(right));
  }

  return root->expr();
}

@yixinglu yixinglu self-requested a review December 10, 2020 06:57
@bright-starry-sky
Copy link
Contributor Author

Actually, I am really confused about the logical expression unfolding implementation. I think you can implement the logic recursively like following pseudo codes:

Expression *expandAnd(Expression *root) {
  if (root->kind() != Expression::Kind::kAnd) {
    return root->clone();
  }
  auto operands = root->operands();
  // (A OR B) AND C => (A AND C) OR (B AND C)
  if (operands[0]->kind() == Expression::Kind::kOr) {
    auto orLeftChild = operands[0]->operands[0];
    auto orRightChild = operands[0]->operands[1];
    auto left = new LogicalExpression(Kind::kAnd, orLeftChild, operands[1]);
    auto right = new LogicalExpression(Kind::kAnd, orRightChild, operands[1]);
    return new LogicalExpression(Kind::kOr, expandAnd(left), expandAnd(right));
  }
  // A AND (B OR C) => (A AND B) OR (A AND C)
  if (operands[1]->kind() == Expression::Kind::kOr) {
    auto orLeftChild = operands[1]->operands[0];
    auto orRightChild = operands[1]->operands[1];
    auto left = new LogicalExpression(Kind::kAnd, orLeftChild, operands[0]);
    auto right = new LogicalExpression(Kind::kAnd, orRightChild, operands[0]);
    return new LogicalExpression(Kind::kOr, expandAnd(left), expandAnd(right));
  }

  return root->expr();
}

Interesting logic, How to deal with the following logic?
(A OR B OR C OR D) AND E

@yixinglu
Copy link
Contributor

Actually, I am really confused about the logical expression unfolding implementation. I think you can implement the logic recursively like following pseudo codes:

Expression *expandAnd(Expression *root) {
  if (root->kind() != Expression::Kind::kAnd) {
    return root->clone();
  }
  auto operands = root->operands();
  // (A OR B) AND C => (A AND C) OR (B AND C)
  if (operands[0]->kind() == Expression::Kind::kOr) {
    auto orLeftChild = operands[0]->operands[0];
    auto orRightChild = operands[0]->operands[1];
    auto left = new LogicalExpression(Kind::kAnd, orLeftChild, operands[1]);
    auto right = new LogicalExpression(Kind::kAnd, orRightChild, operands[1]);
    return new LogicalExpression(Kind::kOr, expandAnd(left), expandAnd(right));
  }
  // A AND (B OR C) => (A AND B) OR (A AND C)
  if (operands[1]->kind() == Expression::Kind::kOr) {
    auto orLeftChild = operands[1]->operands[0];
    auto orRightChild = operands[1]->operands[1];
    auto left = new LogicalExpression(Kind::kAnd, orLeftChild, operands[0]);
    auto right = new LogicalExpression(Kind::kAnd, orRightChild, operands[0]);
    return new LogicalExpression(Kind::kOr, expandAnd(left), expandAnd(right));
  }

  return root->expr();
}

Interesting logic, How to deal with the following logic?
(A OR B OR C OR D) AND E

Similar logic, only change the or Logical expression construction like this:

  // (A OR B OR C OR D) AND E => (A AND E) OR (B AND E) OR (C AND E) OR (D AND E)
  if (operands[0]->kind() == Expression::Kind::kOr) {
    auto leftChildOperands = operands[0]->operands();
    auto newRoot = new LogicalExpression(Kind::kOr);
    for (auto& left : leftChildOperands) {
      auto operand = new LogicalExpression(Kind::kAnd, left, operands[1]);
      newRoot->addOperand(expandAnd(operand));
    }
    return newRoot;
  }

@yixinglu yixinglu dismissed their stale review December 10, 2020 07:31

continue to talk about implementation

@bright-starry-sky
Copy link
Contributor Author

Actually, I am really confused about the logical expression unfolding implementation. I think you can implement the logic recursively like following pseudo codes:

Expression *expandAnd(Expression *root) {
  if (root->kind() != Expression::Kind::kAnd) {
    return root->clone();
  }
  auto operands = root->operands();
  // (A OR B) AND C => (A AND C) OR (B AND C)
  if (operands[0]->kind() == Expression::Kind::kOr) {
    auto orLeftChild = operands[0]->operands[0];
    auto orRightChild = operands[0]->operands[1];
    auto left = new LogicalExpression(Kind::kAnd, orLeftChild, operands[1]);
    auto right = new LogicalExpression(Kind::kAnd, orRightChild, operands[1]);
    return new LogicalExpression(Kind::kOr, expandAnd(left), expandAnd(right));
  }
  // A AND (B OR C) => (A AND B) OR (A AND C)
  if (operands[1]->kind() == Expression::Kind::kOr) {
    auto orLeftChild = operands[1]->operands[0];
    auto orRightChild = operands[1]->operands[1];
    auto left = new LogicalExpression(Kind::kAnd, orLeftChild, operands[0]);
    auto right = new LogicalExpression(Kind::kAnd, orRightChild, operands[0]);
    return new LogicalExpression(Kind::kOr, expandAnd(left), expandAnd(right));
  }

  return root->expr();
}

Interesting logic, How to deal with the following logic?
(A OR B OR C OR D) AND E

Similar logic, only change the or Logical expression construction like this:

  // (A OR B OR C OR D) AND E => (A AND E) OR (B AND E) OR (C AND E) OR (D AND E)
  if (operands[0]->kind() == Expression::Kind::kOr) {
    auto leftChildOperands = operands[0]->operands();
    auto newRoot = new LogicalExpression(Kind::kOr);
    for (auto& left : leftChildOperands) {
      auto operand = new LogicalExpression(Kind::kAnd, left, operands[1]);
      newRoot->addOperand(expandAnd(operand));
    }
    return newRoot;
  }

Actually, I don't quite understand the logic of this pseudo code, According to this pseudo code logic, the output should be:
((A OR B OR C) AND E ) OR (D AND E)
not is : (A AND E) OR (B AND E) OR (C AND E) OR (D AND E)

@critical27
Copy link
Contributor

critical27 commented Dec 10, 2020

I have no doubt for the implementation, LGTM.

I believe the problem comes from the expression may have more than two operands, this would make code not so elegant.
Maybe we can try function style later, for example std::reduce(expr->operands.begin(), expr->operands.end(), Cartesian Product) for and, and std::reduce(expr->operands.begin(), expr->operands.end(), MergeIntoOne) for or.

@yixinglu
Copy link
Contributor

I rethink the expr traversal logic from top to bottom, there's some errors for the case a && (b || c) && d. we should access the expr tree node and find the pattern A && (B || C) or (A || B) && C in the expr tree recursively from bottom to top.

I have no problems about this PR. LGTM.

@yixinglu yixinglu merged commit 11da663 into vesoft-inc:master Dec 15, 2020
@bright-starry-sky bright-starry-sky deleted the index_expression branch December 15, 2020 01:09
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
ready-for-testing PR: ready for the CI test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants