Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop scan edge opt when build path #4053

Merged
merged 8 commits into from Jun 10, 2022

Conversation

wey-gu
Copy link
Contributor

@wey-gu wey-gu commented Mar 20, 2022

What type of PR is this?

  • bug
  • feature
  • enhancement

What problem(s) does this PR solve?

Issue(s) number: #3996

Description:

How do you solve it?

Stop scanV -> scanE optRule when buildPath

update: after eliminate AV being merged, the PR is changed during rebase:

  • Changed GetEdgeTransformRule:
    • Add a Project to be matched to exclude match p=()-[e]->() return p limit 3
    • Removed AV(AVorLimit --> Limit), as AV in this rule will end up losing limit-count to lead scanEdge(limit:-1) in match ()-[e]->() return e,e limit 3
  • Added GetEdgesTransformAppendVerticesLimitRule: (SV, TR, AV, L, Prj) ->(SE, Prj, AV, L, Prj)
    • doing the same thing as the changed GetEdgeTransformRule for match ()-[e]->() return e limit 3

previous version before eliminate AV being merged

  • delete: av->traverse->scanV
  • add: project-> limit -> av->traverse->sv
    • exclude optRule for buildPath in project.colums()

Special notes for your reviewer, ex. impact of this fix, design document, etc:

(root@nebula) [basketballplayer]> show tag indexes
+------------------+----------+----------+
| Index Name       | By Tag   | Columns  |
+------------------+----------+----------+
| "player_index_0" | "player" | []       |
| "player_index_1" | "player" | ["name"] |
+------------------+----------+----------+
Got 2 rows (time spent 2891/3892 us)

## before this PR, below return path was pushed down, while it cannot be pushed down, we should prevent it.
(root@nebula) [basketballplayer]> match p=(:`team`)-->() return p limit 3
+----------+
| p        |
+----------+
| BAD_TYPE |
| BAD_TYPE |
| BAD_TYPE |
+----------+
Got 3 rows (time spent 6144/26093 us)

## afterwards

# with this PR, this will not be optimized to scanE, which led the wrong pushdown, instead it use scanV(and here for p=(:`player`)-->() return p, it will leverage tag-index)

(root@nebula) [basketballplayer]> match p=(:`team`)-->() return p limit 3
[ERROR (-1005)]: Scan vertices or edges need to specify a limit number, or limit number can not push down.

Sun, 20 Mar 2022 07:24:38 UTC

Checklist:

Tests:

  • Unit test(positive and negative cases)
  • Function test
  • Performance test
  • N/A
  • Manual Test

Affects:

  • Documentation affected (Please add the label if documentation needs to be modified.)
  • Incompatibility (If it breaks the compatibility, please describe it and add the label.)
  • If it's needed to cherry-pick (If cherry-pick to some branches is required, please label the destination version(s).)
  • Performance impacted: Consumes more CPU/Memory

Release notes:

Please confirm whether to be reflected in release notes and how to describe:

ex. Fixed the bug .....

Fixed the bug on the indexless case of match p=(:team)--() return p limit 1 end up with badtype, instead, it should be rejected due to being unable to pushdown.

@wey-gu wey-gu requested a review from Shylock-Hg March 20, 2022 07:31
@wey-gu wey-gu changed the title Stop scan e opt when build path Stop scan edge opt when build path Mar 20, 2022
@wey-gu wey-gu added ready-for-testing PR: ready for the CI test ready for review labels Mar 21, 2022
@Shylock-Hg
Copy link
Contributor

Format the code by make clang-format

@wey-gu
Copy link
Contributor Author

wey-gu commented Mar 21, 2022

Format the code by make clang-format

will do so, thanks!

@wey-gu
Copy link
Contributor Author

wey-gu commented Mar 21, 2022

Now I see a regression case:

E   AssertionError: Fail to exec: 

MATCH ()-[e:is_teacher]->() RETURN type(e) AS Type, e.start_year AS StartYear, e.end_year AS EndYear LIMIT 3

error: Scan vertices or edges need to specify a limit number, or limit number can not push down.

The root cause is when returning more than once of e, even like return e, e, the opt rules were applied differently, where, the limit was reset to -1 for some reasons(still don't) know why and I am continually checking on this now.

@Sophie-Xie Sophie-Xie added this to the v3.1.0 milestone Mar 21, 2022
@wey-gu wey-gu added ready-for-testing PR: ready for the CI test and removed ready-for-testing PR: ready for the CI test labels Mar 21, 2022
@wey-gu wey-gu force-pushed the stop_scanE_opt_when_buildPath branch from cd8a356 to 290eb7e Compare March 27, 2022 08:28
@wey-gu wey-gu removed the ready-for-testing PR: ready for the CI test label Mar 27, 2022
@wey-gu wey-gu force-pushed the stop_scanE_opt_when_buildPath branch from 290eb7e to 71b5ddc Compare March 27, 2022 13:19
@wey-gu
Copy link
Contributor Author

wey-gu commented Mar 27, 2022

I am stuck here as when the return part comes with more than one e with limit, it won't be matched to prj->limit->av->traverse->scanVertices, instead it matched the pattern of prj-->av->traverse->scanVertices, where the limit information cannot be fetched(for all node, it's -1).

The strange thing is the final opt plan with explain MATCH ()-[e:serve]->() RETURN e,properties(e) limit 1 is prj->limit->av->traverse->scanVertices, if I disabled prj-->av->traverse->scanVertices, although it will be never matched to this patter.

I will talk to @Shylock-Hg closely tomorrow.

I may not be catching up with this release cycle but I will try my best to do it.

Sorry.

@wey-gu
Copy link
Contributor Author

wey-gu commented Mar 29, 2022

After discussions with @Shylock-Hg and @yixinglu, something should be wrong somewhere else that had led to the pattern not being mapped, will look into it later.

@Sophie-Xie Sophie-Xie modified the milestones: v3.1.0, v3.2.0 Mar 30, 2022
@Sophie-Xie
Copy link
Contributor

Will you continue to fix it? It needs to be merged into version 3.2. Thanks.

@wey-gu
Copy link
Contributor Author

wey-gu commented May 11, 2022

Will you continue to fix it? It needs to be merged into version 3.2. Thanks.

I will look into it again before the end of July, in case collaboration is needed, will request it soon.

@wey-gu wey-gu force-pushed the stop_scanE_opt_when_buildPath branch 2 times, most recently from b64ffd7 to 04df98a Compare May 27, 2022 11:44
@wey-gu wey-gu added the ready-for-testing PR: ready for the CI test label May 27, 2022
@wey-gu
Copy link
Contributor Author

wey-gu commented May 27, 2022

Update:
Today I spent some time figuring out the issue before(when there were two rules GetEdgesTransformLimitRule and GetEdgesTransformRule):

issue:

  • MATCH ()-[e:serve]->() RETURN e LIMIT 3 works fine, it will be matched to new rule GetEdgesTransformLimitRule
  • MATCH ()-[e:serve]->() RETURN e,e LIMIT 3 will end up entering GetEdgesTransformRule now, due to GetEdgesTransformRule was changed to match with extra project, the traverse->limit() and all other nodes' limit will be -1, thus the scanEdge.limit will be -1, and due to the LIMIT isn't pushdown, the query cannot be made

answer:

  • removed GetEdgesTransformRule

Because the goal of GetEdgesTransformRule was only to address the case with a limit sample when the edge comes without indexes, thus I could remove GetEdgesTransformRule. Actually, the pattern can be matched to GetEdgesTransformLimitRule anyway when there is no GetEdgesTransformRule to short circuit it, and everything works as expected.

@wey-gu
Copy link
Contributor Author

wey-gu commented May 29, 2022

The UT Failure seems irrelevant

=================================== FAILURES ===================================
_______________________ test_lookuptest_yieldclausetest ________________________
[gw4] linux -- Python 3.8.10 /usr/bin/python3
/github/home/.local/lib/python3.8/site-packages/pytest_bdd/scenario.py:177: in scenario_wrapper
    _execute_scenario(feature, scenario, request, encoding)
/github/home/.local/lib/python3.8/site-packages/pytest_bdd/scenario.py:143: in _execute_scenario
    _execute_step_function(request, scenario, step, step_func)
/github/home/.local/lib/python3.8/site-packages/pytest_bdd/scenario.py:113: in _execute_step_function
    return_value = step_func(**kwargs)
tck/conftest.py:736: in execution_should_be_succ
    check_resp(rs, stmt)
common/utils.py:334: in check_resp
    assert resp.is_succeeded(), msg
E   AssertionError: Fail to exec: INSERT VERTEX student(number, age), teacher(number, age)  VALUES "220":(1, 20, 1, 30), "221":(2, 22, 2, 32), error: Storage Error: Tag not found.
___________________ test_indextest_rebuildtagindexstatusinfo ___________________

@wey-gu wey-gu force-pushed the stop_scanE_opt_when_buildPath branch from 04df98a to a268940 Compare May 31, 2022 10:00
@wey-gu wey-gu force-pushed the stop_scanE_opt_when_buildPath branch from 4f8aafe to 674dff1 Compare June 7, 2022 16:34
@wey-gu
Copy link
Contributor Author

wey-gu commented Jun 7, 2022

update: I managed to make it work in my env, just pushed to check UT(don't have time to run UT locally before going to bed), and will continue tomorrow.

@wey-gu wey-gu force-pushed the stop_scanE_opt_when_buildPath branch 2 times, most recently from acba653 to 4c4cb37 Compare June 7, 2022 16:42
@wey-gu
Copy link
Contributor Author

wey-gu commented Jun 7, 2022

Update: my own tck case failed, will(sorry) will check tomorrow morning.

@wey-gu wey-gu force-pushed the stop_scanE_opt_when_buildPath branch from 4c4cb37 to bf5839b Compare June 8, 2022 01:44
@wey-gu wey-gu force-pushed the stop_scanE_opt_when_buildPath branch 4 times, most recently from 4927b79 to 8f50f4e Compare June 9, 2022 10:57
@wey-gu wey-gu requested a review from Shylock-Hg June 9, 2022 11:49
@wey-gu wey-gu force-pushed the stop_scanE_opt_when_buildPath branch 2 times, most recently from e096ca7 to fd599ef Compare June 9, 2022 12:13
delete: av->traverse->scanV
add: project-> limit -> av->traverse->sv
add: project-> av ->traverse->sv

exclude optRule for buildPath in project.colums()

Comments added

lint: clang-format

[WIP] Fixing regression when return e, e.foo.bar

The limit info is lost in this case

Fix after vesoft-inc#4146

OptGroup was refactored in vesoft-inc#4146

Removed GetEdgesTransformRule
As this rule only aplied for
   match ()-[e]->() return e limit 1
when 1. there is no index on e and 2. limit cluase
exists, thus we could assert that AppendVertices
will be removed by EliminateAppendVerticesRule

//  Tranformation:
//  Before:
// +---------+---------+
// |      Project      |
// +---------+---------+
//           |
// +---------+---------+
// |       Limit       |
// +---------+---------+
//           |
// +---------+---------+
// |      Traverse     |
// +---------+---------+
//           |
// +---------+---------+
// |    ScanVertices   |
// +---------+---------+
//
//  After:
// +---------+---------+
// |      Project      |
// +---------+---------+
//           |
// +---------+---------+
// |       Limit       |
// +---------+---------+
//           |
// +---------+---------+
// |      Project      |
// +---------+---------+
//           |
// +---------+---------+
// |      ScanEdges    |
// +---------+---------+
@wey-gu wey-gu force-pushed the stop_scanE_opt_when_buildPath branch from fd599ef to bad018b Compare June 9, 2022 12:17
@Sophie-Xie Sophie-Xie merged commit fcbab77 into vesoft-inc:master Jun 10, 2022
@wey-gu wey-gu deleted the stop_scanE_opt_when_buildPath branch June 10, 2022 04:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready for review ready-for-testing PR: ready for the CI test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

match p=(:foo)-->() return p returns empty instead of rejecting error on not being able to pushdown
4 participants