Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support the tck format for the output of EXPLAIN #5274

Closed
xtcyclist opened this issue Jan 18, 2023 · 16 comments
Closed

Support the tck format for the output of EXPLAIN #5274

xtcyclist opened this issue Jan 18, 2023 · 16 comments
Assignees
Labels
good first issue Community: perfect as the first pull request type/feature req Type: feature request
Milestone

Comments

@xtcyclist
Copy link
Contributor

xtcyclist commented Jan 18, 2023

An idea from @Shylock-Hg

In the nebula-console, when explaining the query, support the specification of the tck format, so that the generated outputs could be used in tck feature files directly.

      EXPLAIN FORMAT = "TCK" MATCH (v1:player)-[:like*2..2]->(v2)-[e3:like]->(v4) where id(v1) == "Tony Parker"
      OPTIONAL MATCH (v3:player)-[:like]->(v1)<-[e5]-(v4)
      with v1, v2, e3, v4, e5, v3 where id(v3) == "Tim Duncan" or id(v3) is NULL
      return *
      | id | name           | dependencies | profiling data | operator info |
      | 22 | Project        | 18           |                |               |
      | 18 | Filter         | 14           |                |               |
      | 14 | HashLeftJoin   | 7,13         |                |               |
      | 7  | project        | 6            |                |               |
      | 6  | AppendVertices | 5            |                |               |
      | 5  | Traverse       | 20           |                |               |
      | 20 | Traverse       | 2            |                |               |
      | 2  | Dedup          | 1            |                |               |
      | 1  | PassThrough    | 3            |                |               |
      | 3  | Start          |              |                |               |
      | 13 | Project        | 21           |                |               |
      | 21 | AppendVertices | 11           |                |               |
      | 11 | Traverse       | 10           |                |               |
      | 10 | AppendVertices | 9            |                |               |
      | 9  | Traverse       | 8            |                |               |
      | 8  | Argument       |              |                |               |
@xtcyclist xtcyclist added type/feature req Type: feature request good first issue Community: perfect as the first pull request labels Jan 18, 2023
@wey-gu
Copy link
Contributor

wey-gu commented Jan 18, 2023

@AntiTopQuark
Copy link
Contributor

Hello, has the task been assigned? If not, can I try it?

@Shylock-Hg
Copy link
Contributor

Hello, has the task been assigned? If not, can I try it?

Welcome!

@wey-gu
Copy link
Contributor

wey-gu commented Jan 18, 2023

Feel free to take it @AntiTopQuark, don't hesitate to ask for helps here :)

@AntiTopQuark
Copy link
Contributor

Feel free to take it @AntiTopQuark, don't hesitate to ask for helps here :)

thx~I will try my best to do this well.

@swastik959
Copy link

swastik959 commented Feb 1, 2023

hi i am new to open source and want to contribute to this issue can you please explain and guide me about this issue

@AntiTopQuark
Copy link
Contributor

hi i am new to open source and want to contribute to this issue can you please explain and guide me about this issue

I'm very sorry, but this task has been assigned to me. May I ask if you have already started.

@swastik959
Copy link

hi i am new to open source and want to contribute to this issue can you please explain and guide me about this issue

I'm very sorry, but this task has been assigned to me. May I ask if you have already started.

@AntiTopQuark no

@AntiTopQuark
Copy link
Contributor

I'm very sorry, I've been a bit busy for a while and I've only just finished this pr.
#5414
I have a question: When specifying the output in tck format, do you still need to print the execution plan
The current situation is as follows:
2023-03-19-180945

Thank you very much!

@jievince
Copy link
Contributor

I'm very sorry, I've been a bit busy for a while and I've only just finished this pr. #5414 I have a question: When specifying the output in tck format, do you still need to print the execution plan The current situation is as follows: 2023-03-19-180945

Thank you very much!

EXPLAIN means just needs to show the execution plan, but don't run it,
while PROFILE means show the execution plan and run it.

So EXPLAIN should not actually run and it should not have a result table.
Both of them should print the execution plan.

@AntiTopQuark
Copy link
Contributor

Thank you very much for your response. I would like to confirm the behavior of this requirement again. @xtcyclist @jievince

  • For the case of explain format = 'tck', I think there are two options:
    • Only print the execution plan (although it contradicts the original requirement description), and print the execution plan in row format.
    • Do not support explain format = 'tck'.
  • For the case of profile format = 'tck', print the execution results in TCK format.Then when printing the execution plan, should we use the TCK format or the row format?

@jievince
Copy link
Contributor

Thank you very much for your response. I would like to confirm the behavior of this requirement again. @xtcyclist @jievince

  • For the case of explain format = 'tck', I think there are two options:

    • Only print the execution plan (although it contradicts the original requirement description), and print the execution plan in row format.
    • Do not support explain format = 'tck'.
  • For the case of profile format = 'tck', print the execution results in TCK format.Then when printing the execution plan, should we use the TCK format or the row format?

I think the format tck is not only for execution plan but also for result table.
So for explain format = 'tck', I think we should print the execution plan in tck fomat,
and for profile format='tck', I think we should print both the result table and the execution plan in tck format.

@AntiTopQuark @xtcyclist @Shylock-Hg , What do you think?

@xtcyclist
Copy link
Contributor Author

xtcyclist commented Mar 20, 2023

Yes, I agree with @jievince.

The motivation of this issue is to make it easier for developers to add tck test cases like the following. Currently, a developer has to manually write the contents following the exact format so that the test case could be recognized by the tck python script. This is very time consuming for complex queries. One has to write more than ~10 lines of the execution plan with the names of all the operators and even some execution contexts in json-like format.

  Scenario: Test profiling data format
    When profiling query:
      """
      GO 4 STEPS FROM 'Tim Duncan' OVER like YIELD like._dst AS dst | YIELD count(*)
      """
    Then the result should be, in any order:
      | count(*) |
      | 6        |
    And the execution plan should be:
      | id | name         | dependencies | profiling data                                                                                                                                                  | operator info     |
      | 7  | Aggregate    | 6            | {"version":0, "rows": 1}                                                                                                                                        |                   |
      | 6  | Project      | 5            | {"version":0, "rows": 6}                                                                                                                                        |                   |
      | 5  | GetNeighbors | 4            | {"version":0, "rows": 6, "resp[0]": {"vertices": 3}}                                                                                                            |                   |
      | 4  | Loop         | 0            | [{"version":0, "rows": 1},{"version":1, "rows": 1},{"version":2, "rows": 1},{"version":3, "rows": 1}]                                                           | {"loopBody": "3"} |
      | 3  | Dedup        | 2            | [{"version":0, "rows": 2},{"version":1, "rows": 3},{"version":2, "rows": 3}]                                                                                    |                   |
      | 2  | GetDstBySrc  | 1            | [{"version":0, "rows": 2, "resp[0]": {"vertices": 2}},{"version":1, "rows": 3, "resp[0]":{"vertices": 3}}, {"version":2, "rows": 3, "resp[0]":{"vertices": 3}}] |                   |
      | 1  | Start        |              | [{"version":0, "rows": 0},{"version":1, "rows": 0},{"version":2, "rows": 0}]                                                                                    |                   |
      | 0  | Start        |              | {"version":0, "rows": 0}                                                                                                                                        |                   |

@AntiTopQuark
Copy link
Contributor

@xtcyclist @jievince
Thank you very much. I think the behavior you described is the most reasonable.
I have checked many ·feature files, and most of them do not print profile data. Only a few operators print some operator info, as shown in the figure below.
image
image
The result of an explain/profile query statement when format='row' will print the data in full
image

Next, I would like to look at the relevant source code of tck testing to determine which specific information will be compared when performing the tck test. Then, I can determine what information needs to be printed when format='tck' is specified.

I expect to complete this task by this weekend.

@jievince
Copy link
Contributor

The operator info written in xxx.feature file means we just care about it and so we just want to verify it.

@AntiTopQuark
Copy link
Contributor

AntiTopQuark commented Mar 22, 2023

@jievince @xtcyclist @Shylock-Hg My current results are shown in the following figure. Currently,

  • When executing the command explain format='tck', both Profiling Data and Operator Info are empty.
  • When executing the command profile format='tck', 'execTime' and 'totalTime' will not be printed in Profiling Data because these two fields are related to execution time and machine conditions, and are not suitable for comparison. At the same time, I am not sure which fields should be printed to fill in the Operator Info, so Operator Info is still empty.

Thank you in advance for your answer.

  • Is this situation feasible?
  • Do we need to fill in the Operator Info field?
(root@nebula) [my_space_1]> explain format="tck" FETCH PROP ON player "player_1","player_2","player_3" yield properties(vertex).name as name, properties(vertex).age as age;
Execution succeeded (time spent 261µs/613.718µs)

Execution Plan (optimize time 28 us)

                                                          
| id | name        | dependencies | profiling data | operator info |
|  2 | Project     | 1            |                |               |
|  1 | GetVertices | 0            |                |               |
|  0 | Start       |              |                |               |
                                                          

Wed, 22 Mar 2023 23:15:52 CST

(root@nebula) [my_space_1]> explain format="tck" show hosts;
Execution succeeded (time spent 158µs/496.504µs)

Execution Plan (optimize time 23 us)

                                                        
| id | name      | dependencies | profiling data | operator info |
|  1 | ShowHosts | 0            |                |               |
|  0 | Start     |              |                |               |
                                                        

Wed, 22 Mar 2023 23:16:02 CST

(root@nebula) [my_space_1]> profile format="tck" FETCH PROP ON player "player_1","player_2","player_3" yield properties(vertex).name as name, properties(vertex).age as age;
                  
| name         | age |
| "Piter Park" | 24  |
| "aaa"        | 24  |
| "ccc"        | 24  |
                  
Got 3 rows (time spent 1.474ms/2.19677ms)

Execution Plan (optimize time 41 us)

                                                                                                                                                               
| id | name        | dependencies | profiling data                                                                                                      | operator info |
|  2 | Project     | 1            | {"rows":3,"version":0}                                                                                              |               |
|  1 | GetVertices | 0            | {"resp[0]":{"exec":"232(us)","host":"127.0.0.1:9779","total":"758(us)"},"rows":3,"total_rpc":"875(us)","version":0} |               |
|  0 | Start       |              | {"rows":0,"version":0}                                                                                              |               |
                                                                                                                                                               

Wed, 22 Mar 2023 23:16:13 CST

(root@nebula) [my_space_1]> profile format="tck" show hosts;
                                                                                           
| Host        | Port | Status   | Leader count | Leader distribution | Partition distribution | Version |
| "127.0.0.1" | 9779 | "ONLINE" | 100          | "my_space_1:100"    | "my_space_1:100"       | ""      |
                                                                                           
Got 1 rows (time spent 1.156ms/1.627659ms)

Execution Plan (optimize time 55 us)

                                                                
| id | name      | dependencies | profiling data         | operator info |
|  1 | ShowHosts | 0            | {"rows":1,"version":0} |               |
|  0 | Start     |              | {"rows":0,"version":0} |               |
                                                                

Wed, 22 Mar 2023 23:16:28 CST

(root@nebula) [my_space_1]> 
(root@nebula) [my_space_1]> profile format="row" FETCH PROP ON player "player_1","player_2","player_3" yield properties(vertex).name as name, properties(vertex).age as age;
+--------------+-----+
| name         | age |
+--------------+-----+
| "Piter Park" | 24  |
| "aaa"        | 24  |
| "ccc"        | 24  |
+--------------+-----+
Got 3 rows (time spent 1.716ms/2.344165ms)

Execution Plan (optimize time 29 us)


-----+-------------+--------------+-------------------------------+---------------------------------------
| id | name        | dependencies | profiling data                | operator info                        |
-----+-------------+--------------+-------------------------------+---------------------------------------
|  2 | Project     | 1            | {                             | outputVar: {                         |
|    |             |              |   "execTime": "27(us)",       |   "colNames": [                      |
|    |             |              |   "rows": 3,                  |     "name",                          |
|    |             |              |   "totalTime": "30(us)",      |     "age"                            |
|    |             |              |   "version": 0                |   ],                                 |
|    |             |              | }                             |   "type": "DATASET",                 |
|    |             |              |                               |   "name": "__Project_2"              |
|    |             |              |                               | }                                    |
|    |             |              |                               | inputVar: __GetVertices_1            |
|    |             |              |                               | columns: [                           |
|    |             |              |                               |   "properties(VERTEX).name AS name", |
|    |             |              |                               |   "properties(VERTEX).age AS age"    |
|    |             |              |                               | ]                                    |
-----+-------------+--------------+-------------------------------+---------------------------------------
|  1 | GetVertices | 0            | {                             | outputVar: {                         |
|    |             |              |   "execTime": "113(us)",      |   "colNames": [],                    |
|    |             |              |   "resp[0]": {                |   "type": "DATASET",                 |
|    |             |              |     "exec": "317(us)",        |   "name": "__GetVertices_1"          |
|    |             |              |     "host": "127.0.0.1:9779", | }                                    |
|    |             |              |     "total": "1045(us)"       | inputVar: __VAR_0                    |
|    |             |              |   },                          | space: 2                             |
|    |             |              |   "rows": 3,                  | dedup: false                         |
|    |             |              |   "totalTime": "1311(us)",    | limit: 9223372036854775807           |
|    |             |              |   "total_rpc": "1214(us)",    | filter:                              |
|    |             |              |   "version": 0                | orderBy: []                          |
|    |             |              | }                             | src: COLUMN[0]                       |
|    |             |              |                               | props: [                             |
|    |             |              |                               |   {                                  |
|    |             |              |                               |     "props": [                       |
|    |             |              |                               |       "_tag",                        |
|    |             |              |                               |       "age",                         |
|    |             |              |                               |       "name"                         |
|    |             |              |                               |     ],                               |
|    |             |              |                               |     "tagId": 3                       |
|    |             |              |                               |   }                                  |
|    |             |              |                               | ]                                    |
|    |             |              |                               | exprs:                               |
-----+-------------+--------------+-------------------------------+---------------------------------------
|  0 | Start       |              | {                             | outputVar: {                         |
|    |             |              |   "execTime": "0(us)",        |   "colNames": [],                    |
|    |             |              |   "rows": 0,                  |   "type": "DATASET",                 |
|    |             |              |   "totalTime": "29(us)",      |   "name": "__Start_0"                |
|    |             |              |   "version": 0                | }                                    |
|    |             |              | }                             |                                      |
-----+-------------+--------------+-------------------------------+---------------------------------------

Wed, 22 Mar 2023 23:25:14 CST

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Community: perfect as the first pull request type/feature req Type: feature request
Projects
None yet
Development

No branches or pull requests

7 participants