Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fulltext index regression #3238

Closed
wey-gu opened this issue Oct 29, 2021 · 8 comments
Closed

fulltext index regression #3238

wey-gu opened this issue Oct 29, 2021 · 8 comments
Assignees
Labels
need info Solution: need more information (ex. can't reproduce) type/bug Type: something is unexpected

Comments

@wey-gu
Copy link
Contributor

wey-gu commented Oct 29, 2021

It's reported the indexfullscan was chosen in 2.5 rather than it was in 2.0 with indexScan.

Context https://discuss.nebula-graph.com.cn/t/topic/6149/25?u=wey

Thanks!

@critical27
Copy link
Contributor

Is this fixed? I see a related PR a few days before. @CPWstatic

@Sophie-Xie Sophie-Xie added the need info Solution: need more information (ex. can't reproduce) label Dec 15, 2021
@Sophie-Xie Sophie-Xie assigned CPWstatic and unassigned critical27 Dec 15, 2021
@Sophie-Xie Sophie-Xie added the priority/low-pri Priority: low label Dec 15, 2021
@Sophie-Xie Sophie-Xie assigned cangfengzhs and unassigned CPWstatic Dec 15, 2021
@wey-gu
Copy link
Contributor Author

wey-gu commented Dec 15, 2021

Another input/update request was sent, if no responses for a couple of more days, we can close it.

@cangfengzhs
Copy link
Contributor

cangfengzhs commented Dec 17, 2021

In the latest code, the fulltext index can work normally, but there are serious problems.

  1. We don't create index in es(Full-text index cannot be queried #3217)
  2. If I execute a query, for example lookup on t where prefix(t.name,"abc"), but there is none of string startwith "abc", es will return an empty result. Then execute plan will not be optimized and use IndexFullScan.
  3. This problem also exists when I lookup with where clause like a==1 and a==2 who has two or more subexpression have same property name.

@yixinglu @CPWstatic

@Sophie-Xie Sophie-Xie assigned CPWstatic and unassigned cangfengzhs Dec 17, 2021
@Sophie-Xie Sophie-Xie removed need info Solution: need more information (ex. can't reproduce) priority/low-pri Priority: low labels Dec 17, 2021
@Sophie-Xie Sophie-Xie assigned czpmango and unassigned CPWstatic Dec 17, 2021
@czpmango
Copy link
Contributor

This is my test in version v2.5.1 and the execution plan and fulltext performance are expected:

(czp@nebula) [nba]> profile LOOKUP ON player WHERE PREFIX(player.name, "Tim");

-----+--------------------+--------------+---------------------------------------------------+----------------------------------------
| id | name               | dependencies | profiling data                                    | operator info                         |
-----+--------------------+--------------+---------------------------------------------------+----------------------------------------
|  3 | Project            | 4            | ver: 0, rows: 1, execTime: 36us, totalTime: 38us  | outputVar: [                          |
|    |                    |              |                                                   |   {                                   |
|    |                    |              |                                                   |     "colNames": [                     |
|    |                    |              |                                                   |       "VertexID"                      |
|    |                    |              |                                                   |     ],                                |
|    |                    |              |                                                   |     "type": "DATASET",                |
|    |                    |              |                                                   |     "name": "__Project_3"             |
|    |                    |              |                                                   |   }                                   |
|    |                    |              |                                                   | ]                                     |
|    |                    |              |                                                   | inputVar: __Filter_2                  |
|    |                    |              |                                                   | columns: [                            |
|    |                    |              |                                                   |   "$-.VertexID AS VertexID"           |
|    |                    |              |                                                   | ]                                     |
-----+--------------------+--------------+---------------------------------------------------+----------------------------------------
|  4 | TagIndexPrefixScan | 0            | ver: 0, rows: 1, execTime: 0us, totalTime: 1183us | outputVar: [                          |
|    |                    |              |                                                   |   {                                   |
|    |                    |              |                                                   |     "colNames": [                     |
|    |                    |              |                                                   |       "VertexID",                     |
|    |                    |              |                                                   |       "player.name"                   |
|    |                    |              |                                                   |     ],                                |
|    |                    |              |                                                   |     "type": "DATASET",                |
|    |                    |              |                                                   |     "name": "__Filter_2"              |
|    |                    |              |                                                   |   }                                   |
|    |                    |              |                                                   | ]                                     |
|    |                    |              |                                                   | inputVar:                             |
|    |                    |              |                                                   | space: 1                              |
|    |                    |              |                                                   | dedup: false                          |
|    |                    |              |                                                   | limit: 9223372036854775807            |
|    |                    |              |                                                   | filter:                               |
|    |                    |              |                                                   | orderBy: []                           |
|    |                    |              |                                                   | schemaId: 2                           |
|    |                    |              |                                                   | isEdge: false                         |
|    |                    |              |                                                   | returnCols: [                         |
|    |                    |              |                                                   |   "_vid",                             |
|    |                    |              |                                                   |   "name"                              |
|    |                    |              |                                                   | ]                                     |
|    |                    |              |                                                   | indexCtx: [                           |
|    |                    |              |                                                   |   {                                   |
|    |                    |              |                                                   |     "columnHints": [                  |
|    |                    |              |                                                   |       {                               |
|    |                    |              |                                                   |         "endValue": "__EMPTY__",      |
|    |                    |              |                                                   |         "beginValue": "\"Tim Duncan", |
|    |                    |              |                                                   |         "scanType": "PREFIX",         |
|    |                    |              |                                                   |         "column": "name"              |
|    |                    |              |                                                   |       }                               |
|    |                    |              |                                                   |     ],                                |
|    |                    |              |                                                   |     "filter": "",                     |
|    |                    |              |                                                   |     "index_id": 8                     |
|    |                    |              |                                                   |   }                                   |
|    |                    |              |                                                   | ]                                     |
-----+--------------------+--------------+---------------------------------------------------+----------------------------------------
|  0 | Start              |              | ver: 0, rows: 0, execTime: 0us, totalTime: 16us   | outputVar: [                          |
|    |                    |              |                                                   |   {                                   |
|    |                    |              |                                                   |     "colNames": [],                   |
|    |                    |              |                                                   |     "type": "DATASET",                |
|    |                    |              |                                                   |     "name": "__Start_0"               |
|    |                    |              |                                                   |   }                                   |
|    |                    |              |                                                   | ]                                     |
-----+--------------------+--------------+---------------------------------------------------+----------------------------------------

@czpmango
Copy link
Contributor

  1. If I execute a query, for example lookup on t where prefix(t.name,"abc"), but there is none of string startwith "abc", es will return an empty result. Then execute plan will not be optimized and use IndexFullScan.

You're right, the plan is indeed Start->TagIndexFullScan->Project. But this does not mean that the execution plan is not optimized or performance is degraded.
Here is the test:

(czp@nebula) [nba]> profile LOOKUP ON player WHERE PREFIX(player.name, "non-exists");


-----+------------------+--------------+--------------------------------------------------+-------------------------------------
| id | name             | dependencies | profiling data                                   | operator info                      |
-----+------------------+--------------+--------------------------------------------------+-------------------------------------
|  2 | Project          | 3            | ver: 0, rows: 0, execTime: 32us, totalTime: 33us | outputVar: [                       |
|    |                  |              |                                                  |   {                                |
|    |                  |              |                                                  |     "colNames": [                  |
|    |                  |              |                                                  |       "VertexID"                   |
|    |                  |              |                                                  |     ],                             |
|    |                  |              |                                                  |     "type": "DATASET",             |
|    |                  |              |                                                  |     "name": "__Project_2"          |
|    |                  |              |                                                  |   }                                |
|    |                  |              |                                                  | ]                                  |
|    |                  |              |                                                  | inputVar: __TagIndexFullScan_1     |
|    |                  |              |                                                  | columns: [                         |
|    |                  |              |                                                  |   "$-.VertexID AS VertexID"        |
|    |                  |              |                                                  | ]                                  |
-----+------------------+--------------+--------------------------------------------------+-------------------------------------
|  3 | TagIndexFullScan | 0            | ver: 0, rows: 0, execTime: 0us, totalTime: 31us  | outputVar: [                       |
|    |                  |              |                                                  |   {                                |
|    |                  |              |                                                  |     "colNames": [                  |
|    |                  |              |                                                  |       "VertexID"                   |
|    |                  |              |                                                  |     ],                             |
|    |                  |              |                                                  |     "type": "DATASET",             |
|    |                  |              |                                                  |     "name": "__TagIndexFullScan_1" |
|    |                  |              |                                                  |   }                                |
|    |                  |              |                                                  | ]                                  |
|    |                  |              |                                                  | inputVar:                          |
|    |                  |              |                                                  | space: 1                           |
|    |                  |              |                                                  | dedup: false                       |
|    |                  |              |                                                  | limit: 9223372036854775807         |
|    |                  |              |                                                  | filter:                            |
|    |                  |              |                                                  | orderBy: []                        |
|    |                  |              |                                                  | schemaId: 2                        |
|    |                  |              |                                                  | isEdge: false                      |
|    |                  |              |                                                  | returnCols: [                      |
|    |                  |              |                                                  |   "_vid"                           |
|    |                  |              |                                                  | ]                                  |
|    |                  |              |                                                  | indexCtx: [                        |
|    |                  |              |                                                  |   {                                |
|    |                  |              |                                                  |     "columnHints": [],             |
|    |                  |              |                                                  |     "filter": "",                  |
|    |                  |              |                                                  |     "index_id": 8                  |
|    |                  |              |                                                  |   }                                |
|    |                  |              |                                                  | ]                                  |
-----+------------------+--------------+--------------------------------------------------+-------------------------------------
|  0 | Start            |              | ver: 0, rows: 0, execTime: 0us, totalTime: 17us  | outputVar: [                       |
|    |                  |              |                                                  |   {                                |
|    |                  |              |                                                  |     "colNames": [],                |
|    |                  |              |                                                  |     "type": "DATASET",             |
|    |                  |              |                                                  |     "name": "__Start_0"            |
|    |                  |              |                                                  |   }                                |
|    |                  |              |                                                  | ]                                  |
-----+------------------+--------------+--------------------------------------------------+-------------------------------------

@cangfengzhs
Copy link
Contributor

I misunderstood. However, if so, there is no reason why the performance of full-text indexing will be poor.

@czpmango
Copy link
Contributor

I misunderstood. However, if so, there is no reason why the performance of full-text indexing will be poor.

ACK.

@Sophie-Xie Sophie-Xie added the need info Solution: need more information (ex. can't reproduce) label Dec 27, 2021
@Sophie-Xie
Copy link
Contributor

I misunderstood. However, if so, there is no reason why the performance of full-text indexing will be poor.

This problem can’t be reproduced, the environment of community user is already good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need info Solution: need more information (ex. can't reproduce) type/bug Type: something is unexpected
Projects
None yet
Development

No branches or pull requests

7 participants