Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graph return duplicate result #4929

Closed
nebula-bots opened this issue Nov 24, 2022 · 7 comments · Fixed by #4943
Closed

Graph return duplicate result #4929

nebula-bots opened this issue Nov 24, 2022 · 7 comments · Fixed by #4943
Assignees
Labels
affects/master PR/issue: this bug affects master version. auto-sync find/automation process/done Process of bug severity/minor Severity of bug type/bug/correctness Query runs normaly but produces wrong results. type/bug Type: something is unexpected
Milestone

Comments

@nebula-bots
Copy link
Contributor

Please check the FAQ documentation before raising an issue

Describe the bug (required)

Query result on nebula:

(root@nebula) [testdata]> MATCH p =  (n0)-[e1]->(n1)-[e2]-(n0) WHERE id(n0) == 7 and id(n1) == 7 return e1, e2
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| e1                                                                                                                                                                                                 | e2                                                                                                                                                                                                 |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| [:EDGE_0 7->7 @0 {edgeProp_0_0: true, edgeProp_0_1: false, edgeProp_0_2: 0.3119730055332184, edgeProp_0_3: "Joel Embiid", edgeProp_0_4: 0.17983399331569672, edgeProp_0_5: 88, edgeProp_0_6: 175}] | [:EDGE_1 7->7 @0 {edgeProp_1_0: false, edgeProp_1_1: "Dwyane Wade", edgeProp_1_2: 0.4549649953842163, edgeProp_1_3: 115, edgeProp_1_4: "Danny Green", edgeProp_1_5: true}]                         |
| [:EDGE_0 7->7 @0 {edgeProp_0_0: true, edgeProp_0_1: false, edgeProp_0_2: 0.3119730055332184, edgeProp_0_3: "Joel Embiid", edgeProp_0_4: 0.17983399331569672, edgeProp_0_5: 88, edgeProp_0_6: 175}] | [:EDGE_1 7->7 @0 {edgeProp_1_0: false, edgeProp_1_1: "Dwyane Wade", edgeProp_1_2: 0.4549649953842163, edgeProp_1_3: 115, edgeProp_1_4: "Danny Green", edgeProp_1_5: true}]                         |
| [:EDGE_1 7->7 @0 {edgeProp_1_0: false, edgeProp_1_1: "Dwyane Wade", edgeProp_1_2: 0.4549649953842163, edgeProp_1_3: 115, edgeProp_1_4: "Danny Green", edgeProp_1_5: true}]                         | [:EDGE_0 7->7 @0 {edgeProp_0_0: true, edgeProp_0_1: false, edgeProp_0_2: 0.3119730055332184, edgeProp_0_3: "Joel Embiid", edgeProp_0_4: 0.17983399331569672, edgeProp_0_5: 88, edgeProp_0_6: 175}] |
| [:EDGE_1 7->7 @0 {edgeProp_1_0: false, edgeProp_1_1: "Dwyane Wade", edgeProp_1_2: 0.4549649953842163, edgeProp_1_3: 115, edgeProp_1_4: "Danny Green", edgeProp_1_5: true}]                         | [:EDGE_0 7->7 @0 {edgeProp_0_0: true, edgeProp_0_1: false, edgeProp_0_2: 0.3119730055332184, edgeProp_0_3: "Joel Embiid", edgeProp_0_4: 0.17983399331569672, edgeProp_0_5: 88, edgeProp_0_6: 175}] |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Got 4 rows (time spent 8.039ms/34.059291ms)

Mon, 31 Oct 2022 21:09:41 CST

note that row 2 is duplicate to row 1, row 4 is duplicate to row 3.

Same query on Neo4j(and same dataset) return exactly two rows as expected:

$ MATCH p =  (n0)-[e1]->(n1)-[e2]-(n0) WHERE n0.id = 7 and n1.id = 7 return e1, e2

╒══════════════════════════════════════════════════════════════════════╤══════════════════════════════════════════════════════════════════════╕
│"e1"                                                                  │"e2"                                                                  │
╞══════════════════════════════════════════════════════════════════════╪══════════════════════════════════════════════════════════════════════╡
│{"edgeProp_0_6":175,"edgeProp_0_2":0.311973,"edgeProp_0_3":"Joel Embii│{"edgeProp_1_5":true,"edgeProp_1_1":"Dwyane Wade","edgeProp_1_2":0.454│
│d","edgeProp_0_4":0.179834,"edgeProp_0_5":88,"edgeProp_0_0":true,"edge│965,"edgeProp_1_3":115,"edgeProp_1_4":"Danny Green","edgeProp_1_0":fal│
│Prop_0_1":false}                                                      │se}                                                                   │
├──────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────┤
│{"edgeProp_1_5":true,"edgeProp_1_1":"Dwyane Wade","edgeProp_1_2":0.454│{"edgeProp_0_6":175,"edgeProp_0_2":0.311973,"edgeProp_0_3":"Joel Embii│
│965,"edgeProp_1_3":115,"edgeProp_1_4":"Danny Green","edgeProp_1_0":fal│d","edgeProp_0_4":0.179834,"edgeProp_0_5":88,"edgeProp_0_0":true,"edge│
│se}                                                                   │Prop_0_1":false}                                                      │
└──────────────────────────────────────────────────────────────────────┴──────────────────────────────────────────────────────────────────────┘

and these two edges are further verified by the visualization view rendered by Neo4j(vote to Neo4j):

image

Your Environments (required)

  • OS: uname -a
  • Compiler: g++ --version or clang++ --version
  • CPU: lscpu
  • Commit id (e.g. a3ffc7d8) 5dd2054

How To Reproduce(required)

Steps to reproduce the behavior:

  1. Step 1
  2. Step 2
  3. Step 3

Expected behavior

Additional context

@nebula-bots nebula-bots added affects/master PR/issue: this bug affects master version. auto-sync find/automation severity/minor Severity of bug type/bug Type: something is unexpected type/bug/correctness Query runs normaly but produces wrong results. labels Nov 24, 2022
@Sophie-Xie Sophie-Xie added this to the v3.4.0 milestone Nov 24, 2022
@xtcyclist
Copy link
Contributor

xtcyclist commented Nov 24, 2022

Studying this bug.

The MATCH p =() syntax NebulaGraph designed is to query a trail, where all edges are distinct. There can be repeated vertices but no repeated edges. It's different to path in concept, which does not allow duplicates of neither vertices nor edges.

With this bug to be fixed, the documentation needs to be clarified.

Reference: https://en.wikipedia.org/wiki/Path_(graph_theory)

@xtcyclist
Copy link
Contributor

insert edge like (likeness) values "Tim Duncan"->"Tim Duncan":(100)
insert edge teammate (start_year, end_year) values "Tim Duncan"->"Tim Duncan":(2000, 2022)
MATCH p =  (n0)-[e1]->(n1)-[e2]-(n0) WHERE id(n0) == "Tim Duncan" and id(n1) == "Tim Duncan" return e1, e2

Can be reproduced in the nba dataset with the above query.

@xtcyclist
Copy link
Contributor

xtcyclist commented Nov 24, 2022

Bug reason:

A single self-reflective edge is treated as two distinct edges in the traverse operator: one outbound and one inbound, but they are in fact the same edge.

It's different to a cycle between two vertices, in which the two edges are distinct edges, because they have different src and dst vertices.

@xtcyclist
Copy link
Contributor

xtcyclist commented Nov 24, 2022

Fix:

The storage side shall not return two versions of the same self-reflective edge. This could avoids increasing computation overheads needed to deduplicate edges and also reduce the network traffic.

@critical27
Copy link
Contributor

In our syntax, GO is walk, and MATCH is trail. My question is that, both query may use the same storage interface, aka, GetNeighbors. I don't know whether this PR will have side effect on the Go query.

@critical27
Copy link
Contributor

@whitewum, WDYT?

@whitewum
Copy link
Contributor

whitewum commented Dec 1, 2022

This is a P2 bug.

@github-actions github-actions bot added the process/fixed Process of bug label Dec 6, 2022
@nebula-bots nebula-bots added process/done Process of bug severity/minor Severity of bug and removed severity/major Severity of bug process/fixed Process of bug labels Jan 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects/master PR/issue: this bug affects master version. auto-sync find/automation process/done Process of bug severity/minor Severity of bug type/bug/correctness Query runs normaly but produces wrong results. type/bug Type: something is unexpected
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants