cypher query returning UnknownError: node [...] not connected to this relationship[...] #12268

raduvanciu · 2019-08-13T20:30:06Z

I discovered that some cypher queries using shortest path fail for specific input, while at the same time work well with others.

Neo4j version: 3.5.4, 3.5.5, 3.5.8
Operating system: Ubuntu 16.04 and Docker. I tried both both Community and Enterprise edition
API: Cypher
Steps to reproduce
Expected behavior
the following queries are expected to complete without errors

QUERY1
WITH [54494,23710] AS x MATCH (g:Gene), p=shortestPath((g)-[:ACTIVATION*1..2]->(gg)) WHERE g.geneId in x AND NOT gg.geneId in x RETURN p

QUERY2
WITH [54494,23710,513] AS x MATCH (g:Gene), p=shortestPath((g)-[:ACTIVATION*1..2]->(gg)) WHERE g.geneId in x AND NOT gg.geneId in x RETURN p

QUERY3
WITH [54494] AS x MATCH (g:Gene), p=shortestPath((g)-[:ACTIVATION*1..2]->(gg)) WHERE g.geneId in x AND NOT gg.geneId in x RETURN p
QUERY4
WITH [23710] AS x MATCH (g:Gene), p=shortestPath((g)-[:ACTIVATION*1..2]->(gg)) WHERE g.geneId in x AND NOT gg.geneId in x RETURN p

QUERY5:
WITH [513] AS x MATCH (g:Gene), p=shortestPath((g)-[:ACTIVATION*1..2]->(gg)) WHERE g.geneId in x AND NOT gg.geneId in x RETURN p

Actual behavior

the following error occurs for QUERY1
Neo.DatabaseError.General.UnknownError: Node[10573] not connected to this relationship[38722245]

QUERY2 completes successfully and return a paths with 34 nodes and 33 edges
QUERY3 completes successfully and return no paths
QUERY4 completes successfully and return a paths with 34 nodes and 33 edges
QUERY5 completes successfully and return no paths

I tried various combinations and several versions 3.5.4, 3.5.5, 3.5.8 using Docker or deployed on Ubuntu via AWS official AMI. I also tried removing the limit to the shortest path without much luck. Finally, I tried setting the cypher.forbid_exhaustive_shortestpath to true as described here:

https://neo4j.com/docs/cypher-manual/current/execution-plans/shortestpath-planning/

The following query returns 3 nodes and 1 edge, which indicates that the data is fine, the node with id 10573 is disconnected from the relationship with id 38722245.
MATCH (n), p=()-[r]->() where id(n)=10573 and id(r)=38722245 return p, n

I am not sure how to approach this error, but it seems to be related to the optimized version of the shortestPath algorithm. Any help would be much appreciated. In the meantime I will try to provide a minimal database to reproduce the problem. Sharing the whole database dump (~2GB) with you may be an option if needed.

I was unable yet to reproduce the problem in version 3.4.1.

The text was updated successfully, but these errors were encountered:

chrisvest · 2019-08-14T08:40:53Z

@raduvanciu Can you find the complete stack trace for the failed query in the debug.log?

Also, are there any other transactions/queries running on the database at the same time?

raduvanciu · 2019-08-14T13:08:40Z

@chrisvest Thank you for your quick reply. There are no other transactions running at the same time. Here is the stack trace using version 3.5.5 Enterprise edition

2019-08-14 13:02:41.703+0000 ERROR [o.n.b.v.r.ErrorReporter] Client triggered an unexpected error [Neo.DatabaseError.General.UnknownError]: Node[10573] not connected to this relationship[38722245], reference 3bbcba65-38ca-40e6-aa8c-d106a9f80398.
2019-08-14 13:02:41.703+0000 ERROR [o.n.b.v.r.ErrorReporter] Client triggered an unexpected error [Neo.DatabaseError.General.UnknownError]: Node[10573] not connected to this relationship[38722245], reference 3bbcba65-38ca-40e6-aa8c-d106a9f80398. Node[10573] not connected to this relationship[38722245]
org.neo4j.graphdb.NotFoundException: Node[10573] not connected to this relationship[38722245]
at org.neo4j.kernel.impl.core.RelationshipProxy.getOtherNodeId(RelationshipProxy.java:215)
at org.neo4j.kernel.impl.core.RelationshipProxy.getOtherNode(RelationshipProxy.java:173)
at org.neo4j.graphalgo.impl.path.ShortestPath$DirectionData.fetchNextOrNull(ShortestPath.java:402)
at org.neo4j.graphalgo.impl.path.ShortestPath$DirectionData.fetchNextOrNull(ShortestPath.java:319)
at org.neo4j.helpers.collection.PrefetchingIterator.peek(PrefetchingIterator.java:60)
at org.neo4j.helpers.collection.PrefetchingIterator.hasNext(PrefetchingIterator.java:46)
at org.neo4j.graphalgo.impl.path.ShortestPath.internalPaths(ShortestPath.java:161)
at org.neo4j.graphalgo.impl.path.ShortestPath.findSinglePath(ShortestPath.java:127)
at org.neo4j.cypher.internal.runtime.interpreted.TransactionBoundQueryContext.singleShortestPath(TransactionBoundQueryContext.scala:961)
at org.neo4j.cypher.internal.compatibility.v3_5.ExceptionTranslatingQueryContext$$anonfun$singleShortestPath$1.apply(ExceptionTranslatingQueryContext.scala:275)
at org.neo4j.cypher.internal.compatibility.v3_5.ExceptionTranslatingQueryContext$$anonfun$singleShortestPath$1.apply(ExceptionTranslatingQueryContext.scala:275)
at org.neo4j.cypher.internal.compatibility.v3_5.ExceptionTranslationSupport$class.translateException(ExceptionTranslationSupport.scala:33)
at org.neo4j.cypher.internal.compatibility.v3_5.ExceptionTranslatingQueryContext.translateException(ExceptionTranslatingQueryContext.scala:41)
at org.neo4j.cypher.internal.compatibility.v3_5.ExceptionTranslatingQueryContext.singleShortestPath(ExceptionTranslatingQueryContext.scala:275)
at org.neo4j.cypher.internal.runtime.interpreted.commands.expressions.ShortestPathExpression.getMatches(ShortestPathExpression.scala:68)
at org.neo4j.cypher.internal.runtime.interpreted.commands.expressions.ShortestPathExpression.apply(ShortestPathExpression.scala:54)
at org.neo4j.cypher.internal.runtime.interpreted.pipes.ShortestPathPipe$$anonfun$internalCreateResults$1.apply(ShortestPathPipe.scala:45)
at org.neo4j.cypher.internal.runtime.interpreted.pipes.ShortestPathPipe$$anonfun$internalCreateResults$1.apply(ShortestPathPipe.scala:44)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at org.neo4j.cypher.internal.runtime.RuntimeJavaValueConverter$feedQueryResultRecordIteratorToVisitable.accept(RuntimeJavaValueConverter.scala:63)
at org.neo4j.cypher.internal.compatibility.v3_5.runtime.PipeExecutionResult.accept(PipeExecutionResult.scala:74)
at org.neo4j.cypher.internal.compatibility.v3_5.runtime.executionplan.StandardInternalExecutionResult.accept(StandardInternalExecutionResult.scala:192)
at org.neo4j.cypher.internal.compatibility.ClosingExecutionResult$$anonfun$accept$2.apply$mcV$sp(ClosingExecutionResult.scala:158)
at org.neo4j.cypher.internal.compatibility.ClosingExecutionResult$$anonfun$accept$2.apply(ClosingExecutionResult.scala:158)
at org.neo4j.cypher.internal.compatibility.ClosingExecutionResult$$anonfun$accept$2.apply(ClosingExecutionResult.scala:158)
at org.neo4j.cypher.internal.compatibility.ClosingExecutionResult$$anonfun$safelyAndClose$1.apply(ClosingExecutionResult.scala:171)
at org.neo4j.cypher.exceptionHandler$runSafely$.apply(exceptionHandler.scala:89)
at org.neo4j.cypher.internal.compatibility.ClosingExecutionResult.safelyAndClose(ClosingExecutionResult.scala:174)
at org.neo4j.cypher.internal.compatibility.ClosingExecutionResult.accept(ClosingExecutionResult.scala:157)
at org.neo4j.bolt.v1.runtime.CypherAdapterStream.accept(CypherAdapterStream.java:73)
at org.neo4j.bolt.v1.messaging.ResultHandler.onRecords(ResultHandler.java:40)
at org.neo4j.bolt.v3.runtime.StreamingState.lambda$processStreamResultMessage$0(StreamingState.java:41)
at org.neo4j.bolt.v1.runtime.TransactionStateMachine$State.consumeResult(TransactionStateMachine.java:484)
at org.neo4j.bolt.v1.runtime.TransactionStateMachine$State$1.streamResult(TransactionStateMachine.java:328)
at org.neo4j.bolt.v1.runtime.TransactionStateMachine.streamResult(TransactionStateMachine.java:128)
at org.neo4j.bolt.v3.runtime.StreamingState.processStreamResultMessage(StreamingState.java:40)
at org.neo4j.bolt.v3.runtime.AbstractStreamingState.processUnsafe(AbstractStreamingState.java:44)
at org.neo4j.bolt.v3.runtime.FailSafeBoltStateMachineState.process(FailSafeBoltStateMachineState.java:48)
at org.neo4j.bolt.v1.runtime.BoltStateMachineV1.nextState(BoltStateMachineV1.java:144)
at org.neo4j.bolt.v1.runtime.BoltStateMachineV1.process(BoltStateMachineV1.java:92)
at org.neo4j.bolt.messaging.BoltRequestMessageReader.lambda$doRead$1(BoltRequestMessageReader.java:89)
at org.neo4j.bolt.runtime.MetricsReportingBoltConnection.lambda$enqueue$0(MetricsReportingBoltConnection.java:68)
at org.neo4j.bolt.runtime.DefaultBoltConnection.processNextBatch(DefaultBoltConnection.java:191)
at org.neo4j.bolt.runtime.MetricsReportingBoltConnection.processNextBatch(MetricsReportingBoltConnection.java:86)
at org.neo4j.bolt.runtime.DefaultBoltConnection.processNextBatch(DefaultBoltConnection.java:139)
at org.neo4j.bolt.runtime.ExecutorBoltScheduler.executeBatch(ExecutorBoltScheduler.java:171)
at org.neo4j.bolt.runtime.ExecutorBoltScheduler.lambda$scheduleBatchOrHandleError$2(ExecutorBoltScheduler.java:154)
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

raduvanciu · 2019-08-14T18:10:07Z

@chrisvest Thank you again for looking into this. I have some additional info, which may help.

The bug seems to be related to the fact that one of the source nodes (i.e., {geneId:54494}) has no outgoing edges of type ACTIVATION. It is however surprising that that the exception is not thrown for QUERY2, which includes the same source node.

I found a workaround by rewriting the query. Filter first any source nodes where the shortest path cannot exist:

WITH [54494,23710] AS x MATCH (g:Gene)-[:ACTIVATION]->(:Gene), p=shortestPath((g)-[:ACTIVATION*1..2]->(gg:Gene)) WHERE g.geneId in x AND NOT gg.geneId in x RETURN p

This query is much slower for a database with a lot of nodes, but the pre filtering can be done as a separate query.

FYI, here are some stats with the database I am working with. I was not able to reproduce the issue on the small demo movie database.

ID Allocation

I tried to recreate the whole database from raw csv files, but the issue persists.

Please let me know what do you think of this workaround and if you have additional insights.

sherfert · 2019-10-21T07:30:22Z

@raduvanciu Is it possible for you to attach the raw csv file here so that we can investigate?

raduvanciu · 2019-10-21T14:19:00Z

@sherfert
Thank you for looking into this. Unfortunately, I cannot share with you here all the raw csv files, which are larger than 25MB. Here is a subset which has all the nodes and one type of edges. Since the problem is non deterministic, as described above, we will need to find another query that fails after loading the data.

genes_activation_archive.zip

I hope this helps,

Lojjs · 2019-11-05T09:31:48Z

@raduvanciu An additional question, do you have any indexes/constraints when you run the query?
Also do I understand you correctly if you cannot reproduce the error with the smaller dataset that you sent us (the one in the zip-file with only one relationship type)?

Best regards Louise, Neo4j Cypher team

raduvanciu · 2019-11-05T19:32:13Z

@Lojjs

There is one unique constraint and index on node Gene(geneId).
You are correct, I cannot reproduce the error by using only the data in the zip file. Unfortunately, I cannot share the whole data publicly. I am happy to provide additional information as needed.

For Query1, here is the plan as exposed by EXPLAIN, the planner attempts to use only one index.

{
  "statement": {
    "text": "EXPLAIN WITH [54494,23710] AS x MATCH (g:Gene), p=shortestPath((g)-[:ACTIVATION*1..2]->(gg)) WHERE g.geneId in x AND NOT gg.geneId in x RETURN p",
    "parameters": {}
  },
  "statementType": "r",
  "counters": {
    "_stats": {
      "nodesCreated": 0,
      "nodesDeleted": 0,
      "relationshipsCreated": 0,
      "relationshipsDeleted": 0,
      "propertiesSet": 0,
      "labelsAdded": 0,
      "labelsRemoved": 0,
      "indexesAdded": 0,
      "indexesRemoved": 0,
      "constraintsAdded": 0,
      "constraintsRemoved": 0
    }
  },
  "updateStatistics": {
    "_stats": {
      "nodesCreated": 0,
      "nodesDeleted": 0,
      "relationshipsCreated": 0,
      "relationshipsDeleted": 0,
      "propertiesSet": 0,
      "labelsAdded": 0,
      "labelsRemoved": 0,
      "indexesAdded": 0,
      "indexesRemoved": 0,
      "constraintsAdded": 0,
      "constraintsRemoved": 0
    }
  },
  "plan": {
    "operatorType": "ProduceResults",
    "identifiers": [
      "x",
      "  UNNAMED59",
      "g",
      "p",
      "gg"
    ],
    "arguments": {
      "planner-impl": "IDP",
      "planner-version": "3.5",
      "runtime-version": "3.5",
      "runtime": "SLOTTED",
      "runtime-impl": "SLOTTED",
      "version": "CYPHER 3.5",
      "EstimatedRows": 55787407.63322778,
      "planner": "COST"
    },
    "children": [
      {
        "operatorType": "Apply",
        "identifiers": [
          "x",
          "  UNNAMED59",
          "g",
          "p",
          "gg"
        ],
        "arguments": {
          "EstimatedRows": 55787407.63322778
        },
        "children": [
          {
            "operatorType": "Projection",
            "identifiers": [
              "x"
            ],
            "arguments": {
              "EstimatedRows": 1,
              "Expressions": "{x : $`  AUTOLIST0`}"
            },
            "children": []
          },
          {
            "operatorType": "ShortestPath",
            "identifiers": [
              "x",
              "  UNNAMED59",
              "g",
              "p",
              "gg"
            ],
            "arguments": {
              "EstimatedRows": 55787407.63322778,
              "Expressions": "{}"
            },
            "children": [
              {
                "operatorType": "CartesianProduct",
                "identifiers": [
                  "g",
                  "gg",
                  "x"
                ],
                "arguments": {
                  "EstimatedRows": 55787407.63322778
                },
                "children": [
                  {
                    "operatorType": "Filter",
                    "identifiers": [
                      "gg",
                      "x"
                    ],
                    "arguments": {
                      "Expression": "not gg.geneId IN x",
                      "EstimatedRows": 2231620.0666843406
                    },
                    "children": [
                      {
                        "operatorType": "AllNodesScan",
                        "identifiers": [
                          "gg",
                          "x"
                        ],
                        "arguments": {
                          "EstimatedRows": 31085476
                        },
                        "children": []
                      }
                    ]
                  },
                  {
                    "operatorType": "NodeUniqueIndexSeek",
                    "identifiers": [
                      "g",
                      "x"
                    ],
                    "arguments": {
                      "EstimatedRows": 24.99861354810036,
                      "Index": ":Gene(geneId)"
                    },
                    "children": []
                  }
                ]
              }
            ]
          }
        ]
      }
    ]
  },
  "profile": false,
  "notifications": [],
  "server": {
    "address": "0.0.0.0:7687",
    "version": "Neo4j/3.5.5"
  },
  "resultConsumedAfter": {
    "low": 0,
    "high": 0
  },
  "resultAvailableAfter": {
    "low": 0,
    "high": 0
  }
}

raduvanciu · 2019-11-07T16:22:52Z

@Lojjs Thank you for looking into this I hope to hear back from you soon. If the information above does not help, what would be a good way to share the whole database dump privately? The size of the file is ~2.7GB.

craigtaverner · 2019-11-12T16:51:06Z

@raduvanciu Since you are using the enterprise edition, perhaps you are a paying customer of Neo4j, and therefore you could also register an official support ticket with customer support and they can help provide you with a way to upload a large dataset privately.

raduvanciu · 2019-11-12T18:36:43Z

@craigtaverner Unfortunately, the support does not extend to the startup program, so we are not eligible for customer support.

I can provide, via email, a private url to download the database. At the time of writing the comment, no developer is assigned to this issue.

craigtaverner · 2019-11-12T19:31:02Z

Hi @raduvanciu, perhaps if you could message me directly on the neo4j-users.slack.com we can discuss how best to proceed.

LinneaAndersson · 2023-05-30T06:07:12Z

@raduvanciu Is this still a problem for you or can we perhaps close this issue?

LinneaAndersson · 2023-07-10T12:37:27Z

@raduvanciu I will close this issue, please let us know if you have further questions or issues.

Regards,
Linnéa

raduvanciu added the bug label Aug 13, 2019

sherfert added team-cypher team-kernel labels Oct 22, 2019

Hunterness added the 3.5 label Oct 24, 2019

LinneaAndersson self-assigned this May 30, 2023

LinneaAndersson closed this as completed Jul 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cypher query returning UnknownError: node [...] not connected to this relationship[...] #12268

cypher query returning UnknownError: node [...] not connected to this relationship[...] #12268

raduvanciu commented Aug 13, 2019 •

edited

Loading

chrisvest commented Aug 14, 2019

raduvanciu commented Aug 14, 2019

raduvanciu commented Aug 14, 2019

sherfert commented Oct 21, 2019

raduvanciu commented Oct 21, 2019

Lojjs commented Nov 5, 2019 •

edited

Loading

raduvanciu commented Nov 5, 2019

raduvanciu commented Nov 7, 2019

craigtaverner commented Nov 12, 2019

raduvanciu commented Nov 12, 2019 •

edited

Loading

craigtaverner commented Nov 12, 2019

LinneaAndersson commented May 30, 2023

LinneaAndersson commented Jul 10, 2023

cypher query returning UnknownError: node [...] not connected to this relationship[...] #12268

cypher query returning UnknownError: node [...] not connected to this relationship[...] #12268

Comments

raduvanciu commented Aug 13, 2019 • edited Loading

chrisvest commented Aug 14, 2019

raduvanciu commented Aug 14, 2019

raduvanciu commented Aug 14, 2019

ID Allocation

sherfert commented Oct 21, 2019

raduvanciu commented Oct 21, 2019

Lojjs commented Nov 5, 2019 • edited Loading

raduvanciu commented Nov 5, 2019

raduvanciu commented Nov 7, 2019

craigtaverner commented Nov 12, 2019

raduvanciu commented Nov 12, 2019 • edited Loading

craigtaverner commented Nov 12, 2019

LinneaAndersson commented May 30, 2023

LinneaAndersson commented Jul 10, 2023

raduvanciu commented Aug 13, 2019 •

edited

Loading

Lojjs commented Nov 5, 2019 •

edited

Loading

raduvanciu commented Nov 12, 2019 •

edited

Loading