Fix unbounded cascade delete in neo4j_delete_universe#48
Conversation
| \nTests not detected in this PR. Please add/confirm coverage where applicable. |
- Replace unbounded variable-length path (u)-[*0..]->(dependent) with explicit relationship types - Add specific relationship traversal: HAS_SOURCE, HAS_AXIOM, HAS_STORY, HAS_SCENE, HAS_THREAD, IN_UNIVERSE - Limit depth to 2 levels (Universe -> Story -> Scene/Thread) - Add node type validation before deletion to prevent accidental data loss - Only delete expected node types: Source, Axiom, Story, Scene, PlotThread, EntityArchetype, EntityInstance Co-authored-by: spuentesp <112034353+spuentesp@users.noreply.github.com>
- Add WHERE x IS NOT NULL filters to all collection comprehensions - Prevents null values from OPTIONAL MATCH from being included in dependents list - Addresses code review feedback Co-authored-by: spuentesp <112034353+spuentesp@users.noreply.github.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| WITH u, sources + axioms + stories + scenes + threads + entities AS dependents | ||
| UNWIND dependents AS dependent |
There was a problem hiding this comment.
Handle force delete when no dependents
In neo4j_delete_universe’s force path, UNWIND dependents AS dependent on the collected lists means the query produces zero rows when the universe has no sources/axioms/stories/scenes/threads/entities, so the subsequent delete and RETURN never execute. A force deletion of an empty universe will therefore return an empty result set and result[0]["deleted_count"] will raise at runtime while the universe remains undeleted. The delete should keep a row (e.g., include u) even when dependents is empty. (packages/data-layer/src/monitor_data/tools/neo4j_tools.py, lines 505-506)
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
This PR addresses feedback on a previous multiverse management PR by replacing a dangerous unbounded variable-length path deletion pattern with explicit relationship types and depth limits. The goal is to make universe cascade deletion safer and more predictable.
Key Changes:
- Replaced
(u)-[*0..]->(dependent)with explicit OPTIONAL MATCH patterns for each relationship type - Added depth limiting to 2 levels (Universe → Story → Scene/Thread)
- Added node type validation before deletion to prevent accidental data loss
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| WHERE dependent:Source OR dependent:Axiom OR dependent:Story OR | ||
| dependent:Scene OR dependent:PlotThread OR | ||
| dependent:EntityArchetype OR dependent:EntityInstance | ||
| WITH collect(DISTINCT dependent) + u AS nodes |
There was a problem hiding this comment.
Potential Cypher syntax error: Line 512 attempts to concatenate a list with a single element using collect(DISTINCT dependent) + u. In Cypher, you cannot directly concatenate a list with a non-list element. This should be collect(DISTINCT dependent) + [u] to add the universe as a single-element list.
However, this line may never be reached due to the UNWIND issue on line 506, so this error might not surface until that bug is fixed.
| WITH collect(DISTINCT dependent) + u AS nodes | |
| WITH collect(DISTINCT dependent) + [u] AS nodes |
| delete_query = """ | ||
| MATCH (u:Universe {id: $id}) | ||
| OPTIONAL MATCH path = (u)-[*0..]->(dependent) | ||
| // Explicitly collect direct dependencies with depth limit | ||
| OPTIONAL MATCH (u)-[:HAS_SOURCE]->(source:Source) | ||
| OPTIONAL MATCH (u)-[:HAS_AXIOM]->(axiom:Axiom) | ||
| OPTIONAL MATCH (u)-[:HAS_STORY]->(story:Story) | ||
| // Collect story dependencies (1 level deep from Story) | ||
| OPTIONAL MATCH (story)-[:HAS_SCENE]->(scene:Scene) | ||
| OPTIONAL MATCH (story)-[:HAS_THREAD]->(thread:PlotThread) | ||
| // Collect entities with IN_UNIVERSE relationship | ||
| OPTIONAL MATCH (u)<-[:IN_UNIVERSE]-(entity) | ||
| WHERE entity:EntityArchetype OR entity:EntityInstance | ||
| // Collect and filter out nulls from OPTIONAL MATCH results | ||
| WITH u, | ||
| [x IN collect(DISTINCT source) WHERE x IS NOT NULL] as sources, | ||
| [x IN collect(DISTINCT axiom) WHERE x IS NOT NULL] as axioms, | ||
| [x IN collect(DISTINCT story) WHERE x IS NOT NULL] as stories, | ||
| [x IN collect(DISTINCT scene) WHERE x IS NOT NULL] as scenes, | ||
| [x IN collect(DISTINCT thread) WHERE x IS NOT NULL] as threads, | ||
| [x IN collect(DISTINCT entity) WHERE x IS NOT NULL] as entities | ||
| // Flatten into single list | ||
| WITH u, sources + axioms + stories + scenes + threads + entities AS dependents | ||
| UNWIND dependents AS dependent | ||
| // Final safety check: only delete expected node types | ||
| WITH u, dependent | ||
| WHERE dependent:Source OR dependent:Axiom OR dependent:Story OR | ||
| dependent:Scene OR dependent:PlotThread OR | ||
| dependent:EntityArchetype OR dependent:EntityInstance | ||
| WITH collect(DISTINCT dependent) + u AS nodes | ||
| UNWIND nodes AS n | ||
| DETACH DELETE n |
There was a problem hiding this comment.
Maintainability concern: The Cypher query is complex and difficult to reason about due to the multiple stages of OPTIONAL MATCH, collection, filtering, unwinding, and re-collecting. This complexity has led to subtle bugs (see other comments).
Consider simplifying the query structure. One approach would be to use CALL subqueries or separate queries to collect each node type independently, then combine them. For example:
- Query 1: Collect all direct dependencies (sources, axioms, stories, entities)
- Query 2: For each story, collect scenes and threads
- Query 3: Delete all collected nodes plus the universe
This would be more maintainable, easier to test, and less prone to cartesian product issues.
| delete_query = """ | ||
| MATCH (u:Universe {id: $id}) | ||
| OPTIONAL MATCH path = (u)-[*0..]->(dependent) | ||
| // Explicitly collect direct dependencies with depth limit | ||
| OPTIONAL MATCH (u)-[:HAS_SOURCE]->(source:Source) | ||
| OPTIONAL MATCH (u)-[:HAS_AXIOM]->(axiom:Axiom) | ||
| OPTIONAL MATCH (u)-[:HAS_STORY]->(story:Story) | ||
| // Collect story dependencies (1 level deep from Story) | ||
| OPTIONAL MATCH (story)-[:HAS_SCENE]->(scene:Scene) | ||
| OPTIONAL MATCH (story)-[:HAS_THREAD]->(thread:PlotThread) | ||
| // Collect entities with IN_UNIVERSE relationship | ||
| OPTIONAL MATCH (u)<-[:IN_UNIVERSE]-(entity) | ||
| WHERE entity:EntityArchetype OR entity:EntityInstance | ||
| // Collect and filter out nulls from OPTIONAL MATCH results | ||
| WITH u, | ||
| [x IN collect(DISTINCT source) WHERE x IS NOT NULL] as sources, | ||
| [x IN collect(DISTINCT axiom) WHERE x IS NOT NULL] as axioms, | ||
| [x IN collect(DISTINCT story) WHERE x IS NOT NULL] as stories, | ||
| [x IN collect(DISTINCT scene) WHERE x IS NOT NULL] as scenes, | ||
| [x IN collect(DISTINCT thread) WHERE x IS NOT NULL] as threads, | ||
| [x IN collect(DISTINCT entity) WHERE x IS NOT NULL] as entities | ||
| // Flatten into single list | ||
| WITH u, sources + axioms + stories + scenes + threads + entities AS dependents | ||
| UNWIND dependents AS dependent | ||
| // Final safety check: only delete expected node types | ||
| WITH u, dependent | ||
| WHERE dependent:Source OR dependent:Axiom OR dependent:Story OR | ||
| dependent:Scene OR dependent:PlotThread OR | ||
| dependent:EntityArchetype OR dependent:EntityInstance | ||
| WITH collect(DISTINCT dependent) + u AS nodes | ||
| UNWIND nodes AS n | ||
| DETACH DELETE n |
There was a problem hiding this comment.
Test coverage gap: There's no test case for force=True when there are no dependent nodes. The existing test at test_delete_universe_with_force mocks a deleted_count of 10, implying dependencies exist. A test case where force=True is used on a universe with no dependencies (expected deleted_count=1) would catch the bug where the UNWIND of an empty list produces zero rows, preventing deletion of the universe itself.
This test case is critical because it represents a valid use case - a user might use force=True as a precaution even when no dependencies exist.
| UNWIND dependents AS dependent | ||
| // Final safety check: only delete expected node types | ||
| WITH u, dependent | ||
| WHERE dependent:Source OR dependent:Axiom OR dependent:Story OR | ||
| dependent:Scene OR dependent:PlotThread OR | ||
| dependent:EntityArchetype OR dependent:EntityInstance | ||
| WITH collect(DISTINCT dependent) + u AS nodes |
There was a problem hiding this comment.
Critical bug: When there are no dependent nodes, the UNWIND of an empty list at line 506 produces zero rows. This means the subsequent WITH clauses and the final collection at line 512 will also have zero rows, preventing the Universe node itself from being deleted.
To fix this, you need to handle the empty dependents case separately. One approach is to check if the dependents list is empty before unwinding, or use a CASE statement. For example, you could restructure to ensure the universe is always included in the nodes list, even when there are no dependents.
| UNWIND dependents AS dependent | |
| // Final safety check: only delete expected node types | |
| WITH u, dependent | |
| WHERE dependent:Source OR dependent:Axiom OR dependent:Story OR | |
| dependent:Scene OR dependent:PlotThread OR | |
| dependent:EntityArchetype OR dependent:EntityInstance | |
| WITH collect(DISTINCT dependent) + u AS nodes | |
| // Final safety check: only keep expected node types | |
| WITH u, [d IN dependents WHERE d:Source OR d:Axiom OR d:Story OR | |
| d:Scene OR d:PlotThread OR | |
| d:EntityArchetype OR d:EntityInstance] AS filtered_dependents | |
| // Always include the Universe node, even if there are no dependents | |
| WITH CASE | |
| WHEN size(filtered_dependents) = 0 | |
| THEN [u] | |
| ELSE filtered_dependents + u | |
| END AS nodes |
| // Explicitly collect direct dependencies with depth limit | ||
| OPTIONAL MATCH (u)-[:HAS_SOURCE]->(source:Source) | ||
| OPTIONAL MATCH (u)-[:HAS_AXIOM]->(axiom:Axiom) | ||
| OPTIONAL MATCH (u)-[:HAS_STORY]->(story:Story) | ||
| // Collect story dependencies (1 level deep from Story) | ||
| OPTIONAL MATCH (story)-[:HAS_SCENE]->(scene:Scene) | ||
| OPTIONAL MATCH (story)-[:HAS_THREAD]->(thread:PlotThread) | ||
| // Collect entities with IN_UNIVERSE relationship | ||
| OPTIONAL MATCH (u)<-[:IN_UNIVERSE]-(entity) | ||
| WHERE entity:EntityArchetype OR entity:EntityInstance | ||
| // Collect and filter out nulls from OPTIONAL MATCH results | ||
| WITH u, | ||
| [x IN collect(DISTINCT source) WHERE x IS NOT NULL] as sources, | ||
| [x IN collect(DISTINCT axiom) WHERE x IS NOT NULL] as axioms, | ||
| [x IN collect(DISTINCT story) WHERE x IS NOT NULL] as stories, | ||
| [x IN collect(DISTINCT scene) WHERE x IS NOT NULL] as scenes, | ||
| [x IN collect(DISTINCT thread) WHERE x IS NOT NULL] as threads, | ||
| [x IN collect(DISTINCT entity) WHERE x IS NOT NULL] as entities |
There was a problem hiding this comment.
Potential performance issue: The multiple OPTIONAL MATCH clauses (lines 487-494) create a cartesian product across all matched nodes before aggregation. For example, if there are 10 sources, 10 axioms, 10 stories (each with 10 scenes), the intermediate result set would have 10 × 10 × 100 = 10,000 rows before the WITH clause aggregates them with collect(DISTINCT ...).
While the DISTINCT in the collect clauses (lines 498-503) ensures correctness, this cartesian product could cause performance issues with large datasets. Consider using separate queries or CALL subqueries to collect each node type independently, then combine the results. This would scale better as O(n+m+...) instead of O(n×m×...).
| // Explicitly collect direct dependencies with depth limit | |
| OPTIONAL MATCH (u)-[:HAS_SOURCE]->(source:Source) | |
| OPTIONAL MATCH (u)-[:HAS_AXIOM]->(axiom:Axiom) | |
| OPTIONAL MATCH (u)-[:HAS_STORY]->(story:Story) | |
| // Collect story dependencies (1 level deep from Story) | |
| OPTIONAL MATCH (story)-[:HAS_SCENE]->(scene:Scene) | |
| OPTIONAL MATCH (story)-[:HAS_THREAD]->(thread:PlotThread) | |
| // Collect entities with IN_UNIVERSE relationship | |
| OPTIONAL MATCH (u)<-[:IN_UNIVERSE]-(entity) | |
| WHERE entity:EntityArchetype OR entity:EntityInstance | |
| // Collect and filter out nulls from OPTIONAL MATCH results | |
| WITH u, | |
| [x IN collect(DISTINCT source) WHERE x IS NOT NULL] as sources, | |
| [x IN collect(DISTINCT axiom) WHERE x IS NOT NULL] as axioms, | |
| [x IN collect(DISTINCT story) WHERE x IS NOT NULL] as stories, | |
| [x IN collect(DISTINCT scene) WHERE x IS NOT NULL] as scenes, | |
| [x IN collect(DISTINCT thread) WHERE x IS NOT NULL] as threads, | |
| [x IN collect(DISTINCT entity) WHERE x IS NOT NULL] as entities | |
| // Explicitly collect direct dependencies with depth limit using subqueries to avoid cartesian products | |
| CALL { | |
| WITH u | |
| OPTIONAL MATCH (u)-[:HAS_SOURCE]->(source:Source) | |
| RETURN [x IN collect(DISTINCT source) WHERE x IS NOT NULL] AS sources | |
| } | |
| CALL { | |
| WITH u | |
| OPTIONAL MATCH (u)-[:HAS_AXIOM]->(axiom:Axiom) | |
| RETURN [x IN collect(DISTINCT axiom) WHERE x IS NOT NULL] AS axioms | |
| } | |
| CALL { | |
| WITH u | |
| OPTIONAL MATCH (u)-[:HAS_STORY]->(story:Story) | |
| RETURN [x IN collect(DISTINCT story) WHERE x IS NOT NULL] AS stories | |
| } | |
| // Collect story dependencies (1 level deep from Story) | |
| CALL { | |
| WITH u | |
| OPTIONAL MATCH (u)-[:HAS_STORY]->(story:Story)-[:HAS_SCENE]->(scene:Scene) | |
| RETURN [x IN collect(DISTINCT scene) WHERE x IS NOT NULL] AS scenes | |
| } | |
| CALL { | |
| WITH u | |
| OPTIONAL MATCH (u)-[:HAS_STORY]->(story:Story)-[:HAS_THREAD]->(thread:PlotThread) | |
| RETURN [x IN collect(DISTINCT thread) WHERE x IS NOT NULL] AS threads | |
| } | |
| // Collect entities with IN_UNIVERSE relationship | |
| CALL { | |
| WITH u | |
| OPTIONAL MATCH (u)<-[:IN_UNIVERSE]-(entity) | |
| WHERE entity:EntityArchetype OR entity:EntityInstance | |
| RETURN [x IN collect(DISTINCT entity) WHERE x IS NOT NULL] AS entities | |
| } | |
| WITH u, sources, axioms, stories, scenes, threads, entities |
The cascade delete pattern
(u)-[*0..]->(dependent)traverses all outgoing paths with no depth limit, risking unintended data deletion.Changes:
Before:
After:
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.