Skip to content

Fix unbounded cascade delete in neo4j_delete_universe#48

Merged
spuentesp merged 3 commits into
dl-1-cleanfrom
copilot/sub-pr-47
Dec 28, 2025
Merged

Fix unbounded cascade delete in neo4j_delete_universe#48
spuentesp merged 3 commits into
dl-1-cleanfrom
copilot/sub-pr-47

Conversation

Copy link
Copy Markdown

Copilot AI commented Dec 28, 2025

The cascade delete pattern (u)-[*0..]->(dependent) traverses all outgoing paths with no depth limit, risking unintended data deletion.

Changes:

  • Explicit relationship types: Replace variable-length path with specific relationships (HAS_SOURCE, HAS_AXIOM, HAS_STORY, HAS_SCENE, HAS_THREAD, IN_UNIVERSE)
  • Depth limit: Constrain traversal to 2 levels (Universe → Story → Scene/Thread)
  • Type validation: Filter deletions to expected node types only
  • Null handling: Filter null values from OPTIONAL MATCH collections

Before:

OPTIONAL MATCH path = (u)-[*0..]->(dependent)
WITH collect(DISTINCT dependent) + u AS nodes

After:

OPTIONAL MATCH (u)-[:HAS_SOURCE]->(source:Source)
OPTIONAL MATCH (u)-[:HAS_AXIOM]->(axiom:Axiom)
OPTIONAL MATCH (u)-[:HAS_STORY]->(story:Story)
OPTIONAL MATCH (story)-[:HAS_SCENE]->(scene:Scene)
OPTIONAL MATCH (story)-[:HAS_THREAD]->(thread:PlotThread)
OPTIONAL MATCH (u)<-[:IN_UNIVERSE]-(entity)
WHERE entity:EntityArchetype OR entity:EntityInstance
WITH u, 
     [x IN collect(DISTINCT source) WHERE x IS NOT NULL] as sources,
     [x IN collect(DISTINCT axiom) WHERE x IS NOT NULL] as axioms,
     // ... etc
WITH u, sources + axioms + stories + scenes + threads + entities AS dependents
UNWIND dependents AS dependent
WHERE dependent:Source OR dependent:Axiom OR dependent:Story OR 
      dependent:Scene OR dependent:PlotThread OR 
      dependent:EntityArchetype OR dependent:EntityInstance

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

@github-actions
Copy link
Copy Markdown

\nTests not detected in this PR. Please add/confirm coverage where applicable.

Copilot AI and others added 2 commits December 28, 2025 03:18
- Replace unbounded variable-length path (u)-[*0..]->(dependent) with explicit relationship types
- Add specific relationship traversal: HAS_SOURCE, HAS_AXIOM, HAS_STORY, HAS_SCENE, HAS_THREAD, IN_UNIVERSE
- Limit depth to 2 levels (Universe -> Story -> Scene/Thread)
- Add node type validation before deletion to prevent accidental data loss
- Only delete expected node types: Source, Axiom, Story, Scene, PlotThread, EntityArchetype, EntityInstance

Co-authored-by: spuentesp <112034353+spuentesp@users.noreply.github.com>
- Add WHERE x IS NOT NULL filters to all collection comprehensions
- Prevents null values from OPTIONAL MATCH from being included in dependents list
- Addresses code review feedback

Co-authored-by: spuentesp <112034353+spuentesp@users.noreply.github.com>
@spuentesp spuentesp marked this pull request as ready for review December 28, 2025 03:20
Copilot AI review requested due to automatic review settings December 28, 2025 03:20
@github-actions github-actions Bot added the area/data-layer Data layer changes label Dec 28, 2025
Copilot AI changed the title [WIP] Address feedback on manage multiverse PR Fix unbounded cascade delete in neo4j_delete_universe Dec 28, 2025
Copilot AI requested a review from spuentesp December 28, 2025 03:21
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +505 to +506
WITH u, sources + axioms + stories + scenes + threads + entities AS dependents
UNWIND dependents AS dependent
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Handle force delete when no dependents

In neo4j_delete_universe’s force path, UNWIND dependents AS dependent on the collected lists means the query produces zero rows when the universe has no sources/axioms/stories/scenes/threads/entities, so the subsequent delete and RETURN never execute. A force deletion of an empty universe will therefore return an empty result set and result[0]["deleted_count"] will raise at runtime while the universe remains undeleted. The delete should keep a row (e.g., include u) even when dependents is empty. (packages/data-layer/src/monitor_data/tools/neo4j_tools.py, lines 505-506)

Useful? React with 👍 / 👎.

@spuentesp spuentesp merged commit 7e08a1e into dl-1-clean Dec 28, 2025
7 checks passed
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses feedback on a previous multiverse management PR by replacing a dangerous unbounded variable-length path deletion pattern with explicit relationship types and depth limits. The goal is to make universe cascade deletion safer and more predictable.

Key Changes:

  • Replaced (u)-[*0..]->(dependent) with explicit OPTIONAL MATCH patterns for each relationship type
  • Added depth limiting to 2 levels (Universe → Story → Scene/Thread)
  • Added node type validation before deletion to prevent accidental data loss

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

WHERE dependent:Source OR dependent:Axiom OR dependent:Story OR
dependent:Scene OR dependent:PlotThread OR
dependent:EntityArchetype OR dependent:EntityInstance
WITH collect(DISTINCT dependent) + u AS nodes
Copy link

Copilot AI Dec 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential Cypher syntax error: Line 512 attempts to concatenate a list with a single element using collect(DISTINCT dependent) + u. In Cypher, you cannot directly concatenate a list with a non-list element. This should be collect(DISTINCT dependent) + [u] to add the universe as a single-element list.

However, this line may never be reached due to the UNWIND issue on line 506, so this error might not surface until that bug is fixed.

Suggested change
WITH collect(DISTINCT dependent) + u AS nodes
WITH collect(DISTINCT dependent) + [u] AS nodes

Copilot uses AI. Check for mistakes.
Comment on lines 484 to 514
delete_query = """
MATCH (u:Universe {id: $id})
OPTIONAL MATCH path = (u)-[*0..]->(dependent)
// Explicitly collect direct dependencies with depth limit
OPTIONAL MATCH (u)-[:HAS_SOURCE]->(source:Source)
OPTIONAL MATCH (u)-[:HAS_AXIOM]->(axiom:Axiom)
OPTIONAL MATCH (u)-[:HAS_STORY]->(story:Story)
// Collect story dependencies (1 level deep from Story)
OPTIONAL MATCH (story)-[:HAS_SCENE]->(scene:Scene)
OPTIONAL MATCH (story)-[:HAS_THREAD]->(thread:PlotThread)
// Collect entities with IN_UNIVERSE relationship
OPTIONAL MATCH (u)<-[:IN_UNIVERSE]-(entity)
WHERE entity:EntityArchetype OR entity:EntityInstance
// Collect and filter out nulls from OPTIONAL MATCH results
WITH u,
[x IN collect(DISTINCT source) WHERE x IS NOT NULL] as sources,
[x IN collect(DISTINCT axiom) WHERE x IS NOT NULL] as axioms,
[x IN collect(DISTINCT story) WHERE x IS NOT NULL] as stories,
[x IN collect(DISTINCT scene) WHERE x IS NOT NULL] as scenes,
[x IN collect(DISTINCT thread) WHERE x IS NOT NULL] as threads,
[x IN collect(DISTINCT entity) WHERE x IS NOT NULL] as entities
// Flatten into single list
WITH u, sources + axioms + stories + scenes + threads + entities AS dependents
UNWIND dependents AS dependent
// Final safety check: only delete expected node types
WITH u, dependent
WHERE dependent:Source OR dependent:Axiom OR dependent:Story OR
dependent:Scene OR dependent:PlotThread OR
dependent:EntityArchetype OR dependent:EntityInstance
WITH collect(DISTINCT dependent) + u AS nodes
UNWIND nodes AS n
DETACH DELETE n
Copy link

Copilot AI Dec 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maintainability concern: The Cypher query is complex and difficult to reason about due to the multiple stages of OPTIONAL MATCH, collection, filtering, unwinding, and re-collecting. This complexity has led to subtle bugs (see other comments).

Consider simplifying the query structure. One approach would be to use CALL subqueries or separate queries to collect each node type independently, then combine them. For example:

  • Query 1: Collect all direct dependencies (sources, axioms, stories, entities)
  • Query 2: For each story, collect scenes and threads
  • Query 3: Delete all collected nodes plus the universe

This would be more maintainable, easier to test, and less prone to cartesian product issues.

Copilot uses AI. Check for mistakes.
Comment on lines 484 to 514
delete_query = """
MATCH (u:Universe {id: $id})
OPTIONAL MATCH path = (u)-[*0..]->(dependent)
// Explicitly collect direct dependencies with depth limit
OPTIONAL MATCH (u)-[:HAS_SOURCE]->(source:Source)
OPTIONAL MATCH (u)-[:HAS_AXIOM]->(axiom:Axiom)
OPTIONAL MATCH (u)-[:HAS_STORY]->(story:Story)
// Collect story dependencies (1 level deep from Story)
OPTIONAL MATCH (story)-[:HAS_SCENE]->(scene:Scene)
OPTIONAL MATCH (story)-[:HAS_THREAD]->(thread:PlotThread)
// Collect entities with IN_UNIVERSE relationship
OPTIONAL MATCH (u)<-[:IN_UNIVERSE]-(entity)
WHERE entity:EntityArchetype OR entity:EntityInstance
// Collect and filter out nulls from OPTIONAL MATCH results
WITH u,
[x IN collect(DISTINCT source) WHERE x IS NOT NULL] as sources,
[x IN collect(DISTINCT axiom) WHERE x IS NOT NULL] as axioms,
[x IN collect(DISTINCT story) WHERE x IS NOT NULL] as stories,
[x IN collect(DISTINCT scene) WHERE x IS NOT NULL] as scenes,
[x IN collect(DISTINCT thread) WHERE x IS NOT NULL] as threads,
[x IN collect(DISTINCT entity) WHERE x IS NOT NULL] as entities
// Flatten into single list
WITH u, sources + axioms + stories + scenes + threads + entities AS dependents
UNWIND dependents AS dependent
// Final safety check: only delete expected node types
WITH u, dependent
WHERE dependent:Source OR dependent:Axiom OR dependent:Story OR
dependent:Scene OR dependent:PlotThread OR
dependent:EntityArchetype OR dependent:EntityInstance
WITH collect(DISTINCT dependent) + u AS nodes
UNWIND nodes AS n
DETACH DELETE n
Copy link

Copilot AI Dec 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test coverage gap: There's no test case for force=True when there are no dependent nodes. The existing test at test_delete_universe_with_force mocks a deleted_count of 10, implying dependencies exist. A test case where force=True is used on a universe with no dependencies (expected deleted_count=1) would catch the bug where the UNWIND of an empty list produces zero rows, preventing deletion of the universe itself.

This test case is critical because it represents a valid use case - a user might use force=True as a precaution even when no dependencies exist.

Copilot uses AI. Check for mistakes.
Comment on lines +506 to 512
UNWIND dependents AS dependent
// Final safety check: only delete expected node types
WITH u, dependent
WHERE dependent:Source OR dependent:Axiom OR dependent:Story OR
dependent:Scene OR dependent:PlotThread OR
dependent:EntityArchetype OR dependent:EntityInstance
WITH collect(DISTINCT dependent) + u AS nodes
Copy link

Copilot AI Dec 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical bug: When there are no dependent nodes, the UNWIND of an empty list at line 506 produces zero rows. This means the subsequent WITH clauses and the final collection at line 512 will also have zero rows, preventing the Universe node itself from being deleted.

To fix this, you need to handle the empty dependents case separately. One approach is to check if the dependents list is empty before unwinding, or use a CASE statement. For example, you could restructure to ensure the universe is always included in the nodes list, even when there are no dependents.

Suggested change
UNWIND dependents AS dependent
// Final safety check: only delete expected node types
WITH u, dependent
WHERE dependent:Source OR dependent:Axiom OR dependent:Story OR
dependent:Scene OR dependent:PlotThread OR
dependent:EntityArchetype OR dependent:EntityInstance
WITH collect(DISTINCT dependent) + u AS nodes
// Final safety check: only keep expected node types
WITH u, [d IN dependents WHERE d:Source OR d:Axiom OR d:Story OR
d:Scene OR d:PlotThread OR
d:EntityArchetype OR d:EntityInstance] AS filtered_dependents
// Always include the Universe node, even if there are no dependents
WITH CASE
WHEN size(filtered_dependents) = 0
THEN [u]
ELSE filtered_dependents + u
END AS nodes

Copilot uses AI. Check for mistakes.
Comment on lines +486 to +503
// Explicitly collect direct dependencies with depth limit
OPTIONAL MATCH (u)-[:HAS_SOURCE]->(source:Source)
OPTIONAL MATCH (u)-[:HAS_AXIOM]->(axiom:Axiom)
OPTIONAL MATCH (u)-[:HAS_STORY]->(story:Story)
// Collect story dependencies (1 level deep from Story)
OPTIONAL MATCH (story)-[:HAS_SCENE]->(scene:Scene)
OPTIONAL MATCH (story)-[:HAS_THREAD]->(thread:PlotThread)
// Collect entities with IN_UNIVERSE relationship
OPTIONAL MATCH (u)<-[:IN_UNIVERSE]-(entity)
WHERE entity:EntityArchetype OR entity:EntityInstance
// Collect and filter out nulls from OPTIONAL MATCH results
WITH u,
[x IN collect(DISTINCT source) WHERE x IS NOT NULL] as sources,
[x IN collect(DISTINCT axiom) WHERE x IS NOT NULL] as axioms,
[x IN collect(DISTINCT story) WHERE x IS NOT NULL] as stories,
[x IN collect(DISTINCT scene) WHERE x IS NOT NULL] as scenes,
[x IN collect(DISTINCT thread) WHERE x IS NOT NULL] as threads,
[x IN collect(DISTINCT entity) WHERE x IS NOT NULL] as entities
Copy link

Copilot AI Dec 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential performance issue: The multiple OPTIONAL MATCH clauses (lines 487-494) create a cartesian product across all matched nodes before aggregation. For example, if there are 10 sources, 10 axioms, 10 stories (each with 10 scenes), the intermediate result set would have 10 × 10 × 100 = 10,000 rows before the WITH clause aggregates them with collect(DISTINCT ...).

While the DISTINCT in the collect clauses (lines 498-503) ensures correctness, this cartesian product could cause performance issues with large datasets. Consider using separate queries or CALL subqueries to collect each node type independently, then combine the results. This would scale better as O(n+m+...) instead of O(n×m×...).

Suggested change
// Explicitly collect direct dependencies with depth limit
OPTIONAL MATCH (u)-[:HAS_SOURCE]->(source:Source)
OPTIONAL MATCH (u)-[:HAS_AXIOM]->(axiom:Axiom)
OPTIONAL MATCH (u)-[:HAS_STORY]->(story:Story)
// Collect story dependencies (1 level deep from Story)
OPTIONAL MATCH (story)-[:HAS_SCENE]->(scene:Scene)
OPTIONAL MATCH (story)-[:HAS_THREAD]->(thread:PlotThread)
// Collect entities with IN_UNIVERSE relationship
OPTIONAL MATCH (u)<-[:IN_UNIVERSE]-(entity)
WHERE entity:EntityArchetype OR entity:EntityInstance
// Collect and filter out nulls from OPTIONAL MATCH results
WITH u,
[x IN collect(DISTINCT source) WHERE x IS NOT NULL] as sources,
[x IN collect(DISTINCT axiom) WHERE x IS NOT NULL] as axioms,
[x IN collect(DISTINCT story) WHERE x IS NOT NULL] as stories,
[x IN collect(DISTINCT scene) WHERE x IS NOT NULL] as scenes,
[x IN collect(DISTINCT thread) WHERE x IS NOT NULL] as threads,
[x IN collect(DISTINCT entity) WHERE x IS NOT NULL] as entities
// Explicitly collect direct dependencies with depth limit using subqueries to avoid cartesian products
CALL {
WITH u
OPTIONAL MATCH (u)-[:HAS_SOURCE]->(source:Source)
RETURN [x IN collect(DISTINCT source) WHERE x IS NOT NULL] AS sources
}
CALL {
WITH u
OPTIONAL MATCH (u)-[:HAS_AXIOM]->(axiom:Axiom)
RETURN [x IN collect(DISTINCT axiom) WHERE x IS NOT NULL] AS axioms
}
CALL {
WITH u
OPTIONAL MATCH (u)-[:HAS_STORY]->(story:Story)
RETURN [x IN collect(DISTINCT story) WHERE x IS NOT NULL] AS stories
}
// Collect story dependencies (1 level deep from Story)
CALL {
WITH u
OPTIONAL MATCH (u)-[:HAS_STORY]->(story:Story)-[:HAS_SCENE]->(scene:Scene)
RETURN [x IN collect(DISTINCT scene) WHERE x IS NOT NULL] AS scenes
}
CALL {
WITH u
OPTIONAL MATCH (u)-[:HAS_STORY]->(story:Story)-[:HAS_THREAD]->(thread:PlotThread)
RETURN [x IN collect(DISTINCT thread) WHERE x IS NOT NULL] AS threads
}
// Collect entities with IN_UNIVERSE relationship
CALL {
WITH u
OPTIONAL MATCH (u)<-[:IN_UNIVERSE]-(entity)
WHERE entity:EntityArchetype OR entity:EntityInstance
RETURN [x IN collect(DISTINCT entity) WHERE x IS NOT NULL] AS entities
}
WITH u, sources, axioms, stories, scenes, threads, entities

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/data-layer Data layer changes needs-tests No tests touched

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants