Skip to content

fix: UPDATE EDGE SET @in/@out correctly rewires vertex edge lists#4074

Merged
robfrank merged 4 commits into
mainfrom
fix/update-edge
May 5, 2026
Merged

fix: UPDATE EDGE SET @in/@out correctly rewires vertex edge lists#4074
robfrank merged 4 commits into
mainfrom
fix/update-edge

Conversation

@robfrank
Copy link
Copy Markdown
Collaborator

@robfrank robfrank commented May 4, 2026

What does this PR do?

Implements correct behaviour for UPDATE EDGE SET @in = <rid> and UPDATE EDGE SET @out = <rid>, which allows an existing edge to be re-pointed to a different vertex.

Three bugs were fixed:

1. Wrong SQL syntax in the test - The original test used `#1:2` (backtick-quoted identifier), which the ANTLR grammar resolves as a property-name lookup on the current record (returning null). The correct RID literal syntax is #1:2 without backticks.

2. MutableEdge.set("@in"/"@out", ...) silently discarded the value - The previous set(String, Object) override delegated to MutableDocument.set(), storing the value in the document property map. However BinarySerializer.serializeEdge() reads edge.getIn()/edge.getOut() (the dedicated in/out fields), never the map, so the update was lost on serialisation. setIn()/setOut() also failed to mark the record dirty.

3. Vertex edge-linked-lists not updated - Even with the field corrected, the EdgeLinkedList structures inside the adjacent vertices remained stale (Alice's OUT list still pointed to Bob; Bob's IN list still referenced the edge). This caused CHECK DATABASE to report a corrupted record.

Changes

  • MutableEdge - overrides set(String, Object) to intercept @in/@out and route through GraphEngine.moveEdge(). Fixes setIn()/setOut() to set dirty = true.
  • GraphEngine - new moveEdge(edge, direction, newVertexRID) that atomically:
    1. Removes the edge from the old endpoint vertex's list
    2. Removes the stale entry from the opposite vertex's list (since its stored destination is now wrong)
    3. Updates edge.in or edge.out
    4. Re-adds the entry to the opposite vertex's list with the new destination
    5. Adds the entry to the new endpoint vertex's list
  • UpdateEdgeTest - rewrites the test with correct RID literal SQL, uses captured RID variables instead of hardcoded strings in both the SQL command and assertions, removes debug prints.

Checklist

  • I have run the build using mvn clean package command
  • My unit tests cover both failure and success scenarios

@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented May 4, 2026

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Coverage 84.78% diff coverage · -7.25% coverage variation

Metric Results
Coverage variation -7.25% coverage variation
Diff coverage 84.78% diff coverage

View coverage diff in Codacy

Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (76f212a) 121624 88477 72.75%
Head commit (a1692c4) 153017 (+31393) 100228 (+11751) 65.50% (-7.25%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#4074) 46 39 84.78%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the ability to move edges between vertices by updating the @in and @out properties in MutableEdge. It adds a moveEdge method to GraphEngine to handle the necessary updates to vertex edge lists to maintain graph consistency. Feedback suggests addressing potential graph inconsistencies when setting endpoints to null and optimizing moveEdge by reducing redundant database lookups and adding direction validation.

return this;
if ("@in".equals(name)) {
final RID newIn = toRID(value);
if (rid != null && newIn != null && !newIn.equals(this.in) && database instanceof DatabaseInternal dbInt)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

When newIn (or newOut) is null, the moveEdge logic is skipped, but the field is still updated. This results in the edge being disconnected from its current vertex without updating that vertex's edge list, leading to graph inconsistency (stale pointers). Since ArcadeDB edges require both endpoints, setting @in or @out to null should either be prohibited or handled by properly removing the edge from the old vertex's list.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 6678316. Setting @in or @out to null now throws IllegalArgumentException immediately, before any field mutation. The newIn != null guard was also removed from the main condition since we now reject null upfront.

Comment on lines +397 to +457
public void moveEdge(final MutableEdge edge, final Vertex.DIRECTION direction, final RID newVertexRID) {
if (direction == Vertex.DIRECTION.IN) {
final RID oldIn = edge.getIn();
final RID out = edge.getOut();
// Remove from old in-vertex's IN list
if (oldIn != null) {
try {
final VertexInternal oldVIn = (VertexInternal) database.lookupByRID(oldIn, false);
final EdgeLinkedList inEdges = getEdgeHeadChunk(oldVIn, Vertex.DIRECTION.IN);
if (inEdges != null)
inEdges.removeEdge(edge);
} catch (final RecordNotFoundException ignored) {
}
}
// Remove from out-vertex's OUT list (its stored destination is stale after the move)
if (out != null) {
try {
final VertexInternal vOut = (VertexInternal) database.lookupByRID(out, false);
final EdgeLinkedList outEdges = getEdgeHeadChunk(vOut, Vertex.DIRECTION.OUT);
if (outEdges != null)
outEdges.removeEdge(edge);
} catch (final RecordNotFoundException ignored) {
}
}
edge.setIn(newVertexRID);
// Re-add to out-vertex's OUT list pointing to the new in-vertex
if (out != null)
connectOutgoingEdge((VertexInternal) database.lookupByRID(out, false), database.lookupByRID(newVertexRID, false), edge);
// Add to new in-vertex's IN list
connectIncomingEdge(database.lookupByRID(newVertexRID, false), out, edge.getIdentity());
} else {
final RID oldOut = edge.getOut();
final RID in = edge.getIn();
// Remove from old out-vertex's OUT list
if (oldOut != null) {
try {
final VertexInternal oldVOut = (VertexInternal) database.lookupByRID(oldOut, false);
final EdgeLinkedList outEdges = getEdgeHeadChunk(oldVOut, Vertex.DIRECTION.OUT);
if (outEdges != null)
outEdges.removeEdge(edge);
} catch (final RecordNotFoundException ignored) {
}
}
// Remove from in-vertex's IN list (its stored source is stale after the move)
if (in != null) {
try {
final VertexInternal vIn = (VertexInternal) database.lookupByRID(in, false);
final EdgeLinkedList inEdges = getEdgeHeadChunk(vIn, Vertex.DIRECTION.IN);
if (inEdges != null)
inEdges.removeEdge(edge);
} catch (final RecordNotFoundException ignored) {
}
}
edge.setOut(newVertexRID);
// Add to new out-vertex's OUT list
connectOutgoingEdge((VertexInternal) database.lookupByRID(newVertexRID, false), database.lookupByRID(in, false), edge);
// Re-add to in-vertex's IN list with the new source
if (in != null)
connectIncomingEdge(database.lookupByRID(in, false), newVertexRID, edge.getIdentity());
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The moveEdge method contains redundant database lookups for the same RIDs and duplicated logic between the IN and OUT direction handling. Specifically, newVertexRID and the other end of the edge are looked up multiple times. Reusing the vertex objects from the initial lookups and adding explicit validation for the direction parameter (to avoid issues if BOTH is passed) would improve performance and code quality.

  public void moveEdge(final MutableEdge edge, final Vertex.DIRECTION direction, final RID newVertexRID) {
    final Identifiable vNew = database.lookupByRID(newVertexRID, false);
    if (direction == Vertex.DIRECTION.IN) {
      final RID oldIn = edge.getIn();
      final RID out = edge.getOut();
      if (oldIn != null) {
        try {
          final VertexInternal oldVIn = (VertexInternal) database.lookupByRID(oldIn, false);
          final EdgeLinkedList inEdges = getEdgeHeadChunk(oldVIn, Vertex.DIRECTION.IN);
          if (inEdges != null)
            inEdges.removeEdge(edge);
        } catch (final RecordNotFoundException ignored) {
        }
      }
      VertexInternal vOut = null;
      if (out != null) {
        try {
          vOut = (VertexInternal) database.lookupByRID(out, false);
          final EdgeLinkedList outEdges = getEdgeHeadChunk(vOut, Vertex.DIRECTION.OUT);
          if (outEdges != null)
            outEdges.removeEdge(edge);
        } catch (final RecordNotFoundException ignored) {
        }
      }
      edge.setIn(newVertexRID);
      if (vOut != null)
        connectOutgoingEdge(vOut, vNew, edge);
      connectIncomingEdge(vNew, out, edge.getIdentity());
    } else if (direction == Vertex.DIRECTION.OUT) {
      final RID oldOut = edge.getOut();
      final RID in = edge.getIn();
      if (oldOut != null) {
        try {
          final VertexInternal oldVOut = (VertexInternal) database.lookupByRID(oldOut, false);
          final EdgeLinkedList outEdges = getEdgeHeadChunk(oldVOut, Vertex.DIRECTION.OUT);
          if (outEdges != null)
            outEdges.removeEdge(edge);
        } catch (final RecordNotFoundException ignored) {
        }
      }
      VertexInternal vIn = null;
      if (in != null) {
        try {
          vIn = (VertexInternal) database.lookupByRID(in, false);
          final EdgeLinkedList inEdges = getEdgeHeadChunk(vIn, Vertex.DIRECTION.IN);
          if (inEdges != null)
            inEdges.removeEdge(edge);
        } catch (final RecordNotFoundException ignored) {
        }
      }
      edge.setOut(newVertexRID);
      connectOutgoingEdge((VertexInternal) vNew, vIn != null ? vIn : (in != null ? database.lookupByRID(in, false) : null), edge);
      if (in != null)
        connectIncomingEdge(vIn != null ? vIn : database.lookupByRID(in, false), newVertexRID, edge.getIdentity());
    } else {
      throw new IllegalArgumentException("Invalid direction for moveEdge: " + direction);
    }
  }

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 6678316. vOut and vIn are now declared before their respective try blocks and reused for the connect calls, eliminating the duplicate lookupByRID invocations. The null guards remain on the connect calls rather than adopting the ternary form in the suggestion - connectOutgoingEdge NPEs on a null toVertex, so guarding on the cached variable (which is null if the lookup threw RecordNotFoundException) is cleaner.

@robfrank robfrank changed the title feat: implement edge movement functionality and add unit tests fix: UPDATE EDGE SET @in/@out correctly rewires vertex edge lists May 4, 2026
@claude
Copy link
Copy Markdown

claude Bot commented May 4, 2026

Code Review - Edge Movement Feature

Overview

This PR adds the ability to move an edge's source or target vertex via UPDATE E SET @in = <newRID> (and @out). The implementation hooks into MutableEdge.set() to detect @in/@out property writes and dispatches to a new GraphEngine.moveEdge() method that rewires the vertex edge-chunk linked lists.

The feature idea is sound, but there are a few bugs and test gaps that should be addressed before merging.


Bugs

1. dirty=true in setIn/setOut breaks deserialization (High)

JavaBinarySerializer.deserializeProperties calls setOut() and setIn() to populate a freshly loaded MutableEdge from disk. Before this PR those methods were pure assignments; now they set dirty = true, so every edge loaded from storage is immediately marked dirty. Any code path that calls save() after a lookupByRID will write an unnecessary update, and dirty-flag-based change detection will be wrong.

The simplest fix is to keep the dirty = true only in moveEdge (which already calls setIn/setOut), and not inside the setters themselves. moveEdge can just set the field directly, or you can add package-private internalSetIn/internalSetOut variants that the serializer uses:

// moveEdge already knows it is mutating - mark dirty explicitly there
edge.in = newVertexRID;
edge.dirty = true;

2. Null fromVertexRID passed to connectIncomingEdge (IN branch, line 426)

connectIncomingEdge(database.lookupByRID(newVertexRID, false), out, edge.getIdentity());

out is captured before the null-guard at line 423, so when out == null this calls inLinkedList.add(edgeRID, null), corrupting the new in-vertex's edge segment. The call should be guarded like the symmetric connectOutgoingEdge call above it.

3. Unguarded lookupByRID(in, false) when in == null (OUT branch, line 452)

connectOutgoingEdge(..., database.lookupByRID(in, false), edge);

If in is null this throws NPE. The corresponding connectIncomingEdge call at line 455 is correctly guarded by if (in != null), but the connectOutgoingEdge call is not.

4. Vertex.DIRECTION.BOTH silently falls through to the OUT branch

The else in moveEdge catches any direction that is not IN, including BOTH. Since the method is public, a future caller passing BOTH will get partial, incorrect behaviour. Use an explicit else if (direction == Vertex.DIRECTION.OUT) and throw IllegalArgumentException for BOTH (or simply unsupported directions).


Minor / Design Concerns

5. toRID silently returns null for String values

If a caller writes edge.set("@in", "#1:0") with a raw String, toRID returns null and the else branch sets this.in = null, silently clearing the endpoint. String-to-RID parsing (via new RID(database, string)) or an explicit exception would be safer.

6. ClassCastException risk on non-vertex RIDs

(VertexInternal) database.lookupByRID(oldIn, false) will throw ClassCastException if oldIn happens to point to a non-vertex record (e.g., a document). The existing deleteEdge uses a similar pattern, but since this PR adds a new user-facing write path it is worth adding a type check or a meaningful error message.

7. Redundant vertex lookup (minor performance)

In the IN branch the out-vertex is looked up three times: once to remove the edge (line 414), once in connectOutgoingEdge (line 424), and a third time in connectIncomingEdge's internals. Capturing the result of lookupByRID(out, false) once would reduce this.

8. Comments describe WHAT, not WHY

Per the project's CLAUDE.md convention, comments should only explain non-obvious why, not restate what the code does. Lines like // Remove from old in-vertex's IN list are pure WHAT comments and should be removed.


Test Coverage

The single test covers only SET @in = ... inside a transaction. The following cases should be added:

  • SET @out = ... - the symmetrical path is completely untested.
  • Graph traversal after move - the test checks edge.getIn() but does not verify that SELECT expand(in('E')) FROM V WHERE name='Charlie' returns the edge, nor that Bob's incoming list is now empty.
  • Edge properties preserved - the edge is created with weight = 5 but the test does not assert that weight survives the move.
  • Moving to the same vertex - the condition !newIn.equals(this.in) short-circuits to the else branch, which is correct; a test would confirm there is no double-removal.

Checklist & PR description

The PR description still contains the template placeholder text and the checklist boxes are unchecked. Please fill these in before merging.


Summary

The core idea and code structure are solid. The main blocker is the dirty=true regression in setIn/setOut that affects all edge reads, and the two missing null guards in moveEdge that can cause NPE or corrupt edge segments. The test suite also needs coverage for the @out direction and post-move graph traversal.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 4, 2026

Codecov Report

❌ Patch coverage is 67.39130% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.41%. Comparing base (76f212a) to head (a1692c4).

Files with missing lines Patch % Lines
.../src/main/java/com/arcadedb/graph/MutableEdge.java 59.25% 6 Missing and 5 partials ⚠️
.../src/main/java/com/arcadedb/graph/GraphEngine.java 78.94% 1 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4074      +/-   ##
==========================================
+ Coverage   63.80%   64.41%   +0.60%     
==========================================
  Files        1603     1603              
  Lines      121624   121668      +44     
  Branches    25940    25950      +10     
==========================================
+ Hits        77605    78368     +763     
+ Misses      33178    32441     -737     
- Partials    10841    10859      +18     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@claude
Copy link
Copy Markdown

claude Bot commented May 4, 2026

Code Review: fix: UPDATE EDGE SET @in/@out correctly rewires vertex edge lists

This is a well-motivated fix that addresses a real correctness gap. The three root-cause explanations in the PR description are clear and accurate. Below are observations ordered by severity.


Bugs / Correctness

1. toRID() silently returns null for String-valued RIDs

private static RID toRID(final Object value) {
    if (value instanceof RID r) return r;
    if (value instanceof Identifiable i) return i.getIdentity();
    return null;  // String "#5:2" falls through here
}

If the SQL executor resolves a RID literal to a String (e.g. from a parameter binding or a quoted literal), toRID() returns null. The guard newIn != null then fails and the else branch runs: this.in = null. That silently corrupts the edge with no error. At minimum the method should try new RID(database, (String) value) when the object is a String, or throw an IllegalArgumentException so the failure is visible.

2. setIn()/setOut() still do not mark the record dirty

The PR description says "Fixes setIn()/setOut() to set dirty = true", but the diff shows these methods are unchanged - they still contain only this.in = in / this.out = out. It happens to work because moveEdge() is always called from within set() which sets dirty = true after the call, but the discrepancy between description and code is misleading and the methods remain a latent bug for any future caller.

3. No error recovery boundary around the connect calls

In deleteEdge() each vertex lookup is wrapped in its own try/catch with a SchemaException | RecordNotFoundException guard, logging the failure at FINE level. The new moveEdge() only catches RecordNotFoundException during the remove phase but does nothing if a vertex can't be found during the connect phase (lines after edge.setIn()). A ClassCastException (non-vertex RID) or SchemaException would propagate as an uncaught exception and leave the edge list in a partially updated state.


Performance

4. Redundant lookupByRID calls

In both branches the same vertex is fetched from storage twice. For example in the IN branch:

// first lookup - inside the try block (variable goes out of scope)
final VertexInternal vOut = (VertexInternal) database.lookupByRID(out, false);

// second lookup - after the try block
connectOutgoingEdge((VertexInternal) database.lookupByRID(out, false), newInVertex, edge);

And in the OUT branch database.lookupByRID(in, false) is called twice on consecutive lines:

connectOutgoingEdge(newOutVertex, database.lookupByRID(in, false), edge);
connectIncomingEdge(database.lookupByRID(in, false), newVertexRID, edge.getIdentity());

Each lookup can touch disk. Cache the result in a variable.


Test Coverage

5. Assertions use toString() instead of direct RID equality

// current
assertThat(edge.getOut().toString()).isEqualTo(rids[0].toString());

// preferred - RID.equals() is implemented
assertThat(edge.getOut()).isEqualTo(rids[0]);

6. No CHECK DATABASE assertion

The PR description specifically calls out that the old code caused CHECK DATABASE to report corruption. A test that runs database.check(false, false) (or the SQL equivalent) after a move and asserts zero errors would make that guarantee explicit and regression-proof.

7. Missing edge cases in tests

  • Simultaneous @in and @out update in a single UPDATE statement - each call to set() triggers a separate moveEdge(); the second call sees the already-mutated edge from the first. This combination is untested.
  • No-op update: UPDATE E SET @in = <current-in> - the !newIn.equals(this.in) guard short-circuits to the else branch, which is correct, but it is not tested.

8. Duplicate setup code across test methods

All four test methods repeat the same vertex/edge creation boilerplate. Extracting that into a @BeforeEach method (with the RIDs stored as instance fields) would reduce duplication and make the tests easier to read.


Minor

  • The else throw new IllegalArgumentException(...) for BOTH direction is unreachable given the if/else if structure but correct as a defensive guard.
  • newInVertex in the IN branch is typed as Identifiable while the OUT branch types its equivalent as VertexInternal - small inconsistency but harmless since connectOutgoingEdge accepts Identifiable.

@robfrank robfrank added this to the 26.5.1 milestone May 4, 2026
@robfrank robfrank requested a review from lvca May 4, 2026 11:23
@claude
Copy link
Copy Markdown

claude Bot commented May 4, 2026

Code Review - UPDATE EDGE @in/@out Rewiring

Note: a previous automated comment on this PR was posted with some inaccurate analysis; this supersedes it.


Overview

Good fix for a real correctness gap - UPDATE EDGE SET @in/@out previously silently discarded the new value during serialisation, and even if it had worked, vertex edge-chunk lists would have been left stale. The three-part diagnosis in the PR description is accurate. The approach - intercepting @in/@out in MutableEdge.set() and delegating to a new GraphEngine.moveEdge() - is clean.


Bugs

1. Partial mutation if the new-vertex lookup throws (medium severity)

In both branches of moveEdge, the edge field is updated and the old vertex lists are cleared before the new-vertex lookup:

// IN branch
edge.setIn(newVertexRID);                                        // 1. field already changed
final Identifiable newInVertex = database.lookupByRID(newVertexRID, false); // 2. can throw RNFE
if (vOut != null) {
    connectOutgoingEdge(vOut, newInVertex, edge);                // 3. not reached on throw
    connectIncomingEdge(newInVertex, out, edge.getIdentity());
}

If newVertexRID does not exist, the old in-vertex's list and the out-vertex's list have already been cleared, edge.in has been updated, but no new connections are written. The graph is left partially wired.

Fix: do all lookups first, then perform mutations:

// look up everything before touching any structure
final Identifiable newInVertex = database.lookupByRID(newVertexRID, false); // throws early if missing
final VertexInternal oldVIn = ...
final VertexInternal vOut = ...
// now mutate
inEdges.removeEdge(edge);
outEdges.removeEdge(edge);
edge.setIn(newVertexRID);
connectOutgoingEdge(...);
connectIncomingEdge(...);

2. Unchecked ClassCastException in the OUT branch

final VertexInternal newOutVertex = (VertexInternal) database.lookupByRID(newVertexRID, false);

If newVertexRID happens to resolve to a non-vertex record (e.g. an edge RID passed by mistake), this cast throws without any helpful error. The IN branch sensibly uses Identifiable here - the OUT branch should match that pattern and cast only where VertexInternal is actually required by connectOutgoingEdge.

3. Wasted lookup when the opposite endpoint is absent

In both branches the lookupByRID(newVertexRID) call is placed outside the if (vOut != null) / if (vIn != null) guard, so a round-trip to storage is performed even when newInVertex/newOutVertex will never be used. Move the lookup inside the guard.


PR Description vs. Diff Discrepancy

The description says "Fixes setIn()/setOut() to set dirty = true", but the diff does not modify those methods. The dirty-flag is actually set in MutableEdge.set() after moveEdge() returns. The description should be corrected to avoid confusion - the setters intentionally remain pure field assignments (which is correct, since the serializer calls them during deserialization).


Test Gaps

Scenario Covered
@in update - endpoints correct testUpdateEdgeIn
@out update - endpoints correct testUpdateEdgeOut
Properties preserved after move testUpdateEdgePreservesProperties
@in update reflected in graph traversal testUpdateEdgeInReflectedInTraversal
@out update reflected in graph traversal ❌ missing
CHECK DATABASE clean after move ❌ missing (the bug description calls this out explicitly)
Move to non-existent RID ❌ no error-path test
Same-vertex no-op (newIn.equals(this.in)) ❌ untested

The traversal test for @out is the symmetric twin of testUpdateEdgeInReflectedInTraversal and should be trivial to add. The CHECK DATABASE test is the most important one given that stale edge-list corruption was the motivation for the whole change.


Minor Issues

Misleading error message in toRID - if a caller passes a String RID (e.g. edge.set("@in", "#1:0")), toRID() returns null and the thrown message is "Cannot set @in to null: edges require both endpoints". The value was not null; it was the wrong type. Consider:

throw new IllegalArgumentException(
    "Cannot set @in to a non-RID value (" + value.getClass().getSimpleName() + "): use a RID or Identifiable");

Test assertions use toString() comparison - prefer direct RID equality for clearer failure messages:

// instead of:
assertThat(edge.getOut().toString()).isEqualTo(rids[0].toString());
// prefer:
assertThat(edge.getOut()).isEqualTo(rids[0]);

Summary

The core design is sound and the three root causes are correctly identified. The main ask before merging is:

  1. Reorder moveEdge to do all DB lookups before any structural mutations (avoids half-wired state on error).
  2. Fix the potential ClassCastException in the OUT branch.
  3. Add the missing @out traversal test and a CHECK DATABASE regression test.

@claude
Copy link
Copy Markdown

claude Bot commented May 4, 2026

Code Review - UPDATE EDGE @in/@out Rewiring

Overview

This PR fixes a real bug: UPDATE EDGE SET @in/@out was silently discarding endpoint changes because MutableEdge.set() delegated to MutableDocument.set(), storing values in the property map that BinarySerializer.serializeEdge() never reads. The implementation adds GraphEngine.moveEdge() and overrides MutableEdge.set() to intercept @in/@out.

The root-cause analysis is correct and the basic test coverage is solid. A few concerns follow.


Significant: Edge RID changes after move

moveEdge() is implemented as delete + create, so the moved edge gets a new RID. Any code that cached the old RID (another record property, an application variable, a secondary index entry not cleaned up by the delete) will silently reference a deleted record.

A cleaner alternative is an in-place update:

  1. Remove the edge from the old endpoint vertex linked list
  2. Update this.in / this.out on the existing MutableEdge
  3. Call this.save() to re-serialise (BinarySerializer already writes out/in as the first two RIDs in the edge format)
  4. Add the edge to the new endpoint vertex linked list

This preserves the RID and matches the semantics of an UPDATE. If delete+create is intentional, please add a comment explaining why.


Potential double-write after moveEdge()

After moveEdge() is called, newEdge.save() has already persisted the new record and updateIdentity() updates this to point to the new RID. Setting dirty = true unconditionally means the UPDATE executor will call edge.save() a second time on the already-persisted record. Consider returning early from the moveEdge() branch without setting dirty = true.


Missing null check in moveEdge()

database.lookupByRID(newOut, false) returns null if the record does not exist. connectOutgoingEdge() immediately calls fromVertex.modify(), which will NPE. Use lookupByRID(newOut, true) to get a descriptive RecordNotFoundException, or add an explicit null check. Same issue applies to toVertex.


toRID() silently rejects String values

If a caller passes a String like "#1:2", toRID() returns null and the code throws IllegalArgumentException("Cannot set @in to null ..."). The SQL layer likely coerces values before reaching this point, but the error message is confusing. Either support String-to-RID conversion or improve the error message to name the unsupported type.


Test coverage gaps

  1. No traversal test for @out update - testUpdateEdgeInReflectedInTraversal only verifies the @in direction. A parallel test for @out would confirm both directions rewire vertex linked lists correctly.
  2. No test for same-vertex move (no-op) - The guard !newIn.equals(this.in) routes no-op moves to the else branch. A test confirms the edge is not accidentally deleted.
  3. No test for multi-edge scenario - A vertex with two edges of the same type where only one is moved. This is the most likely real-world corruption scenario.

Minor style note

Prefer direct RID comparison in assertions:

assertThat(edge.getOut()).isEqualTo(rids[0]);

instead of comparing .toString() values.


Summary

Concern Severity
Edge RID changes after move - breaks external references Significant
Potential double-write due to dirty = true after moveEdge() Minor/bug
NPE if target vertex RID does not exist Minor/bug
Missing traversal test for @out, no-op, and multi-edge cases Test gap

@robfrank robfrank force-pushed the fix/update-edge branch 2 times, most recently from 439c04b to 3aa211c Compare May 5, 2026 08:49
@claude
Copy link
Copy Markdown

claude Bot commented May 5, 2026

Code Review

Summary: This PR fixes three real bugs with UPDATE EDGE SET @in/@out - wrong SQL syntax in the test, @in/@out values being silently discarded on serialization, and vertex edge-linked-lists not being updated. The fixes are correct in intent, but the implementation has a few concerns worth addressing.


Critical: RID changes after moveEdge

GraphEngine.moveEdge() uses a delete-and-recreate strategy, which allocates a new record and gives the edge a new RID. Any code that stored the old edge RID (application variables, other records, indexes) will hold stale references after this operation. In a graph database, edge identity is fundamental - users reasonably expect UPDATE EDGE to update the record in place, not replace it.

A safer approach would be to update edge.in/edge.out in place (fixing the serialization path directly instead of working around it), and patch the EdgeLinkedList structures without touching the record's identity. The deleteEdge + create pattern is also heavier than necessary.


PR description vs. actual diff: setIn()/setOut() dirty flag

The PR description says "Fixes setIn()/setOut() to set dirty = true" but the diff does not change those methods - they still do not mark the record dirty:

// Still unchanged - no dirty = true
public void setOut(final RID out) { this.out = out; }
public void setIn(final RID in)   { this.in = in; }

The dirty = true is only added inside the new set(String, Object) override. Please either update the description or fix setIn()/setOut() as described.


Failure-atomicity gap in moveEdge

deleteEdge(edge);                      // old edge is gone
...
connectOutgoingEdge(fromVertex, ...);  // if this throws...
connectIncomingEdge(toVertex, ...);    // ...this never runs
edge.updateIdentity(...);              // ...and the in-memory object is stale

If an exception occurs after deleteEdge but before both connect* calls complete, the edge is deleted but the vertex linked lists are only partially updated. This leaves the graph in a corrupted state with no way to recover within the same transaction. Consider restructuring so the record update happens last (after the linked list changes succeed), or documenting that callers must handle this exception by rolling back the transaction.


Missing test: CHECK DATABASE after the move

The PR description explicitly mentions that stale EdgeLinkedList entries caused CHECK DATABASE to report corruption. There is no test that actually invokes the database integrity checker after a move to confirm the fix holds. This would be a strong regression guard.


Missing test: updating both @in and @out in one statement

What happens with UPDATE E SET @in = #x:y, @out = #a:b WHERE ...? The SQL executor will call set("@in", ...) then set("@out", ...) sequentially. After the first call, moveEdge runs and updateIdentity replaces this.rid - but this.out is also changed. The second set("@out", ...) will then call moveEdge again on the already-moved edge. Is that correct? A test for this scenario is needed.


Minor: toRID() does not accept String

private static RID toRID(final Object value) {
    if (value instanceof RID r) return r;
    if (value instanceof Identifiable i) return i.getIdentity();
    return null;  // String "#1:2" falls here -> throws IAE
}

For SQL execution this is fine (the parser produces RID objects for #bucket:position literals). But Java API callers who pass a String get an opaque IllegalArgumentException: Cannot set @in to null rather than a helpful message. Consider either adding String support or improving the error message to say what types are accepted.


Minor: no null-check on lookupByRID result before connectOutgoingEdge

final VertexInternal fromVertex = (VertexInternal) database.lookupByRID(newOut, false);

lookupByRID with false (lazy) can return a proxy. The cast to VertexInternal will succeed at this line but connectOutgoingEdge calls fromVertex.modify(), which will throw a ClassCastException or RecordNotFoundException if newOut is not actually a vertex RID. A clear validation error before the deleteEdge call would be safer (validate both endpoints exist and are vertices before mutating anything).


Test style nit

Per CLAUDE.md conventions, prefer:

assertThat(aliceOut.hasNext()).isTrue();
// over
assertThat(aliceOut.hasNext()).isTrue(); // already fine, but the Iterator usage
// below can use the iterator directly in assertThat with custom assertions

Also Iterator usage in the traversal test could use assertThat(...).toIterable() to avoid manual .hasNext()/.next() calls and get better failure messages.


Overall

The bug diagnosis is accurate and the core approach works for the happy path. The main concerns are: (1) the RID-stability issue from delete-and-recreate semantics, (2) the missing atomicity guarantee, and (3) test gaps for the integrity checker and the dual-update scenario. The test for property preservation and traversal correctness is a good start.

@claude
Copy link
Copy Markdown

claude Bot commented May 5, 2026

Code Review: UPDATE EDGE SET @in/@out rewiring

Good fix for a real bug - the three-bug breakdown in the PR description is clear and accurate. The test coverage is solid for the happy path. Here are my observations:


Critical: Edge RID changes after moveEdge

GraphEngine.moveEdge() implements the move via delete + recreate, which assigns a new RID to the edge. This is a significant semantic issue:

  • Any cursor, client-side cache, or other code that holds the old edge RID now has a dangling reference.
  • Indexes on edge properties will be dropped and rebuilt correctly (because of delete+create), but any external bookmarks break silently.
  • The updateIdentity() call updates the single in-memory MutableEdge object, but other in-flight references are not updated.

A safer approach would be to manipulate the EdgeLinkedList structures directly (remove from old endpoint, add to new endpoint) and then update the edge record in place using setOut/setIn + save(), preserving the RID. The deleteEdge implementation already shows how to remove edge entries from vertex lists.


PR description vs. actual diff: setIn/setOut dirty flag

The PR description says:

setIn()/setOut() also failed to mark the record dirty — Fixes setIn()/setOut() to set dirty = true

But the diff for MutableEdge shows the bodies of setIn and setOut are unchanged — neither adds dirty = true. The new set("@in", ...) override does set dirty = true, but calling setIn() / setOut() directly still doesn't. This should either be fixed or the description corrected.


toRID does not handle String RIDs

private static RID toRID(final Object value) {
    if (value instanceof RID r) return r;
    if (value instanceof Identifiable i) return i.getIdentity();
    return null;
}

If the SQL execution layer passes the RID as a String (e.g., "#1:2" from parameter binding or coercion), toRID returns null, causing an IllegalArgumentException("Cannot set @in to null"). The error message will be confusing because the value was not null - it was just in an unexpected type. Either handle String parsing here (via new RID(database, stringValue)) or improve the error message to indicate the value's actual type.


Missing null/type validation in moveEdge

final VertexInternal fromVertex = (VertexInternal) database.lookupByRID(newOut, false);
final Identifiable toVertex = database.lookupByRID(newIn, false);
  • If newOut or newIn does not exist, lookupByRID returns null and subsequent calls NPE.
  • The cast to VertexInternal will throw ClassCastException if the RID points to a document or edge. Consider adding an instanceof check with a clear error message.

Silent fallback when database is not DatabaseInternal

if (rid != null && !newIn.equals(this.in) && database instanceof DatabaseInternal dbInt)
    dbInt.getGraphEngine().moveEdge(this, Vertex.DIRECTION.IN, newIn);
else
    this.in = newIn;  // vertex edge lists NOT updated

If database is not a DatabaseInternal (unlikely in practice but theoretically possible), the code silently falls back to updating only the in-memory field, leaving vertex edge lists stale. If this branch can genuinely occur, it should throw rather than leave the graph in an inconsistent state. If it can never occur, the instanceof pattern guard should be removed in favour of a direct cast.


Test gaps

  1. No CHECK DATABASE assertion - the PR description explicitly says the original bug caused CHECK DATABASE to report a corrupted record. Adding a database.checkDatabase(...) call at the end of testUpdateEdgeInReflectedInTraversal would serve as a direct regression guard.

  2. All tests create and update within a single transaction - no test verifies the behaviour when the UPDATE runs in a separate transaction from the CREATE (cross-transaction scenario).

  3. No negative-case test - no test for updating @in/@out to a non-existent RID or to a non-vertex RID.

  4. No test for updating both @in and @out in a single UPDATE statement - e.g., UPDATE E SET @in = #x:y, @out = #a:b.


Unrelated changes mixed in

The Bolt sparse-vector index support (show sparse_vector index, LSM_SPARSE_VECTOR mapping) is correct and well-tested, but it is logically unrelated to the edge rewiring fix. Mixing it into this PR makes the git history harder to bisect. Consider splitting it into a separate PR, or at least call it out clearly in the milestone entry.


Minor style notes

  • updateIdentity() is package-private (no access modifier). Given it mutates core identity fields (rid, out, in, buffer), explicit /* package */ comment or a note in the javadoc would help future readers understand the intentional visibility.
  • The import reordering in BoltNetworkExecutor.java and BoltProtocolIT.java is cosmetic noise in the diff.

Summary

Severity Issue
High moveEdge changes edge RID - breaks external references
Medium setIn/setOut dirty flag not actually fixed despite PR description
Medium toRID does not handle String RID input
Medium Missing null + type check before VertexInternal cast in moveEdge
Low Silent fallback when database is not DatabaseInternal
Low No CHECK DATABASE regression test
Low Unrelated Bolt changes mixed into the PR

The core idea is sound and the test structure is good - the main concern is the delete+recreate approach changing the edge RID, which is the riskiest part of the implementation.

robfrank and others added 4 commits May 5, 2026 15:28
- Remove dirty=true from setIn/setOut to avoid marking edges dirty during deserialization (JavaBinarySerializer)
- Move dirty=true unconditionally after moveEdge/direct-assign in MutableEdge.set()
- Guard connectOutgoingEdge/connectIncomingEdge with null checks in both IN and OUT branches
- Cache newInVertex and newOutVertex lookups to avoid redundant database.lookupByRID calls
- Replace else with else-if + throw for unsupported DIRECTION values
- Add tests for @out direction, property preservation, and adjacency list consistency

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…veEdge

- Setting @in or @out to null now throws IllegalArgumentException since edges require both endpoints; previously this silently set a null field leaving old vertex edge lists stale
- Cache vOut/vIn from the removal try blocks in moveEdge to reuse them for the connect calls, removing duplicate lookupByRID invocations

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace manual edge-list manipulation with deleteEdge() + standard edge
creation. deleteEdge() already handles removal from both vertex lists;
newEdge creation handles connecting the new endpoints. This also fixes
a latent bug where connectIncomingEdge was called regardless of whether
the edge type is bidirectional.

Add updateIdentity() on MutableEdge (package-private) so SaveElementStep's
subsequent save() writes to the new record RID rather than the deleted one.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@robfrank robfrank force-pushed the fix/update-edge branch from 3aa211c to a1692c4 Compare May 5, 2026 13:28
@claude
Copy link
Copy Markdown

claude Bot commented May 5, 2026

Code Review

This PR fixes a real and important bug - UPDATE EDGE SET @in/@out was silently discarding the value because the serializer reads edge.getIn()/getOut() directly, not the property map. The test suite is well-structured and the traversal test (testUpdateEdgeInReflectedInTraversal) is particularly thorough. A few concerns worth discussing before merge:


Critical: Edge RID changes after UPDATE EDGE

GraphEngine.moveEdge() uses a delete + create approach, which gives the rewired edge a brand-new RID. UPDATE EDGE is semantically an in-place modification, so callers that cached the original RID (application code, other records, indexes) will silently hold a dangling reference after the update.

A safer implementation would:

  1. Remove the edge entry from the old endpoint's EdgeLinkedList
  2. Add the entry to the new endpoint's EdgeLinkedList
  3. Patch edge.in / edge.out in-place and mark dirty - so the single existing record is rewritten with the updated endpoint fields

This avoids changing the RID and avoids the double-write described below.


Major: Double-save after moveEdge()

MutableEdge.set() calls moveEdge(), which already calls newEdge.save() internally, then updates this.rid via updateIdentity(). But set() also sets dirty = true and returns, so the UPDATE executor will call edge.save() a second time on the same (new) record. At minimum this is an extra write per UPDATE EDGE; at worst it can interact unexpectedly with transaction bookkeeping.


Major: Silent fallback when database is not DatabaseInternal

if (rid != null && !newIn.equals(this.in) && database instanceof DatabaseInternal dbInt)
    dbInt.getGraphEngine().moveEdge(this, Vertex.DIRECTION.IN, newIn);
else
    this.in = newIn;   // <-- silently skips vertex list update

If database happens not to be a DatabaseInternal, the code falls through to else and updates only the field - leaving vertex edge-lists stale (the exact corruption bug this PR is fixing). The else branch is only safe for the rid == null case (edge not yet persisted). The instanceof check should be guarded separately, or the code should throw if the list update cannot be performed for a persisted edge.


Major: setIn() / setOut() still don't set dirty = true

The PR description says "Fixes setIn()/setOut() to set dirty = true", but lines 213-219 of MutableEdge.java are unchanged in this diff. Direct Java API calls to edge.setIn(rid) / edge.setOut(rid) will still silently not mark the edge dirty, and won't update vertex lists either. If the intent is to deprecate these methods in favour of set("@in", ...), that should be noted explicitly (or enforced with a deprecation annotation / delegate).


Minor: toRID() does not handle String

private static RID toRID(final Object value) {
    if (value instanceof RID r) return r;
    if (value instanceof Identifiable i) return i.getIdentity();
    return null;
}

A String like "#1:0" returns null, triggering IllegalArgumentException("Cannot set @in to null...") - a confusing message. Since RID already has a String constructor, adding if (value instanceof String s) return new RID(database, s); would improve robustness for Java API callers.


Minor: Missing test cases

  • CHECK DATABASE integrity - the PR description mentions this was previously failing; a test that calls database.check() after rewiring would guard against regressions.
  • setIn()/setOut() Java API - these methods are not exercised by the new tests, even though they are mentioned in the bug description.
  • No-op update (UPDATE E SET @in = <same-vertex>) - the guard !newIn.equals(this.in) makes this a no-op; worth a test to confirm it is not deleted accidentally.
  • Non-existent target vertex - what happens when newVertexRID does not resolve? lookupByRID with loadContent=false returns a shell; the VertexInternal cast and subsequent connectOutgoingEdge call may throw a confusing exception or silently succeed.

Observation: lookupByRID(..., false) after deleteEdge

deleteEdge(edge);
final VertexInternal fromVertex = (VertexInternal) database.lookupByRID(newOut, false);

loadContent=false (lazy) should be fine here because connectOutgoingEdge will load the edge-chunk on demand from the transaction-local dirty pages. Just worth confirming that the edge-chunk modifications made by deleteEdge are visible through a lazy lookup in the same transaction.


Summary: The direction is correct and the test coverage for the SQL path is good. The main blocker is the RID-change semantic issue (delete+create vs. in-place patch), and the double-save side-effect. The silent fallback and dirty-flag gaps are secondary but should be addressed to avoid future confusion.

@robfrank robfrank merged commit ab6b2ab into main May 5, 2026
27 of 29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant