Skip to content

Add graph-storage crate#4198

Draft
TrueDoctor wants to merge 5 commits into
masterfrom
upstream-graph-storage
Draft

Add graph-storage crate#4198
TrueDoctor wants to merge 5 commits into
masterfrom
upstream-graph-storage

Conversation

@TrueDoctor
Copy link
Copy Markdown
Member

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the graph-storage crate, which provides a delta-based graph representation for the Graphite file format, including CRDT/delta computation, conversion utilities, and round-trip tests. It also cleans up legacy migrations in graph-craft and updates MemoHash to compare values instead of hashes. The review feedback highlights a potential DoS/OOM vulnerability when resizing the exports vector with unvalidated slot indices, suggests optimizing the O(N^2) pairwise comparison in order_consistent to O(N log N) by sorting, and recommends using into_iter().next() instead of drain(..).next() for idiomatic element extraction.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread document/graph-storage/src/lib.rs Outdated
Comment thread document/graph-storage/src/lib.rs Outdated
Comment thread node-graph/graph-craft/src/document/value.rs Outdated
@TrueDoctor
Copy link
Copy Markdown
Member Author

@cubic-ai-dev

@cubic-dev-ai
Copy link
Copy Markdown
Contributor

cubic-dev-ai Bot commented Jun 7, 2026

@cubic-ai-dev

@TrueDoctor I have started the AI code review. It will take a few minutes to complete.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

9 issues found across 26 files

Confidence score: 2/5

  • There are multiple high-confidence, user-impacting regressions (sev 7–8) across hashing, graph diffing, and serialization, so merge risk is currently high rather than routine.
  • Most severe: node-graph/libraries/core-types/src/memo.rs makes MemoHash equality ignore hash while Hash still includes it, which can violate hash collection invariants and cause incorrect map/set behavior.
  • Several storage/runtime paths risk dropped or corrupted data: missed network/export diffs in document/graph-storage/src/delta.rs, omitted export-only resources in document/graph-storage/src/from_runtime.rs, and silent truncation in document/graph-storage/src/to_runtime.rs when inputs lengths differ.
  • Pay close attention to node-graph/libraries/core-types/src/memo.rs, document/graph-storage/src/delta.rs, document/graph-storage/src/from_runtime.rs, document/graph-storage/src/to_runtime.rs, and node-graph/libraries/resources/src/lib.rs - they contain the highest-likelihood correctness and round-trip integrity risks.

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread node-graph/libraries/resources/src/lib.rs
Comment thread node-graph/libraries/vector-types/src/gradient.rs Outdated
Comment thread node-graph/libraries/core-types/src/memo.rs
Comment thread document/graph-storage/src/from_runtime.rs
Comment thread document/graph-storage/src/to_runtime.rs
Comment thread document/graph-storage/src/delta.rs
Comment thread document/graph-storage/src/delta.rs Outdated
Comment thread node-graph/graph-craft/src/document/value.rs Outdated
Comment thread document/graph-storage/src/from_runtime.rs Outdated
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

11 issues found across 19 files

Confidence score: 2/5

  • Multiple high-severity, high-confidence data-integrity risks make this PR risky to merge as-is, especially in core graph storage/CRDT paths (document/graph-storage/src/attributes.rs, document/graph-storage/src/to_runtime.rs, document/graph-storage/src/crdt.rs, document/graph-storage/src/ids.rs).
  • The most severe issue is inconsistent Priority equality/ordering/hashing in document/graph-storage/src/attributes.rs, which can break HashMap/BTree* behavior and lead to hard-to-debug corruption-like behavior.
  • There are additional concrete regression risks affecting user-visible state correctness: duplicate node IDs can silently collapse nodes during runtime reconstruction, deletes can be resurrected without tombstones, and non-canonical Rev hashing can make identical deltas produce different IDs.
  • Pay close attention to document/graph-storage/src/attributes.rs, document/graph-storage/src/to_runtime.rs, document/graph-storage/src/crdt.rs, document/graph-storage/src/ids.rs, document/graph-storage/src/session.rs, document/graph-storage/src/delta.rs, document/graph-storage/src/model.rs - core invariants and history/diff behavior can diverge from expected graph state.

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread document/graph-storage/src/attributes.rs Outdated
Comment thread document/graph-storage/src/to_runtime.rs Outdated
Comment thread document/graph-storage/src/crdt.rs
Comment thread document/graph-storage/src/ids.rs
Comment thread document/graph-storage/src/session.rs Outdated
Comment thread document/graph-storage/src/model.rs
Comment thread document/graph-storage/src/model.rs
Comment thread document/graph-storage/Cargo.toml Outdated
Comment thread document/graph-storage/Cargo.toml Outdated
Comment thread document/graph-storage/Cargo.toml Outdated
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 19 files

Confidence score: 3/5

  • There is meaningful regression risk in document/graph-storage/src/session.rs: the unconditional registry reset can drop visible hot-zone edits and desynchronize working state from hot_log when unretired ops are present, which is user-facing state integrity impact.
  • document/graph-storage/src/session.rs also has a fallible path in commit_ops that currently uses assert! for missing parents, so malformed or unexpected input could panic the app instead of returning an error.
  • document/graph-storage/src/ids.rs deriving Default for LamportClock introduces an ambiguous/invalid construction path without an explicit PeerId; this is moderate API correctness risk rather than an immediate blocker.
  • Pay close attention to document/graph-storage/src/session.rs, document/graph-storage/src/ids.rs - state rebuild correctness, panic-free error handling, and explicit clock initialization need validation before merge.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="document/graph-storage/src/ids.rs">

<violation number="1" location="document/graph-storage/src/ids.rs:49">
P2: `LamportClock` should not be default-constructible because it needs an explicit `PeerId`. Deriving `Default` makes an invalid/ambiguous state easy to create.</violation>
</file>

<file name="document/graph-storage/src/session.rs">

<violation number="1" location="document/graph-storage/src/session.rs:128">
P1: This unconditional registry reset can lose visible hot-zone edits and desynchronize working state from `hot_log` when called with unretired ops present. Rebuild working registry from snapshot plus hot log instead of replacing it directly.</violation>

<violation number="2" location="document/graph-storage/src/session.rs:194">
P2: `commit_ops` can panic on missing parent via `assert!`, crashing the app instead of returning an error from this fallible path.

(Based on your team's feedback about avoiding panicking invariant handling in application code.) [FEEDBACK_USED].</violation>
</file>

Reply with feedback, questions, or to request a fix.

Re-trigger cubic


let revs = self.commit_ops(ops, false)?;
// No hot ops on this path, so the working registry must mirror the advanced snapshot.
self.document.registry = self.document.retired_snapshot.clone();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: This unconditional registry reset can lose visible hot-zone edits and desynchronize working state from hot_log when called with unretired ops present. Rebuild working registry from snapshot plus hot log instead of replacing it directly.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At document/graph-storage/src/session.rs, line 128:

<comment>This unconditional registry reset can lose visible hot-zone edits and desynchronize working state from `hot_log` when called with unretired ops present. Rebuild working registry from snapshot plus hot log instead of replacing it directly.</comment>

<file context>
@@ -0,0 +1,534 @@
+
+		let revs = self.commit_ops(ops, false)?;
+		// No hot ops on this path, so the working registry must mirror the advanced snapshot.
+		self.document.registry = self.document.retired_snapshot.clone();
+		Ok(revs)
+	}
</file context>

pub const ORIGIN: Self = TimeStamp { counter: 0, peer: PeerId(0) };
}

#[derive(Copy, Clone, Debug, Default, Serialize, Deserialize)]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: LamportClock should not be default-constructible because it needs an explicit PeerId. Deriving Default makes an invalid/ambiguous state easy to create.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At document/graph-storage/src/ids.rs, line 49:

<comment>`LamportClock` should not be default-constructible because it needs an explicit `PeerId`. Deriving `Default` makes an invalid/ambiguous state easy to create.</comment>

<file context>
@@ -0,0 +1,92 @@
+	pub const ORIGIN: Self = TimeStamp { counter: 0, peer: PeerId(0) };
+}
+
+#[derive(Copy, Clone, Debug, Default, Serialize, Deserialize)]
+pub struct LamportClock {
+	pub(crate) counter: u64,
</file context>

let rev = delta.id;

for parent in &delta.parents {
assert!(self.document.history.contains_key(parent), "commit parent must be in history");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: commit_ops can panic on missing parent via assert!, crashing the app instead of returning an error from this fallible path.

(Based on your team's feedback about avoiding panicking invariant handling in application code.) .

View Feedback

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At document/graph-storage/src/session.rs, line 194:

<comment>`commit_ops` can panic on missing parent via `assert!`, crashing the app instead of returning an error from this fallible path.

(Based on your team's feedback about avoiding panicking invariant handling in application code.) .</comment>

<file context>
@@ -0,0 +1,534 @@
+			let rev = delta.id;
+
+			for parent in &delta.parents {
+				assert!(self.document.history.contains_key(parent), "commit parent must be in history");
+			}
+			let mode = if idempotent { ApplyMode::Idempotent } else { ApplyMode::Live };
</file context>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant