Implement InMemoryStorage with petgraph for dependency graph (rivets-…#4
Conversation
…bz5)
This commit implements the first storage backend for rivets, providing a fast,
ephemeral in-memory storage implementation using HashMap and petgraph.
**Implementation:**
- **InMemoryStorageInner**: Core data structure with:
- `HashMap<IssueId, Issue>` for O(1) issue lookups
- `petgraph::DiGraph` for dependency graph management
- `HashMap<IssueId, NodeIndex>` for mapping issues to graph nodes
- `IdGenerator` integration for hash-based ID generation
- **Thread Safety**: Wrapped in `Arc<Mutex<>>` for async thread-safe access
- **IssueStorage Trait**: Implements all 15 methods from the storage trait:
- CRUD: create, get, update, delete
- Dependencies: add/remove dependency, get dependencies/dependents, cycle detection
- Queries: list, ready_to_work, blocked_issues
- Batch: import_issues, export_all
- Persistence: save (no-op for in-memory)
**Key Features:**
- **Cycle Detection**: Uses petgraph's `has_path_connecting` algorithm
- **Ready Work Algorithm**: BFS traversal with depth limit (50) for transitive blocking
- **Blocking Detection**: Finds issues blocked by open dependencies
- **Dependency Graph**: Supports all 4 dependency types (blocks, related, parent-child, discovered-from)
- **Validation**: Integrates with `NewIssue::validate()` for consistent validation
**Changes:**
- Added `tokio = { features = ["sync"] }` to workspace Cargo.toml for async Mutex support
- Created `crates/rivets/src/storage/in_memory.rs` (904 lines)
- Reorganized storage module: `storage.rs` → `storage/mod.rs`
- Updated `create_storage()` factory to instantiate InMemoryStorage
- Added 13 comprehensive unit tests covering all functionality
**Test Coverage:**
- CRUD operations (create, get, update, delete)
- Dependency management (add, remove, get dependencies/dependents)
- Cycle detection (prevents circular dependencies)
- Delete protection (prevents deleting issues with dependents)
- Filtering (by priority, status, type, assignee, label)
- Ready work algorithm (excludes blocked issues)
- Blocked issues detection
- Import/export operations
- Save operation (no-op)
All tests passing (41 total: 28 existing + 13 new).
Addresses rivets-bz5 acceptance criteria:
✅ InMemoryStorage implements all trait methods
✅ Uses petgraph for dependency graph
✅ Cycle detection working with graph algorithms
✅ Ready work calculation via graph traversal
✅ All CRUD operations functional
✅ Unit tests for all operations
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Pull Request Review: InMemoryStorage ImplementationSummaryThis PR implements a solid foundation for the in-memory storage backend with petgraph-based dependency graph management. The implementation is well-structured, thoroughly tested, and follows Rust best practices. Overall, this is high-quality work with just a few areas for improvement. ✅ Strengths1. Excellent Architecture
2. Strong Test Coverage
3. Robust Error Handling
4. Code Quality
🔍 Areas for Improvement1. Potential Graph Corruption on Dependency Creation
|
This commit addresses all feedback from the code review: 1. **CRITICAL: Fix graph corruption in create method** - Refactored to validate ALL dependencies before modifying storage - Add temporary node for cycle checking with rollback on failure - Prevents inconsistent state if validation fails after issue is added 2. **CRITICAL: Document import_issues dependency limitation** - Added export_dependencies() and import_dependencies() helper methods - Updated test to clarify that dependencies are NOT preserved by import_issues() - Added comprehensive documentation explaining the limitation and workarounds - This is a known limitation of the current IssueStorage trait API 3. **MEDIUM: Address ID generator inefficiency** - Optimized to only recreate generator when crossing ID length thresholds (500, 1500) - Reduced from O(n) on every create to O(1) amortized - Added database_size() getter method to IdGenerator - Only performs O(n) re-registration at threshold boundaries 4. **LOW: Add concurrency and performance tests** - Added test_concurrent_creates to verify Arc<Mutex<>> correctness - Added test_performance_benchmark to validate performance claims - Performance: 1000 issues created in ~14ms (well within acceptable bounds) 5. **SUGGESTION: Update documentation clarity** - Enhanced module-level documentation to clearly explain ephemeral vs persistent storage - Added explicit note that save() is a no-op for in-memory backend - Added performance characteristics section (Big-O complexity) - Improved thread safety documentation All 43 tests passing. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Code Review - PR 4: InMemoryStorage ImplementationThis PR implements a solid foundation for the in-memory storage backend with petgraph integration. The second commit addresses previous review feedback effectively. Strengths1. Excellent Architecture and Design
2. Strong Performance Optimizations
3. Comprehensive Test Coverage
4. Code Quality
Issues and SuggestionsMEDIUM: Dependency Import Limitation DocumentationLocation: in_memory.rs:628-659 The import_issues() method does not preserve dependencies, which is documented with a TODO comment. While the helper methods export_dependencies() and import_dependencies() exist (lines 699-739), they are marked allow(dead_code) and not exposed through the trait. Recommendation:
LOW: Priority Validation InconsistencyLocation: in_memory.rs:276-280 The update() method validates priority inline (if priority > 4), but create() delegates to NewIssue::validate(). This creates two sources of truth for the same validation logic. Recommendation: Extract priority validation to a shared helper function or constant LOW: Missing Validation in import_issues()Location: in_memory.rs:628-659 The import_issues() method bypasses the usual validation that create() performs via NewIssue::validate(). While this may be intentional for performance during bulk imports, it could allow invalid data to enter the system if importing from untrusted sources. Options:
SUGGESTION: Consider RwLock for Read-Heavy WorkloadsLocation: in_memory.rs:154 (type alias) Currently uses Mutex which blocks all access during any operation. For read-heavy workloads (common in issue trackers), RwLock would allow concurrent reads. Recommendation: Mutex is fine for MVP. Consider RwLock as a future optimization if profiling shows lock contention on read operations. VerdictStatus: Approve with minor suggestions This PR represents excellent work that:
The identified issues are non-blocking and mostly suggestions for future improvement or consistency. The critical graph corruption bug was fixed properly, and the ID generator optimization is well-implemented. Recommended Next Steps:
Great job! Reviewed by: Claude Code |
Issue #4 - Workspace Discovery Not Utilized: - Add discover_and_set_workspace() convenience method to Context - Combines discover_workspace() and set_workspace() for better UX - Add test for nested directory discovery Issue #6 - Misleading Error for Uninitialized Workspace: - Add WorkspaceNotInitialized error variant - Update storage_for() to return this instead of NoContext - Clearer error: "Workspace not initialized: /path. Call set_context first." - Add test for uninitialized workspace error Issue #8 - Inconsistent Error Handling: - Change WorkspaceNotFound from String to struct with source field - Preserve underlying IO error from canonicalize() failures - Better debugging with error chain support via #[source] - Add test for nonexistent path error 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…ivets-0gom round 4) Round-4 review (Claude bot, run #25710418596) had 4 substantive findings. Decisions per gilfoyle/assessing-review-feedback: - Finding #1 (import-less file behavior test): REJECT - already tracked via rivets-3d0s notes; testing the Pass-2 short-circuit is a different bug class - Finding #2 (missing fallthrough-path test): ACCEPT - added here - Finding #3 (debug_assert! for LIKE wildcards): REJECT after evidence - initially accepted, then reverted when the suggested assert fired on legitimate snake_case paths (crate_a/, crate_b/, etc.). Reviewer's premise that "crate directory paths essentially never contain these characters" is false in real Rust workspaces. Proper fix is LIKE ESCAPE clause, deferred as out of scope. - Finding #4 (Unix normalize_path alloc nit): REJECT - reviewer marked "purely a style note" See .rivets-0gom/review-decisions-round-4.md for the full per-finding log. The new test fallback_resolves_via_unscoped_when_no_same_crate_candidate pairs with the round-3 test as a fallthrough-path companion: when shared_helper exists ONLY in crate_b (no same-crate candidate), the resolver must drop through to search_unique_symbol_by_name and produce the cross-crate edge. A regression that fails to fall through (e.g., early-returning after the same-crate miss) would orphan the reference and fail the positive assertion. 588 tethys tests pass, clippy + fmt clean.
…tion, streaming-mode test coverage (PR #65) Three findings applied; one rejected; one deferred to tracker. Applied: - claude #1: dropped (rivets-lcb6) refs from db/file_deps.rs::clear_all_file_deps doc and indexing.rs inline comment. CLAUDE.md: source comments shouldn't reference current task/fix/issue IDs since those rot. - claude #2: file_deps_removed_when_use_statement_deleted now uses assert_eq!(after.row_count, before.row_count - 1, ...) rather than the looser `<` — catches both "stale persists" (after == before) AND "removed too many" (after < before - 1) regressions. - gemini #5: file_deps_idempotency.rs refactored with #[rstest] + #[case::batch(IndexOptions::default)] / #[case::streaming( IndexOptions::with_streaming)]. 4 cases (2 tests x 2 modes). The clear_all_file_deps call runs before the streaming/non-streaming branch, so the fix applies to both; the test matrix now verifies it. Both modes verified to FAIL pre-fix (reverted the clear, all 4 cases panic) and pass post-fix. Deferred: - claude #3: non-transactional clear risk (pipeline panic between clear and re-populate leaves table empty). Pre-existing pattern shared with clear_all_call_edges; not introduced by this PR. Filed as rivets-ml05 (P4 bug) for atomicity work that needs to consider the streaming batch_writer's chunked-commit pattern. Rejected: - gemini #4: N+1 / stale-file concern in compute_all_dependencies. The function does exist (gemini was right about the name; my initial doubt was wrong), but the concern is about a different bug class (orphan files lingering in DB) that doesn't apply post-rebuild because db/files.rs cascades deletes on file removal. Performance concern about per-file iteration is a separate optimization, not in scope for the resolver-correctness epic. Full per-finding decision log: .rivets-lcb6/review-decisions-round-1.md Verification: cargo nextest run -p tethys -> 615 pass / 8 skipped (was 613, +2 from the rstest parameterization split); cargo clippy -D warnings clean; cargo fmt clean.
…efer
Initial verdict cited db/files.rs:145-146 cascade DELETEs as evidence that
orphan files don't linger after re-index. Verification showed those DELETEs
only fire when an EXISTING file is RE-indexed (the `updated > 0` branch),
not when a file is deleted from disk. Files deleted from disk are never
processed by the indexer, so no cleanup fires for them.
Concrete consequence: compute_all_dependencies (streaming-only path)
iterates self.db.list_all_files() which includes orphan rows, reads their
stale stored imports + refs, and re-inserts file_deps with the orphan as
from_file_id. Downstream queries (coupling, callers, cycles) then treat
the ghost file as a real edge source.
Pre-existing bug in streaming mode; not caused by rivets-lcb6. Filed as
rivets-dhxo (P3 bug). Updated decision log to:
- reframe the finding as "orphan-file handling in compute_all_dependencies"
- flip verdict from reject → defer
- add a "correction lesson" note: round-1 conflated scope with validity
when the cascade-cleanup claim went unverified
…vior (PR #67 round 4, finding F4) Per round-4 review (Claude #4): plan-v3-k-hybrid.md's C10 row explicitly called for "needs CI test: fixture with orphan-dir file using a workspace crate name as a symbol — verify the orphan→workspace edge is dropped." Slice 2's `file_deps_corroboration.rs` only covered C7+C8 (FORBIDDEN drop + corroborated ALLOWED preserve); C10 (orphan pseudo-crate handling) was unit-tested in `k_hybrid_filter_tests` but not exercised through the full `tethys index` pipeline. The bot caught the gap. ## What's added `build_orphan_dir_workspace()` — fixture with 1 Cargo crate (`crate_caller`) + 1 non-Cargo orphan directory (`examples/oddball.rs`). The orphan file defines `OrphanThing::len()`. The caller's `some_input.len()` call would, pre-K-hybrid, resolve to that workspace `len` method as the unique match. `k_hybrid_drops_workspace_crate_to_orphan_dir_phantom_edge` test: indexes the workspace, asserts no `crate_caller/* → examples/*` edge exists in `file_deps`. Post-K-hybrid the filter must drop it (no Rust `use` statement can corroborate an import from an orphan directory — the orphan-pseudo-crate name `orphan:examples` contains `:` which is not a valid Rust identifier, so corroboration is structurally impossible). ## Falsifiability The new test exercises the same filter code path as the existing `k_hybrid_drops_cross_crate_call_without_import_corroboration` test (both hit the `(Some(_), Some(b))` cross-crate arm). The slice-2 commit (dbba5b7) already documented manual falsifiability for that path — reverting the filter to always-keep makes the cross-crate test fail with the predicted phantom edge. The orphan test would fail the same way for the same reason, so the falsifiability is structurally proven without a separate manual check. ## Other round-4 findings (in PR comment / decision log) - F1 (workspace-root orphan "inconsistency" with Python sim): REJECTED. Verified both implementations behave identically — bot called them "inconsistent" but admitted "behaviorally equivalent." - F2 (fully-qualified path false negatives): already-documented negative space; no action. - F3 (resolver_routing.rs change correctness): positive observation, no action. ## Gates - cargo nextest run -p tethys: 630 pass, 6 skipped (+1) - cargo clippy -p tethys --all-targets --all-features -- -D warnings: clean - cargo fmt --check: clean
Round-3 review-feedback assessment of two new claude-code-review comments (00:38Z and 00:45Z) that reviewed the round-1 state. Two new actions; rest were carryover rejections or already-fixed-in-round-2. R3-1 (claude #3): verification.md said "5 tests" but the file as-committed in round-1 has 8 (the 3 round-1 review-decisions additions: I2, I3, S3-partial). Update the test-count and total-suite-count lines to current state with a brief note crediting the round-1 commit. R3-2 (claude #4): the `super::` filesystem-walk-vs-Rust-spec divergence is documented in the new `self_and_super_paths_resolve_via_as_written` test docstring but had no tracker entry. Per the project's tracker-discipline rule, documented-only deferrals rot in `.<issue-id>/` dirs. Filed rivets-nkjd to track the divergence and updated the test docstring to reference it. Rejected (carryover from prior rounds): - `(*s).to_string()` style: clippy doesn't flag it (round-1 S6 rationale stands; two reviewers' subjective preference doesn't override the linter). - `tail` allocation timing: round-2 R2-2 rationale stands (sub-microsecond). Already fixed: - Disjunctive assertion in `workspace_crate_prefixed_call_resolves`: both reviewers flagged it; round-2 R2-1 fixed it in commit 80167b3. They were reading the round-1 state. Decision log: `.rivets-044i/review-decisions-round-3.md`. Verification: - 8/8 pass2_qualified_paths tests pass. - `cargo clippy --all-targets --all-features -- -D warnings` clean. - `cargo fmt --check` clean.
Round-3 review-feedback assessment of two new claude-code-review comments (00:38Z and 00:45Z) that reviewed the round-1 state. Two new actions; rest were carryover rejections or already-fixed-in-round-2. R3-1 (claude #3): verification.md said "5 tests" but the file as-committed in round-1 has 8 (the 3 round-1 review-decisions additions: I2, I3, S3-partial). Update the test-count and total-suite-count lines to current state with a brief note crediting the round-1 commit. R3-2 (claude #4): the `super::` filesystem-walk-vs-Rust-spec divergence is documented in the new `self_and_super_paths_resolve_via_as_written` test docstring but had no tracker entry. Per the project's tracker-discipline rule, documented-only deferrals rot in `.<issue-id>/` dirs. Filed rivets-nkjd to track the divergence and updated the test docstring to reference it. Rejected (carryover from prior rounds): - `(*s).to_string()` style: clippy doesn't flag it (round-1 S6 rationale stands; two reviewers' subjective preference doesn't override the linter). - `tail` allocation timing: round-2 R2-2 rationale stands (sub-microsecond). Already fixed: - Disjunctive assertion in `workspace_crate_prefixed_call_resolves`: both reviewers flagged it; round-2 R2-1 fixed it in commit 80167b3. They were reading the round-1 state. Decision log: `.rivets-044i/review-decisions-round-3.md`. Verification: - 8/8 pass2_qualified_paths tests pass. - `cargo clippy --all-targets --all-features -- -D warnings` clean. - `cargo fmt --check` clean.
…bz5)
This commit implements the first storage backend for rivets, providing a fast, ephemeral in-memory storage implementation using HashMap and petgraph.
Implementation:
InMemoryStorageInner: Core data structure with:
HashMap<IssueId, Issue>for O(1) issue lookupspetgraph::DiGraphfor dependency graph managementHashMap<IssueId, NodeIndex>for mapping issues to graph nodesIdGeneratorintegration for hash-based ID generationThread Safety: Wrapped in
Arc<Mutex<>>for async thread-safe accessIssueStorage Trait: Implements all 15 methods from the storage trait:
Key Features:
has_path_connectingalgorithmNewIssue::validate()for consistent validationChanges:
tokio = { features = ["sync"] }to workspace Cargo.toml for async Mutex supportcrates/rivets/src/storage/in_memory.rs(904 lines)storage.rs→storage/mod.rscreate_storage()factory to instantiate InMemoryStorageTest Coverage:
All tests passing (41 total: 28 existing + 13 new).
Addresses rivets-bz5 acceptance criteria:
✅ InMemoryStorage implements all trait methods
✅ Uses petgraph for dependency graph
✅ Cycle detection working with graph algorithms
✅ Ready work calculation via graph traversal
✅ All CRUD operations functional
✅ Unit tests for all operations
🤖 Generated with Claude Code