feat: YAML import pipeline — INTAKE milestone#188
Conversation
Implements INTAKE milestone — schema-validated YAML import with version checking, reference validation (no dangling edges), idempotent merge, atomic writes, node properties, and dry-run mode. Adds js-yaml as direct dependency. 22 new tests (117 total).
📝 WalkthroughWalkthroughAdds a YAML import pipeline and a CLI subcommand Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as CLI (git mind import)
participant Cmd as importCmd
participant Parser as parseImportFile
participant Validator as validateImportData
participant Writer as writeImport
participant Graph as Graph
CLI->>Cmd: importCmd(cwd, filePath, {dryRun, json})
Cmd->>Parser: parseImportFile(filePath)
Parser-->>Cmd: {data, parseError}
alt parseError
Cmd-->>CLI: result(valid:false, errors)
else
Cmd->>Validator: validateImportData(data, graph)
Validator-->>Cmd: {valid, errors, warnings, declaredNodeIds}
alt invalid
Cmd-->>CLI: result(valid:false, errors, warnings)
else valid
alt dryRun
Cmd-->>CLI: result(valid:true, dryRun:true, stats)
else write
Cmd->>Writer: writeImport(graph, data)
Writer->>Graph: apply patch / commit
Graph-->>Writer: commit result
Writer-->>Cmd: {nodesWritten, edgesWritten}
Cmd-->>CLI: result(valid:true, stats)
end
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
No actionable comments were generated in the recent review. 🎉 Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
# Conflicts: # CHANGELOG.md
|
@coderabbitai review please |
|
🧠 Learnings used✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Fix all issues with AI agents
In `@bin/git-mind.js`:
- Around line 126-138: Update the import command usage text to mention the
undocumented alias: when parsing the 'import' case (the args handling that sets
dryRun using args.includes('--dry-run') || args.includes('--validate') and calls
importCmd), add `--validate` to the usage/help output (or explicitly document it
as an alias of `--dry-run`) so the message shown by console.error for the import
subcommand reflects both `--dry-run` and `--validate`.
In `@src/import.js`:
- Around line 186-192: The code currently resets the edge 'createdAt' on every
import by calling patch.setEdgeProperty(edge.source, edge.target, edge.type,
'createdAt', new Date().toISOString()); change this so creation timestamp is
preserved: either rename the field to 'importedAt' or 'updatedAt' if you intend
to refresh it, or, preferably, only set 'createdAt' when it does not already
exist by checking the existing edge properties (e.g., via the
patch/getEdge/getEdgeProperty helper before calling patch.setEdgeProperty);
update the logic around patch.setEdgeProperty(...) for 'createdAt' accordingly
and keep the existing rationale handling.
- Around line 96-148: The edge-validation pushes separate missing-field errors
for source/target/type then redundantly rechecks all three with a combined if
that causes a continue; refactor the required-field logic in the edge loop (for
variables edge in edgeEntries) to first collect missing required fields
(source/target/type), push one error per missing field, and if any are missing
immediately continue so later checks (validateNodeId, validateEdgeType,
validateConfidence, self-edge and reference checks using validateNodeId,
validateEdgeType, validateConfidence, declaredNodeIds, existingNodes) only run
when all three required fields are present; remove the duplicated combined check
to improve readability and ensure consistent control flow.
- Around line 160-202: The writeImport function currently only checks typeof
node.properties === 'object' which allows arrays through and causes
Object.entries to write numeric-indexed properties; update writeImport to ensure
node.properties is a plain object by adding a guard like node.properties !==
null && !Array.isArray(node.properties) before iterating, or alternatively
enforce this validation in validateImportData so writeImport only ever receives
a non-array object; reference the writeImport function and the node.properties
check (and mirror the same Array.isArray guard used/needed in parseImportFile)
when making the change.
- Around line 28-42: The function parseImportFile currently treats any non-null
object as valid, which lets YAML arrays (typeof [] === 'object') slip through
and later trigger misleading "Missing required field" errors; update the guard
in parseImportFile to explicitly reject arrays (e.g., check Array.isArray(data))
so that if data is null, not an object, or is an array you return { data: null,
parseError: 'YAML file is empty or not an object' } — adjust the condition that
inspects data (the variable named data in parseImportFile) and keep the existing
error text and return shape.
In `@test/import.test.js`:
- Around line 322-343: The test should assert that createdAt timestamps are
preserved across idempotent imports: after the first import (using importFile),
fetch the nodes and edges via graph.getNodes()/graph.getEdges() and record their
createdAt values (for the node with id "spec:auth" and/or edge from
"file:auth.js" to "spec:auth"), then perform the second import and re-fetch
nodes/edges and assert the createdAt values are equal (i.e., not overwritten);
update the 're-import produces same graph state' test to capture and compare
createdAt from the first and second imports using the existing importFile,
graph.getNodes, and graph.getEdges symbols.
- Reject YAML arrays in parseImportFile (typeof [] === 'object' pitfall) - Validate node.properties is a plain object, not an array - Rename edge createdAt to importedAt (honest re-import semantics) - Refactor edge required-field checks for cleaner control flow - Document --validate alias in CLI usage text
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@test/import.test.js`:
- Around line 1-9: The imports parseImportFile and validateImportData are not
used directly in the tests; either remove these unused imports from the test
file or add direct unit tests exercising parseImportFile and validateImportData
(in addition to the existing importFile tests) so they are referenced. Locate
the import statement that includes importFile, parseImportFile,
validateImportData and either delete parseImportFile and validateImportData from
that import list, or add new it() blocks that call parseImportFile(...) and
validateImportData(...) with representative inputs and assertions to cover their
behavior.
- Around line 343-364: Update the "re-import produces same graph state" test to
also assert that the importedAt timestamp is refreshed on re-import: after the
first import (using importFile) call graph.getNodes() and capture the importedAt
value for the node(s) (e.g., the "file:auth" node), wait a short interval (so
the timestamp can change), call importFile again, then fetch nodes again via
graph.getNodes() and assert the new importedAt is later/greater than the
previous value; keep the existing assertions on counts (nodes/edges) intact. Use
the existing helpers (writeYaml, importFile, graph.getNodes) and reference the
node id "file:auth" when locating the timestamp to compare.
| describe('idempotency', () => { | ||
| it('re-import produces same graph state', async () => { | ||
| const path = await writeYaml('graph.yaml', ` | ||
| version: 1 | ||
| nodes: | ||
| - id: "spec:auth" | ||
| - id: "file:auth.js" | ||
| edges: | ||
| - source: "file:auth.js" | ||
| target: "spec:auth" | ||
| type: "implements" | ||
| `); | ||
| // Import twice | ||
| await importFile(graph, path); | ||
| await importFile(graph, path); | ||
|
|
||
| const nodes = await graph.getNodes(); | ||
| const edges = await graph.getEdges(); | ||
|
|
||
| expect(nodes.length).toBe(2); | ||
| expect(edges.length).toBe(1); | ||
| }); |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Idempotency test could be strengthened by verifying importedAt is updated on re-import.
The test confirms node/edge counts remain stable, but doesn't verify that importedAt is refreshed (which is the expected behavior for an updatedAt-style timestamp). Adding a brief delay and asserting importedAt changed would confirm the timestamp semantics end-to-end.
Suggested enhancement
// Import twice
await importFile(graph, path);
+ const edges1 = await graph.getEdges();
+ const ts1 = edges1[0].props.importedAt;
+
await importFile(graph, path);
const nodes = await graph.getNodes();
const edges = await graph.getEdges();
expect(nodes.length).toBe(2);
expect(edges.length).toBe(1);
+ // importedAt should be refreshed on re-import
+ expect(edges[0].props.importedAt).toBeDefined();🤖 Prompt for AI Agents
In `@test/import.test.js` around lines 343 - 364, Update the "re-import produces
same graph state" test to also assert that the importedAt timestamp is refreshed
on re-import: after the first import (using importFile) call graph.getNodes()
and capture the importedAt value for the node(s) (e.g., the "file:auth" node),
wait a short interval (so the timestamp can change), call importFile again, then
fetch nodes again via graph.getNodes() and assert the new importedAt is
later/greater than the previous value; keep the existing assertions on counts
(nodes/edges) intact. Use the existing helpers (writeYaml, importFile,
graph.getNodes) and reference the node id "file:auth" when locating the
timestamp to compare.
|
@coderabbitai review please |
|
🧠 Learnings used✅ Actions performedReview triggered.
|
|
@coderabbitai review please |
|
🧠 Learnings used✅ Actions performedReview triggered.
|
|
@coderabbitai review please |
|
🧠 Learnings used✅ Actions performedReview triggered.
|
Summary
git mind import <file>— schema-validated YAML graph ingestionversion: 1required, unknown versions fail hardcreatePatch().commit()--dry-run/--validatevalidates without writing--jsonfor CI-friendly structured outputproperties:support in YAML formatImport format
Files changed
src/import.js— new: core import logic (parse, validate, write)test/import.test.js— new: 22 testspackage.json— addedjs-yamldirect dependencysrc/cli/commands.js— addedimportCmdcommandsrc/cli/format.js— addedformatImportResult()bin/git-mind.js— wired upimportcommand with--dry-run,--validate,--jsonsrc/index.js— exportsimportFile,parseImportFile,validateImportDataCHANGELOG.md— updatedTest plan
Closes #187
Summary by CodeRabbit
New Features
importCLI command to load YAML graph files with validation, warnings, and a summary of planned or applied changes--dry-runto validate without writing and--jsonfor machine-readable outputTests