Skip to content

feat: YAML import pipeline — INTAKE milestone#188

Merged
flyingrobots merged 6 commits intomainfrom
roadmap/intake
Feb 11, 2026
Merged

feat: YAML import pipeline — INTAKE milestone#188
flyingrobots merged 6 commits intomainfrom
roadmap/intake

Conversation

@flyingrobots
Copy link
Owner

@flyingrobots flyingrobots commented Feb 11, 2026

Summary

  • Adds git mind import <file> — schema-validated YAML graph ingestion
  • version: 1 required, unknown versions fail hard
  • Reference validation: all edge endpoints must be declared in the file or pre-exist in the graph
  • Atomic writes: all-or-nothing via single createPatch().commit()
  • Idempotent: re-importing the same file is safe (nodes merge, edges update)
  • --dry-run / --validate validates without writing
  • --json for CI-friendly structured output
  • Node properties: support in YAML format

Import format

version: 1
nodes:
  - id: "spec:auth"
  - id: "file:src/auth.js"
    properties:
      status: active
edges:
  - source: "file:src/auth.js"
    target: "spec:auth"
    type: implements
    confidence: 0.9
    rationale: "Main auth module"

Files changed

  • src/import.js — new: core import logic (parse, validate, write)
  • test/import.test.js — new: 22 tests
  • package.json — added js-yaml direct dependency
  • src/cli/commands.js — added importCmd command
  • src/cli/format.js — added formatImportResult()
  • bin/git-mind.js — wired up import command with --dry-run, --validate, --json
  • src/index.js — exports importFile, parseImportFile, validateImportData
  • CHANGELOG.md — updated

Test plan

  • 117 tests passing (22 new import tests)
  • Schema validation: missing version, wrong version, non-object YAML, non-array fields
  • Node validation: invalid IDs, missing ID field, unknown prefix warnings
  • Edge validation: invalid types, invalid confidence, self-edge rejection, missing fields
  • Reference validation: dangling edges rejected, pre-existing graph nodes accepted
  • Successful import with node properties, confidence, rationale
  • Dry run validates without writing
  • Idempotent re-import
  • Atomic failure: nothing written on validation error
  • Empty file (version only) handled gracefully

Closes #187

Summary by CodeRabbit

  • New Features

    • Add import CLI command to load YAML graph files with validation, warnings, and a summary of planned or applied changes
    • Supports --dry-run to validate without writing and --json for machine-readable output
    • Terminal-friendly formatted results showing success/failure, counts, and warnings
  • Tests

    • Added comprehensive tests covering validation, atomicity, dry-run behavior, idempotency, and result structure

Implements INTAKE milestone — schema-validated YAML import with version
checking, reference validation (no dangling edges), idempotent merge,
atomic writes, node properties, and dry-run mode. Adds js-yaml as
direct dependency. 22 new tests (117 total).
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 11, 2026

📝 Walkthrough

Walkthrough

Adds a YAML import pipeline and a CLI subcommand git mind import that parses, validates, optionally dry-runs, and atomically writes nodes and edges into the graph; exposes import helpers via public exports and includes tests and a YAML parser dependency.

Changes

Cohort / File(s) Summary
CLI Entrypoint
bin/git-mind.js
Registers import <file> usage, validates presence of <file>, parses --dry-run/--validate/--json flags, and dispatches to importCmd.
CLI Command Handler
src/cli/commands.js
Adds exported importCmd(cwd, filePath, opts) that loads graph, calls import pipeline, emits JSON or formatted output, and sets exit codes on errors.
CLI Formatting
src/cli/format.js
Adds formatImportResult(result) to render validation/errors/warnings and import stats for terminal output.
Import Pipeline
src/import.js
New YAML import implementation: parseImportFile, validateImportData, importFile (internal writeImport). Handles version/schema, node/edge validation, reference checks, dry-run, atomic commit, and returns structured results.
Public Exports
src/index.js
Re-exports importFile, parseImportFile, and validateImportData from src/import.js.
Dependencies
package.json
Adds js-yaml dependency (^4.1.1).
Tests
test/import.test.js
Adds comprehensive tests for parse/validation errors, warnings, dry-run behavior, idempotency, atomicity, and successful writes.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as CLI (git mind import)
    participant Cmd as importCmd
    participant Parser as parseImportFile
    participant Validator as validateImportData
    participant Writer as writeImport
    participant Graph as Graph

    CLI->>Cmd: importCmd(cwd, filePath, {dryRun, json})
    Cmd->>Parser: parseImportFile(filePath)
    Parser-->>Cmd: {data, parseError}
    alt parseError
        Cmd-->>CLI: result(valid:false, errors)
    else
        Cmd->>Validator: validateImportData(data, graph)
        Validator-->>Cmd: {valid, errors, warnings, declaredNodeIds}
        alt invalid
            Cmd-->>CLI: result(valid:false, errors, warnings)
        else valid
            alt dryRun
                Cmd-->>CLI: result(valid:true, dryRun:true, stats)
            else write
                Cmd->>Writer: writeImport(graph, data)
                Writer->>Graph: apply patch / commit
                Graph-->>Writer: commit result
                Writer-->>Cmd: {nodesWritten, edgesWritten}
                Cmd-->>CLI: result(valid:true, stats)
            end
        end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐇
I nibble lines of YAML neat,
Check each node and every link,
Dry-run first, then write complete,
No dangling tails or missing ink,
The graph grows hop by hop — what a treat!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically identifies the main feature: a YAML import pipeline for the INTAKE milestone.
Linked Issues check ✅ Passed All objectives from #187 are met: YAML parser with version validation, idempotent merge, reference validation, atomic writes, dry-run mode, CLI command, and comprehensive test suite.
Out of Scope Changes check ✅ Passed All changes are scoped to the YAML import feature; no unrelated modifications detected across added files, CLI integration, exports, dependencies, and tests.
Docstring Coverage ✅ Passed Docstring coverage is 90.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch roadmap/intake

No actionable comments were generated in the recent review. 🎉


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

…185, #187)

Adds CLI reference for git mind nodes, status, and import. Adds YAML
import format section, node query and status library examples, and
updated public API export listing.
Reference artifact documenting architecture, data model, runtime
behavior, system invariants, failure modes, and roadmap trajectory.
@flyingrobots
Copy link
Owner Author

@coderabbitai review please

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 11, 2026

@flyingrobots: I'll review the changes in this PR right away. This looks like a comprehensive addition of the YAML import pipeline with thorough testing coverage.


🧠 Learnings used
Learnt from: flyingrobots
Repo: neuroglyph/git-mind PR: 169
File: docs/templates/code-review-feedback.md:26-33
Timestamp: 2025-09-16T17:10:27.477Z
Learning: The user (flyingrobots) uses Obsidian to view/edit Markdown files and prefers to keep Obsidian-specific syntax like collapsible callout markers (e.g., `[!NOTE]-` for collapsible callouts) even if they're not GitHub-compatible.
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Fix all issues with AI agents
In `@bin/git-mind.js`:
- Around line 126-138: Update the import command usage text to mention the
undocumented alias: when parsing the 'import' case (the args handling that sets
dryRun using args.includes('--dry-run') || args.includes('--validate') and calls
importCmd), add `--validate` to the usage/help output (or explicitly document it
as an alias of `--dry-run`) so the message shown by console.error for the import
subcommand reflects both `--dry-run` and `--validate`.

In `@src/import.js`:
- Around line 186-192: The code currently resets the edge 'createdAt' on every
import by calling patch.setEdgeProperty(edge.source, edge.target, edge.type,
'createdAt', new Date().toISOString()); change this so creation timestamp is
preserved: either rename the field to 'importedAt' or 'updatedAt' if you intend
to refresh it, or, preferably, only set 'createdAt' when it does not already
exist by checking the existing edge properties (e.g., via the
patch/getEdge/getEdgeProperty helper before calling patch.setEdgeProperty);
update the logic around patch.setEdgeProperty(...) for 'createdAt' accordingly
and keep the existing rationale handling.
- Around line 96-148: The edge-validation pushes separate missing-field errors
for source/target/type then redundantly rechecks all three with a combined if
that causes a continue; refactor the required-field logic in the edge loop (for
variables edge in edgeEntries) to first collect missing required fields
(source/target/type), push one error per missing field, and if any are missing
immediately continue so later checks (validateNodeId, validateEdgeType,
validateConfidence, self-edge and reference checks using validateNodeId,
validateEdgeType, validateConfidence, declaredNodeIds, existingNodes) only run
when all three required fields are present; remove the duplicated combined check
to improve readability and ensure consistent control flow.
- Around line 160-202: The writeImport function currently only checks typeof
node.properties === 'object' which allows arrays through and causes
Object.entries to write numeric-indexed properties; update writeImport to ensure
node.properties is a plain object by adding a guard like node.properties !==
null && !Array.isArray(node.properties) before iterating, or alternatively
enforce this validation in validateImportData so writeImport only ever receives
a non-array object; reference the writeImport function and the node.properties
check (and mirror the same Array.isArray guard used/needed in parseImportFile)
when making the change.
- Around line 28-42: The function parseImportFile currently treats any non-null
object as valid, which lets YAML arrays (typeof [] === 'object') slip through
and later trigger misleading "Missing required field" errors; update the guard
in parseImportFile to explicitly reject arrays (e.g., check Array.isArray(data))
so that if data is null, not an object, or is an array you return { data: null,
parseError: 'YAML file is empty or not an object' } — adjust the condition that
inspects data (the variable named data in parseImportFile) and keep the existing
error text and return shape.

In `@test/import.test.js`:
- Around line 322-343: The test should assert that createdAt timestamps are
preserved across idempotent imports: after the first import (using importFile),
fetch the nodes and edges via graph.getNodes()/graph.getEdges() and record their
createdAt values (for the node with id "spec:auth" and/or edge from
"file:auth.js" to "spec:auth"), then perform the second import and re-fetch
nodes/edges and assert the createdAt values are equal (i.e., not overwritten);
update the 're-import produces same graph state' test to capture and compare
createdAt from the first and second imports using the existing importFile,
graph.getNodes, and graph.getEdges symbols.

- Reject YAML arrays in parseImportFile (typeof [] === 'object' pitfall)
- Validate node.properties is a plain object, not an array
- Rename edge createdAt to importedAt (honest re-import semantics)
- Refactor edge required-field checks for cleaner control flow
- Document --validate alias in CLI usage text
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@test/import.test.js`:
- Around line 1-9: The imports parseImportFile and validateImportData are not
used directly in the tests; either remove these unused imports from the test
file or add direct unit tests exercising parseImportFile and validateImportData
(in addition to the existing importFile tests) so they are referenced. Locate
the import statement that includes importFile, parseImportFile,
validateImportData and either delete parseImportFile and validateImportData from
that import list, or add new it() blocks that call parseImportFile(...) and
validateImportData(...) with representative inputs and assertions to cover their
behavior.
- Around line 343-364: Update the "re-import produces same graph state" test to
also assert that the importedAt timestamp is refreshed on re-import: after the
first import (using importFile) call graph.getNodes() and capture the importedAt
value for the node(s) (e.g., the "file:auth" node), wait a short interval (so
the timestamp can change), call importFile again, then fetch nodes again via
graph.getNodes() and assert the new importedAt is later/greater than the
previous value; keep the existing assertions on counts (nodes/edges) intact. Use
the existing helpers (writeYaml, importFile, graph.getNodes) and reference the
node id "file:auth" when locating the timestamp to compare.

Comment on lines 343 to 364
describe('idempotency', () => {
it('re-import produces same graph state', async () => {
const path = await writeYaml('graph.yaml', `
version: 1
nodes:
- id: "spec:auth"
- id: "file:auth.js"
edges:
- source: "file:auth.js"
target: "spec:auth"
type: "implements"
`);
// Import twice
await importFile(graph, path);
await importFile(graph, path);

const nodes = await graph.getNodes();
const edges = await graph.getEdges();

expect(nodes.length).toBe(2);
expect(edges.length).toBe(1);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Idempotency test could be strengthened by verifying importedAt is updated on re-import.

The test confirms node/edge counts remain stable, but doesn't verify that importedAt is refreshed (which is the expected behavior for an updatedAt-style timestamp). Adding a brief delay and asserting importedAt changed would confirm the timestamp semantics end-to-end.

Suggested enhancement
       // Import twice
       await importFile(graph, path);
+      const edges1 = await graph.getEdges();
+      const ts1 = edges1[0].props.importedAt;
+
       await importFile(graph, path);
 
       const nodes = await graph.getNodes();
       const edges = await graph.getEdges();
 
       expect(nodes.length).toBe(2);
       expect(edges.length).toBe(1);
+      // importedAt should be refreshed on re-import
+      expect(edges[0].props.importedAt).toBeDefined();
🤖 Prompt for AI Agents
In `@test/import.test.js` around lines 343 - 364, Update the "re-import produces
same graph state" test to also assert that the importedAt timestamp is refreshed
on re-import: after the first import (using importFile) call graph.getNodes()
and capture the importedAt value for the node(s) (e.g., the "file:auth" node),
wait a short interval (so the timestamp can change), call importFile again, then
fetch nodes again via graph.getNodes() and assert the new importedAt is
later/greater than the previous value; keep the existing assertions on counts
(nodes/edges) intact. Use the existing helpers (writeYaml, importFile,
graph.getNodes) and reference the node id "file:auth" when locating the
timestamp to compare.

)

- Remove unused parseImportFile/validateImportData imports from test
- Verify importedAt is present after re-import in idempotency test
@flyingrobots
Copy link
Owner Author

@coderabbitai review please

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 11, 2026

@flyingrobots: I'll review the changes in this PR now.


🧠 Learnings used
Learnt from: flyingrobots
Repo: neuroglyph/git-mind PR: 169
File: docs/templates/code-review-feedback.md:26-33
Timestamp: 2025-09-16T17:10:27.477Z
Learning: The user (flyingrobots) uses Obsidian to view/edit Markdown files and prefers to keep Obsidian-specific syntax like collapsible callout markers (e.g., `[!NOTE]-` for collapsible callouts) even if they're not GitHub-compatible.
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@flyingrobots
Copy link
Owner Author

@coderabbitai review please

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 11, 2026

@flyingrobots: I'll review the changes in this PR right away.


🧠 Learnings used
Learnt from: flyingrobots
Repo: neuroglyph/git-mind PR: 169
File: docs/templates/code-review-feedback.md:26-33
Timestamp: 2025-09-16T17:10:27.477Z
Learning: The user (flyingrobots) uses Obsidian to view/edit Markdown files and prefers to keep Obsidian-specific syntax like collapsible callout markers (e.g., `[!NOTE]-` for collapsible callouts) even if they're not GitHub-compatible.
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@flyingrobots
Copy link
Owner Author

@coderabbitai review please

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 11, 2026

@flyingrobots: I'll review the changes in this PR right away.


🧠 Learnings used
Learnt from: flyingrobots
Repo: neuroglyph/git-mind PR: 169
File: docs/templates/code-review-feedback.md:26-33
Timestamp: 2025-09-16T17:10:27.477Z
Learning: The user (flyingrobots) uses Obsidian to view/edit Markdown files and prefers to keep Obsidian-specific syntax like collapsible callout markers (e.g., `[!NOTE]-` for collapsible callouts) even if they're not GitHub-compatible.
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@flyingrobots flyingrobots merged commit d52bbb7 into main Feb 11, 2026
6 checks passed
@flyingrobots flyingrobots deleted the roadmap/intake branch February 11, 2026 12:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: YAML import pipeline (git mind import) — INTAKE milestone

1 participant