You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Context: Follow-up to Discussion #33 on graph primitives. The typed-edges proposal (supersedes / contradicts / implements / unresolved) is data-shape work; this Discussion is about the operator-facing surface of that data — letting the human view the genome as an Obsidian vault and use Obsidian's graph view, search, and Dataview queries for free.
Goal: agree on the markdown layout before we build the exporter, so the schema doesn't churn after operators have habits.
Sibling work: PR for per-stage telemetry + Grafana dashboard is in flight in parallel — that's the "live ops" surface; this Discussion is the "browse the corpus" surface.
Proposed vault layout
~/.helix/vault/
├── README.md # generated: explains the layout, last export ts
├── genes/
│ ├── auth/ # by domain (or by source_id parent dir)
│ │ ├── middleware.md # one .md per gene
│ │ └── jwt-verify.md
│ └── ...
├── _sessions/
│ ├── laude.md # what laude touched, when, with what queries
│ └── raude.md
├── _refresh-targets/
│ ├── 2026-05-06.md # today's verify-before-acting queue
│ └── 2026-05-05.md
├── _unresolved/ # intent links — gene we expected but didn't find
│ └── rate-limiting-policy.md
├── _stale/ # genes with live_truth_score below threshold
│ └── ...
└── _meta/
├── chromatin-tier-counts.md
├── party-id-attribution.md
└── 12-signal-stats.md
Proposed gene file shape
---gene_id: abc123def456chromatin: euchromatin # tier name, not numberparty_id: swift_wing21participant_handle: laudedomains: [auth, jwt, security]source_id: helix_context/auth/middleware.pysource_lines: 42-89content_type: codelast_seen: 2026-05-06T20:45:00Zlive_truth_score: 0.92co_activation_partners: 7# Typed edges (when typed-edges proposal lands; today these are auto from harmonic_links/parent_of)supersedes: ['[[gene-old456]]']implements: ['[[spec-auth-flow]]']documented_by: ['[[docs/auth-readme]]']# Promoterpromoter_tags: [auth, middleware, jwt]synonym_hits: [token, login, security]---# auth/middleware.py:42-89```pythondefverify_token(token: str, secret: str) -> bool:
\"\"\"Verify JWT and return True if valid.\"\"\"...
Backlinks
(Obsidian populates this automatically from any [[gene-abc123def456]] reference)
## What you get for free
| Obsidian feature | Mapped to |
|---|---|
| Graph view | Whole genome graph by typed edges |
| Local graph (n-hop) | `helix_neighbors`-style exploration without the MCP roundtrip |
| Backlinks panel | "Who references this gene?" without coding it |
| Dataview queries | `LIST FROM #auth WHERE live_truth_score < 0.7` |
| Full-text search | All gene bodies + frontmatter |
| Tag pane | Browse by `domains`, `chromatin`, `party_id` |
| Properties view | Filter/sort by frontmatter columns |
## Specific shaping questions
These are the decisions that need to be made before someone builds the exporter, because they constrain the schema and changing them later is annoying:
### 1. Filename strategy: by gene_id, by source_id, or by domain?
- `genes/abc123def456.md` — stable, but unreadable. You can't navigate the vault folder tree.
- `genes/auth/middleware.md` — readable, but collides when two genes share a source path (different line ranges of the same file).
- `genes/auth/middleware-42-89.md` — readable AND unique, but ugly.
- `genes/<domain>/<source_stem>-<short_gene_id>.md` — best of both?
Lean toward the last. Open to pushback.
### 2. How aggressive should we be about wikilink coverage?
- **Minimal:** only typed edges become wikilinks (when the typed-edges proposal lands).
- **Medium:** typed edges + co_activation partners + parent/child attribution.
- **Aggressive:** also turn synonym hits into links to "concept" pages.
Aggressive risks creating a hairball graph view. Minimal undersells what helix knows. Medium is probably right.
### 3. Re-export cadence: snapshot, on-demand, or watched?
- **Snapshot (cron):** simplest; `helix-vault export --every 1h`. Vault is always slightly stale.
- **On-demand (CLI):** `helix-vault export --since 2026-05-06`. Operator runs when they want fresh data.
- **Watched (filesystem):** helix watches genome.db, exports incrementally. Closest to live; most complex.
I'd ship snapshot first, add on-demand second. Watched is overkill for v1.
### 4. Should we support **bidirectional** sync?
If the operator edits a gene in Obsidian (e.g., adds a tag, fixes a typo in the docstring header), should that flow back into helix?
- Pro: lets the operator curate the genome manually
- Con: turns helix into a two-way replication system; consistency is hard
- Compromise: read-only export, but a "feedback note" pattern where operator can drop notes that helix ingests as new genes attributed to the operator's session
I'd ship read-only first. Bidirectional is a separate Discussion.
### 5. How do _unresolved and _stale interact with the agent's refresh_targets?
The typed-edges proposal in #33 had `_unresolved/` (agent expected a gene that doesn't exist) and `_stale/` (gene's live_truth_score dropped). Both are visible operator artifacts.
Question: should opening an `_unresolved/rate-limiting.md` note in Obsidian and writing a real gene there cause helix to ingest it on next export pull? That would be a really clean UX — operator sees "agents wanted this knowledge," writes it, agents find it next time.
This depends on Q4. If we ship read-only first, `_unresolved/` is purely a viewer. If we add bidirectional later, this becomes an interaction loop.
## What I'd actually build for v1
Read-only snapshot exporter:
- `POST /export/obsidian` endpoint that writes the vault to a configurable path
- CLI: `helix-vault export [--path ~/.helix/vault] [--since N]`
- Filename strategy: `genes/<domain>/<source_stem>-<short_id>.md` (Q1 = option D)
- Wikilink coverage: medium (typed edges + co-activation + attribution) (Q2 = medium)
- Re-export: on-demand CLI; cron is operator's responsibility (Q3 = on-demand)
- Read-only: no sync-back yet (Q4 = no, defer)
- `_unresolved/` and `_stale/` are output only (Q5 = follows Q4)
Estimated cost: 1-2 days. New module `helix_context/exporters/obsidian.py`, ~300-500 lines.
## What I want input on
- Anything in the vault layout that's structurally wrong (folders that should be merged, naming that won't scale)
- Anyone who actively uses Obsidian + a code corpus today: what would you expect to see when you open the vault for the first time?
- Counterproposals for any of Q1-Q5 above
---
Sibling work:
- PR (in flight) — per-stage telemetry histograms + starter Grafana dashboard
- Discussion #33 — typed edges + unresolved/intent links (the schema this exporter would render)
- PR #32 — WAL bloat fix (merged context for why operator visibility matters)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Proposed vault layout
Proposed gene file shape
Backlinks
(Obsidian populates this automatically from any
[[gene-abc123def456]]reference)Beta Was this translation helpful? Give feedback.
All reactions