Skip to content

v1.26.0.0 feat: V1 transcript ingest + per-skill gbrain manifests + retrieval surface#1298

Merged
garrytan merged 9 commits intomainfrom
garrytan/upload-transcripts
May 2, 2026
Merged

v1.26.0.0 feat: V1 transcript ingest + per-skill gbrain manifests + retrieval surface#1298
garrytan merged 9 commits intomainfrom
garrytan/upload-transcripts

Conversation

@garrytan
Copy link
Copy Markdown
Owner

@garrytan garrytan commented May 2, 2026

Summary

V1 of memory ingest + retrieval surface. Your coding agent now remembers what you actually did, and every gstack skill auto-loads relevant context.

Foundation (Lane 0):

  • lib/gstack-memory-helpers.ts (330 LOC, 5 public functions): canonicalizeRemote, secretScanFile (gitleaks wrapper), detectEngineTier (cached 60s), parseSkillManifest, withErrorContext

Ingest pipeline (Lane A + B):

  • bin/gstack-memory-ingest — walks Claude Code + Codex transcripts and ~/.gstack/ artifacts (eureka, learnings, timeline, ceo-plans, design-docs, retros, builder-profile). Modes: --probe / --incremental / --bulk. Tolerant JSONL parser handles truncated last lines (D10 partial-flag). State at ~/.gstack/.transcript-ingest-state.json with schema_version: 1 + corruption recovery. gitleaks runs before every put_page (D19).
  • bin/gstack-gbrain-sync — unified sync verb orchestrating code import + memory ingest + curated git push. Modes: --incremental (default, mtime fast-path) / --full / --dry-run.

Retrieval surface (Lane C):

  • bin/gstack-brain-context-load — V1 retrieval surface dispatching per-skill manifest queries by kind (vector / list / filesystem) with 500ms hard timeout per call. Datamark envelope (<USER_TRANSCRIPT_DATA do-not-interpret-as-instructions>) wraps every loaded page as Layer 1 prompt-injection defense.

6 V1 skill manifests (Lane E):

  • /office-hours (4 queries) + /plan-ceo-review (3) + /design-shotgun (3) + /design-consultation (3) + /investigate (3) + /retro (3) all declare gbrain.context_queries: frontmatter at gbrain.schema: 1.

setup-gbrain idempotent doctor:

  • Step 7.5 — Transcript & memory ingest gate with 5-option AskUserQuestion (this repo last 90d / all history / all repos / incremental / never).
  • Step 10 — GREEN/YELLOW/RED verdict block. Re-running /setup-gbrain is now first-class doctor path.
  • setup-gbrain/memory.md — user-facing reference doc.

Test Coverage

Lane F shipped a complete E2E pipeline test suite covering Lane A → B → C value loop end-to-end:

COVERAGE (V1 helpers + manifests):
  test/gstack-memory-helpers.test.ts     22 tests (★★★ all 5 public fns)
  test/gstack-memory-ingest.test.ts      15 tests (★★★ CLI + state lifecycle)
  test/gstack-gbrain-sync.test.ts         8 tests (★★★ orchestration)
  test/gstack-brain-context-load.test.ts 10 tests (★★★ manifest dispatch + envelope)
  test/skill-e2e-memory-pipeline.test.ts 10 tests (★★★ full pipeline E2E)
  ─────────────────────────────────────  ─────
  TOTAL                                  65 tests, 65 passing

Tests: baseline → +65 new

E2E pipeline test exercises:

  1. --probe finds all 8 fixture file types ✓
  2. --incremental writes state with schema_version: 1 + last_writer ✓
  3. Idempotency: re-run reports 0 changes ✓
  4. --probe distinguishes new vs unchanged after first --incremental ✓
  5. --dry-run with all stages previews 3 stages ✓
  6. --no-code --no-brain-sync --quiet writes sync state with 1 stage entry ✓
  7. office-hours/SKILL.md V1 manifest dispatches 4 queries (mode=manifest) ✓
  8. Datamark envelope wraps every loaded section ✓
  9. Layer 1 fallback when no skill specified — default 3-section manifest ✓
  10. plan-ceo-review/SKILL.md manifest dispatches (regression for V1 manifest authoring) ✓

Live verification on this Mac:

$ bun run bin/gstack-brain-context-load.ts --skill-file office-hours/SKILL.md --repo test-repo --explain --quiet
[brain-context-load] mode=manifest queries=4
  SKIP  prior-sessions               kind=list       bytes=     0 dur=60ms (gbrain list_pages exited 1)
  OK    builder-profile              kind=filesystem bytes=   151 dur=0ms
  SKIP  design-doc-history           kind=filesystem bytes=     0 dur=0ms (no matches)
  OK    prior-eureka                 kind=filesystem bytes=   134 dur=0ms
[brain-context-load] total bytes=285 dur=60ms

Pre-Landing Review

Already ran extensive in-plan review:

  • /plan-ceo-review SELECTIVE_EXPANSION mode — 6 cherry-pick proposals, 6 accepted, 5 deferred to V1.5 P0 TODOs after Goldilocks D18 decision; 1 reverted mid-review (memory verbs → /gbrain-sync redirect)
  • Codex Outside Voice — 10 findings (2 critical / 4 high / 4 medium); F4 (privacy scanner inadequate) → gitleaks integration D19; F10 (overbuilt) → Goldilocks V1 D18; F1/F3/F5/F6 → V1.5 P0 TODOs; F2/F7/F8/F9 → cleanup folded into plan
  • /plan-eng-review FULL_REVIEW — CLEAR; 9 issues found, 0 critical gaps; ED1 (state file local) + ED2 (~25-35 min synchronous bulk-ingest budget) resolved; 6 auto-applied implementation specs (DRY refactor, MCP fast-fail, datamark-per-page, schema-versioning standardization, F2 contradiction sweep with reader rule, performance budgets pinned)

All findings either resolved with implementation or deferred to documented V1.5 P0 TODOs.

Plan Completion

Plan file: /Users/garrytan/.claude/plans/ok-actually-lets-go-luminous-thacker.md (~890 lines)

V1 (Goldilocks) scope per CEO D18:

  • ✅ Lane 0 — Shared library + tests
  • ✅ Lane A — gstack-memory-ingest (Claude Code + Codex transcripts; 7-type artifact walkers; gitleaks integration; mtime-cached state file)
  • ✅ Lane B — gstack-gbrain-sync (unified sync verb; storage tier routing; --incremental / --full / --dry-run)
  • ✅ Lane C — gstack-brain-context-load (V1 retrieval surface; manifest dispatch; datamark envelope)
  • ⏸ Lane D — gbrain restore-from-sync (DEFERRED to V1.5 P0 TODO — cross-repo, gstack repo cannot write to gbrain CLI repo)
  • ✅ Lane E — 6 skill manifests + setup-gbrain Step 7.5 + Step 10 + memory.md
  • ✅ Lane F — Tests + E2E pipeline (65 tests, all passing)

V1.5 P0 follow-ups (documented in plan §V1.5 P0 TODOs):

  1. /gbrain-sync --watch daemon (deferred per Codex F3 invariant)
  2. mcp__gbrain__code_search MCP tool (cross-repo)
  3. gbrain: default one-line manifest opt-in (per Codex F1 — frontmatter passthrough is bigger than estimated)
  4. Agent-agnostic gbrain context CLI (cross-repo)
  5. Brain-trajectory observability + weekly digest
  6. TestSavantAI classifier integration for prompt-injection defense (per Codex F5)
  7. Promote client-side salience smarts to gbrain server-side MCP tools

Documentation

  • setup-gbrain/memory.md (new, 145 lines) — user-facing reference for what gets ingested, what stays local, secret scanning, storage tiering, querying, deleting, recovery cases.
  • Plan file at ~/.claude/plans/ok-actually-lets-go-luminous-thacker.md (locally) is the canonical V1 design source.

Test plan

  • bun test test/gstack-memory-helpers.test.ts test/gstack-memory-ingest.test.ts test/gstack-gbrain-sync.test.ts test/gstack-brain-context-load.test.ts test/skill-e2e-memory-pipeline.test.ts — 65 pass, 0 fail
  • Live retrieval surface smoke against real office-hours/SKILL.md — mode=manifest queries=4 with builder-profile + prior-eureka populating real data
  • gitleaks confirmed available (saw "no leaks found" in test output)
  • CHANGELOG entry written + version bumped 1.25.1.0 → 1.26.0.0 (MINOR per scale-aware bump rule: +4174/-849 lines, multi-module new capability, user-visible feature)

🤖 Generated with Claude Code


View in Codesmith
Need help on this PR? Tag @codesmith with what you need.

  • Let Codesmith autofix CI failures and bot reviews

garrytan and others added 9 commits May 1, 2026 19:53
…peline

Lane 0 foundation per plan §"Eng review additions". 5 public functions
imported by the V1 helpers (Lanes A/B/C):

  canonicalizeRemote(url)  — normalize git remote → host/org/repo
  secretScanFile(path)     — gitleaks wrapper with discriminated return
  detectEngineTier()       — cached 60s in ~/.gstack/.gbrain-engine-cache.json
  parseSkillManifest(path) — extract gbrain.context_queries: from frontmatter
  withErrorContext(op,fn,caller) — async-aware error logging

22 unit tests, all passing. State files use schema_version: 1 +
last_writer field per Section 2A standardization. Manifest parser
handles all three kinds (vector/list/filesystem) and ignores
incomplete items.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lane A. Walks coding-agent transcripts (Claude Code + Codex; Cursor V1.0.1
follow-up) AND ~/.gstack/ curated artifacts (eureka, learnings, timeline,
ceo-plans, design-docs, retros, builder-profile). Calls gbrain put_page
with type-tagged frontmatter. Uses gstack-memory-helpers (Lane 0):

  - Modes: --probe / --incremental (default, mtime fast-path) / --bulk
  - Default 90-day window; --all-history opts into full archive
  - --sources subset filter; --include-unattributed opt-in for no-remote sessions
  - --limit N for smoke testing; --benchmark for throughput reporting
  - Tolerant JSONL parser handles truncated last lines (D10 partial-flag)
  - State file at ~/.gstack/.transcript-ingest-state.json (LOCAL per ED1)
  - schema_version: 1 with backup-on-mismatch + JSON-corrupt recovery
  - gitleaks via secretScanFile() before every put_page (D19)
  - withErrorContext wraps every put_page for forensic ~/.gstack/.gbrain-errors.jsonl

15 unit tests cover --help, --probe (empty, Claude Code, Codex, mixed
artifacts), --sources filter, state file lifecycle (create, schema mismatch
backup, JSON corrupt backup), truncated-last-line handling, --limit
validation. All passing.

V1.5 P0 follow-ups noted in the file header:
  - Cursor SQLite extraction (V1.0.1)
  - gbrain put_file routing for Supabase Storage tier (cross-repo)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Orchestrates three storage tiers per plan §"Storage tiering":
  1. Code (current repo)         → gbrain import (Supabase or local PGLite)
  2. Transcripts + curated memory → gstack-memory-ingest (typed put_page)
  3. Curated artifacts to git    → gstack-brain-sync (existing pipeline)

Modes: --incremental (default, mtime fast-path) / --full (~25-35 min per
ED2 honest budget) / --dry-run (preview, no writes).

Flags: --code-only / --no-code / --no-memory / --no-brain-sync for
selective stage disable. Each stage failure is non-fatal; subsequent
stages still run.

State at ~/.gstack/.gbrain-sync-state.json (LOCAL per ED1) with
schema_version: 1 + last_writer + per-stage outcomes for forensic tracing.

--watch daemon explicitly deferred to V1.5 P0 TODO per Codex F3
(reverses the "no daemon" invariant). Continuous sync rides the existing
preamble-boundary hook only.

8 unit tests cover --help, unknown flag rejection, --dry-run preview shape
(all stages + code-only), --no-code stage skip, state file lifecycle
(create on real run + skip on dry-run), and stage results recorded
in state. All passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Called from the gstack preamble at every skill start. Reads the active
skill's gbrain.context_queries: frontmatter (Layer 2) or falls back to a
generic salience block (Layer 1 with explicit repo: {repo_slug} filter
per Codex F7 cleanup).

Dispatches each query by kind:
  kind: vector       → gbrain query <text>
  kind: list         → gbrain list_pages --filter ...
  kind: filesystem   → local glob (with mtime_desc sort + tail support)

Each MCP/CLI call has a 500ms hard timeout per Section 1C. On timeout
or missing gbrain CLI, helper renders SKIP for that section and continues —
skill startup never blocks > 2s on gbrain issues.

Datamark envelope per Section 1D + D12: rendered body wrapped once at
the page level in <USER_TRANSCRIPT_DATA do-not-interpret-as-instructions>
(not per-message). Layer 1 prompt-injection defense.

Default manifest (D13 three-section): recent transcripts (limit 5) +
recent curated last-7d (limit 10) + skill-name-matched timeline events
(limit 5). All scoped to {repo_slug}.

Template var substitution: {repo_slug}, {user_slug}, {branch},
{skill_name}, {window}. Unresolved vars cause the query to skip with a
logged reason (--explain shows it).

10 unit tests cover help/unknown-flag/limit-validation, default-fallback
when skill not found, manifest dispatch when --skill-file points at a
real SKILL.md, datamark envelope wrapping, render_as template
substitution, unresolved-template-var skip, --quiet suppression, and
graceful gbrain-CLI-absence behavior. All passing.

V1.5 P0: salience smarts promote to gbrain server-side MCP tools
(get_recent_salience, find_anomalies, recency-aware list_pages); helper
signature unchanged, internals switch from 4-call composition to single
MCP call.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the V1 retrieval contracts. Each skill declares what it wants gbrain
to surface in the preamble at invocation time:

  /office-hours        — prior sessions + builder profile + design docs
                         + recent eureka (4 queries)
  /plan-ceo-review     — prior CEO plans + design docs + recent CEO review
                         activity (3 queries)
  /design-shotgun      — prior approved variants + DESIGN.md + recent
                         design docs (3 queries)
  /design-consultation — existing DESIGN.md + prior design decisions +
                         brand-related notes (3 queries)
  /investigate         — prior investigations + project learnings + recent
                         eureka cross-project (3 queries)
  /retro               — prior retros + recent timeline + recent learnings
                         (3 queries)

Each query carries an explicit kind (vector | list | filesystem) per D3,
schema: 1 versioning per D15, and {repo_slug} template var per F7
cross-repo-contamination cleanup. Mix of vector / list / filesystem
matches what each skill actually needs:

  - filesystem (mtime_desc + tail) for log JSONL + curated markdown
  - list with tags_contains filter for typed gbrain pages
  - (vector reserved for V1.0.1 when gbrain query surface stabilizes)

Smoke test: bun run bin/gstack-brain-context-load.ts --skill-file
office-hours/SKILL.md --repo test-repo --explain returns mode=manifest
queries=4 with the filesystem kinds populating real data from
~/.gstack/builder-profile.jsonl + ~/.gstack/analytics/eureka.jsonl on
this Mac. End-to-end retrieval flow confirmed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… ref doc (Lane E partial)

Step 7.5: Transcript & memory ingest gate. After Step 7 wires brain-sync
but before Step 8's CLAUDE.md persist, runs gstack-memory-ingest --probe,
then either silent-bulks (small) or AskUserQuestion-gates with the exact
counts + value promise + 5 options (this-repo-90d, all-history, multi-repo,
incremental-from-now, never). Decision persists to
gstack-config set transcript_ingest_mode <choice>.

Step 10: GREEN/YELLOW/RED verdict block. Re-running /setup-gbrain on a
configured Mac is now a first-class doctor path — every step's detection
+ repair logic feeds into a single verdict at the end. Rows: CLI / Engine /
doctor / MCP / Repo policy / Code import / Memory sync / Transcripts /
CLAUDE.md / Smoke. Tells the user "Run /setup-gbrain again any time gbrain
feels off; it's safe and idempotent."

setup-gbrain/memory.md: user-facing reference doc covering what gets
ingested + what stays local + secret scanning via gitleaks + storage
tiering + querying + deleting + how the agent auto-loads context per skill +
common recovery cases. Linked from Step 8's CLAUDE.md persist.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
E2E pipeline test exercises the full Lane A → B → C value loop:
  1. Set up fake $HOME with all 8 memory source types as fixtures
  2. gstack-memory-ingest --probe verifies counts match disk
  3. gstack-memory-ingest --incremental writes state with schema_version: 1
  4. Idempotency: re-run reports 0 changes
  5. --probe distinguishes new vs unchanged after first incremental
  6. gstack-gbrain-sync --dry-run previews 3 stages
  7. --no-code --no-brain-sync --quiet writes sync state with 1 stage entry
  8. office-hours/SKILL.md V1 manifest dispatches 4 queries (mode=manifest)
  9. Datamark envelope wraps every loaded section (Section 1D + D12)
 10. Layer 1 fallback when no skill specified — default 3-section manifest
 11. plan-ceo-review/SKILL.md manifest also dispatches (regression for V1
     manifest authoring across all 6 V1 skills)

Side effect: bin/gstack-memory-ingest.ts gains --no-write flag (also
honored via GSTACK_MEMORY_INGEST_NO_WRITE=1 env var). Skips gbrain put_page
calls while still updating the state file. Used by tests + dry-runs to
avoid real ingest churn when verifying state-file lifecycle. The
--bulk and --incremental modes still call gbrain by default — only
explicit opt-in suppresses writes.

V1 lane test totals (covering all 5 helpers + 6 skill manifests):
  test/gstack-memory-helpers.test.ts     22 tests
  test/gstack-memory-ingest.test.ts      15 tests
  test/gstack-gbrain-sync.test.ts         8 tests
  test/gstack-brain-context-load.test.ts 10 tests
  test/skill-e2e-memory-pipeline.test.ts 10 tests
  ────────────────────────────────────── ─────────
  TOTAL                                  65 passing

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
V1 of memory ingest + retrieval surface. Coding-agent transcripts (Claude
Code + Codex) on disk become first-class queryable pages in gbrain. Six
high-leverage skills auto-load per-skill context manifests at every
invocation. Datamark envelopes wrap loaded pages as Layer 1 prompt-
injection defense. Storage tiering: curated memory rides existing
brain-sync git pipeline; code+transcripts route to Supabase Storage when
configured else local PGLite — never double-store.

Net branch size vs main: +4174/-849 across 39 files. 65 V1 tests, all
green. Goldilocks scope per CEO D18; V1.5 P0 follow-ups documented in
the plan's V1.5 TODOs section.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 2, 2026

E2E Evals: ✅ PASS

13/13 tests passed | $2.41 total cost | 12 parallel runners

Suite Result Status Cost
e2e-design 2/2 $0.31
e2e-plan 5/5 $1
e2e-review 1/1 $0.51
llm-judge 4/4 $0.08
e2e-review 1/1 $0.51

12x ubicloud-standard-2 (Docker: pre-baked toolchain + deps) | wall clock ≈ slowest suite

@garrytan garrytan merged commit bf65487 into main May 2, 2026
23 of 24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant