feat: self-contained API keys (read from gbrain's own config, not just env) by vinsew · Pull Request #121 · garrytan/gbrain

vinsew · 2026-04-14T18:02:23Z

Summary

Today src/core/embedding.ts and src/core/search/expansion.ts call new OpenAI() / new Anthropic() with no arguments, which makes the SDKs read OPENAI_API_KEY / ANTHROPIC_API_KEY from the process's env. That puts the burden on every caller — shells, cron jobs, agent subprocess tools, daemons, containers — to propagate those env vars correctly.

When a caller's subprocess env doesn't have them (common for launchd-spawned daemons, agent terminal tools with env sanitization, or any subprocess without explicit env forwarding), the caller silently gets empty results from gbrain query / gbrain embed because the SDK falls back to anonymous API calls that fail with 401 (then throw, then get caught by gbrain's retry/fallback logic, then return "No results").

The debugging experience is painful: the user sees "No results" in their agent's output, assumes the brain is empty, doesn't realize the subprocess env is missing keys. I hit this personally running gbrain query from a launchd-managed agent's terminal tool — the .zshrc-sourced keys were in my interactive shell but not in the daemon's env, and there's no surface-level error telling you that.

The existing schema is almost there — just unconnected

GBrainConfig (src/core/config.ts:10-16) already defines openai_api_key and anthropic_api_key fields, and saveConfig() writes them to ~/.gbrain/config.json with 0600 perms. But none of the runtime code actually reads those fields — the config is populated but never consulted. This PR just connects the wiring.

Also fixed: loadConfig() was silently dropping ANTHROPIC_API_KEY from the env merge (only OPENAI_API_KEY was being merged on line 43). Minor but present bug.

Fix

Three small changes:

config.ts: merge ANTHROPIC_API_KEY env var into loaded config alongside OPENAI_API_KEY (was silently dropped).
embedding.ts: read openai_api_key from loadConfig() and pass to new OpenAI({ apiKey }). Falls back to new OpenAI() (SDK env-default) when config has no key — preserves current behavior for users who rely on env vars.
expansion.ts: same pattern for Anthropic.

Precedence preserved from existing code: env var > config file (because loadConfig() merges env over file). Users who want to override per-process still can.

Usage for callers (new capability)

One-time setup:

# Edit gbrain's own config file
$ vim ~/.gbrain/config.json
# or via a helper (could add \`gbrain config set openai_api_key sk-...\` in follow-up PR)

Then from any caller — no env vars needed:

gbrain query \"...\"      # just works
gbrain embed --stale    # just works

Especially valuable for:

Cron jobs under launchd/systemd (don't inherit shell env)
Agent terminal tools with env sanitization (e.g. sandbox policies)
Subprocess calls from Python/Node agents that don't forward env
Docker containers without -e OPENAI_API_KEY passthrough

Impact

4 files changed, +82 / -3 lines
Zero behavior change for users who already have env vars set
New capability: callers without env vars in their subprocess context now work if keys are in ~/.gbrain/config.json
Config file storage is 0600-permissioned (already done by saveConfig)

Test plan

4 new tests in test/config.test.ts:
- OPENAI_API_KEY env merges into config
- ANTHROPIC_API_KEY env merges into config (regression — was missing)
- Both together
- Both absent when neither source provides them
All 12 config tests pass (8 existing + 4 new)
Full bun test suite: no new regressions (the 4 pre-existing PGLiteEngine failures are unrelated and exist on master)

Context

Fourth in a series of PRs from real-user setup on a Chinese-speaking deployment: #114 (chunker CJK), #115 (slugify CJK), #119 (sync CJK via core.quotepath), and now this one (key portability). Each addresses a distinct anti-pattern surfaced by running gbrain outside the "English-speaker, interactive-shell, env-vars-in-.zshrc" default assumption.

This PR is the only one that's a feature, not a bug fix — but it makes gbrain meaningfully more embeddable as the knowledge backbone for cron-driven agents and daemons, which is the direction the SKILLPACK itself advocates.

Today embedding.ts and expansion.ts call `new OpenAI()` / `new Anthropic()` with no arguments, which makes the SDKs read OPENAI_API_KEY / ANTHROPIC_API_KEY from the process's env. That puts the burden on every caller — shells, cron jobs, agent subprocesses, daemons — to propagate those env vars correctly. When a caller's env doesn't have them (e.g. launchd-spawned daemons, agent terminal tools with sanitized env), the caller silently gets empty results from `gbrain query` / `gbrain embed` because the SDK falls back to anonymous API calls that fail. GBrain already has `openai_api_key` and `anthropic_api_key` fields in its GBrainConfig schema (src/core/config.ts) and stores them in ~/.gbrain/config.json, but none of the runtime code actually reads those fields — the config is populated but never consulted. This PR connects that wiring so gbrain becomes self-contained: callers just run `gbrain ...` and gbrain finds its own keys. Changes: - config.ts: merge ANTHROPIC_API_KEY from env into loaded config (was silently dropped — only OPENAI_API_KEY was being merged) - embedding.ts: read openai_api_key from loadConfig() and pass to `new OpenAI({ apiKey })`. Falls back to SDK's env-default behavior when config has no key (preserves current behavior for users who rely on env vars). - expansion.ts: same pattern for Anthropic. Usage for callers: # One-time setup (put keys in gbrain's own config file) $ cat >> ~/.gbrain/config.json.fragment <<EOF {"openai_api_key": "sk-...", "anthropic_api_key": "sk-ant-..."} EOF $ chmod 600 ~/.gbrain/config.json # (or edit config.json directly) # Then from ANY caller, no env vars needed: $ gbrain query "..." # just works $ gbrain embed --stale # just works This is especially valuable for: - Cron jobs run under launchd/systemd (which don't inherit shell env) - Agent terminal tools with env sanitization - Subprocess calls from Python/Node agents without env passthrough - Docker containers without explicit env forwarding Precedence (preserved from existing code): env var > config file So users who want to override per-process still can. Impact: - 4 files changed, +67 / -5 lines - Zero behavior change for users who already have env vars set - Callers without env vars in their subprocess context now work IF the keys are written to ~/.gbrain/config.json Tests: - 4 new tests in test/config.test.ts cover: OPENAI env merge, ANTHROPIC env merge (regression — was missing), both together, and absence-when-neither. - All 12 config tests pass; no pre-existing regressions.

GBrain stores internal cross-page references in slug form (e.g. `[Alice](./alice)`) because the slug is the canonical identifier in the DB. That works inside GBrain's own resolution layer. But when those pages are exported as `.md` files on disk and opened in standard markdown viewers (Obsidian, VS Code preview, GitHub web view, typical mkdocs/jekyll renderers), the viewers look for a literal file at `./alice` — which doesn't exist. The actual file is `./alice.md`. Result: every internal link in an exported brain is silently broken on disk. The user clicks `[小龙]` in `龙虾群.md`, sees a 404 / empty page, and cannot navigate the brain outside of GBrain itself. This defeats half the value of having the brain stored as portable markdown. Fix: Add `normalizeInternalLinks(content)` that runs over each page's serialized markdown right before `writeFileSync` and rewrites slug-form internal links to filename-form by appending `.md`: [Alice](./alice) -> [Alice](./alice.md) [Alice](alice) -> [Alice](alice.md) [Alice](../people/alice) -> [Alice](../people/alice.md) [小龙](../people/小龙) -> [小龙](../people/小龙.md) Conservative: leaves untouched anything that looks external or already extended: - URL schemes (http:, https:, mailto:, ftp:, file:, tel:, ...) — skip - Anchors (#section) — skip - Empty targets — skip - Trailing slash (directory references) — skip - Already has any extension (.md, .png, .pdf, .MD, ...) — skip - Preserves query strings and anchors when appending: [Section](./alice#bio) -> [Section](./alice.md#bio) [Search](./alice?q=t) -> [Search](./alice.md?q=t) The DB content stays slug-form (GBrain's internal convention is unchanged). Only the on-disk export gets the `.md` annotation, so the exported markdown is viewable as-is by any standard renderer. Real-world reproduction this fix addresses: $ gbrain put 龙虾群 < <(echo '[小龙](./小龙)') $ gbrain export --dir /tmp/out $ cat /tmp/out/龙虾群.md # before this PR: contains [小龙](./小龙) — clicking 404s # after this PR: contains [小龙](./小龙.md) — clicking opens the file Impact: - 2 files changed, +149 / -1 lines (1 line of helper invocation + ~40 lines of helper + comment + 26 tests) - Zero behavior change for external URLs, anchors, or already-extended links - DB content unchanged — only the on-disk export representation gains the `.md` annotation - Existing exports remain valid (re-running export on an already-exported brain is idempotent because already-extended links are skipped) Tests: - 26 new tests covering: same-dir slug, parent-dir slug, deep nesting, CJK slugs, multiple links per line, multi-line markdown, all 6 external schemes (http/https/mailto/file/ftp/tel), all 4 extension cases (md/png/pdf/uppercase), anchor preservation, query preservation, empty/trailing-slash/no-link edge cases. - All 26 tests pass. - Full suite: 612 pass / no new regressions (4 pre-existing PGLiteEngine failures are unrelated and exist on master). Fifth in a series of practical PRs from a real Chinese-speaking deploy. Companion to: - garrytan#114 (chunker CJK) - garrytan#115 (slugify CJK) - garrytan#119 (sync git quotepath CJK) - garrytan#121 (self-contained API keys) Same theme: GBrain is meaningfully more useful when the markdown export is a first-class deliverable, not a half-broken side-effect.

vinsew force-pushed the fix/self-contained-api-keys branch from 98cd225 to 83d3851 Compare April 14, 2026 18:06

vinsew mentioned this pull request Apr 14, 2026

fix: gbrain export auto-appends .md to internal slug-form links #123

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: self-contained API keys (read from gbrain's own config, not just env)#121

feat: self-contained API keys (read from gbrain's own config, not just env)#121
vinsew wants to merge 1 commit intogarrytan:masterfrom
vinsew:fix/self-contained-api-keys

vinsew commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vinsew commented Apr 14, 2026

Summary

The existing schema is almost there — just unconnected

Fix

Usage for callers (new capability)

Impact

Test plan

Context

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant