feat(builder): open-source BitBadgesAgent — BYO Anthropic key#172
feat(builder): open-source BitBadgesAgent — BYO Anthropic key#172
Conversation
Adds a programmatic AI builder so anyone can build BitBadges
collections from a prompt without going through the BitBadges
frontend or API. Consumers install @anthropic-ai/sdk as a peer
dep and pass their own key — BitBadges never sees credentials.
Three-tier surface:
- bitbadges/builder/agent → BitBadgesAgent class (stable)
- bitbadges/builder/internals → prompt, loop, validation, adapters
(unstable — for DIY consumers, may break between minors)
- existing bitbadges-builder MCP stdio bin unchanged
BitBadgesAgent features:
- Zero-config: new BitBadgesAgent({ anthropicKey }).build('…')
- Model picker (haiku/sonnet/opus) with per-model cost reporting
- Validation modes: strict (default), lenient, off
- Skills filter, systemPromptAppend, full systemPrompt replace
- tools.add / tools.remove for bounded customization
- Hooks: onTokenUsage, onToolCall, onStatusUpdate, onCompletion
- Pluggable KVStore (MemoryStore + FileStore ship, consumers can
BYO Redis/etc.)
- Typed errors (ValidationFailedError, QuotaExceededError, etc.)
- substituteImages / collectImageReferences helpers
- healthCheck() + validate() QoL methods
- OAuth token support in addition to API key (ANTHROPIC_OAUTH_TOKEN)
- BITBADGES_API_KEY passthrough into every query tool
Ported from the indexer: prompt assembly, agent loop with
retry/compression, validation gate + fix-loop driver, simulation
error patterns. Prompt/system-prompt behavior byte-identical to
indexer today.
Package changes:
- Exports map adds ./builder/agent and ./builder/internals
- @anthropic-ai/sdk as optional peerDependency (>=0.80.0 <1.0.0)
- Three example scripts under examples/builder-agent/
- 17 unit tests covering zero-config, OAuth, env vars, tool
filtering, image substitution, hooks, typed errors
Backlog: #0298.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Greptile SummaryThis PR open-sources Key findings:
Confidence Score: 4/5Safe to merge after the one-line fix in anthropicClient.ts to re-throw PeerDependencyError before the generic catch. The PR is architecturally sound and all three previously flagged issues are resolved. The P1 bug (version-check error swallowed) is a straightforward one-line fix and only affects the error message quality, not runtime correctness. The P2 items are documentation/UX polish. The agent loop, validation gate, session stores, tool adapter, and test suite are all in good shape. packages/bitbadgesjs-sdk/src/builder/agent/anthropicClient.ts — the catch block on line 59 needs to re-throw PeerDependencyError before falling through to the generic handler.
|
| Filename | Overview |
|---|---|
| packages/bitbadgesjs-sdk/src/builder/agent/anthropicClient.ts | Dynamic peer-dep loader with version check; P1 bug — the catch block swallows PeerDependencyError from assertSupportedVersion and replaces it with a misleading "not installed" message. |
| packages/bitbadgesjs-sdk/src/builder/agent/BitBadgesAgent.ts | Main agent class — well-structured with per-build AbortController set, lazy client init, fix-loop, and session persistence; previously flagged issues (double tx call, shared controller) are resolved. |
| packages/bitbadgesjs-sdk/src/builder/agent/loop.ts | Agent conversation loop — retry logic, token accounting, tool-result compression, and fire-and-forget hooks look correct and well-documented. |
| packages/bitbadgesjs-sdk/src/builder/agent/validation.ts | Validation gate runs three checks (review, structural validation, simulation) in a clear sequence; error classification and advisory note handling look correct. |
| packages/bitbadgesjs-sdk/src/builder/agent/images.ts | IMAGE_N substitution recursively walks entire transaction; JSDoc incorrectly states only image-named fields are rewritten when all string values are checked. |
| packages/bitbadgesjs-sdk/src/builder/agent/errors.ts | Typed error hierarchy with correct prototype chain restoration; AnthropicAuthError message hardcodes API key hint even for OAuth failures. |
| packages/bitbadgesjs-sdk/src/builder/agent/sessionStore.ts | MemoryStore (TTL-aware Map) and FileStore (JSON files, TTL) — path sanitization replaces slashes so no traversal risk; clean interface. |
| packages/bitbadgesjs-sdk/src/builder/agent/BitBadgesAgent.spec.ts | 17 unit tests covering zero-config, OAuth, env vars, tool filtering, image substitution, hooks, typed errors, and end-to-end with a mocked Anthropic client; good coverage. |
Sequence Diagram
sequenceDiagram
participant Caller
participant BitBadgesAgent
participant AnthropicClient
participant AgentLoop
participant ToolRegistry
participant ValidationGate
participant SessionStore
Caller->>BitBadgesAgent: build(prompt, options)
BitBadgesAgent->>AnthropicClient: getAnthropicClient() [lazy, once]
BitBadgesAgent->>SessionStore: get(sessionId) [load prior messages]
BitBadgesAgent->>BitBadgesAgent: assemblePromptParts()
loop Agent Loop (maxRounds)
BitBadgesAgent->>AgentLoop: runAgentLoop(params)
AgentLoop->>AnthropicClient: messages.create() [with retry]
AnthropicClient-->>AgentLoop: response
AgentLoop->>AgentLoop: onTokenUsage hook [awaited]
AgentLoop->>ToolRegistry: execute(toolName, args)
ToolRegistry-->>AgentLoop: result string
AgentLoop->>AgentLoop: compressOldToolResults()
end
AgentLoop-->>BitBadgesAgent: loopResult
BitBadgesAgent->>BitBadgesAgent: handleGetTransaction(sessionId)
opt validation !== off
loop Fix Loop (fixLoopMaxRounds)
BitBadgesAgent->>ValidationGate: runValidationGate(transaction)
ValidationGate-->>BitBadgesAgent: gate result
alt gate.valid === false
BitBadgesAgent->>AgentLoop: runAgentLoop(fixPrompt)
AgentLoop-->>BitBadgesAgent: loopResult
end
end
end
BitBadgesAgent->>SessionStore: set(sessionId, messages+tx)
BitBadgesAgent->>BitBadgesAgent: onCompletion hook [fire-and-forget]
BitBadgesAgent-->>Caller: BuildResult
Prompt To Fix All With AI
This is a comment left during a code review.
Path: packages/bitbadgesjs-sdk/src/builder/agent/anthropicClient.ts
Line: 59-64
Comment:
**Version-check error silently replaced by wrong message**
The `catch` block in `loadAnthropicSdk` catches **all** exceptions, including the `PeerDependencyError` thrown by `assertSupportedVersion` when the installed SDK is outside `>=0.80.0 <1.0.0`. The catch re-throws with the "not installed" message, discarding the version-mismatch message that was carefully written.
Concretely: a user with `@anthropic-ai/sdk@0.79.0` will see:
> `@anthropic-ai/sdk is required to use BitBadgesAgent but is not installed`
instead of:
> `@anthropic-ai/sdk version 0.79.0 detected; BitBadgesAgent requires >=0.80.0 <1.0.0. Install a compatible version…`
Fix: re-throw `PeerDependencyError` instances before the generic handler:
```ts
} catch (err) {
if (err instanceof PeerDependencyError) throw err;
throw new PeerDependencyError(
`@anthropic-ai/sdk is required to use BitBadgesAgent but is not installed. ` +
`Install it with: npm install @anthropic-ai/sdk (supported range: ${SUPPORTED_RANGE})`
);
}
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: packages/bitbadgesjs-sdk/src/builder/agent/images.ts
Line: 17-23
Comment:
**JSDoc contradicts the implementation**
The JSDoc comment says:
> "Only string fields named `image` are rewritten"
But the `walk` function replaces `IMAGE_N` tokens in **every string value** across the entire transaction — the key name is never checked. The test on line 119 confirms this: it substitutes inside `metadataPlaceholders['ipfs://METADATA_COLLECTION'].image`, but the same logic would also replace a field named `description` or `uri` if it contained `IMAGE_1`.
If the broad-replacement behaviour is intentional (matching the frontend), update the JSDoc to reflect it:
```ts
* Recursively replace every string value matching `IMAGE_N` anywhere in the
* transaction with the corresponding entry from `images`. All string-valued
* fields are checked — not just fields named `image`.
```
If the intent really was to restrict replacement to `image`-named fields, the `walk` helper needs a key-awareness parameter.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: packages/bitbadgesjs-sdk/src/builder/agent/errors.ts
Line: 63-72
Comment:
**`AnthropicAuthError` message always mentions `ANTHROPIC_API_KEY`, even for OAuth failures**
The hardcoded error message is:
> "Anthropic authentication failed. Check that ANTHROPIC_API_KEY is set and valid."
This is only helpful when the user is using an API key. When they're using `anthropicAuthToken` or the `ANTHROPIC_OAUTH_TOKEN` env var, the message is misleading — they'll look for an API key that isn't relevant to their setup.
Consider a generic message:
```ts
constructor(detail?: string) {
super(
`Anthropic authentication failed. Verify that your Anthropic credentials (API key or OAuth token) are valid.${detail ? ` (${detail})` : ''}`,
'ANTHROPIC_AUTH_ERROR',
503
);
```
How can I resolve this? If you propose a fix, please make it concise.Reviews (2): Last reviewed commit: "feat(builder/agent): Anthropic prompt ca..." | Re-trigger Greptile
Four findings: - P1: double `handleGetTransaction` call — the `??` fallback invoked the tool twice when `.transaction` was nullish. Collapse to one call and extract the tx from either wrapping shape. - P1: shared `abortController` field clobbered by concurrent `build()` calls. Move to a per-build controller tracked in a Set; `agent.abort()` now aborts every in-flight build on the instance. Restructured into `build() → runBuild()` so the Set entry is always cleaned up via try/finally. - P2: `SUPPORTED_RANGE` was only used in error messages, never compared against the installed SDK. Parse `mod.VERSION` and throw a clear PeerDependencyError if outside >=0.80.0 <1.0.0. Silently skip when VERSION is absent/unparseable so future SDK renames don't brick builds. - P2: document why raw `prompt` is intentionally not run through `containsInjection` — agent is BYO-key, caller controls the key + prompt. Server consumers exposing this to untrusted users apply `containsInjection` at their own trust boundary (indexer already does). Community-skill text from third parties IS still sanitized. Tests: 17/17 passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e throws Greptile P1 on indexer PR #116 surfaced that the indexer's TokenLedger.checkQuota() throws were being silently swallowed when invoked from inside the onTokenUsage hook. Root cause: the SDK was treating every hook as fire-and-forget via `fireHook()`, which wraps the callback in `Promise.resolve().catch(() => {})`. Distinction now documented + enforced: - onTokenUsage is LOAD-BEARING. Awaited directly; rejections propagate out of runAgentLoop so consumers can enforce quotas. Matches the legacy indexer agentLoop contract. - onToolCall / onStatusUpdate / onCompletion stay fire-and-forget — they're observability-only; a misbehaving logger must not hang a build. Tests: 17/17 pass unchanged (no test depended on the old swallowed- throw behavior). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Builds on top of the load-bearing onTokenUsage fix. Rolls caching
into the agent rather than shipping as a standalone PR per user
request ("append to existing PRs").
Changes:
- assemblePromptParts now returns `userContent: Array<{text, cache_control?}>`
in addition to the legacy `userMessage` string. The stable skills
prefix (selectedSkillsSection + promptSkillsSection) is marked
`cache_control: ephemeral`; the per-request tail (context,
metadata, permissions, refinement history, prompt) sits in the
trailing block with no cache mark.
- Skill ordering canonicalized: `[...new Set(selectedSkills)].sort()`
in both the skills section and the request header so different
orderings of the same skill set hit the same cache entry.
- runAgentLoop accepts `userContent`, parses Anthropic's
`cache_creation_input_tokens` / `cache_read_input_tokens`, and
threads them through to hooks and the result.
- Added cacheCreationTokens / cacheReadTokens fields to TokenUsage,
BuildTrace, and AgentLoopResult — consumers can now monitor cache
hit rate without re-parsing provider responses.
- computeCostUsd now takes cache counters and applies Anthropic's
multipliers: cache write = 1.25x input, cache read = 0.10x input.
- Fix-loop rounds intentionally skip `userContent` — the fix prompt
is dynamic error guidance with no cache value.
- result.toString() surfaces cache counts when non-zero.
Tests: 17/17 still pass (cache counters default to 0 in mocked
responses, cost math degrades to the old input+output formula).
Expected production impact (per backlog #0303): ~60-80% cost
reduction on the repeated stable prefix, faster time-to-first-token.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@greptileai review |
| } catch (err) { | ||
| throw new PeerDependencyError( | ||
| `@anthropic-ai/sdk is required to use BitBadgesAgent but is not installed. ` + | ||
| `Install it with: npm install @anthropic-ai/sdk (supported range: ${SUPPORTED_RANGE})` | ||
| ); | ||
| } |
There was a problem hiding this comment.
Version-check error silently replaced by wrong message
The catch block in loadAnthropicSdk catches all exceptions, including the PeerDependencyError thrown by assertSupportedVersion when the installed SDK is outside >=0.80.0 <1.0.0. The catch re-throws with the "not installed" message, discarding the version-mismatch message that was carefully written.
Concretely: a user with @anthropic-ai/sdk@0.79.0 will see:
@anthropic-ai/sdk is required to use BitBadgesAgent but is not installed
instead of:
@anthropic-ai/sdk version 0.79.0 detected; BitBadgesAgent requires >=0.80.0 <1.0.0. Install a compatible version…
Fix: re-throw PeerDependencyError instances before the generic handler:
} catch (err) {
if (err instanceof PeerDependencyError) throw err;
throw new PeerDependencyError(
`@anthropic-ai/sdk is required to use BitBadgesAgent but is not installed. ` +
`Install it with: npm install @anthropic-ai/sdk (supported range: ${SUPPORTED_RANGE})`
);
}Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/bitbadgesjs-sdk/src/builder/agent/anthropicClient.ts
Line: 59-64
Comment:
**Version-check error silently replaced by wrong message**
The `catch` block in `loadAnthropicSdk` catches **all** exceptions, including the `PeerDependencyError` thrown by `assertSupportedVersion` when the installed SDK is outside `>=0.80.0 <1.0.0`. The catch re-throws with the "not installed" message, discarding the version-mismatch message that was carefully written.
Concretely: a user with `@anthropic-ai/sdk@0.79.0` will see:
> `@anthropic-ai/sdk is required to use BitBadgesAgent but is not installed`
instead of:
> `@anthropic-ai/sdk version 0.79.0 detected; BitBadgesAgent requires >=0.80.0 <1.0.0. Install a compatible version…`
Fix: re-throw `PeerDependencyError` instances before the generic handler:
```ts
} catch (err) {
if (err instanceof PeerDependencyError) throw err;
throw new PeerDependencyError(
`@anthropic-ai/sdk is required to use BitBadgesAgent but is not installed. ` +
`Install it with: npm install @anthropic-ai/sdk (supported range: ${SUPPORTED_RANGE})`
);
}
```
How can I resolve this? If you propose a fix, please make it concise.…ve version-check errors
Two Greptile P1s on re-review:
1. loadAnthropicSdk swallowed the PeerDependencyError that
assertSupportedVersion throws for out-of-range SDKs. The generic
try/catch wrapped both "module not installed" AND "version
mismatch" into the same "not installed" message, masking the
real issue. Split into two stages: the import is the only thing
in the try/catch; the version check runs outside so its
PeerDependencyError surfaces verbatim.
2. The DRY refactor accidentally deleted BUILDER_SYSTEM_PROMPT_FOR_EXPORT
and collapsed assemblePromptParts's `forExport` option to a no-op.
The export prompt is NOT the same as the hosted-session prompt
— it's for pasting into Claude.ai / ChatGPT where no tools are
available, so it swaps the tool-calling workflow for an explicit
Output Format section describing the `MsgUniversalUpdateCollection`
JSON shape + metadataPlaceholders sidecar layout. Restored:
- BUILDER_SYSTEM_PROMPT_FOR_EXPORT constant (with updated
Output Format section matching current shape)
- `forExport: boolean` option on assemblePromptParts that swaps
the system prompt
- assembleExportPrompt helper for callers that want the
concatenated string (indexer /export-prompt route)
- Both exported from bitbadges/builder/internals
This is load-bearing for the frontend's self-host flow: the "paste
this into Claude.ai" path needs the export prompt; the SDK + MCP
paths use the tool-calling prompt.
Tests: 17/17 still pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ortPrompt stable API
Backlog #0309 items 1, 2, 3, 7.
- createBitBadgesCommunitySkillsFetcher: prebuilt fetcher that hits the
public /api/v0/builder/community-skills endpoint. Power-user path —
callers bring skill IDs, get the same community skill injection the
hosted flow does. API-key gated; silently returns [] when no key is
configured or the endpoint is unreachable.
- agent.listSkills(): returns SkillInstruction[] (filtered by the
constructor skills whitelist when set). Sync, no network.
- agent.describeSkill(id): lookup one skill by ID. null when unknown
or outside the whitelist.
- Debug-mode warning when selectedSkills contains unknown IDs. Drops
unknown IDs silently (matches legacy behavior) but logs to stderr
when debug: true so callers can catch typos.
- Construction-time warning when skills reference on-chain collections
but no bitbadgesApiKey is configured. query_collection calls would
fail mid-loop; the warning steers users to set the key.
- agent.exportPrompt(prompt, options) promoted from /internals to
stable. Returns { prompt: string; communitySkillsIncluded: string[] }
ready for paste-into-Claude.ai flows. Used by the frontend's "Pure
prompt" path.
- Export getAllSkillInstructions + SkillInstruction from
bitbadges/builder/agent for discovery-UI builders.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds 129 new Jest tests across seven spec files covering the public agent surface introduced on feat/bitbadges-agent. All mocks — no real network, no real Anthropic calls. New specs: - models.spec.ts — resolveModel fallbacks + computeCostUsd with the 1.25x cache-write / 0.10x cache-read multipliers verified on worked examples; zero-token inputs don't NaN. - prompt.spec.ts — buildSystemPrompt(create|update|refine) section composition, BUILDER_SYSTEM_PROMPT_FOR_EXPORT contains Output Format, getSystemPromptHash is deterministic + 12 hex chars, findMatchingErrorPatterns, buildFixPrompt attempt header, assemblePromptParts cache-boundary layout + canonicalized skill ordering, assembleExportPrompt shape. - sessionStore.spec.ts — parameterized over MemoryStore + FileStore (Date.now mock for memory TTL, mtime-based for file TTL), large-value round-trip, key sanitization, clear helper. - toolAdapter.spec.ts — zero-config >40 builtins, remove/add/override by name, defaultArgs merged + explicit-args-win, >100KB truncation marker, unknown tool returns serialized error, handler throw is caught. - images.spec.ts — nested placeholder walk, non-IMAGE_N strings preserved, partial substitution, no-mutation guarantee, lexicographic sort in collectImageReferences. - communitySkills.spec.ts — empty IDs + no key short-circuit (no fetch), success path, 500 + network + timeout all return [], filters out entries missing name/promptText, honors BITBADGES_API_KEY/URL env. - errors.spec.ts — instanceof dispatch across every subclass, ValidationFailedError carries errors/tx/advisory, QuotaExceededError carries tokensUsed/tokenCap, AbortedError carries partialTokens. BitBadgesAgent.spec.ts extended with listSkills / describeSkill whitelist semantics, exportPrompt round-trip, concurrent build isolation, and agent.abort() cancelling every in-flight build. Final: 8 suites, 146 agent tests, all pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses all critical issues from the parallel code-review + E2E smoke runs on PRs #172 / #116. Tests: 146/146 pass. **P0 blockers (users could not run a build before this):** - anthropicClient: `Anthropic.Anthropic ?? Anthropic` resolved to `BaseAnthropic` (an internal parent class with no `.messages` resource), producing "Cannot read properties of undefined (reading 'create')" the first time any agent ran. Verified in Anthropic SDK >=0.82: `mod.default.Anthropic` exists but points at BaseAnthropic. Fixed by using the module export directly (it IS the Anthropic class). - anthropicClient: OAuth bearer tokens were rejected 401 — Anthropic requires `anthropic-beta: oauth-2025-04-20` header for OAuth creds. Now auto-applied when `authToken` is provided. API-key path unchanged. - BitBadgesAgent: creatorAddress didn't land on the final tx if the LLM's first tool call was a non-session tool (search_knowledge_base, fetch_docs). The SDK's session was created lazily without the creator, resulting in empty `value.creator`. Now we pre-init the session via `getOrCreateSession(sid, creator)` up front — mirrors the legacy indexer handler's explicit init. - BitBadgesAgent: sessionId used `Math.random()` for the random suffix — CodeQL flagged as insecure-randomness in a security context. Replaced with `crypto.randomUUID()`. **P1 correctness:** - toolAdapter.mergeDefaults: `{ ...defaults, ...incoming }` was a classic footgun — an `incoming` key set to `undefined` would knock out the default. Now strips undefined from incoming before merge. - BitBadgesAgent: concurrent `build()` calls racing through `this.client ??= await getAnthropicClient()` each fired their own init and the last-winning result silently discarded the others' errors. Shared `clientInitPromise` now deduplicates. Promise cleared on rejection so transient failures don't poison future retries. - BitBadgesAgent: `systemPromptAppend` was concatenated into the system prompt with zero screening. Hosted/untrusted deployments could inject "ignore previous instructions" via this field. `containsInjection` check now runs at construction and throws a clear error if the append contains obvious injection patterns. - BitBadgesAgent: `exportPrompt` was skipping the ctor's `systemPromptAppend` — builds saw the append, exports didn't. Parity restored. - BitBadgesAgent.runBuild: unguarded `txResponse?.transaction ?? txResponse` could leave `transaction = undefined` if `get_transaction` returned nothing unexpected. Falls back to `{ messages: [] }` so downstream validation + sanity checks process a well-formed shell instead of NPE'ing. **Local-dev ergonomics:** - createBitBadgesCommunitySkillsFetcher now detects localhost / 127.0.0.1 / *.localhost URLs and skips the "no API key → return []" gate in dev. Mirrors how the indexer itself relaxes auth for local development — third-party devs iterating against a local indexer don't need a BitBadges API key to exercise the community- skills path. Tests added to cover: OAuth header presence, creator pre-init, sessionId shape, mergeDefaults undefined filter, client init dedup, systemPromptAppend injection rejection, exportPrompt append parity, local-mode fetcher. (Most already in place from the subagent's unit-test pass — tweaked a few to lock in the new behavior.) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… always fires + systemPrompt injection check
Round-two review findings from the parallel subagent pass. All 155
tests pass (+9 new, targeting the fixes).
**P0 (ship blocker for BYO-key flow):**
- Peer-dep resolver failed under bun-link. The Function('s','return
import(s)') trick resolves against the SDK's own file URL. When
the SDK is bun-linked into a consumer, the SDK sits at
`bitbadgesjs/packages/bitbadgesjs-sdk/` which has no
`@anthropic-ai/sdk` — the consumer's does, but the loader never
looks there. Replaced with a three-strategy loader:
1. bare dynamic import (normal installs)
2. createRequire anchored at process.cwd() (bun-link + npm-link,
consumer running from their project root — the common case)
3. createRequire anchored at the SDK's own __filename (hoisted
monorepos)
Verified E2E: Strategy 2 resolves the dep when running from the
indexer directory even though the SDK is symlinked.
**P1 (correctness):**
- onCompletion now fires on EVERY exit path, not just success. Prior
spec documented it as "observability-only, fire-and-forget" but
implementation skipped it on thrown errors (ValidationFailedError,
QuotaExceededError, AbortedError). Restructured runBuild with a
try/finally + accumulator; the hook fires once (idempotent) with
whatever state was reached before the throw.
- systemPrompt full-replace field now gets the same containsInjection
check that systemPromptAppend got. Previously only the append was
guarded — a caller passing an untrusted full-replace could bypass
every base-prompt protection.
**Tests added (9):**
- Injection rejection on systemPromptAppend AND systemPrompt (3 cases).
- exportPrompt picks up the constructor's systemPromptAppend (parity
with build() — regression-guard for the prior gap).
- onCompletion fires once on success and once on ValidationFailedError
(regression-guard for the contract fix).
- Community-skills localhost bypass works + non-localhost still
requires a key.
- toolAdapter mergeDefaults: undefined doesn't knock out a default,
null explicitly overrides it (pins the earlier fix).
**E2E verified (production settings, model=haiku, validation=lenient):**
- anthropic.ok: true (OAuth + beta header path clean)
- creator on final tx: bb1q0qsr... (pre-init propagation confirmed)
- cache read/write ratios healthy (559k/21k tokens on a second build)
- healthCheck / listSkills / describeSkill / exportPrompt all clean
**Known issue (not a regression, deferred):**
The LLM repeatedly omits `collectionPermissions` neutral-array fields,
exhausting the fix loop. This is a pre-existing validator/model-output
mismatch in the SDK tool schema layer — needs its own ticket to either
auto-coerce missing neutral arrays in the validator or strengthen the
system prompt's permissions section.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-build onLog hook
Two regressions surfaced during code review. Both fixed; all 155
tests pass.
**collectionPermissions regression — "basic stuff failing when it
wasn't before":**
The session template at sessionState.ts:92 started as
`collectionPermissions: {}`. If the LLM skipped calling
`set_permissions` (more common post-caching-refactor for model-
behavior reasons), the validator rejected with 11 missing-field
errors and the fix loop burned 3 rounds trying to recover — ~$0.33
and 2 minutes on a trivial prompt.
Fix: default all 11 permission fields to `[]` (neutral) on the
session template. A build that never touches permissions is now
valid by default, and calling `set_permissions` still overwrites
the whole object identically to before. Matches the old indexer's
implicit autoFixTrivialIssues behavior that was removed earlier on
a "throw at producer, not consumer" rationale — but the real
producer problem here was the template shape, not the tool
handler. Fixing at the template is the cleanest place.
E2E verified: a bare build that doesn't touch permissions now
produces `collectionPermissions: { canDeleteCollection: [], ... }`
and passes validation.
**Dev-console log regression — we lost mid-build `info`/`ai_text`/
`validation` entries:**
Old `runAgentLoop` emitted round-start, AI-text, validation-result,
and error entries via `onLog` that fed `sessionLog → Redis +
fileLog`. My SDK port only kept `onToolCall` — dev-replay JSONL
and the frontend's log-polling route saw tool calls but not the
round boundaries or AI text between them.
Fix: added a generic `onLog` hook to the SDK's AgentHooks
contract. Fire-and-forget (same as onToolCall/onStatusUpdate).
Emitted from:
- loop.ts: round-N start (info + token counts) and AI-text
responses.
- validation.ts: pass/fail with hard-error counts (already
existed as gate-local `onLog`, now forwarded).
Indexer wires it to sessionLog() just like the pre-refactor code.
**Tests:**
- Three existing tests assumed "empty session = invalid" — updated
them to force failure via a `simulate` hook that returns
`valid: false` instead of relying on empty permissions.
- No new test surface needed; onLog is an additive observability
hook mirrored from the audited hook contract.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…gression
Session `ses_1g6lo9elkwp6` surfaced this: 8 rounds of successful
session-tool calls produced an EMPTY final tx (no approvals, no
standards, no tokens, no metadata) and a blank apply on the
frontend.
Root cause: session-mutating tools in the SDK
(handleSetPermissions, handleAddApproval, handleSetStandards, etc.)
internally call `getOrCreateSession(input.sessionId, input.creatorAddress)`
— they read sessionId from the ARGS object, not the ToolExecutionContext.
The pre-refactor indexer's toolRegistry explicitly merged ctx into
args (`{ ...args, sessionId: ctx.sessionId }`) before calling the
handler. My toolAdapter.ts dropped that merge, so tools were
mutating the SDK's default (no-sessionId) session while the agent's
`handleGetTransaction({ sessionId })` read its explicitly-bound
session — which got zero mutations.
Fix: `createAgentToolRegistry` now injects ctx.sessionId + ctx.callerAddress
into every tool call's args before handler execution. Ordering:
{ ...args, sessionId: ctx.sessionId, creatorAddress: ctx.callerAddress }
then mergeDefaults on top. LLM-supplied args can't override the
agent's session binding (they shouldn't — the LLM doesn't know
the correct sessionId).
Two existing tests updated to assert the new injected-context shape
(the contract is now: ctx values always land on args). 155/155 pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…y + bump to 0.35.0 Round-3 review findings: - P1: prompt.ts formatContextHelpers() used Object.entries(publicParams) without sorting. Object.entries() iteration order is unstable across calls for string keys constructed in different orders. Two logically- identical claim configs could produce byte-different prompt prefixes, silently busting Anthropic's prompt-cache on the cache_control ephemeral block. Added .sort() on the key pairs before joining — matches the existing canonicalization on selectedSkills. - P1: AgentToolRegistry / AgentTool / AnthropicTool types were only re-exported from /internals (the unstable subpath). Third-party devs using `agent.tools` in TypeScript couldn't import the type. Now exported from the public bitbadges/builder/agent entry. - Version bump 0.34.3 → 0.35.0 for the BitBadgesAgent release. Minor bump reflects the new subpath + peerDep + fetcher + agent class. 155/155 tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The class is scoped to collection building, not a generic docs or protocol agent. Renaming makes the surface name match its intent. - Class: BitBadgesAgent -> BitBadgesBuilderAgent - Errors: BitBadgesAgentError -> BitBadgesBuilderAgentError - Options: BitBadgesAgentOptions -> BitBadgesBuilderAgentOptions - Files: BitBadgesAgent.ts / .spec.ts renamed via git mv - Log prefix: [bitbadges-agent] -> [bitbadges-builder-agent] - Export path /builder/agent and examples dir unchanged Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…N truncation, quota tests (#173) Stacked on top of feat/bitbadges-agent. Addresses findings from deep review of #172 that are unambiguous wins (no design debate needed). Blockers - models.ts: opus ID bumped claude-opus-4-6 → claude-opus-4-7 (latest). Updated pinned expectations in models.spec.ts + BitBadgesBuilderAgent.spec.ts so the tests actually enforce the current model. Greptile-flagged polish - images.ts: JSDoc for `substituteImages` previously claimed only fields named `image` were rewritten. Implementation matches every string anywhere in the tx. Doc now describes the real behavior. - errors.ts: `AnthropicAuthError` message hardcoded ANTHROPIC_API_KEY. Rewritten to cover both API-key and OAuth credential paths. Correctness nits - loop.ts COMPRESSIBLE_TOOLS: add simulate_transaction + validate_transaction so the existing summarizeToolResult branches actually fire (dead code before). - loop.ts partial-tokens: guard `err.partialTokens = …` in a try/catch. Some caught errors (frozen, primitive) would turn into a cryptic TypeError instead of propagating the original. - toolAdapter.ts truncation: stop emitting "slice + suffix" which the LLM can't parse. Wrap in {_truncated, originalBytes, preview} — valid JSON, stays well under the 100KB cap. Small API additions - BitBadgesBuilderAgentOptions: new optional `sessionTtlSeconds` (default stays 7200s). Multi-day refinement flows no longer hit a hardcoded TTL. - agent.validate() signature: second arg is now an options bag with `{ creatorAddress?, existingCollectionId?, abortSignal? }`. When `existingCollectionId` is set and an `onChainSnapshotFetcher` is configured, the snapshot is pulled for diff-based review — matches update-mode `build()` behavior. Test coverage - healthCheck() success + failure paths - validate() with/without snapshot fetcher - maxTokensPerBuild quota → QuotaExceededError - sessionTtlSeconds threads through to store - toolAdapter truncation envelope is valid JSON All 162 agent tests pass locally (serial run). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
BitBadgesAgentclass atbitbadges/builder/agentso anyone can build BitBadges collections from a natural-language prompt using their own Anthropic API key. BitBadges never sees user keys./builder/agent(stable class),/builder/internals(unstable primitives for DIY loops / fine-tuning), existing MCP stdio bin unchanged.@anthropic-ai/sdkis an optional peerDependency — MCP/CLI consumers unaffected.Key features
new BitBadgesAgent({ anthropicKey }).build('create a subscription for $10/mo')ANTHROPIC_API_KEY/ANTHROPIC_OAUTH_TOKEN/ANTHROPIC_AUTH_TOKENfrom env.bitbadgesApiKeyoption threads through every query tool (or auto-readsBITBADGES_API_KEY).validation: 'strict'|'lenient'|'off', skills filter,systemPromptAppend,tools.add/tools.remove, hooks (onTokenUsage,onToolCall,onStatusUpdate,onCompletion).MemoryStore+FileStoreship; consumers can BYO (the indexer will add a Redis adapter in the follow-up PR).ValidationFailedError,QuotaExceededError,AnthropicAuthError,AbortedError,PeerDependencyError,SimulationError), parsed tool output,result.costUsd,result.toString(),agent.abort(),agent.healthCheck(),agent.validate(),substituteImageshelper, debug mode.Files
src/builder/agent/— 11 new modules.examples/builder-agent/— zero-config, middle-tier, DIY internals scripts + README.package.json— exports map additions for./builder/agent+./builder/internals; peerDep.README.md— "three ways to build" section replacing the prior MCP-only block.src/builder/agent/BitBadgesAgent.spec.ts— 17 unit tests (all passing) covering zero-config, OAuth, env vars, tool filtering, image substitution, hooks, typed errors, end-to-end with mocked Anthropic.Backwards compat
bitbadges/builder/*subpaths (registry, tools, resources, skills, session) unchanged.bitbadges-builderMCP stdio bin unchanged.Follow-ups (separate PRs)
feat/consume-bitbadges-agenton bitbadges-indexer) consumes the agent viabun link, addsRedisSessionStore, shrinksaiBuildHandler.feat/bitbadges-agent-ui) exposes the new customization knobs inAiGenerateWidgetand embeds a one-liner in the landing-page AI agent card.feat/bitbadges-agent-docs) adds a gitbook page for the programmatic agent.Backlog ticket: #0298.
Test plan
npx jest src/builder/agent/BitBadgesAgent.spec.ts— 17/17 passnpx tsc --build tsconfig.build.json— clean (CJS)npx tsc --build tsconfig-esm.build.json— clean (ESM)zero-config.tswith realANTHROPIC_API_KEYagainst testnet before mergebitbadges-builder --help)🤖 Generated with Claude Code