feat(builder): open-source BitBadgesAgent — BYO Anthropic key by trevormil · Pull Request #172 · BitBadges/bitbadgesjs

trevormil · 2026-04-22T13:44:50Z

Summary

Adds BitBadgesAgent class at bitbadges/builder/agent so anyone can build BitBadges collections from a natural-language prompt using their own Anthropic API key. BitBadges never sees user keys.
Three-tier consumption: /builder/agent (stable class), /builder/internals (unstable primitives for DIY loops / fine-tuning), existing MCP stdio bin unchanged.
@anthropic-ai/sdk is an optional peerDependency — MCP/CLI consumers unaffected.
Ports prompt assembly, agent loop, validation gate + fix loop from the indexer into the SDK. Prompt behavior byte-identical.

Key features

Zero-config: new BitBadgesAgent({ anthropicKey }).build('create a subscription for $10/mo')
Auth: API key, OAuth token, or pre-built Anthropic client. Auto-reads ANTHROPIC_API_KEY / ANTHROPIC_OAUTH_TOKEN / ANTHROPIC_AUTH_TOKEN from env.
BitBadges API passthrough: bitbadgesApiKey option threads through every query tool (or auto-reads BITBADGES_API_KEY).
Customization: model picker (haiku/sonnet/opus), validation: 'strict'|'lenient'|'off', skills filter, systemPromptAppend, tools.add/tools.remove, hooks (onTokenUsage, onToolCall, onStatusUpdate, onCompletion).
Pluggable session store: MemoryStore + FileStore ship; consumers can BYO (the indexer will add a Redis adapter in the follow-up PR).
QoL: typed errors (ValidationFailedError, QuotaExceededError, AnthropicAuthError, AbortedError, PeerDependencyError, SimulationError), parsed tool output, result.costUsd, result.toString(), agent.abort(), agent.healthCheck(), agent.validate(), substituteImages helper, debug mode.

Files

src/builder/agent/ — 11 new modules.
examples/builder-agent/ — zero-config, middle-tier, DIY internals scripts + README.
package.json — exports map additions for ./builder/agent + ./builder/internals; peerDep.
README.md — "three ways to build" section replacing the prior MCP-only block.
src/builder/agent/BitBadgesAgent.spec.ts — 17 unit tests (all passing) covering zero-config, OAuth, env vars, tool filtering, image substitution, hooks, typed errors, end-to-end with mocked Anthropic.

Backwards compat

No existing exports or bins are removed.
Existing bitbadges/builder/* subpaths (registry, tools, resources, skills, session) unchanged.
bitbadges-builder MCP stdio bin unchanged.

Follow-ups (separate PRs)

Indexer PR (feat/consume-bitbadges-agent on bitbadges-indexer) consumes the agent via bun link, adds RedisSessionStore, shrinks aiBuildHandler.
Frontend PR (feat/bitbadges-agent-ui) exposes the new customization knobs in AiGenerateWidget and embeds a one-liner in the landing-page AI agent card.
Docs PR (feat/bitbadges-agent-docs) adds a gitbook page for the programmatic agent.

Backlog ticket: #0298.

Test plan

npx jest src/builder/agent/BitBadgesAgent.spec.ts — 17/17 pass
npx tsc --build tsconfig.build.json — clean (CJS)
npx tsc --build tsconfig-esm.build.json — clean (ESM)
Manual: run zero-config.ts with real ANTHROPIC_API_KEY against testnet before merge
Manual: confirm existing MCP bin still boots (bitbadges-builder --help)

🤖 Generated with Claude Code

Adds a programmatic AI builder so anyone can build BitBadges collections from a prompt without going through the BitBadges frontend or API. Consumers install @anthropic-ai/sdk as a peer dep and pass their own key — BitBadges never sees credentials. Three-tier surface: - bitbadges/builder/agent → BitBadgesAgent class (stable) - bitbadges/builder/internals → prompt, loop, validation, adapters (unstable — for DIY consumers, may break between minors) - existing bitbadges-builder MCP stdio bin unchanged BitBadgesAgent features: - Zero-config: new BitBadgesAgent({ anthropicKey }).build('…') - Model picker (haiku/sonnet/opus) with per-model cost reporting - Validation modes: strict (default), lenient, off - Skills filter, systemPromptAppend, full systemPrompt replace - tools.add / tools.remove for bounded customization - Hooks: onTokenUsage, onToolCall, onStatusUpdate, onCompletion - Pluggable KVStore (MemoryStore + FileStore ship, consumers can BYO Redis/etc.) - Typed errors (ValidationFailedError, QuotaExceededError, etc.) - substituteImages / collectImageReferences helpers - healthCheck() + validate() QoL methods - OAuth token support in addition to API key (ANTHROPIC_OAUTH_TOKEN) - BITBADGES_API_KEY passthrough into every query tool Ported from the indexer: prompt assembly, agent loop with retry/compression, validation gate + fix-loop driver, simulation error patterns. Prompt/system-prompt behavior byte-identical to indexer today. Package changes: - Exports map adds ./builder/agent and ./builder/internals - @anthropic-ai/sdk as optional peerDependency (>=0.80.0 <1.0.0) - Three example scripts under examples/builder-agent/ - 17 unit tests covering zero-config, OAuth, env vars, tool filtering, image substitution, hooks, typed errors Backlog: #0298. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

greptile-apps · 2026-04-22T13:49:46Z

Greptile Summary

This PR open-sources BitBadgesAgent — a self-contained, BYO-key AI agent that builds BitBadges collections from natural-language prompts. It ports the agent loop, prompt assembly, validation gate, and fix loop from the private indexer into the SDK, adds MemoryStore/FileStore session backends, a typed error hierarchy, image-placeholder helpers, and a full example suite. The agent is correctly gated behind an optional peerDependency on @anthropic-ai/sdk so existing MCP/CLI consumers are unaffected.

Key findings:

P1 — loadAnthropicSdk swallows version errors (anthropicClient.ts:59): The outer catch block catches the PeerDependencyError thrown by assertSupportedVersion and replaces it with a generic "not installed" message. A consumer with an out-of-range SDK version will never see the helpful version-mismatch guidance.
P2 — Misleading JSDoc on substituteImages (images.ts:17): The doc says "Only string fields named image are rewritten" but the implementation replaces IMAGE_N tokens in all string values throughout the transaction object.
P2 — AnthropicAuthError hardcodes "ANTHROPIC_API_KEY" (errors.ts:63): The error message always directs users to check ANTHROPIC_API_KEY, even when the failure came from an OAuth token.
All three previously flagged concerns (double handleGetTransaction call, shared abortController field, unenforced version range) are fully resolved in this revision.
The 17-test suite covers the happy path, mocked end-to-end, typed errors, and environment-variable credential reading. The FileStore path sanitization correctly replaces path-separator characters, so no traversal risk is present.

Confidence Score: 4/5

Safe to merge after the one-line fix in anthropicClient.ts to re-throw PeerDependencyError before the generic catch.

The PR is architecturally sound and all three previously flagged issues are resolved. The P1 bug (version-check error swallowed) is a straightforward one-line fix and only affects the error message quality, not runtime correctness. The P2 items are documentation/UX polish. The agent loop, validation gate, session stores, tool adapter, and test suite are all in good shape.

packages/bitbadgesjs-sdk/src/builder/agent/anthropicClient.ts — the catch block on line 59 needs to re-throw PeerDependencyError before falling through to the generic handler.

Important Files Changed

Filename	Overview
packages/bitbadgesjs-sdk/src/builder/agent/anthropicClient.ts	Dynamic peer-dep loader with version check; P1 bug — the catch block swallows PeerDependencyError from assertSupportedVersion and replaces it with a misleading "not installed" message.
packages/bitbadgesjs-sdk/src/builder/agent/BitBadgesAgent.ts	Main agent class — well-structured with per-build AbortController set, lazy client init, fix-loop, and session persistence; previously flagged issues (double tx call, shared controller) are resolved.
packages/bitbadgesjs-sdk/src/builder/agent/loop.ts	Agent conversation loop — retry logic, token accounting, tool-result compression, and fire-and-forget hooks look correct and well-documented.
packages/bitbadgesjs-sdk/src/builder/agent/validation.ts	Validation gate runs three checks (review, structural validation, simulation) in a clear sequence; error classification and advisory note handling look correct.
packages/bitbadgesjs-sdk/src/builder/agent/images.ts	IMAGE_N substitution recursively walks entire transaction; JSDoc incorrectly states only `image`-named fields are rewritten when all string values are checked.
packages/bitbadgesjs-sdk/src/builder/agent/errors.ts	Typed error hierarchy with correct prototype chain restoration; AnthropicAuthError message hardcodes API key hint even for OAuth failures.
packages/bitbadgesjs-sdk/src/builder/agent/sessionStore.ts	MemoryStore (TTL-aware Map) and FileStore (JSON files, TTL) — path sanitization replaces slashes so no traversal risk; clean interface.
packages/bitbadgesjs-sdk/src/builder/agent/BitBadgesAgent.spec.ts	17 unit tests covering zero-config, OAuth, env vars, tool filtering, image substitution, hooks, typed errors, and end-to-end with a mocked Anthropic client; good coverage.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant BitBadgesAgent
    participant AnthropicClient
    participant AgentLoop
    participant ToolRegistry
    participant ValidationGate
    participant SessionStore

    Caller->>BitBadgesAgent: build(prompt, options)
    BitBadgesAgent->>AnthropicClient: getAnthropicClient() [lazy, once]
    BitBadgesAgent->>SessionStore: get(sessionId) [load prior messages]
    BitBadgesAgent->>BitBadgesAgent: assemblePromptParts()

    loop Agent Loop (maxRounds)
        BitBadgesAgent->>AgentLoop: runAgentLoop(params)
        AgentLoop->>AnthropicClient: messages.create() [with retry]
        AnthropicClient-->>AgentLoop: response
        AgentLoop->>AgentLoop: onTokenUsage hook [awaited]
        AgentLoop->>ToolRegistry: execute(toolName, args)
        ToolRegistry-->>AgentLoop: result string
        AgentLoop->>AgentLoop: compressOldToolResults()
    end

    AgentLoop-->>BitBadgesAgent: loopResult
    BitBadgesAgent->>BitBadgesAgent: handleGetTransaction(sessionId)

    opt validation !== off
        loop Fix Loop (fixLoopMaxRounds)
            BitBadgesAgent->>ValidationGate: runValidationGate(transaction)
            ValidationGate-->>BitBadgesAgent: gate result
            alt gate.valid === false
                BitBadgesAgent->>AgentLoop: runAgentLoop(fixPrompt)
                AgentLoop-->>BitBadgesAgent: loopResult
            end
        end
    end

    BitBadgesAgent->>SessionStore: set(sessionId, messages+tx)
    BitBadgesAgent->>BitBadgesAgent: onCompletion hook [fire-and-forget]
    BitBadgesAgent-->>Caller: BuildResult

Prompt To Fix All With AI

This is a comment left during a code review.
Path: packages/bitbadgesjs-sdk/src/builder/agent/anthropicClient.ts
Line: 59-64

Comment:
**Version-check error silently replaced by wrong message**

The `catch` block in `loadAnthropicSdk` catches **all** exceptions, including the `PeerDependencyError` thrown by `assertSupportedVersion` when the installed SDK is outside `>=0.80.0 <1.0.0`. The catch re-throws with the "not installed" message, discarding the version-mismatch message that was carefully written.

Concretely: a user with `@anthropic-ai/sdk@0.79.0` will see:

> `@anthropic-ai/sdk is required to use BitBadgesAgent but is not installed`

instead of:

> `@anthropic-ai/sdk version 0.79.0 detected; BitBadgesAgent requires >=0.80.0 <1.0.0. Install a compatible version…`

Fix: re-throw `PeerDependencyError` instances before the generic handler:

```ts
  } catch (err) {
    if (err instanceof PeerDependencyError) throw err;
    throw new PeerDependencyError(
      `@anthropic-ai/sdk is required to use BitBadgesAgent but is not installed. ` +
        `Install it with: npm install @anthropic-ai/sdk (supported range: ${SUPPORTED_RANGE})`
    );
  }
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: packages/bitbadgesjs-sdk/src/builder/agent/images.ts
Line: 17-23

Comment:
**JSDoc contradicts the implementation**

The JSDoc comment says:

> "Only string fields named `image` are rewritten"

But the `walk` function replaces `IMAGE_N` tokens in **every string value** across the entire transaction — the key name is never checked. The test on line 119 confirms this: it substitutes inside `metadataPlaceholders['ipfs://METADATA_COLLECTION'].image`, but the same logic would also replace a field named `description` or `uri` if it contained `IMAGE_1`.

If the broad-replacement behaviour is intentional (matching the frontend), update the JSDoc to reflect it:

```ts
 * Recursively replace every string value matching `IMAGE_N` anywhere in the
 * transaction with the corresponding entry from `images`. All string-valued
 * fields are checked — not just fields named `image`.
```

If the intent really was to restrict replacement to `image`-named fields, the `walk` helper needs a key-awareness parameter.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: packages/bitbadgesjs-sdk/src/builder/agent/errors.ts
Line: 63-72

Comment:
**`AnthropicAuthError` message always mentions `ANTHROPIC_API_KEY`, even for OAuth failures**

The hardcoded error message is:

> "Anthropic authentication failed. Check that ANTHROPIC_API_KEY is set and valid."

This is only helpful when the user is using an API key. When they're using `anthropicAuthToken` or the `ANTHROPIC_OAUTH_TOKEN` env var, the message is misleading — they'll look for an API key that isn't relevant to their setup.

Consider a generic message:

```ts
constructor(detail?: string) {
  super(
    `Anthropic authentication failed. Verify that your Anthropic credentials (API key or OAuth token) are valid.${detail ? ` (${detail})` : ''}`,
    'ANTHROPIC_AUTH_ERROR',
    503
  );
```

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (2): Last reviewed commit: "feat(builder/agent): Anthropic prompt ca..." | Re-trigger Greptile}

Four findings: - P1: double `handleGetTransaction` call — the `??` fallback invoked the tool twice when `.transaction` was nullish. Collapse to one call and extract the tx from either wrapping shape. - P1: shared `abortController` field clobbered by concurrent `build()` calls. Move to a per-build controller tracked in a Set; `agent.abort()` now aborts every in-flight build on the instance. Restructured into `build() → runBuild()` so the Set entry is always cleaned up via try/finally. - P2: `SUPPORTED_RANGE` was only used in error messages, never compared against the installed SDK. Parse `mod.VERSION` and throw a clear PeerDependencyError if outside >=0.80.0 <1.0.0. Silently skip when VERSION is absent/unparseable so future SDK renames don't brick builds. - P2: document why raw `prompt` is intentionally not run through `containsInjection` — agent is BYO-key, caller controls the key + prompt. Server consumers exposing this to untrusted users apply `containsInjection` at their own trust boundary (indexer already does). Community-skill text from third parties IS still sanitized. Tests: 17/17 passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…e throws Greptile P1 on indexer PR #116 surfaced that the indexer's TokenLedger.checkQuota() throws were being silently swallowed when invoked from inside the onTokenUsage hook. Root cause: the SDK was treating every hook as fire-and-forget via `fireHook()`, which wraps the callback in `Promise.resolve().catch(() => {})`. Distinction now documented + enforced: - onTokenUsage is LOAD-BEARING. Awaited directly; rejections propagate out of runAgentLoop so consumers can enforce quotas. Matches the legacy indexer agentLoop contract. - onToolCall / onStatusUpdate / onCompletion stay fire-and-forget — they're observability-only; a misbehaving logger must not hang a build. Tests: 17/17 pass unchanged (no test depended on the old swallowed- throw behavior). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Builds on top of the load-bearing onTokenUsage fix. Rolls caching into the agent rather than shipping as a standalone PR per user request ("append to existing PRs"). Changes: - assemblePromptParts now returns `userContent: Array<{text, cache_control?}>` in addition to the legacy `userMessage` string. The stable skills prefix (selectedSkillsSection + promptSkillsSection) is marked `cache_control: ephemeral`; the per-request tail (context, metadata, permissions, refinement history, prompt) sits in the trailing block with no cache mark. - Skill ordering canonicalized: `[...new Set(selectedSkills)].sort()` in both the skills section and the request header so different orderings of the same skill set hit the same cache entry. - runAgentLoop accepts `userContent`, parses Anthropic's `cache_creation_input_tokens` / `cache_read_input_tokens`, and threads them through to hooks and the result. - Added cacheCreationTokens / cacheReadTokens fields to TokenUsage, BuildTrace, and AgentLoopResult — consumers can now monitor cache hit rate without re-parsing provider responses. - computeCostUsd now takes cache counters and applies Anthropic's multipliers: cache write = 1.25x input, cache read = 0.10x input. - Fix-loop rounds intentionally skip `userContent` — the fix prompt is dynamic error guidance with no cache value. - result.toString() surfaces cache counts when non-zero. Tests: 17/17 still pass (cache counters default to 0 in mocked responses, cost math degrades to the old input+output formula). Expected production impact (per backlog #0303): ~60-80% cost reduction on the repeated stable prefix, faster time-to-first-token. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

trevormil · 2026-04-22T14:33:09Z

@greptileai review

greptile-apps · 2026-04-22T14:38:08Z

+  } catch (err) {
+    throw new PeerDependencyError(
+      `@anthropic-ai/sdk is required to use BitBadgesAgent but is not installed. ` +
+        `Install it with: npm install @anthropic-ai/sdk (supported range: ${SUPPORTED_RANGE})`
+    );
+  }


Version-check error silently replaced by wrong message

The catch block in loadAnthropicSdk catches all exceptions, including the PeerDependencyError thrown by assertSupportedVersion when the installed SDK is outside >=0.80.0 <1.0.0. The catch re-throws with the "not installed" message, discarding the version-mismatch message that was carefully written.

Concretely: a user with @anthropic-ai/sdk@0.79.0 will see:

@anthropic-ai/sdk is required to use BitBadgesAgent but is not installed

instead of:

@anthropic-ai/sdk version 0.79.0 detected; BitBadgesAgent requires >=0.80.0 <1.0.0. Install a compatible version…

Fix: re-throw PeerDependencyError instances before the generic handler:

} catch (err) { if (err instanceof PeerDependencyError) throw err; throw new PeerDependencyError( `@anthropic-ai/sdk is required to use BitBadgesAgent but is not installed. ` + `Install it with: npm install @anthropic-ai/sdk (supported range: ${SUPPORTED_RANGE})` ); }

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/bitbadgesjs-sdk/src/builder/agent/anthropicClient.ts Line: 59-64 Comment: **Version-check error silently replaced by wrong message** The `catch` block in `loadAnthropicSdk` catches **all** exceptions, including the `PeerDependencyError` thrown by `assertSupportedVersion` when the installed SDK is outside `>=0.80.0 <1.0.0`. The catch re-throws with the "not installed" message, discarding the version-mismatch message that was carefully written. Concretely: a user with `@anthropic-ai/sdk@0.79.0` will see: > `@anthropic-ai/sdk is required to use BitBadgesAgent but is not installed` instead of: > `@anthropic-ai/sdk version 0.79.0 detected; BitBadgesAgent requires >=0.80.0 <1.0.0. Install a compatible version…` Fix: re-throw `PeerDependencyError` instances before the generic handler: ```ts } catch (err) { if (err instanceof PeerDependencyError) throw err; throw new PeerDependencyError( `@anthropic-ai/sdk is required to use BitBadgesAgent but is not installed. ` + `Install it with: npm install @anthropic-ai/sdk (supported range: ${SUPPORTED_RANGE})` ); } ``` How can I resolve this? If you propose a fix, please make it concise.

…ve version-check errors Two Greptile P1s on re-review: 1. loadAnthropicSdk swallowed the PeerDependencyError that assertSupportedVersion throws for out-of-range SDKs. The generic try/catch wrapped both "module not installed" AND "version mismatch" into the same "not installed" message, masking the real issue. Split into two stages: the import is the only thing in the try/catch; the version check runs outside so its PeerDependencyError surfaces verbatim. 2. The DRY refactor accidentally deleted BUILDER_SYSTEM_PROMPT_FOR_EXPORT and collapsed assemblePromptParts's `forExport` option to a no-op. The export prompt is NOT the same as the hosted-session prompt — it's for pasting into Claude.ai / ChatGPT where no tools are available, so it swaps the tool-calling workflow for an explicit Output Format section describing the `MsgUniversalUpdateCollection` JSON shape + metadataPlaceholders sidecar layout. Restored: - BUILDER_SYSTEM_PROMPT_FOR_EXPORT constant (with updated Output Format section matching current shape) - `forExport: boolean` option on assemblePromptParts that swaps the system prompt - assembleExportPrompt helper for callers that want the concatenated string (indexer /export-prompt route) - Both exported from bitbadges/builder/internals This is load-bearing for the frontend's self-host flow: the "paste this into Claude.ai" path needs the export prompt; the SDK + MCP paths use the tool-calling prompt. Tests: 17/17 still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ortPrompt stable API Backlog #0309 items 1, 2, 3, 7. - createBitBadgesCommunitySkillsFetcher: prebuilt fetcher that hits the public /api/v0/builder/community-skills endpoint. Power-user path — callers bring skill IDs, get the same community skill injection the hosted flow does. API-key gated; silently returns [] when no key is configured or the endpoint is unreachable. - agent.listSkills(): returns SkillInstruction[] (filtered by the constructor skills whitelist when set). Sync, no network. - agent.describeSkill(id): lookup one skill by ID. null when unknown or outside the whitelist. - Debug-mode warning when selectedSkills contains unknown IDs. Drops unknown IDs silently (matches legacy behavior) but logs to stderr when debug: true so callers can catch typos. - Construction-time warning when skills reference on-chain collections but no bitbadgesApiKey is configured. query_collection calls would fail mid-loop; the warning steers users to set the key. - agent.exportPrompt(prompt, options) promoted from /internals to stable. Returns { prompt: string; communitySkillsIncluded: string[] } ready for paste-into-Claude.ai flows. Used by the frontend's "Pure prompt" path. - Export getAllSkillInstructions + SkillInstruction from bitbadges/builder/agent for discovery-UI builders. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds 129 new Jest tests across seven spec files covering the public agent surface introduced on feat/bitbadges-agent. All mocks — no real network, no real Anthropic calls. New specs: - models.spec.ts — resolveModel fallbacks + computeCostUsd with the 1.25x cache-write / 0.10x cache-read multipliers verified on worked examples; zero-token inputs don't NaN. - prompt.spec.ts — buildSystemPrompt(create|update|refine) section composition, BUILDER_SYSTEM_PROMPT_FOR_EXPORT contains Output Format, getSystemPromptHash is deterministic + 12 hex chars, findMatchingErrorPatterns, buildFixPrompt attempt header, assemblePromptParts cache-boundary layout + canonicalized skill ordering, assembleExportPrompt shape. - sessionStore.spec.ts — parameterized over MemoryStore + FileStore (Date.now mock for memory TTL, mtime-based for file TTL), large-value round-trip, key sanitization, clear helper. - toolAdapter.spec.ts — zero-config >40 builtins, remove/add/override by name, defaultArgs merged + explicit-args-win, >100KB truncation marker, unknown tool returns serialized error, handler throw is caught. - images.spec.ts — nested placeholder walk, non-IMAGE_N strings preserved, partial substitution, no-mutation guarantee, lexicographic sort in collectImageReferences. - communitySkills.spec.ts — empty IDs + no key short-circuit (no fetch), success path, 500 + network + timeout all return [], filters out entries missing name/promptText, honors BITBADGES_API_KEY/URL env. - errors.spec.ts — instanceof dispatch across every subclass, ValidationFailedError carries errors/tx/advisory, QuotaExceededError carries tokensUsed/tokenCap, AbortedError carries partialTokens. BitBadgesAgent.spec.ts extended with listSkills / describeSkill whitelist semantics, exportPrompt round-trip, concurrent build isolation, and agent.abort() cancelling every in-flight build. Final: 8 suites, 146 agent tests, all pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Addresses all critical issues from the parallel code-review + E2E smoke runs on PRs #172 / #116. Tests: 146/146 pass. **P0 blockers (users could not run a build before this):** - anthropicClient: `Anthropic.Anthropic ?? Anthropic` resolved to `BaseAnthropic` (an internal parent class with no `.messages` resource), producing "Cannot read properties of undefined (reading 'create')" the first time any agent ran. Verified in Anthropic SDK >=0.82: `mod.default.Anthropic` exists but points at BaseAnthropic. Fixed by using the module export directly (it IS the Anthropic class). - anthropicClient: OAuth bearer tokens were rejected 401 — Anthropic requires `anthropic-beta: oauth-2025-04-20` header for OAuth creds. Now auto-applied when `authToken` is provided. API-key path unchanged. - BitBadgesAgent: creatorAddress didn't land on the final tx if the LLM's first tool call was a non-session tool (search_knowledge_base, fetch_docs). The SDK's session was created lazily without the creator, resulting in empty `value.creator`. Now we pre-init the session via `getOrCreateSession(sid, creator)` up front — mirrors the legacy indexer handler's explicit init. - BitBadgesAgent: sessionId used `Math.random()` for the random suffix — CodeQL flagged as insecure-randomness in a security context. Replaced with `crypto.randomUUID()`. **P1 correctness:** - toolAdapter.mergeDefaults: `{ ...defaults, ...incoming }` was a classic footgun — an `incoming` key set to `undefined` would knock out the default. Now strips undefined from incoming before merge. - BitBadgesAgent: concurrent `build()` calls racing through `this.client ??= await getAnthropicClient()` each fired their own init and the last-winning result silently discarded the others' errors. Shared `clientInitPromise` now deduplicates. Promise cleared on rejection so transient failures don't poison future retries. - BitBadgesAgent: `systemPromptAppend` was concatenated into the system prompt with zero screening. Hosted/untrusted deployments could inject "ignore previous instructions" via this field. `containsInjection` check now runs at construction and throws a clear error if the append contains obvious injection patterns. - BitBadgesAgent: `exportPrompt` was skipping the ctor's `systemPromptAppend` — builds saw the append, exports didn't. Parity restored. - BitBadgesAgent.runBuild: unguarded `txResponse?.transaction ?? txResponse` could leave `transaction = undefined` if `get_transaction` returned nothing unexpected. Falls back to `{ messages: [] }` so downstream validation + sanity checks process a well-formed shell instead of NPE'ing. **Local-dev ergonomics:** - createBitBadgesCommunitySkillsFetcher now detects localhost / 127.0.0.1 / *.localhost URLs and skips the "no API key → return []" gate in dev. Mirrors how the indexer itself relaxes auth for local development — third-party devs iterating against a local indexer don't need a BitBadges API key to exercise the community- skills path. Tests added to cover: OAuth header presence, creator pre-init, sessionId shape, mergeDefaults undefined filter, client init dedup, systemPromptAppend injection rejection, exportPrompt append parity, local-mode fetcher. (Most already in place from the subagent's unit-test pass — tweaked a few to lock in the new behavior.) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… always fires + systemPrompt injection check Round-two review findings from the parallel subagent pass. All 155 tests pass (+9 new, targeting the fixes). **P0 (ship blocker for BYO-key flow):** - Peer-dep resolver failed under bun-link. The Function('s','return import(s)') trick resolves against the SDK's own file URL. When the SDK is bun-linked into a consumer, the SDK sits at `bitbadgesjs/packages/bitbadgesjs-sdk/` which has no `@anthropic-ai/sdk` — the consumer's does, but the loader never looks there. Replaced with a three-strategy loader: 1. bare dynamic import (normal installs) 2. createRequire anchored at process.cwd() (bun-link + npm-link, consumer running from their project root — the common case) 3. createRequire anchored at the SDK's own __filename (hoisted monorepos) Verified E2E: Strategy 2 resolves the dep when running from the indexer directory even though the SDK is symlinked. **P1 (correctness):** - onCompletion now fires on EVERY exit path, not just success. Prior spec documented it as "observability-only, fire-and-forget" but implementation skipped it on thrown errors (ValidationFailedError, QuotaExceededError, AbortedError). Restructured runBuild with a try/finally + accumulator; the hook fires once (idempotent) with whatever state was reached before the throw. - systemPrompt full-replace field now gets the same containsInjection check that systemPromptAppend got. Previously only the append was guarded — a caller passing an untrusted full-replace could bypass every base-prompt protection. **Tests added (9):** - Injection rejection on systemPromptAppend AND systemPrompt (3 cases). - exportPrompt picks up the constructor's systemPromptAppend (parity with build() — regression-guard for the prior gap). - onCompletion fires once on success and once on ValidationFailedError (regression-guard for the contract fix). - Community-skills localhost bypass works + non-localhost still requires a key. - toolAdapter mergeDefaults: undefined doesn't knock out a default, null explicitly overrides it (pins the earlier fix). **E2E verified (production settings, model=haiku, validation=lenient):** - anthropic.ok: true (OAuth + beta header path clean) - creator on final tx: bb1q0qsr... (pre-init propagation confirmed) - cache read/write ratios healthy (559k/21k tokens on a second build) - healthCheck / listSkills / describeSkill / exportPrompt all clean **Known issue (not a regression, deferred):** The LLM repeatedly omits `collectionPermissions` neutral-array fields, exhausting the fix loop. This is a pre-existing validator/model-output mismatch in the SDK tool schema layer — needs its own ticket to either auto-coerce missing neutral arrays in the validator or strengthen the system prompt's permissions section. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…-build onLog hook Two regressions surfaced during code review. Both fixed; all 155 tests pass. **collectionPermissions regression — "basic stuff failing when it wasn't before":** The session template at sessionState.ts:92 started as `collectionPermissions: {}`. If the LLM skipped calling `set_permissions` (more common post-caching-refactor for model- behavior reasons), the validator rejected with 11 missing-field errors and the fix loop burned 3 rounds trying to recover — ~$0.33 and 2 minutes on a trivial prompt. Fix: default all 11 permission fields to `[]` (neutral) on the session template. A build that never touches permissions is now valid by default, and calling `set_permissions` still overwrites the whole object identically to before. Matches the old indexer's implicit autoFixTrivialIssues behavior that was removed earlier on a "throw at producer, not consumer" rationale — but the real producer problem here was the template shape, not the tool handler. Fixing at the template is the cleanest place. E2E verified: a bare build that doesn't touch permissions now produces `collectionPermissions: { canDeleteCollection: [], ... }` and passes validation. **Dev-console log regression — we lost mid-build `info`/`ai_text`/ `validation` entries:** Old `runAgentLoop` emitted round-start, AI-text, validation-result, and error entries via `onLog` that fed `sessionLog → Redis + fileLog`. My SDK port only kept `onToolCall` — dev-replay JSONL and the frontend's log-polling route saw tool calls but not the round boundaries or AI text between them. Fix: added a generic `onLog` hook to the SDK's AgentHooks contract. Fire-and-forget (same as onToolCall/onStatusUpdate). Emitted from: - loop.ts: round-N start (info + token counts) and AI-text responses. - validation.ts: pass/fail with hard-error counts (already existed as gate-local `onLog`, now forwarded). Indexer wires it to sessionLog() just like the pre-refactor code. **Tests:** - Three existing tests assumed "empty session = invalid" — updated them to force failure via a `simulate` hook that returns `valid: false` instead of relying on empty permissions. - No new test surface needed; onLog is an additive observability hook mirrored from the audited hook contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…gression Session `ses_1g6lo9elkwp6` surfaced this: 8 rounds of successful session-tool calls produced an EMPTY final tx (no approvals, no standards, no tokens, no metadata) and a blank apply on the frontend. Root cause: session-mutating tools in the SDK (handleSetPermissions, handleAddApproval, handleSetStandards, etc.) internally call `getOrCreateSession(input.sessionId, input.creatorAddress)` — they read sessionId from the ARGS object, not the ToolExecutionContext. The pre-refactor indexer's toolRegistry explicitly merged ctx into args (`{ ...args, sessionId: ctx.sessionId }`) before calling the handler. My toolAdapter.ts dropped that merge, so tools were mutating the SDK's default (no-sessionId) session while the agent's `handleGetTransaction({ sessionId })` read its explicitly-bound session — which got zero mutations. Fix: `createAgentToolRegistry` now injects ctx.sessionId + ctx.callerAddress into every tool call's args before handler execution. Ordering: { ...args, sessionId: ctx.sessionId, creatorAddress: ctx.callerAddress } then mergeDefaults on top. LLM-supplied args can't override the agent's session binding (they shouldn't — the LLM doesn't know the correct sessionId). Two existing tests updated to assert the new injected-context shape (the contract is now: ctx values always land on args). 155/155 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…y + bump to 0.35.0 Round-3 review findings: - P1: prompt.ts formatContextHelpers() used Object.entries(publicParams) without sorting. Object.entries() iteration order is unstable across calls for string keys constructed in different orders. Two logically- identical claim configs could produce byte-different prompt prefixes, silently busting Anthropic's prompt-cache on the cache_control ephemeral block. Added .sort() on the key pairs before joining — matches the existing canonicalization on selectedSkills. - P1: AgentToolRegistry / AgentTool / AnthropicTool types were only re-exported from /internals (the unstable subpath). Third-party devs using `agent.tools` in TypeScript couldn't import the type. Now exported from the public bitbadges/builder/agent entry. - Version bump 0.34.3 → 0.35.0 for the BitBadgesAgent release. Minor bump reflects the new subpath + peerDep + fetcher + agent class. 155/155 tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The class is scoped to collection building, not a generic docs or protocol agent. Renaming makes the surface name match its intent. - Class: BitBadgesAgent -> BitBadgesBuilderAgent - Errors: BitBadgesAgentError -> BitBadgesBuilderAgentError - Options: BitBadgesAgentOptions -> BitBadgesBuilderAgentOptions - Files: BitBadgesAgent.ts / .spec.ts renamed via git mv - Log prefix: [bitbadges-agent] -> [bitbadges-builder-agent] - Export path /builder/agent and examples dir unchanged Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…N truncation, quota tests (#173) Stacked on top of feat/bitbadges-agent. Addresses findings from deep review of #172 that are unambiguous wins (no design debate needed). Blockers - models.ts: opus ID bumped claude-opus-4-6 → claude-opus-4-7 (latest). Updated pinned expectations in models.spec.ts + BitBadgesBuilderAgent.spec.ts so the tests actually enforce the current model. Greptile-flagged polish - images.ts: JSDoc for `substituteImages` previously claimed only fields named `image` were rewritten. Implementation matches every string anywhere in the tx. Doc now describes the real behavior. - errors.ts: `AnthropicAuthError` message hardcoded ANTHROPIC_API_KEY. Rewritten to cover both API-key and OAuth credential paths. Correctness nits - loop.ts COMPRESSIBLE_TOOLS: add simulate_transaction + validate_transaction so the existing summarizeToolResult branches actually fire (dead code before). - loop.ts partial-tokens: guard `err.partialTokens = …` in a try/catch. Some caught errors (frozen, primitive) would turn into a cryptic TypeError instead of propagating the original. - toolAdapter.ts truncation: stop emitting "slice + suffix" which the LLM can't parse. Wrap in {_truncated, originalBytes, preview} — valid JSON, stays well under the 100KB cap. Small API additions - BitBadgesBuilderAgentOptions: new optional `sessionTtlSeconds` (default stays 7200s). Multi-day refinement flows no longer hit a hardcoded TTL. - agent.validate() signature: second arg is now an options bag with `{ creatorAddress?, existingCollectionId?, abortSignal? }`. When `existingCollectionId` is set and an `onChainSnapshotFetcher` is configured, the snapshot is pulled for diff-based review — matches update-mode `build()` behavior. Test coverage - healthCheck() success + failure paths - validate() with/without snapshot fetcher - maxTokensPerBuild quota → QuotaExceededError - sessionTtlSeconds threads through to store - toolAdapter truncation envelope is valid JSON All 162 agent tests pass locally (serial run). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

greptile-apps Bot reviewed Apr 22, 2026

View reviewed changes

github-advanced-security AI found potential problems Apr 22, 2026

View reviewed changes

Comment thread packages/bitbadgesjs-sdk/src/builder/agent/BitBadgesAgent.ts Fixed

trevormil mentioned this pull request Apr 22, 2026

docs(ai-agents): Programmatic Agent page (BitBadgesAgent, BYO Anthropic key) trevormil/bitbadges-docs#46

Merged

3 tasks

trevormil and others added 3 commits April 22, 2026 10:01

greptile-apps Bot reviewed Apr 22, 2026

View reviewed changes

trevormil and others added 9 commits April 22, 2026 10:47

trevormil mentioned this pull request Apr 22, 2026

fix(builder/agent): review polish — stacked on #172 #173

Merged

3 tasks

trevormil merged commit d96cf64 into main Apr 22, 2026
4 checks passed

trevormil deleted the feat/bitbadges-agent branch April 22, 2026 18:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(builder): open-source BitBadgesAgent — BYO Anthropic key#172

feat(builder): open-source BitBadgesAgent — BYO Anthropic key#172
trevormil merged 14 commits intomainfrom
feat/bitbadges-agent

trevormil commented Apr 22, 2026

Uh oh!

greptile-apps Bot commented Apr 22, 2026 •

edited

Loading

Important Files Changed

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

trevormil commented Apr 22, 2026

Uh oh!

greptile-apps Bot Apr 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

trevormil commented Apr 22, 2026

Summary

Key features

Files

Backwards compat

Follow-ups (separate PRs)

Test plan

Uh oh!

greptile-apps Bot commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

trevormil commented Apr 22, 2026

Uh oh!

greptile-apps Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps Bot commented Apr 22, 2026 •

edited

Loading