Skip to content

feat: extract @agentspec/codegen — provider-agnostic code generation#17

Merged
iliassjabali merged 17 commits intomainfrom
feat/codegen-migration
Apr 14, 2026
Merged

feat: extract @agentspec/codegen — provider-agnostic code generation#17
iliassjabali merged 17 commits intomainfrom
feat/codegen-migration

Conversation

@iliassjabali
Copy link
Copy Markdown
Collaborator

@iliassjabali iliassjabali commented Mar 26, 2026

Summary

Introduces @agentspec/codegen, a provider-agnostic code generation engine used by agentspec generate and agentspec scan. Ships three interchangeable LLM providers, a resolver with auto-detect and explicit override, and a new agentspec provider-status command for diagnostics. @agentspec/adapter-claude is a thin deprecation shim that re-exports from @agentspec/codegen.

@agentspec/codegen

Providers

Provider Class Requires
Claude subscription ClaudeSubscriptionProvider claude CLI authenticated
OpenAI-compatible OpenAICompatibleProvider AGENTSPEC_LLM_API_KEY + AGENTSPEC_LLM_MODEL (+ optional AGENTSPEC_LLM_BASE_URL)
Anthropic API AnthropicApiProvider ANTHROPIC_API_KEY (+ optional ANTHROPIC_BASE_URL)

The OpenAI-compatible provider works with any endpoint that speaks the OpenAI wire format: OpenRouter, Groq, Together, Ollama (dummy API key), Nvidia NIM, OpenAI.com, or any self-hosted OpenAI-compatible model. AGENTSPEC_LLM_BASE_URL defaults to https://api.openai.com/v1; include the /v1 path for non-OpenAI endpoints. AGENTSPEC_LLM_MODEL is required and fails fast at resolve time if unset. Errors are translated via the openai SDK's structured error classes (AuthenticationError, RateLimitError, NotFoundError, BadRequestError, APIError), each mapping to a specific CodegenErrorCode.

Resolver

resolveProvider() selects the first available provider in priority order:

claude-sub > openai-compatible > anthropic-api

Override via env var (AGENTSPEC_CODEGEN_PROVIDER=claude-sub|anthropic-api|openai-compatible) or CLI flag (--provider <name> on generate and scan). If no provider is available, the resolver throws a CodegenError('provider_unavailable', ...) with a three-option help message.

Architecture

Hexagonal layout inside packages/codegen/src/:

  • provider.ts defines two ports: CodegenProvider (streaming) and ProviderProbe (diagnostic).
  • providers/ holds driven adapters (claude-sub.ts, openai-compatible.ts, anthropic-api.ts). Each module owns its provider class, probe object, and error translator.
  • resolver.ts, provider-probe.ts, and index.ts are the only driving-side files that know the concrete adapter list.
  • Domain modules (context-builder, skill-loader, response-parser, repair, stream-utils) depend only on the ports.

Probe subsystem

  • Each ProviderProbe answers "is this provider ready to use right now?" given NodeJS.ProcessEnv. Probes never throw; every failure lands in a ProviderProbeResult variant: ready | misconfigured | unreachable | not-configured.
  • provider-probe.ts is a thin ~55-line orchestrator that iterates over a PROBES list via Promise.all.
  • The Claude CLI probe uses async execFile, so all three probes genuinely run in parallel. A regression test asserts total probe time stays under 1.8x a single-call delay.
  • The OpenAI-compatible probe does a live GET {baseURL}/models roundtrip with a 6s timeout and a Authorization: Bearer {apiKey} header.
  • The Anthropic API probe does a live GET {baseURL}/v1/models roundtrip with a 6s timeout and the x-api-key + anthropic-version headers.

Streaming and utilities

  • CodegenChunk discriminated union (delta | heartbeat | done) powers streaming generation output.
  • collect(stream) drains a provider stream to a string.
  • repairYaml(provider, badYaml, errors) asks the provider to fix schema validation errors in an agent.yaml.
  • probeProviders() returns { results: ProviderProbeResult[], env: ProviderEnvProbe }.

CLI

  • agentspec generate and agentspec scan use @agentspec/codegen directly. Both accept --provider <name> to override the resolver.
  • agentspec provider-status is a new command that renders one section per provider plus an environment/resolution summary. --json returns { results: ProviderProbeResult[], env: ProviderEnvProbe }. Exit code is 0 if a provider resolves, 1 otherwise.
  • The renderer uses a single renderProbeResult() dispatch keyed on result.provider.

@agentspec/adapter-claude

A thin deprecation shim re-exporting from @agentspec/codegen:

  • generateWithClaude() wraps generateCode().
  • resolveAuth() wraps resolveProvider().
  • Logs a once-per-process deprecation warning on import.
  • Still published to npm.

Docs

  • docs/guides/provider-auth.md: full setup guide for all three providers, with a concrete backends table for the OpenAI-compatible path (OpenRouter, Groq, Together, Ollama, Nvidia NIM, OpenAI.com), Ollama dummy-key guidance, env var reference, and troubleshooting rows.
  • docs/reference/cli.md: env var tables, --provider help text, command examples.
  • packages/codegen/README.md: provider table, auto-detect chain, code samples.
  • docs/adapters/*.md, docs/tutorials/*.md, docs/guides/migrate-*.md, docs/concepts/adapters.md, docs/guides/ci-integration.md: provider-agnostic language. Runtime references to OPENAI_API_KEY (what generated agents themselves need at runtime) remain unchanged.

CI / publish

  • publish.yml: @agentspec/adapter-claude publish step runs after @agentspec/codegen so workspace deps resolve correctly.
  • release.yml: @agentspec/adapter-claude included in the version bump loop.

Test coverage

Package Tests
@agentspec/codegen 172
@agentspec/cli 480
@agentspec/sdk 285
@agentspec/sidecar 245
@agentspec/mcp-server 102
@agentspec/adapter-claude 11
Total 1,295 passing

Codegen tests cover: contract tests for each provider, unit tests for stream/empty-response/error paths, structured error translation for Anthropic and OpenAI SDKs, live-probe tests for all three probes, probe orchestrator tests (including a parallelism assertion), resolver tests for priority order and error messages.

CLI tests include 11 cross-package E2E tests that spawn the real CLI via tsx and validate the resolver, provider-status --json shape, and failure modes.

Test plan

  • pnpm -w test green (1,295 / 1,295)
  • pnpm -w -r typecheck clean
  • pnpm -w -r build clean
  • agentspec provider-status --json renders the unified { results, env } shape
  • Forcing each AGENTSPEC_CODEGEN_PROVIDER mode resolves to the correct provider or raises a targeted error
  • Live probe roundtrip against api.openai.com with a fake key returns unreachable (HTTP 401) and preserves key-preview redaction
  • Optional follow-up: live generation against OpenRouter / Groq / Ollama before releasing @agentspec/codegen

Fixes #33

iliassjabali and others added 5 commits March 22, 2026 00:16
…laude subscription + API key

## What this does

AgentSpec previously required ANTHROPIC_API_KEY for generate and scan.
This change adds full support for Claude Pro/Max subscriptions so users
with a Claude.ai plan can run AgentSpec without any API key.

## New command: agentspec claude-status

Inspect the full Claude auth environment in one shot:

  agentspec claude-status        # table output
  agentspec claude-status --json # machine-readable, exit 1 if not ready

Reports:
- CLI: installed, version, authenticated, account email, plan (Pro/Max/Free)
- API: key set, masked preview, live HTTP probe to /v1/models, base URL
- Env: AGENTSPEC_CLAUDE_AUTH_MODE override, ANTHROPIC_MODEL, resolved mode

Implemented via probeClaudeAuth() in adapter-claude/src/auth.ts which
collects all data without throwing, then renders it in claude-status.ts.

## Auth resolution (CLI first)

resolveAuth() in auth.ts picks the method in this order:
  1. Claude CLI — if installed + authenticated (subscription users)
  2. ANTHROPIC_API_KEY — fallback for CI / API-only setups
  3. Neither — single combined error with setup instructions for both

Override: AGENTSPEC_CLAUDE_AUTH_MODE=cli|api

## CLI stdin fix

runClaudeCli() now pipes the user message via stdin (spawnSync input:)
instead of as a CLI argument, avoiding ARG_MAX limits on large manifests.

## Why not @anthropic-ai/claude-agent-sdk

The agent SDK is designed for persistent multi-turn coding assistants
(session management, resume cursors, tool approval gates). AgentSpec
generate/scan are one-shot calls — the SDK would be ~2500 lines of
adapter code with almost all of it unused. Our spawnSync approach is
the correct scope match: zero extra dependency, auth for free, simple
to test and debug. The only tradeoff is no streaming in CLI mode.

## Files

New:
- packages/adapter-claude/src/auth.ts — resolveAuth, isCliAvailable, probeClaudeAuth
- packages/adapter-claude/src/cli-runner.ts — runClaudeCli via spawnSync stdin
- packages/cli/src/commands/claude-status.ts — new CLI command
- packages/adapter-claude/src/__tests__/auth.test.ts — 16 tests
- packages/adapter-claude/src/__tests__/cli-runner.test.ts — 9 tests
- docs/guides/claude-auth.md — full auth guide incl. claude-status usage
- examples/gymcoach/docker-compose.yml — local Postgres + Redis

Updated:
- adapter-claude/index.ts — routes generate/repair through resolveAuth
- cli/commands/generate.ts + scan.ts — remove hard API key blocks, show auth label
- cli/cli.ts — registers claude-status command
- docs/reference/cli.md — claude-status section, updated generate/scan auth docs
- docs/concepts/adapters.md + quick-start.md — dual-auth examples throughout

Tests: 63 passing in adapter-claude, 1039 passing workspace-wide
…tion or class'

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
- auth.ts: parse claude auth status JSON before lowercasing so loggedIn:false
  is not silently misread as true (Copilot comment on isClaudeAuthenticated)
- auth.ts: reduce API key preview exposure from 16 chars to first-4…last-2
- auth.ts: remove dead catch branch in isClaudeAuthenticated (both if-branches
  returned false; simplified to unconditional return false)
- cli-runner.ts: remove dead systemPromptPath temp-file write — system prompt
  was written to disk but never used; --system-prompt was passed inline.
  Also fixes cleanupTempFile which called unlinkSync on a directory (would
  always throw and leave temp dirs behind).
- generate.ts / scan.ts: derive authLabel from resolveAuth() instead of
  isCliAvailable() so AGENTSPEC_CLAUDE_AUTH_MODE override is reflected in
  the spinner (Copilot comment on both commands)
- generate.ts / scan.ts: resolve auth once and pass into generateWithClaude
  via new options.auth field to avoid redundant subprocess call (PERF-01)
- generate.ts: fix runDeployTarget helm path to wrap generateWithClaude in
  try/catch with graceful error output (QUAL-03)
- index.ts: wrap repairYaml YAML content in XML tags to prevent prompt
  injection from adversarial agent.yaml files (SEC-02); truncate to 64 KB
- skills/guidelines.md: add security preamble instructing Claude to treat
  context_manifest and context_file XML tags as data only, never instructions
- docs: correct timeout example in error table from 120s to 300s
- tests: add claude-status.test.ts (9 tests) covering JSON output shape and
  exit code 0/1 for all three resolved modes
- tests: add probeClaudeAuth coverage (8 tests) to auth.test.ts
- tests: add repairYaml coverage (4 tests) and XML tag assertions to
  claude-adapter.test.ts; update buildContext tests for new XML format
- tests: remove dead node:fs mock from cli-runner.test.ts
- tests: update scan/generate test mocks from isCliAvailable to resolveAuth
- cli.test.ts: pass AGENTSPEC_CLAUDE_AUTH_MODE=api in generate tests to
  prevent them hitting real Claude CLI on developer machines
…nostic architecture

Replace the monolithic @agentspec/adapter-claude with @agentspec/codegen — a
provider-agnostic code generation package using hexagonal architecture.

- CodegenProvider port with three adapters: Claude subscription, Anthropic API, OpenAI Codex
- Auto-detection via resolveProvider() (CLI → API key → Codex)
- Streaming via AsyncIterable<CodegenChunk> (delta | heartbeat | done)
- @agentspec/adapter-claude retained as deprecated backwards-compat shim
- CLI updated: --provider flag on generate and scan commands
- 78 codegen tests, 363 CLI tests, 1065 total — all passing
- Docs updated: adapters guide, claude-auth, cli reference, codegen README
Comment thread packages/codegen/src/__tests__/domain/resolver.test.ts Fixed
Remove all Claude-specific hardcoding from the generic codegen pipeline.
The CLI, types, and docs now use provider-agnostic language throughout.

Renames:
- claude-status command → provider-status
- probeClaudeAuth() → probeProviders()
- ClaudeProbeReport → ProviderProbeReport
- ClaudeApiProbe → AnthropicApiProbe
- ClaudeEnvProbe → ProviderEnvProbe
- auth-probe.ts → provider-probe.ts
- docs/guides/claude-auth.md → provider-auth.md
- resolvedMode: 'cli'|'api' → resolvedProvider: string|null
- authModeOverride → providerOverride
- AGENTSPEC_CLAUDE_AUTH_MODE → AGENTSPEC_CODEGEN_PROVIDER

Adds E2E cross-functionality tests covering the full
resolver → provider-probe → provider-status pipeline.
Replace `as any` with proper types:
- Contract tests use CodegenChunk instead of unknown/any
- Test spies use ReturnType<typeof vi.spyOn<...>> instead of any
- Context builder test uses AgentSpecManifest instead of any
- Contract makeSuccessStream param widened to unknown (removes as any at call sites)
- Codex test uses type guard narrowing instead of as any
…codegen

# Conflicts:
#	docs/quick-start.md
#	docs/reference/cli.md
#	packages/adapter-claude/src/__tests__/claude-adapter.test.ts
#	packages/codegen/src/context-builder.ts
@iliassjabali iliassjabali marked this pull request as ready for review April 12, 2026 00:50
@skokaina
Copy link
Copy Markdown
Contributor

@iliassjabali this is great we should update the docs/ to reflect that we have a provider agnostic code generator now

- Adapter pages (langgraph, crewai, mastra, autogen): replace hardcoded
  ANTHROPIC_API_KEY with provider auto-detect note + link to provider-auth
- ci-integration.md: add LLM code generation section with provider setup
  and AGENTSPEC_CODEGEN_PROVIDER force-override example
- Migration guides (gymcoach, gpt-researcher, openagi, existing-agent,
  superagent): replace hardcoded export with provider-auth reference
- Tutorials (01, 02, 03): replace hardcoded key with provider-auth link
  in prerequisites and code blocks
- Add choosing-a-provider decision matrix
- Add Method 3 (OpenAI Codex) which was completely missing
- Expand each method with: default model, model override, rate limits,
  cost notes, probing behavior, and provider-specific gotchas
- Add comprehensive env var reference table
- Expand troubleshooting with quota, rate limit, and billing errors
- Add CI examples for both Anthropic API and OpenAI Codex
@iliassjabali iliassjabali self-assigned this Apr 12, 2026
@iliassjabali iliassjabali requested a review from skokaina April 12, 2026 12:36
@iliassjabali
Copy link
Copy Markdown
Collaborator Author

@iliassjabali this is great we should update the docs/ to reflect that we have a provider agnostic code generator now

updated the docs !

- Fix temp directory leak in ClaudeSubscriptionProvider (cleanup in finally)
- Restore adapter-claude backwards compat (onProgress, GenerationProgress, repairYaml 3rd arg)
- Break circular import by extracting collect() to stream-utils.ts
- Deduplicate Claude CLI auth check into shared claude-auth.ts
- Remove dead heartbeat interval in ClaudeSubscriptionProvider
- Fix greedy regex in response-parser (use non-greedy match)
- Fix sanitizeContextContent to also escape </context_manifest>
- Add Codex probe section to provider-status command
- Add 74 new tests: shim compat, generateCode/collect, translateError branches,
  empty-response guards, --provider flag for generate and scan
- Docs: add --provider to generate options, fix --include-api-server reference,
  add autogen to framework tables, fix ANTHROPIC_MODEL defaults
@iliassjabali iliassjabali removed their assignment Apr 12, 2026
tryParseCandidates took indexOf('```json') + lastIndexOf('\n```') as a
single slice, so when an LLM emitted more than one fenced block the
concatenation was invalid JSON and the parser threw "did not return
valid JSON" even though each block parsed fine on its own.

Iterate every ```json fence via a /g regex, JSON.parse each, and merge
across candidates: any block with `files` wins, the first
`installCommands` and `envVars` seen are folded in. The distinction
between parse_failed (no JSON at all) and response_invalid (JSON parsed
but no files field) is preserved.

Observed during `agentspec generate examples/gymcoach/agent.yaml`
against both --framework langgraph (multi-fence batching) and
--framework helm (metadata block separate from files block). Helm
retry after the fix wrote 14 files successfully.

Covered by 4 new cases in response-parser.test.ts.
Adds a Fixed entry for f0eb12f so the parser reliability change is
discoverable without reading commit history.
Replace the legacy OpenAI-SDK provider (hardcoded to api.openai.com and
keyed on OPENAI_API_KEY) with OpenAICompatibleProvider, configured via
three env vars: AGENTSPEC_LLM_API_KEY (required), AGENTSPEC_LLM_MODEL
(required), AGENTSPEC_LLM_BASE_URL (optional, defaults to
api.openai.com/v1). Works with OpenRouter, Groq, Together, Ollama,
Nvidia NIM, OpenAI.com, or any OpenAI wire-format endpoint.

Resolver auto-detect priority: claude-sub > openai-compatible >
anthropic-api. Error translation uses the openai SDK's structured error
classes (AuthenticationError, RateLimitError, NotFoundError,
BadRequestError, APIError) instead of string matching.

Refactor the probe subsystem into a hexagonal layout:
- Add a ProviderProbe port in provider.ts alongside CodegenProvider.
- Colocate each provider's probe logic with its adapter module so
  provider-probe.ts shrinks to a ~55-line orchestrator iterating over a
  PROBES list via Promise.all.
- Convert the Claude CLI probe from execFileSync to async execFile so
  it no longer blocks the event loop; provider-status now genuinely
  runs all three probes in parallel.
- The openai-compatible probe does a live GET /models roundtrip with a
  6s timeout, reporting ready / misconfigured / unreachable.

CLI provider-status collapses three per-provider renderers into one
dispatch keyed on result.provider. The --json shape changes from named
fields (claudeCli / anthropicApi / codex) to a unified
results: ProviderProbeResult[] array; no downstream JSON contract
depended on the old shape.

Spec: docs/superpowers/specs/2026-04-12-openai-compatible-codegen-provider-design.md

Tests: 1,295 passing workspace-wide (172 codegen / 480 cli / rest
unchanged). Typecheck and build clean across all packages.
@skokaina
Copy link
Copy Markdown
Contributor

@iliassjabali this is great we should update the docs/ to reflect that we have a provider agnostic code generator now

updated the docs !

Great, merging this one would require a version bump - will need to setup the npmjs repo first

Update README provider-auth section and CHANGELOG Unreleased notes
to reflect the new AGENTSPEC_LLM_* configuration and auto-detect
priority introduced in c72185c.
@iliassjabali iliassjabali merged commit 2ade20e into main Apr 14, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support any OpenAI-compatible LLM provider for scan/generate

2 participants