Skip to content

Add summarize delivery: prep --via claude-p + --via api (#329)#357

Merged
anilmurty merged 1 commit into
mainfrom
feat/summarize-delivery
Jun 29, 2026
Merged

Add summarize delivery: prep --via claude-p + --via api (#329)#357
anilmurty merged 1 commit into
mainfrom
feat/summarize-delivery

Conversation

@auspexlabs

Copy link
Copy Markdown
Collaborator

The write-the-summary half of the feature: tj summarize prep --via claude-p|api runs the rewrite for you in one shot. The last PR of the summarize stack.

Summary

  • prep --via claude-p — drives your local claude -p (headless). No key, with a 300s timeout so a stuck call can't wedge the CLI.
  • prep --via api — calls Anthropic with your own TJ_ANTHROPIC_API_KEY (any global ANTHROPIC_API_KEY is ignored) and the required [summarize] api_model (no default — only frontier models preserve structure). Reports a "pays for itself" amortization. Actual provider usage priced through the existing cost engine ÷ the estimated per-call saving.
  • Both run wrap → rewrite → check → stage, so structure stays a hard guarantee. The api path accepts only a clean completion (stop_reason == "end_turn"); truncated / refused / non-summary responses are refused before the structure gate — important for prose-only prompts with no markers to fail. A rejected-but-billed call will still state it was billed, user gets the failure reason.

What's NOT in this PR

  • Multi-provider --via api (a provider catalog) + a base_url/gateway override — deferred; v1 is Anthropic-only. This is TJ's first outbound model call (everywhere else it observes), so there's no existing outbound-endpoint home and where that config should live long-term is an open question.
  • Recording summarize's own api spend as a span in tj cost — deferred.

Tests / Verification

  • tests/unit/test_summarize_delivery.py + the prep --via CLI round-trips cover both paths: roundtrip/stage, below-gate-skips-spend, drift-refuse, claude-p timeout + not-installed, api no-key / no-model / unknown-model-labels-estimate / failed-structure-reports-cost / missing-usage-cost-unknown / truncation / non-end_turn refusal / malformed + non-dict JSON.
  • Full CI (unit/synthetic/agents/integration): 1328 passed, 2 skipped. ruff check tokenjam/ + mypy tokenjam/ clean.

Checklist

  • One concern (delivery: claude-p + api)
  • ruff check tokenjam/ + mypy tokenjam/ clean
  • Full suite green standalone (1328 passed / 2 skipped)
  • CLAUDE.md updated for the landed surface

Closes #329.

`tj summarize prep --via claude-p|api` runs the rewrite for you in one
shot (wrap, rewrite, check, stage) instead of the manual copy/paste or
MCP in-session paths. This is the write-the-summary half of the feature
and the last PR of the summarize stack.

claude-p drives the user's local `claude -p` headless: no key, with a
300s timeout so a stuck call can't wedge the CLI. api calls Anthropic
with the user's own TJ_ANTHROPIC_API_KEY (any global ANTHROPIC_API_KEY
is ignored) and the required [summarize] api_model (no default; only
frontier models preserve structure). It reports a "pays for itself"
amortization: actual provider usage priced through the existing cost
engine, divided by the estimated per-call saving.

Both paths run through the existing check/stage gate, so structure stays
a hard guarantee. The api path accepts only a clean completion
(stop_reason end_turn); truncated, refused, or non-summary responses are
refused before the structure gate, which matters for prose-only prompts
with no markers to fail. Nothing is staged on a failure, and a
rejected-but-billed call still says it was billed.

Closes #329.
@auspexlabs auspexlabs requested a review from anilmurty as a code owner June 28, 2026 22:42
@auspexlabs auspexlabs requested a review from anilmurty June 28, 2026 22:42

@anilmurty anilmurty left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @auspexlabs — this completes the summarize stack (scan → mechanism → apply → delivery), and it's careful exactly where it needs to be on TokenJam's first outbound call:

  • Reads only TJ_ANTHROPIC_API_KEY and ignores any ambient ANTHROPIC_API_KEY; the no-key path fails before any network call, and the key never lands in a log, span, or error message. Real timeouts on both paths, nothing staged on timeout.
  • The "pays for itself" math is honest — labeled an estimate, real-charge vs default-rates distinguished, and a billed-but-failed call still tells the user it was billed. Below-gate candidates skip the spend entirely.
  • Same check gate as #355/#356, staging-only (never writes the file), and the api path refuses anything but a clean end_turn completion before the structure gate — which correctly handles prose-only prompts that have no markers to fail.

CI green, ruff + mypy clean. Ready to merge.

@anilmurty anilmurty merged commit cb42e75 into main Jun 29, 2026
4 checks passed
@anilmurty anilmurty deleted the feat/summarize-delivery branch June 29, 2026 01:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[enhancement] Summary — initial CLI + MCP surface (scan → mechanism → apply → delivery)

3 participants