v0.11.0
What's Changed
π One-Shot Agent Runs with jazz run
Tired of firing up a full interactive session just to ask Jazz a quick question? Say hello to jazz run β a brand-new one-shot command that takes a prompt, runs the agent once, and spits out a clean answer on stdout. It's perfect for scripts, CI/CD pipelines, and webhook handlers. Pipe stdin for untrusted input, add --max-iterations to cap the agent's reasoning loop, or use --json to get a structured envelope with cost and token usage. Think of it as jazz chat's no-nonsense cousin who doesn't stick around for small talk. (#245)
π Web Search Gets a Major Upgrade
Jazz's web search just leveled up in a big way. Linkup joins the search party as a new provider, giving you another powerful way to find what you're looking for on the internet. But the real magic is in the overhaul: all search providers now return actual content, not just URLs and snippets. Exa now delivers full text in results, Brave pulls extra snippets, and Tavily serves richer content. Less "here's a link, go figure it out" β more "here's the answer, you're welcome." (#241, #229, #238)
And if you're on OpenRouter, Jazz now supports native server-side web_search and web_fetch tools β meaning the LLM can search and fetch pages directly through OpenRouter's infrastructure, no round-trip required. (#230)
π§ Smarter, More Resilient Agents
Agents that give up easily are the worst. That's why Jazz now defaults to 10 LLM retries (up from 3) with exponential backoff, a 15-minute total timeout covering all retries, and a 45-second slow-call hint that tells you when the model is taking its sweet time. Streaming failures gracefully fall back to non-streaming mode with their own full retry budget β because one bad connection shouldn't torpedo your entire run. (#231, #237, #235)
But wait, there's more. Say hello to budget pressure and meltdown detection. As an agent approaches its iteration limit, it gets increasingly urgent nudges to wrap things up and produce output instead of spiraling into infinite research. And if the agent starts repeating the same tool calls over and over (we've all been there), meltdown detection kicks in, clears the slate, and tells the agent to try something β anything β different. (#238)
π Per-Agent API Key Overrides
Running multiple agents that need different API keys? Now you can configure per-agent API key overrides that fall back to global config and then environment variables. Agent A uses one OpenAI key, Agent B uses another β everybody's happy, nobody's fighting over credentials. (#242)
π¬ A Chat Experience That Actually Feels Good
The interactive chat got a serious polish pass. You can now type messages while the agent is still thinking β they'll be queued up and sent automatically when the agent finishes, so you never have to stare at a frozen cursor again. Bounded live panels show the last few lines of reasoning and subagent output in real-time, then collapse to a one-line summary when done. Press Shift+Tab to toggle between "safe" mode (all tools need approval) and "yolo" mode (full auto-pilot) β because sometimes you just want to let it rip. (#212, #220, #213, #239)
Under the hood, streaming is buttery smooth with 80ms buffered deltas (~12 fps), coalesced updates, and adaptive buffering that defers rendering when you're inside an unclosed code block or table β no more watching column widths dance around as the output streams in. (#216, #218, #211, #209)
π Session History & Cost Tracking
Every conversation is now logged to disk with timestamps and roles, so you can look back at what happened. And workflow runs now track per-run cost and token usage right in the run history β because knowing how much your agent spent is the first step to not being surprised by the bill. (#221, #226)
π‘οΈ Security Hardening
The shell tool got a serious security audit. Commands are now executed via spawn with an argument array instead of string interpolation, and a defense-in-depth denylist blocks dangerous operations like rm -rf /, privilege escalation, reverse shells, fork bombs, and curl-piped-to-bash shenanigans. A dedicated regression test suite covers 80+ block-and-allow cases. Your agent should be helpful, not a liability. (#227)
π§ Quality-of-Life Improvements
ask_user_questionnow always allows free-form text input, even when suggested responses are provided β because sometimes the best answer isn't in the list.- Failed scheduled workflow runs are now properly recorded and surfaced on startup, so you'll know exactly what went wrong instead of wondering why your cron job silently died.
- Tool aliases let you register alternative names for tools, so the LLM has more ways to find what it needs.
- The auto-update checker now links to the releases page instead of a stale changelog.
- CI agents have been switched to openrouter/owl-alpha for faster, more cost-effective reviews.
Commits
115f9d23chore(ci): switch CI agents to openrouter/owl-alpha by @lvndry82f17396feat(cli): add jazz run one-shot command and --max-iterations flag by @lvndry3edea825fix(user-interaction): always enable free-form ask_user_question input by @lvndry9c3a616cfix(llm): support per-agent API key overrides with fallback by @lvndryc1a8b629feat(web-search): add Linkup as a search provider by @lvndryeaa05af5feat(agent): surface llm retry and slow-call status by @lvndry3fa7c53achore(notifications): replace node-notifier with native commands + feat(shift-tab): add mode toggle shortcut by @lvndry65815f4achore(web-search): [exa] return text content by @lvndryad2bcca5feat: per-agent search provider, budget pressure, meltdown detection by @lvndry5aabc714fix(llm): clean up retry design β scoped fallback, unified schedule, 15 min timeout by @lvndrye28bd614fix(llm): suppress AI SDK system-in-messages warning by @lvndry9f1091fbfeat(tools): add tool alias support + prompt improvements for blocked interpreter flags by @lvndryb4c5a5bedocs: README by @lvndry6631b4e2chore(deps): update all dependencies by @lvndrye7a1565bfix(agent): raise LLM request timeout floor to 10 min by @lvndryc0eacf8bfix(agent): bump DEFAULT_MAX_LLM_RETRIES from 3 to 8 by @lvndry2d4a9094feat(openrouter): native web_search and web_fetch server tool support by @lvndrybb139601feat(web-search): overhaul all providers to return content and follow latest best practices by @lvndry34548bd0fix(security): eliminate shell injection via spawn argument array by @lvndry2ababe71feat(config): make LLM retry count configurable via jazz.config.json by @lvndry0824c3b3ci(release): skip release if no commits since last tag by @lvndry050e88aechore(cost): more precise calculation by @lvndry8a29a3cafeat(workflow): expose per-run cost and token usage in AgentResponse by @lvndry62d00530feat(web-search): auto-detect provider API keys from environment variables by @lvndry01c645d7fix(auto-update): replace changelog with releases link by @lvndryfa3286bachore: improve vscode config by @lvndryd2505409chore: remove TODO.md and README.proposed.md by @lvndryfa069c0ffeat(history): session storage and conversation history by @lvndryad57363ffeat(chat): stack queued messages, one entry per line by @lvndry9ba9840dfeat(cli): bounded live panels for reasoning and subagents by @lvndryb674518ffix(chat): echo "You: " when draining the queued message by @lvndryfb4f804cchore: prune dead files and unused dependencies by @lvndry17fdcec7perf(cli): adaptive buffering for tables and code blocks by @lvndryd1150b41feat(chat): queue messages typed while agent is busy by @lvndry91972f78fix(workflow): record failed scheduled runs and surface them on startup by @lvndry861bcf20perf(cli): buffer streaming deltas to ~80ms cadence by @lvndry859929c5fix(cli): rendering polish β blank line before metrics + reasoning delimiter by @lvndrye480f06bfix(cli): polish input flows β optional agent description + mask API key echo by @lvndryda1f4cc2fix(cli): rebuild streaming pipeline on single-pending-buffer model by @lvndry