v0.12.0
What's Changed
jazz run grows up: live event streaming and per-run reasoning control
The one-shot jazz run command was already the workhorse for scripts and webhooks, but it was a bit of a black box — you'd fire off a prompt and just... wait. No more. The --events flag now lets you subscribe to live NDJSON event streams on stderr while stdout stays pristine with your final answer. Pick exactly what you want to watch: tools, reasoning, text, usage, or just all if you want the full firehose. Each event is a parseable JSON line, capped at 200 characters per string value so a single rogue tool result won't flood your logs. Meanwhile, --reasoning lets you override the agent's reasoning effort on a per-run basis — crank it to high for gnarly problems, drop it to low for quick lookups, or disable it entirely when you just want straight answers without the internal monologue. (#247, #248)
Ollama reasoning: actually disabled when you say disable
If you've been using Ollama with reasoning-capable models, you might have noticed something unsettling: the agent would go silent, returning empty content and no tool calls. The culprit? Ollama's thinking-enabled models default to reasoning ON when no flag is sent, which shoves all output into a separate reasoning field that Jazz wasn't reading. Now, when reasoning is set to disable, Jazz explicitly sends think: false to Ollama, making the model behave like a normal chatty assistant again. Your tools work. Your content appears. Your sanity is preserved. (#248)
Per-agent API key overrides with graceful fallback
You can now configure API keys at the individual agent level — perfect for when one agent needs a different OpenAI org, or you want to route specific agents through different provider accounts. If an agent doesn't have its own key configured, Jazz falls back to the global config seamlessly. No more "but I thought I set that key" moments. (#242)
ask_user_question always accepts free-form input
The ask_user_question tool now always enables free-form text input alongside the selectable options. Previously, users were sometimes trapped in a multiple-choice-only mode with no way to type a custom response. Now you get the best of both worlds: quick-select suggestions when they fit, and the freedom to type whatever you want when they don't. (#239)
Commits
0408b910.12.0 by @github-actions[bot]f9c34e1fix(llm): ollama reasoning 'disable' now sends think:false; add --reasoning flag to run (#248) by @lvndryec651f2feat(cli): add --events NDJSON stream flag to jazz run (#247) by @lvndry115f9d2chore(ci): switch CI agents to openrouter/owl-alpha (#246) by @lvndry82f1739feat(cli): add jazz run one-shot command and --max-iterations flag (#245) by @lvndry3edea82fix(user-interaction): always enable free-form ask_user_question input by @lvndry9c3a616fix(llm): support per-agent API key overrides with fallback (#242) by @lvndryc1a8b62feat(web-search): add Linkup as a search provider (#241) by @lvndryeaa05affeat(agent): surface llm retry and slow-call status (#240) by @lvndry3fa7c53chore(notifications): replace node-notifier with native commands + feat(shift-tab): add mode toggle shortcut (#239) by @lvndry65815f4chore(web-search): [exa] return text content by @lvndryad2bccafeat: per-agent search provider, budget pressure, meltdown detection (#238) by @lvndry5aabc71fix(llm): clean up retry design — scoped fallback, unified schedule, 15 min timeout (#237) by @lvndrye28bd61fix(llm): suppress AI SDK system-in-messages warning (#235) by @lvndry9f1091ffeat(tools): add tool alias support + prompt improvements for blocked interpreter flags (#233) by @lvndryb4c5a5bdocs: README (#234) by @lvndry6631b4echore(deps): update all dependencies (#232) by @lvndrye7a1565fix(agent): raise LLM request timeout floor to 10 min by @lvndryc0eacf8fix(agent): bump DEFAULT_MAX_LLM_RETRIES from 3 to 8 (#231) by @lvndry2d4a909feat(openrouter): native web_search and web_fetch server tool support (#230) by @lvndrybb13960feat(web-search): overhaul all providers to return content and follow latest best practices (#229) by @lvndry34548bdfix(security): eliminate shell injection via spawn argument array (#227) by @lvndry2ababe7feat(config): make LLM retry count configurable via jazz.config.json (#228) by @lvndry0824c3bci(release): skip release if no commits since last tag by @lvndry050e88achore(cost): more precise calculation by @lvndry8a29a3cfeat(workflow): expose per-run cost and token usage in AgentResponse (#226) by @lvndry62d0053feat(web-search): auto-detect provider API keys from environment variables (#225) by @lvndry01c645dfix(auto-update): replace changelog with releases link (#224) by @lvndryfa3286bchore: improve vscode config (#223) by @lvndryd250540chore: remove TODO.md and README.proposed.md (#222) by @lvndryfa069c0feat(history): session storage and conversation history (#221) by @lvndryad57363feat(chat): stack queued messages, one entry per line (#220) by @lvndry9ba9840feat(cli): bounded live panels for reasoning and subagents (#213) by @lvndryb674518fix(chat): echo "You: " when draining the queued message (#217) by @lvndryfb4f804chore: prune dead files and unused dependencies (#219) by @lvndry17fdcecperf(cli): adaptive buffering for tables and code blocks (#218) by @lvndryd1150b4feat(chat): queue messages typed while agent is busy (#212) by @lvndry91972f7fix(workflow): record failed scheduled runs and surface them on startup (#215) by @lvndry861bcf2perf(cli): buffer streaming deltas to ~80ms cadence (#216) by @lvndry859929cfix(cli): rendering polish — blank line before metrics + reasoning delimiter (#211) by @lvndry