Conversation
Collaborator
|
Thanks for your contribution! |
…via agents.yaml The agentic chat pipeline had six hardcoded `max_tokens` values (thinking/observing/responding/answer_now/acting/react_fallback) and a hardcoded `temperature=0.2`, which caused the final response to be truncated mid-sentence on long answers (e.g. ~1800-token cap on `responding`). Lift these to `capabilities.chat.<stage>.max_tokens` / `capabilities.chat.temperature` in `agents.yaml` so users can tune budgets without code changes. New defaults bump responding/answer_now to 8000 tokens to fix truncation out of the box, with safe fallbacks in `_ChatLimits.from_config` if the yaml block is missing.
…space) Output of `pre-commit run --from-ref upstream/dev --to-ref HEAD`: - ruff (legacy alias): removed unused imports `complete as llm_complete` in agentic_pipeline.py (left over after my refactor — no callsites); re-sorted imports in services/config/__init__.py. - ruff-format: re-wrapped a few long lines that crossed the 100-col budget after my changes (stream-message calls, tool-binding tuples, test-fixture dict literals); collapsed a single-element multi-line list back to one line. - trailing-whitespace: stripped pre-existing trailing spaces in two docstrings of services/setup/init.py that happened to be in the same file as my edits. No behavioural changes; tests still pass (70 passed in the same suite — chat/, capabilities/test_answer_now.py, services/config/). Note: the bandit hook fails with "Unknown test found in profile: B104". This is an upstream config/version mismatch in pyproject.toml (`tool.bandit.skips` lists B104, but bandit 1.8.0 no longer recognises that ID). It is independent of this PR.
90a511f to
7a2e4d1
Compare
pancacake
pushed a commit
that referenced
this pull request
Apr 20, 2026
Rebased onto dev (post-#348) and resolved two UI conflicts (`web/app/(workspace)/chat/[[...sessionId]]/page.tsx` drops the now-dead `replaySnapshot` helper; `web/components/chat/home/ChatMessages.tsx` switches the last-assistant action button from the old `RotateCcw`/Retry to `RefreshCcw`/Regenerate). Smoke tests green (199 passed) and all required checks pass. Merging: adds a `regenerate` capability wired through CLI (`/regenerate` · `/retry`), WebSocket (`type: "regenerate"`), and Web UI (per-message Regenerate button). Backend rolls back the trailing assistant via `SQLiteSessionStore.delete_message`, reuses `start_turn` through `_persist_user_message=False` / `_regenerate=True`, skips `memory_service.refresh_from_turn` on regeneration, and surfaces non-fatal `regenerate_busy` / `nothing_to_regenerate` errors. Also includes empty-response diagnostics in `_stage_responding`. 18 new tests.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
The agentic chat pipeline currently has six hardcoded
max_tokensvalues(
thinking=1200,observing=1200,responding=1800,answer_now=1800,acting=1500,react_fallback=800) and a hardcodedtemperature=0.2inAgenticChatPipeline._completion_kwargs. These limits cannot be tuned withoutediting source code, and the 1800-token cap on
respondingtruncates longfinal responses mid-sentence on real-world prompts (reproduced on a Russian
LaTeX-heavy "VC-dimension proof plan" question — output cut around the 1800-token
boundary, ~4500–4800 chars).
This PR lifts these knobs into
data/user/settings/agents.yamlunder a newcapabilities.chatblock, mirroring the existing per-capability config pattern(
capabilities.solve,capabilities.research, etc.), but with per-stagesub-sections to give granular control:
Defaults are bumped (responding/answer_now: 1800 → 8000) to fix the truncation
bug out of the box.
_ChatLimits.from_configprovides safe fallbacks if theyaml block is missing entirely (legacy installs), and coerces malformed values
(strings, non-dict stage entries) back to defaults instead of crashing.
Implementation notes
get_chat_params()andDEFAULT_CHAT_PARAMSindeeptutor/services/config/loader.py. Kept separate fromget_agent_paramsbecause the chat capability has a nested per-stage shape, while
get_agent_paramsreturns flat{temperature, max_tokens}.DEFAULT_AGENTS_SETTINGSindeeptutor/services/setup/init.pyextended sofresh installs auto-generate the
chatblock in theiragents.yaml.AgenticChatPipeline.__init__loads the config once intoself._chat_limits/
self._chat_temperature; the six callsites and_completion_kwargsreadfrom these instance attrs.
Related Issues
Module(s) Affected
agents(chat pipeline)apiconfig(loader + setup defaults)coreknowledgeloggingservicestoolsutilsweb(Frontend)docs(Documentation)scriptstests...Checklist
pre-commit run --all-filesand fixed any issues.Test plan
tests/services/config/test_chat_params_config.py(10 tests):get_chat_params()returns defaults whenagents.yamlis missing or thecapabilities.chatblock is absent.defaults.
_ChatLimits.from_confighandles empty dict,DEFAULT_CHAT_PARAMS,string-numbers, garbage strings, and non-dict stage values.
tests/agents/chat/,tests/capabilities/test_answer_now.py,tests/services/config/(70 passed locally; nothing depended on the old hardcoded literals).
Additional Notes
agents.yamlfiles that don't havethe
chatblock keep working throughDEFAULT_CHAT_PARAMSdeep-merge inget_chat_params(). Behaviour changes only because the defaults arelarger, which is the intended fix.
temperatureis now a single value shared by all chat stages(matches the existing single-knob pattern of other capabilities).