Skip to content

feat: chat model factory defaults + omit strict_mode=False from payload#808

Merged
cosminacho merged 6 commits intomainfrom
feat/chat-model-factory-defaults
Apr 24, 2026
Merged

feat: chat model factory defaults + omit strict_mode=False from payload#808
cosminacho merged 6 commits intomainfrom
feat/chat-model-factory-defaults

Conversation

@cosminacho
Copy link
Copy Markdown
Contributor

@cosminacho cosminacho commented Apr 24, 2026

Summary

Two related fixes to keep the request payload sent to the LLM gateway predictable and free of accidental defaults.

1. Explicit defaults in get_chat_model

Stop falling through to whatever the vendor library happens to default to (which varies wildly — Gemini ran out of reasoning tokens on a 3-word prompt because max_tokens wasn't forwarded). Restore the historical knobs from UiPathRequestMixin:

param old new escape hatch
max_tokens _UNSET (not forwarded) 1000 (historical default from 634d9fe feat: add chat models) pass None to forward an explicit unset
temperature _UNSET (not forwarded) 0.0 pass None to omit
max_retries _UNSET (not forwarded) 3

Also cleaned up a dead-code fallback in the legacy dispatch path that mapped None/_UNSET back to 0 for max_tokens and 0.0 for temperature — these silently overrode an explicit "unset" caller intent. Now None is forwarded straight through to the legacy leaf clients (which already handle it correctly via if temperature is not None: ... guards).

2. Don't send strict_mode=False to the API

Three handlers (OpenAI, Anthropic, Fireworks) used if strict_mode is not None to gate the strict kwarg. Since AgentConfig.strict_mode defaults to False, every request was being sent with strict=False — leaking an opt-in flag as if it were always set. Switched to if strict_mode is True so only an explicit opt-in reaches the wire. Matches the existing behavior of BedrockConverse and Gemini.

Bookkeeping

  • pyproject.toml: 0.10.40.10.5
  • uv.lock refreshed

Test plan

  • uv run pytest tests/chat/test_chat_model_factory.py — 25 pass, 4 skipped (vertex/bedrock extras not installed locally)
  • uv run pytest tests/chat/handlers/test_tool_binding_kwargs.py tests/chat/test_bedrock_payload_handler.py — 70 pass
  • just lint — clean
  • CI

Generated with Claude Code

cosminacho and others added 2 commits April 24, 2026 14:28
…del factory

Restores the historical max_tokens=1000 default from UiPathRequestMixin and
adds temperature=0.0 and max_retries=3 defaults so callers no longer get
underlying-library defaults (which can be wildly inconsistent across vendors,
e.g. Gemini running out of reasoning tokens on small prompts).

Also fixes a dead-code fallback in the legacy path that mapped an unset
max_tokens to 0 instead of a usable default.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per review feedback: max_tokens=1000 was too prescriptive — leave it unset
by default so the underlying client picks the right limit per model.
temperature=0.0 and max_retries=3 defaults retained.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cosminacho and others added 3 commits April 24, 2026 14:46
…ough

Default goes back to 1000 (the historical UiPathRequestMixin value); the
type annotation already allows None, so callers can opt out of the limit
explicitly. Also drop the legacy-path None->0 remap so an explicit None
is forwarded to the legacy leaf clients (which accept None).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The temperature default is now 0.0 so _UNSET is unreachable, and the
legacy leaf creators already skip the temperature kwarg when it's None
(see _legacy/chat_model_factory.py:54). Forwarding directly preserves
caller intent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three handlers (OpenAI, Anthropic, Fireworks) used 'is not None' to gate
the strict kwarg, which means strict_mode=False (the AgentConfig default)
was being forwarded as 'strict=False' to the API. Switch to 'is True' so
only an explicit opt-in is sent — matches BedrockConverse and Gemini
which already behave this way.

Updated the two tests that codified the old behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cosminacho cosminacho changed the title feat: set default max_tokens, temperature, max_retries in chat model factory feat: chat model factory defaults + omit strict_mode=False from payload Apr 24, 2026
Followup to forwarding None through to the legacy dispatch: update
get_chat_model and the _create_*_llm helpers to accept temperature: float | None
and max_tokens: int | None. The implementations already handled None
correctly via 'if temperature is not None' guards in the leaf creators
and direct passthrough for max_tokens (whose underlying chat models
all accept None).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cosminacho cosminacho merged commit bbf2e25 into main Apr 24, 2026
71 of 73 checks passed
@cosminacho cosminacho deleted the feat/chat-model-factory-defaults branch April 24, 2026 12:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants