Skip to content

Conversation

@Qard
Copy link
Contributor

@Qard Qard commented Jan 12, 2026

This makes max_tokens configurable and makes both it and temperature fallback to model-provided defaults otherwise.

Fixes #149

@Qard Qard requested review from ankrgyl and ibolmo January 12, 2026 22:55
@Qard Qard self-assigned this Jan 12, 2026
@github-actions
Copy link

github-actions bot commented Jan 12, 2026

Braintrust eval report

Autoevals (parameter-flexibility-1768263664)

Score Average Improvements Regressions
NumericDiff 72.8% (-1pp) - 2 🔴
Time_to_first_token 1.34tok (-0.12tok) 112 🟢 7 🔴
Llm_calls 1.55 (+0) - -
Tool_calls 0 (+0) - -
Errors 0 (+0) - -
Llm_errors 0 (+0) - -
Tool_errors 0 (+0) - -
Prompt_tokens 279.25tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -
Completion_tokens 19.3tok (+0tok) - -
Completion_reasoning_tokens 0tok (+0tok) - -
Total_tokens 298.54tok (+0tok) - -
Estimated_cost 0$ (+0$) - -
Duration 2.51s (+1s) 114 🟢 105 🔴
Llm_duration 2.78s (-0.25s) 114 🟢 5 🔴

@github-actions
Copy link

Braintrust eval report

Autoevals (parameter-flexibility-1768258553)

Score Average Improvements Regressions
NumericDiff 71.6% (-2pp) 1 🟢 7 🔴
Time_to_first_token 1.38tok (+0.05tok) 40 🟢 79 🔴
Llm_calls 1.55 (+0) - -
Tool_calls 0 (+0) - -
Errors 0 (+0) - -
Llm_errors 0 (+0) - -
Tool_errors 0 (+0) - -
Prompt_tokens 279.25tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -
Completion_tokens 19.3tok (+0tok) - -
Completion_reasoning_tokens 0tok (+0tok) - -
Total_tokens 298.54tok (+0tok) - -
Estimated_cost 0$ (+0$) - -
Duration 2.77s (+1.37s) 17 🟢 201 🔴
Llm_duration 2.87s (+0.08s) 28 🟢 89 🔴

Copy link
Collaborator

@ibolmo ibolmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick tests would be good, but looks reasonable.

This makes max_tokens configurable and makes both it and
temperature fallback to model-provided defaults otherwise.
@Qard Qard force-pushed the parameter-flexibility branch from 89c7134 to 19fceba Compare January 13, 2026 00:20
@Qard Qard requested a review from ibolmo January 13, 2026 00:33
Copy link
Collaborator

@ibolmo ibolmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ty!

@Qard Qard merged commit df6af22 into main Jan 13, 2026
7 checks passed
@Qard Qard deleted the parameter-flexibility branch January 13, 2026 17:04
@github-actions
Copy link

github-actions bot commented Jan 13, 2026

Braintrust eval report

Autoevals (main-1768323886)

Score Average Improvements Regressions
NumericDiff 72.8% (-1pp) - 2 🔴
Time_to_first_token 1.38tok (-0.05tok) 88 🟢 28 🔴
Llm_calls 1.55 (+0) - -
Tool_calls 0 (+0) - -
Errors 0 (+0) - -
Llm_errors 0 (+0) - -
Tool_errors 0 (+0) - -
Prompt_tokens 279.25tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -
Completion_tokens 19.3tok (+0tok) - -
Completion_reasoning_tokens 0tok (+0tok) - -
Total_tokens 298.54tok (+0tok) - -
Estimated_cost 0$ (+0$) - -
Duration 3.56s (+2.12s) 96 🟢 123 🔴
Llm_duration 2.8s (-0.12s) 95 🟢 23 🔴

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Factuality hits json parse errors caused by exceeding token limit

3 participants