fix(experiments): apply generation overrides to subject provider before evaluation by bug-ops · Pull Request #1427 · bug-ops/zeph

bug-ops · 2026-03-09T06:05:09Z

Summary

Fixes Experiment engine: variations not applied to subject provider #1407: experiment engine was evaluating all variations with the same unmodified subject provider, producing delta=0.0 for every variation
Adds GenerationOverrides struct to zeph-llm and with_generation_overrides() builder to all provider types and AnyProvider
Engine now creates a cloned+patched provider per variation before calling evaluator.evaluate()

Changes

zeph-llm/src/provider.rs: new GenerationOverrides struct (temperature, top_p, top_k, frequency_penalty, presence_penalty)
zeph-llm/src/{ollama,claude,openai,compatible,mock}.rs: generation_overrides field + with_generation_overrides() builder; overrides applied in chat(), chat_stream(), chat_with_tools(), chat_typed() paths
zeph-llm/src/any.rs: with_generation_overrides() dispatches to all variants; Orchestrator/Router warn and return self unchanged
zeph-core/src/experiments/snapshot.rs: removes local GenerationOverrides, re-exports from zeph_llm
zeph-core/src/experiments/engine.rs: line 249 creates patched provider with overrides before each evaluator.evaluate() call; removes stale TODO comment
Claude temperature field corrected from Option<u32> (thousandths) to Option<f64> across all request structs

Test plan

cargo +nightly fmt --check — clean
cargo clippy --workspace --features full -- -D warnings — clean
cargo nextest run --config-file .github/nextest.toml --workspace --features full --lib --bins — 4860 passed (+3 new unit tests for with_generation_overrides state propagation)

…re evaluation (#1407) The experiment engine was calling evaluator.evaluate(&self.subject) with the same unmodified provider on every iteration, producing delta=0.0 for all variations. Add GenerationOverrides to zeph-llm and implement with_generation_overrides() on all provider types (Ollama, Claude, OpenAI, Compatible, Mock) and AnyProvider. Apply overrides in chat(), chat_stream(), chat_with_tools(), and chat_typed() paths. Remove local GenerationOverrides from snapshot.rs and re-export from zeph-llm. Engine now creates a cloned+patched provider per variation before evaluation. - Claude temperature field changed from Option<u32> to Option<f64> - Ollama: frequency_penalty/presence_penalty silently dropped (unsupported by ModelOptions; documented with comment and debug log) - Orchestrator/Router variants warn and return self unchanged

bug-ops added 2 commits March 9, 2026 07:04

merge: sync with main

a474526

github-actions bot added bug Something isn't working documentation Improvements or additions to documentation llm zeph-llm crate (Ollama, Claude) size/L Large PR (201-500 lines) rust Rust code changes core zeph-core crate and removed bug Something isn't working labels Mar 9, 2026

bug-ops enabled auto-merge (squash) March 9, 2026 06:05

merge: sync with main

ada37e9

github-actions bot added the bug Something isn't working label Mar 9, 2026

bug-ops merged commit 2b6eb38 into main Mar 9, 2026
25 checks passed

bug-ops deleted the experiment-engine-variation branch March 9, 2026 06:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(experiments): apply generation overrides to subject provider before evaluation#1427

fix(experiments): apply generation overrides to subject provider before evaluation#1427
bug-ops merged 3 commits intomainfrom
experiment-engine-variation

bug-ops commented Mar 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bug-ops commented Mar 9, 2026

Summary

Changes

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant