feat: add temperature parameter to LLMClient interface#382
Merged
Conversation
Add optional `temperature` field to `LLMClient.prompt()` opts and set `temperature: 0` in all worker call sites (distillation, curation, pattern echo, query expansion, import extraction, compaction) to eliminate eval variance caused by non-deterministic LLM sampling. Thread temperature through both Anthropic and OpenAI request builders in the gateway adapter, through the batch queue params (including the OpenAI batch JSONL body), and through the batch fallback path. Also fix batch fallback to forward maxTokens and workerID from the queued request — previously these were silently dropped on fallback. Closes #381
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add optional
temperaturefield toLLMClient.prompt()opts and settemperature: 0in all worker call sites to eliminate eval variance caused by non-deterministic LLM sampling.Changes
Interface (
packages/core/src/types.ts)temperature?: numberto theLLMClient.prompt()opts with JSDoc documenting adapter behaviorGateway plumbing (
packages/gateway/src/)llm-adapter.ts: ThreadtemperaturethroughbuildAnthropicWorkerRequest(),buildOpenAIWorkerRequest(), andcreateGatewayLLMClient.prompt()batch-queue.ts:temperaturetoPendingRequest.paramstypetemperaturein queued batch request paramstemperaturein OpenAI batch JSONL body (previously only Anthropic batch forwarded the full params)temperature,maxTokens, andworkerIDfrom the queued request (previously silently dropped on fallback)Worker call sites (all set
temperature: 0)packages/core/src/distillation.ts—distillSegment(),metaDistillInner()packages/core/src/curator.ts—runInner(),consolidate()packages/core/src/pattern-echo.ts—_detect()packages/core/src/search.ts—expandQuery()(with comment noting diversity trade-off)packages/core/src/import/extract.ts—extractKnowledge()packages/gateway/src/pipeline.ts— compaction workerImpact
Eval PR-2 (implicit preferences) ranged from 3.33 to 4.63 across runs with identical code due to non-deterministic distillation output. Temperature=0 makes scores reproducible and enables meaningful A/B testing of prompt changes.
Verification
bun run typecheck— passes all 4 packagesbun test— 1608 tests pass, 0 failuresCloses #381