Skip to content

Add customizable prompts and per-scenario LLM config overrides (#52)#53

Merged
richardkiene merged 1 commit into
mainfrom
feature/customizable-prompts-52
Jan 23, 2026
Merged

Add customizable prompts and per-scenario LLM config overrides (#52)#53
richardkiene merged 1 commit into
mainfrom
feature/customizable-prompts-52

Conversation

@richardkiene
Copy link
Copy Markdown
Contributor

@richardkiene richardkiene commented Jan 23, 2026

Summary

Add the ability to customize judge and synthetic user behavior through:

  • extra_instructions that augment the default prompts
  • Per-scenario LLM configuration overrides
  • Separate LLM configs for judge and synthetic_user

Configuration Priority (highest to lowest)

  1. CLI arguments
  2. Scenario-level config (from scenario YAML)
  3. Component-specific config (judge:, synthetic_user:)
  4. Shared LLM config (llm:)
  5. Defaults

Note: extra_instructions are appended from all levels (not replaced).

Example Usage

Global Config (mcprobe.yaml)

llm:
  provider: openai
  model: gpt-4o

judge:
  model: gpt-4o-mini  # Use cheaper model for judging
  extra_instructions: |
    Be lenient about formatting differences.
    Consider partial credit for directionally correct answers.

synthetic_user:
  model: gpt-4o
  extra_instructions: |
    Push back firmly on vague answers.

Per-Scenario Override

name: Complex Financial Analysis
description: Tests multi-step calculations

config:
  judge:
    model: gpt-4o  # Need smarter model for this one
    extra_instructions: |
      This scenario requires exact numerical precision.
  synthetic_user:
    extra_instructions: |
      You are a CFO who expects exact figures.

synthetic_user:
  persona: A CFO reviewing quarterly reports
  initial_query: What was our Q3 revenue growth?

evaluation:
  correctness_criteria:
    - Provides exact revenue figures

Changes

  • Added extra_instructions field to LLMConfig model
  • Added ScenarioLLMOverride and ScenarioConfig to scenario model
  • Added optional config section to TestScenario
  • Updated resolve_llm_config to handle scenario-level overrides
  • Updated prompt builders to append extra_instructions
  • Updated pytest plugin, MCP server, and CLI to pass scenario configs
  • Added LLMDefaults dataclass to avoid too-many-arguments lint issue

Test plan

  • All 255 unit tests pass
  • Ruff linting passes
  • Mypy type checking passes
  • Manual testing with scenario overrides

Closes #52

Add the ability to customize judge and synthetic user behavior through:
- extra_instructions that augment the default prompts
- Per-scenario LLM configuration overrides
- Separate LLM configs for judge and synthetic_user (already existed but
  now properly documented and supported with scenario-level overrides)

Configuration priority (highest to lowest):
1. CLI arguments
2. Scenario-level config (from scenario YAML)
3. Component-specific config (judge:, synthetic_user:)
4. Shared LLM config (llm:)
5. Defaults

Extra instructions are appended from all levels (not replaced).

Example global config (mcprobe.yaml):
  judge:
    model: gpt-4o-mini
    extra_instructions: |
      Be lenient about formatting differences.

Example per-scenario override:
  config:
    judge:
      model: gpt-4o
      extra_instructions: |
        This scenario requires exact precision.
@richardkiene richardkiene merged commit b44a02a into main Jan 23, 2026
3 checks passed
@richardkiene richardkiene deleted the feature/customizable-prompts-52 branch January 23, 2026 21:41
richardkiene added a commit that referenced this pull request Jan 23, 2026
Adds documentation for features from PR #53:

- Add extra_instructions field to LLMConfig table in configuration reference
- Document extra_instructions usage with examples
- Add config section to scenario format schema
- Document ScenarioConfig and ScenarioLLMOverride fields
- Add per-scenario configuration examples
- Update complete example to include config section
richardkiene added a commit that referenced this pull request Jan 23, 2026
Adds documentation for features from PR #53:

- Add extra_instructions field to LLMConfig table in configuration reference
- Document extra_instructions usage with examples
- Add config section to scenario format schema
- Document ScenarioConfig and ScenarioLLMOverride fields
- Add per-scenario configuration examples
- Update complete example to include config section
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add customizable prompts and per-scenario LLM config overrides

1 participant