Skip to content

Evaluation Config + Bundles #7

@placerda

Description

@placerda

Objective

Define and implement the Evaluation Bundle abstraction that standardizes how Foundry evaluators are configured and grouped.

Scope

Introduce reusable evaluation bundles that map to common agent patterns such as RAG agents and tool-using agents.

Tasks

  • Define EvaluationBundle model

  • Create bundle registry mechanism

  • Implement two initial bundles:

    • rag_baseline
    • tool_agent_baseline
  • Bundle configuration must support:

    • Evaluator list
    • Threshold definitions
    • Dataset reference
  • Implement bundle resolution logic from YAML config

  • Add bundle validation logic

Acceptance Criteria

  • YAML config can reference a bundle by name
  • Bundle loads correct evaluator configuration
  • Thresholds are applied correctly
  • Bundle system is extensible without modifying core CLI logic

Out of Scope

  • Observability integration
  • Telemetry export
  • Dashboard generation

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions