Skip to content

agentv eval run skips valid eval when message content array mixes string and file blocks #1034

@christso

Description

@christso

Summary

agentv validate accepts an eval whose message content array contains a mix of:

  • a plain string block
  • a { type: file, value: ... } block

But agentv eval run then skips the same test as "incomplete":

Error: Skipping incomplete test: red-inline-string-block. Missing required fields: id, input, and at least one of criteria/expected_output/assertions
Error: No tests matched the provided filters.

This looks like a parser / validator mismatch.

Version

agentv --version => 4.12.7

Expected behavior

One of these should happen consistently:

  1. agentv eval run should accept the same message content shape that agentv validate accepts, or
  2. agentv validate should reject that shape up front.

Right now the file validates as correct but cannot be executed.

Minimal repro

Green repro

This validates and runs:

description: green repro with structured content blocks

tests:
  - id: green-file-and-text-blocks
    criteria: Returns a short answer
    input:
      - role: user
        content:
          - type: text
            value: Use the local file.
          - type: file
            value: /README.md

Commands:

agentv validate .tmp/agentv-repro/green-file-and-text-blocks.eval.yaml
agentv eval run .tmp/agentv-repro/green-file-and-text-blocks.eval.yaml --test-id green-file-and-text-blocks --dry-run

Observed:

  • validate passes
  • eval run executes the test normally

Red repro

This also validates, but eval run skips it as incomplete:

description: red repro with inline string block in content array

tests:
  - id: red-inline-string-block
    criteria: Returns a short answer
    input:
      - role: user
        content:
          - |-
            Use the local file.
          - type: file
            value: /README.md

Commands:

agentv validate .tmp/agentv-repro/red-inline-string-block.eval.yaml
agentv eval run .tmp/agentv-repro/red-inline-string-block.eval.yaml --test-id red-inline-string-block --dry-run

Observed:

  • validate passes
  • eval run fails with:
Error: Skipping incomplete test: red-inline-string-block. Missing required fields: id, input, and at least one of criteria/expected_output/assertions
Error: No tests matched the provided filters.

Why this looks like a bug

The validator appears to allow string items inside a message content array, but the loader path used by eval run appears to reject them.

Relevant source behavior:

  • validateMessages(...) allows content array items that are either strings or objects.
  • isTestMessage(...) appears to accept content arrays only when every item is an object.
  • expandInputShorthand(...) uses isTestMessage(...) when loading test message arrays.

So the likely mismatch is:

  • validator: accepts content: ["text", {type: file, value: ...}]
  • loader/runtime: rejects the same message as not a valid TestMessage

Suggested fix

Please make validation and runtime consistent.

Most likely options:

  1. Relax the loader / isTestMessage(...) path to accept string items inside content arrays, since the validator already allows them.
  2. Or tighten validation so these mixed content arrays are rejected before runtime.

Given the current validator behavior and the practical usefulness of mixed string + file content arrays, option 1 seems preferable.

Real-world impact

This came up in a real eval file that used a user message with:

  • an inline instruction block
  • several attached files

It validates cleanly, but agentv eval run cannot execute it because the test gets dropped as incomplete.

Metadata

Metadata

Assignees

No one assigned

    Labels

    in-progressClaimed by an agent — do not duplicate work

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions