Skip to content

Validator rejects valid eval input when role is missing (short-form input) #915

@christso

Description

@christso

Context

PR WiseTechGlobal/WTG.AI.Prompts#490 fails eval validation in CI:

✗ evals/arch-prc/functional-evidence-review.eval.yaml
  ✗ [input[0].role] Invalid role 'undefined'. Must be one of: system, user, assistant
  ✗ [input[0].content] Missing or invalid 'content' field (must be a string, array, or object)

Failed run

Root cause (CI version mismatch)

The PR branch is missing agentv from devDependencies in package.json (main has "agentv": "^4.3.4", PR branch doesn't). When CI runs bunx agentv validate, since agentv isn't installed locally, bunx auto-downloads a potentially different version — causing inconsistent validation behavior.

Short-term fix: A workflow change has been prepared for WTG.AI.Prompts to support a AGENTV_VERSION repository variable, allowing the version to be pinned via Settings > Variables without a code push.

Underlying issue: validator is stricter than runtime for input arrays

The validator (eval-validator.ts validateMessages()) requires every item in an input array to be a message object with both role and content. When role is missing, it hard-errors:

// eval-validator.ts:489-498
const role = message.role;
const validRoles = ['system', 'user', 'assistant'];
if (!validRoles.includes(role as string)) {
  errors.push({
    severity: 'error',
    ...
    message: `Invalid role '${role}'. Must be one of: ${validRoles.join(', ')}`,
  });
}

But the runtime (shorthand-expansion.ts:34-37) silently filters items that don't match isTestMessage():

// shorthand-expansion.ts:34-37
if (Array.isArray(value)) {
  const messages = value.filter((msg): msg is TestMessage => isTestMessage(msg));
  return messages.length > 0 ? messages : undefined;
}

This means the validator rejects input that the runtime would accept.

Proposed fix

Make the validator accept content objects (items with a type field like file, text, image) as implicit user messages in input arrays, or at minimum downgrade from error to warning. This would align the validator with the runtime's lenient handling.

Specifically, in validateMessages(), before checking role, check if the item looks like a content object rather than a message:

// If item looks like a content object ({type: "file", value: ...}), 
// treat as valid — runtime wraps these implicitly
if (isObject(message) && 'type' in message && !('role' in message)) {
  // Warn or accept silently
  continue;
}

Two-part fix summary

Repo Change Status
WTG.AI.Prompts Add AGENTV_VERSION env var to validate.yml workflow so version can be overridden via repo variable Prepared locally
EntityProcess/agentv Make validateMessages() accept content objects without role in input arrays TODO

devbox2-wtg-ai-prompts-allagents

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions