-
Notifications
You must be signed in to change notification settings - Fork 0
Validator rejects valid eval input when role is missing (short-form input) #915
Description
Context
PR WiseTechGlobal/WTG.AI.Prompts#490 fails eval validation in CI:
✗ evals/arch-prc/functional-evidence-review.eval.yaml
✗ [input[0].role] Invalid role 'undefined'. Must be one of: system, user, assistant
✗ [input[0].content] Missing or invalid 'content' field (must be a string, array, or object)
Root cause (CI version mismatch)
The PR branch is missing agentv from devDependencies in package.json (main has "agentv": "^4.3.4", PR branch doesn't). When CI runs bunx agentv validate, since agentv isn't installed locally, bunx auto-downloads a potentially different version — causing inconsistent validation behavior.
Short-term fix: A workflow change has been prepared for WTG.AI.Prompts to support a AGENTV_VERSION repository variable, allowing the version to be pinned via Settings > Variables without a code push.
Underlying issue: validator is stricter than runtime for input arrays
The validator (eval-validator.ts validateMessages()) requires every item in an input array to be a message object with both role and content. When role is missing, it hard-errors:
// eval-validator.ts:489-498
const role = message.role;
const validRoles = ['system', 'user', 'assistant'];
if (!validRoles.includes(role as string)) {
errors.push({
severity: 'error',
...
message: `Invalid role '${role}'. Must be one of: ${validRoles.join(', ')}`,
});
}But the runtime (shorthand-expansion.ts:34-37) silently filters items that don't match isTestMessage():
// shorthand-expansion.ts:34-37
if (Array.isArray(value)) {
const messages = value.filter((msg): msg is TestMessage => isTestMessage(msg));
return messages.length > 0 ? messages : undefined;
}This means the validator rejects input that the runtime would accept.
Proposed fix
Make the validator accept content objects (items with a type field like file, text, image) as implicit user messages in input arrays, or at minimum downgrade from error to warning. This would align the validator with the runtime's lenient handling.
Specifically, in validateMessages(), before checking role, check if the item looks like a content object rather than a message:
// If item looks like a content object ({type: "file", value: ...}),
// treat as valid — runtime wraps these implicitly
if (isObject(message) && 'type' in message && !('role' in message)) {
// Warn or accept silently
continue;
}Two-part fix summary
| Repo | Change | Status |
|---|---|---|
| WTG.AI.Prompts | Add AGENTV_VERSION env var to validate.yml workflow so version can be overridden via repo variable |
Prepared locally |
| EntityProcess/agentv | Make validateMessages() accept content objects without role in input arrays |
TODO |
devbox2-wtg-ai-prompts-allagents