Open-source AI output validator for LLM applications. Catch hallucinations, strip narration, verify tool calls.
Battle-tested across 90+ production AI agent instances at Dunky AI.
npm install @dunkyai/ai-validatorimport { validate } from '@dunkyai/ai-validator'
const result = validate({
response: "I've sent the email to john@example.com",
toolCalls: [
{ name: "gmail_send", input: { to: "john@example.com" }, output: '{"success":true}' }
]
})
console.log(result.pass) // true — action claim matches tool call
console.log(result.cleaned) // "I've sent the email to john@example.com"
console.log(result.issues) // []The AI says "I've sent the email" but never actually called a send tool:
const result = validate({
response: "I've sent the email to john@example.com",
toolCalls: [] // no tools were called!
})
console.log(result.pass) // false
console.log(result.issues)
// [{ type: "hallucination", message: "AI claims to have performed an action, but none of these tools were called: gmail_send, send_email, sendEmail" }]The AI called the tool, but it returned an error — and the AI claims it succeeded anyway:
const result = validate({
response: "I've sent the email to john@example.com",
toolCalls: [
{ name: "gmail_send", input: { to: "john@example.com" }, output: '{"error":"Request failed with status 500"}' }
]
})
console.log(result.pass) // false
console.log(result.issues)
// [{ type: "hallucination", message: 'AI claims success but the tool "gmail_send" returned an error: Request failed with status 500' }]Error detection handles JSON errors ({ "error": "..." }, { "success": false }, status codes >= 400) and plain-text error patterns.
The AI leaks its internal monologue into user-facing responses:
const result = validate({
response: "Let me save this file and extract the text.\n\nHere's the summary of the document..."
})
console.log(result.cleaned) // "Here's the summary of the document..."
console.log(result.issues)
// [{ type: "narration", message: "Narration / thought process stripped", match: "Let me save this file and extract the text." }]Remove specific phrases you don't want in responses:
const result = validate({
response: "Hey there! The bottom line is we need to move fast.",
options: {
blockedPhrases: ["hey there", "the bottom line"]
}
})
console.log(result.cleaned) // "! is we need to move fast."Main validation function. Returns a ValidationResult.
Params:
response(string) — The AI's response texttoolCalls(ToolCall[], optional) — Tool calls that were actually executedoptions(ValidationOptions, optional) — Configuration
Options:
checkHallucinations(boolean, default: true) — Check for hallucinated action claimsstripNarration(boolean, default: true) — Remove narration linesblockedPhrases(string[], default: []) — Custom phrases to removenarrationPatterns(RegExp[], default: built-in) — Custom narration patternsactionPatterns({ pattern, tools }[], default: built-in) — Custom hallucination patterns
Returns:
{
pass: boolean, // true if no hallucinations found
cleaned: string, // response with narration/phrases stripped
issues: Issue[], // all issues found
original: string // original response
}Check if the AI claims actions not backed by tool calls.
Strip narration lines from text.
Check if a tool output looks like an error. Handles JSON ({ "error": "..." }, { "success": false }, status >= 400) and plain-text error patterns.
Remove specific phrases from text.
This validator detects one specific failure mode: the model skips tool calls entirely and fabricates a success response. The signal is binary — zero matching tool calls + a success claim in the output.
It covers two hallucination scenarios:
- No tool called — the model skips tool calls entirely and fabricates a success response.
- Tool called but failed — the model calls the tool, gets an error back, but claims success anyway.
It does not cover:
-
The model calls the wrong tool. If the user says "send an email" and the model calls
slack_sendinstead, that's a routing/prompt issue, not a hallucination. -
Subtle factual inaccuracies. If the model summarizes a document and gets a detail wrong, that's a different class of problem requiring retrieval verification.
We tried matching the user's input for action keywords ("send", "create", "search") and flagging when no tool call followed. This was whack-a-mole — users phrase requests in endless ways ("check on", "look into", "gather"). Matching success claims in the output is more robust because hallucinated responses are remarkably formulaic: "I've sent...", "Successfully created...", "Email Sent", etc.
- Claude (Anthropic)
- GPT-4 / GPT-4o (OpenAI)
- Llama, Mistral, Gemma (via Groq, Together, etc.)
- Gemini (Google)
- Any model that uses tool/function calling
MIT — use it however you want, commercially or otherwise.
Issues and PRs welcome at github.com/dunkyai/ai-validator.