Skip to content

Sharon77770/validator-first-agent

Repository files navigation

Jarvis Architecture

🇰🇷 Korean Version: README_KO.md

A validator-first LLM execution architecture for safer task automation systems.


Overview

Jarvis Architecture is an experimental LLM orchestration architecture that separates:

  • Natural language understanding
  • Semantic validation
  • Execution authority

Instead of directly trusting LLM outputs, Jarvis treats the LLM as an intent extraction component while deterministic validators and service layers decide whether execution is allowed.

The core idea is:

Structured output alone is not sufficient for safe execution.


Motivation

Modern LLM agents commonly use the following flow:

User Input
→ LLM
→ Structured Output (JSON / Function Calling)
→ Execute

However, schema-valid outputs are often still:

  • Semantically invalid
  • Contradictory
  • Dangerous
  • Ambiguous

Examples:

  • "Schedule an event on February 31."
  • "Delete everything but don't delete it."
  • "Remind me later."
  • "{ "action": "delete_all" }"

Jarvis Architecture attempts to reduce unsafe execution through deterministic validation and policy enforcement.


Architecture

Flow:

User Input
→ Intent Classification
→ Structured Extraction
→ Semantic Validator
→ Policy Gate
→ Service Layer
→ Execution

Validator responsibilities:

  • Datetime validation
  • Temporal consistency checking
  • Contradictory command detection
  • Dangerous action blocking
  • Ambiguous request detection
  • Clarification routing

Possible outcomes:

  • ACCEPT
  • REJECT
  • CLARIFY

Compared Architectures

This repository contains three intentionally simplified architectures.

1. Plain LLM Chatbot

Flow:

User Input
→ LLM
→ Execute

Characteristics:

  • Direct trust model
  • No structured output
  • No semantic validation

2. Structured Output Agent

Flow:

User Input
→ LLM
→ Structured JSON
→ Schema Parse
→ Execute

Characteristics:

  • Structured output
  • Basic schema parsing
  • No semantic verification

3. Validator-First Agent (Jarvis)

Flow:

User Input
→ LLM Intent Extraction
→ Semantic Validator
→ Policy Gate
→ Execute

Characteristics:

  • Deterministic validation
  • Execution authority separation
  • Clarification support
  • Dangerous request rejection

Experimental Results

Benchmark:

  • 55 multilingual scheduling requests
  • Korean / English / Chinese
  • Valid / Invalid / Dangerous / Ambiguous requests

Results:

Architecture Accuracy False Accept
Plain LLM Chatbot 25.45% 38
Structured Output Agent 29.09% 37
Validator-First Agent 72.73% 3

Key finding:

Structured formatting alone does not provide semantic safety.


Repository Structure

/agent_resp_judge
/baseline_test_runner
/logs
/plain-agent
/structured-agent
/validator_first_agent

Research Goal

This project is not intended to compete with frontier LLM capability.

The goal is to explore:

  • Safer execution architectures
  • Deterministic semantic validation
  • Separation between language understanding and execution authority
  • Reliability improvements using system architecture rather than larger models

Limitations

Current limitations include:

  • Small benchmark size
  • Rule-based validators
  • Limited task domain
  • Simplified execution environment
  • Incomplete multilingual handling

This repository should be viewed as an experimental prototype and research toy project.


License

MIT License

Releases

No releases published

Packages

 
 
 

Contributors