Adaptive Anti-Corruption Layer Pattern

Problem solved: LLM-produced format hallucinations

This repository formalises the Adaptive Anti-Corruption Layer (AACL) design pattern for integrating probabilistic LLM agents with deterministic systems via a self-healing mechanism. The AACL provides a normalisation boundary that converts ambiguous LLM outputs into strictly typed inputs, returning structured correction signals that allow the model to self-correct at runtime. This eliminates silent format corruption and enhances reliable agentic behaviour without model retraining.

Design Pattern: Adaptive Anti-Corruption Layer (AACL)

Context: Integrating probabilistic LLM agents with deterministic systems

Problem: LLMs produce chaotic, non-type-safe outputs; direct integration causes silent format corruption

Solution: Two-layer architecture with normalisation boundary that provides structured feedback

Result: Self-correcting system where structured feedback enables runtime error correction

Contents: Pattern Specification (this file), Reference Implementation (Python, in /src), Pattern Validation with adversarial testing (Python, in /pattern_validation)

Architectural Assumptions

The self-healing mechanism requires an agentic loop architecture where the LLM can receive tool execution feedback and retry with corrected inputs. Specifically, the system must support:

Function/tool calling — LLM can invoke tools with parameters
Error propagation — Structured errors returned to LLM context
Iterative retry — LLM can re-plan and retry after failures
State persistence — Conversation state maintained across tool calls

Framework example:

LangGraph: Requires a state saver, an agent node (runnable), a conditional edge to a tool node, and an edge back to the agent node for retry

If your system lacks an agentic loop (i.e., one-shot tool calls with no retry), the AACL pattern still provides value by preventing silent format corruption, but self-healing requires the retry mechanism.

Premise

LLMs are semantic sequence models. They are not type-safe, not schema-stable, and not reliable data serialisers. Therefore:

LLMs must provide values. Code must provide structure.

Attempting to treat model-generated JSON or arguments as authoritative structure guarantees silent format corruption, brittle parsing, and cascading failure.

The correct architecture is a two-layer boundary separating free-form model output from deterministic business logic.

LLM (semantic planner)
↓
Interface Layer (normalisation + validation + structured errors)
↓
Implementation Layer (strict types, pure logic)

This boundary is where the system becomes self-correcting. The interface boundary is the only location where ambiguity is allowed to exist. Once execution passes into the implementation layer, ambiguity must be ZERO.

Structured output belongs in function results, not token streams. Use function calling to receive structured data from code, not to parse it from LLM-generated text. Need the JSON visible to users? Put the function result in your output queue - same place the response stream goes.

Failure Modes of Raw LLM Output (Non-Exhaustive)

LLMs freely interchange:

"true", "True", "yes", "1", True
5, "05", "five", "5"
"null", "none", None, "n/a", ""
"a.com b.com", "a.com,b.com", ["a.com", "b.com"]
Dates in any human-readable permutation

Passing these directly to an API layer introduces silent format corruption — the worst class of system failure because it has a probability of “working” then it just breaks for no apparent reason.

Architecture

This two-layer boundary is the core of the Adaptive Anti-Corruption Layer (AACL). The LLM operates in a semantic space; the implementation layer operates in a typed deterministic space. The AACL is the boundary that translates between them through normalisation + structured failure signals.

1. Interface Layer (LLM-Facing)

Function: Convert arbitrary inputs into typed inputs.

Requirements:

Accept union and ambiguous input types
Normalise to canonical representations
Validate according to strict schema expectations
Return structured error messages when normalisation fails

This layer must be total:
Every input either normalises or fails with a LLM-usable correction signal.

2. Implementation Layer (Logic-Facing)

Function: Perform business operations with strict typing.

No normalisation
No LLM-awareness
No ambiguity handling
Pure deterministic execution

If incorrect values reach this layer, the architecture is wrong.

Minimal Example (Python)

The pattern uses a two-file structure. See full reference implementation in /src/mcp_tools/tavily_search/.

Interface Layer (`interface.py`) — LLM-Facing

from fastmcp import FastMCP
from .implementations.tavily_impl import tavily_search_impl

mcp = FastMCP("My MCP Server")

@mcp.tool()
def search_web(
    query,
    search_depth="basic",
    max_results=5,
    include_domains=None,
    time_range=None
) -> dict:
    """
    Search the web using Tavily's search API.
    
    Args:
        query: Search query (required)
        search_depth: "basic" or "advanced" (Optional, defaults to "basic")
        max_results: Number of results, 1-10 (Optional, defaults to 5)
        include_domains: Domain filter (comma/space-separated or list) (Optional)
        time_range: Time filter ("day", "week", "month", "year") (Optional)
    
    Returns:
        Dictionary containing search results
    
    Raises:
        ValueError: With structured correction signals for invalid inputs
    """
    # Normalize ambiguous inputs to canonical forms
    search_depth = _normalize_search_depth(search_depth)
    include_domains = _normalize_optional_string(include_domains)
    time_range = _normalize_optional_string(time_range)
    
    # Validate with structured error messages
    if time_range and time_range not in ("day", "week", "month", "year"):
        raise ValueError(
            "INVALID_TIME_RANGE: expected one of ['day', 'week', 'month', 'year']; "
            f"received '{time_range}'. Retry with a valid value."
        )
    
    # Pass typed, normalized inputs to implementation
    return tavily_search_impl(
        query=query,
        search_depth=search_depth,
        max_results=int(max_results),
        include_domains=include_domains,
        time_range=time_range
    )

def _normalize_optional_string(value):
    """Normalize null-like values to None."""
    if value is None:
        return None
    if isinstance(value, str):
        s = value.strip().lower()
        if s in ("", "null", "none", "n/a", "na"):
            return None
    return value

def _normalize_search_depth(depth):
    """Normalize search depth to 'basic' or 'advanced'."""
    if not depth:
        return "basic"
    d = str(depth).strip().lower()
    if d in ("advanced", "deep", "thorough"):
        return "advanced"
    return "basic"

Implementation Layer (`implementations/tavily_impl.py`) — Logic-Facing

def tavily_search_impl(
    query: str,
    search_depth: str,
    max_results: int,
    include_domains: list[str] | None,
    time_range: str | None
) -> dict:
    """
    Pure implementation - expects strictly typed, normalized inputs.
    No validation or normalization should happen here.
    """
    client = TavilyClient(api_key=os.getenv("TAVILY_API_KEY"))
    
    params = {
        "query": query,
        "search_depth": search_depth,
        "max_results": max_results
    }
    
    if include_domains:
        params["include_domains"] = include_domains
    if time_range:
        params["time_range"] = time_range
    
    return client.search(**params)

Why This Is Self-Healing

The loop:

LLM emits a parameter in an arbitrary format.
Interface layer attempts normalisation.
If normalisation succeeds → call implementation logic.
If normalisation fails → return a structured correction signal.
LLM re-plans and retries (ReAct pattern, no human involvement).

This produces adaptive convergence: The system self-heals at runtime by guiding the LLM to correct inputs without human supervision.

Using This Pattern for Structured LLM Outputs

Why You Never Let the LLM Produce JSON

This is not about syntax errors. It is about responsibility boundaries.

JSON is a deterministic serialisation format. LLMs are probabilistic sequence models.

If the LLM is responsible for producing formatted JSON, you guarantee:

Silent type drift ("5" instead of 5)
Mixed boolean encodings ("true" vs true vs "yes")
Key-order instability (breaks hashing, caching, diff-based tests)
Schema drift over iterative refinement
Random breakage triggered by prompt context state

These are not mistakes. They are the statistical nature of token generation.

Once the model is permitted to define structure, the structure becomes non-deterministic.

Apply the Same Two-Layer Architecture

LLM → untyped values
Interface Layer → normalization + schema enforcement
Implementation Layer → constructs JSON deterministically

The model never formats JSON.

Need the structured data visible in the user interface? That's fine - your function returns it, and your application layer displays it. The point is the LLM doesn't generate the structure, your code does.

Example

See full reference implementation in /src/mcp_tools/structured_output/.

Interface Layer (LLM-Facing)

from fastmcp import FastMCP
from .implementations.json_writer_impl import create_structured_data_impl

mcp = FastMCP("My MCP Server")

@mcp.tool()
def create_json(x, y, flag) -> dict:
    """
    Create and return structured JSON from LLM-provided values.
    
    Args:
        x: Integer value (accepts "5", "05", 5)
        y: String value
        flag: Boolean flag (accepts "true", "yes", "1", True, etc.)
    
    Returns:
        Dictionary with deterministic structure (ready for JSON serialization)
    
    Raises:
        ValueError: With structured correction signals for invalid inputs
    """
    # Normalize integer with structured error
    try:
        x = int(x)
    except (ValueError, TypeError):
        raise ValueError(
            f"TYPE_ERROR: field 'x' must be an integer; "
            f"received {repr(x)} (type: {type(x).__name__}). "
            "Retry with a valid integer value."
        )
    
    # Normalize string
    y = str(y)
    
    # Normalize boolean from various representations
    if isinstance(flag, str):
        flag = flag.strip().lower() in ("true", "1", "yes", "on")
    else:
        flag = bool(flag)
    
    # Pass typed inputs to implementation
    return create_structured_data_impl(x=x, y=y, flag=flag)

Implementation Layer (Logic-Facing) (Strict, Deterministic JSON)

def create_structured_data_impl(x: int, y: str, flag: bool) -> dict:
    """
    Construct JSON structure deterministically from typed values.
    
    The LLM never generates JSON - it only provides values.
    Code defines keys, order, and types.
    """
    # Structure is defined by code, not LLM
    return {
        "x": x,
        "y": y,
        "flag": flag
    }

Why This Works

Responsibility	LLM	Interface Layer	Implementation Layer
Interpret Intent	Yes	No	No
Normalise Values	No	Yes	No
Enforce Schema	No	Yes	No
Construct Data Structures	No	No	Yes
Serialise Data	No	No	Yes

Core Principle

LLMs Plan. Code Types. Never let the model define structure. Always enforce structure at the boundary.

Summary

Layer	Handles	Must Be	Failure Mode	Output
LLM	Semantics	Flexible	Format Hallucination	Unstructured Values
Interface Layer	Normalisation + Validation	Total / Deterministic	Structured Correction (Intentional Exception Raised)	Typed Inputs
Implementation Layer	Business Logic	Pure / Strict	Hard Failure (if reached incorrectly)	Stable Data / JSON / YAML

The invariant: If the implementation layer sees garbage, the interface layer is incorrect.

This pattern is general and applies to every LLM-tooling integration, including MCP, ReAct, function-calling APIs, and agentic planning systems. This architecture is not a workaround for LLM weaknesses. The Adaptive Anti-Corruption Layer is the correct separation of concerns for any system in which a probabilistic language generator interacts with deterministic software components.

Related Patterns

Pattern	Relationship
DDD Anti-Corruption Layer	Conceptual ancestor — but assumes deterministic upstream domain
Adapter Pattern	Handles interface mismatch, but not semantic ambiguity
Retry with Backoff	Handles failure, but not interpretation
ReAct	Handles iterative convergence, but focuses on LLM output instead of relying on the known good system known as code

Pattern Formalisation (Appendix)

For pattern catalogue inclusion.

Applicability

Use the AACL pattern when:

Integrating LLMs with deterministic APIs, databases, or business logic
Building function-calling or tool-use systems
Creating MCP servers or LLM integration points
Generating structured data (JSON, YAML) from LLM outputs

Do not use when:

Input is already strictly typed (traditional API)
Format variation is acceptable downstream
Only semantic correctness matters (content, not format)

Consequences

Benefits:

Eliminates silent format corruption
Enables self-healing via structured errors
Clear separation of concerns
Works with any LLM (no retraining) that supports function calling
Composable with existing frameworks

Trade-offs:

Requires two-layer architecture
Normalisation adds (minimal) latency
Interface must evolve with edge cases
Does not solve content hallucinations

Known Uses

OpenAI/Anthropic function calling APIs
MCP server implementations
LangChain custom tools
Agentic systems
Any LLM-to-database/API boundary

Forces Resolved

The pattern balances:

Flexibility vs. Correctness: LLM freedom + type safety
Fail-Fast vs. Teach: Structured errors guide correction
When to Normalise vs. Validate: Intentional design choice per parameter
Boundary Location: Interface handles ambiguity, implementation stays pure

The resolution: Interface layer is total (handles all inputs), implementation is pure (assumes correctness).

About This Pattern

Author: Morgan Lee
Organisation: Synpulse8
First Published: 12 November 2025

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
pattern_validation		pattern_validation
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
env.example		env.example
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Adaptive Anti-Corruption Layer Pattern

Problem solved: LLM-produced format hallucinations

Architectural Assumptions

Premise

Failure Modes of Raw LLM Output (Non-Exhaustive)

Architecture

1. Interface Layer (LLM-Facing)

2. Implementation Layer (Logic-Facing)

Minimal Example (Python)

Interface Layer (`interface.py`) — LLM-Facing

Implementation Layer (`implementations/tavily_impl.py`) — Logic-Facing

Why This Is Self-Healing

Using This Pattern for Structured LLM Outputs

Why You Never Let the LLM Produce JSON

Apply the Same Two-Layer Architecture

Example

Interface Layer (LLM-Facing)

Implementation Layer (Logic-Facing) (Strict, Deterministic JSON)

Why This Works

Core Principle

Summary

Related Patterns

Pattern Formalisation (Appendix)

Applicability

Consequences

Known Uses

Forces Resolved

About This Pattern

About

Uh oh!

Releases

Packages

Languages

License

synpulse8-opensource/pulse8-ai-aacl-pattern

Folders and files

Latest commit

History

Repository files navigation

Adaptive Anti-Corruption Layer Pattern

Problem solved: LLM-produced format hallucinations

Architectural Assumptions

Premise

Failure Modes of Raw LLM Output (Non-Exhaustive)

Architecture

1. Interface Layer (LLM-Facing)

2. Implementation Layer (Logic-Facing)

Minimal Example (Python)

Interface Layer (interface.py) — LLM-Facing

Implementation Layer (implementations/tavily_impl.py) — Logic-Facing

Why This Is Self-Healing

Using This Pattern for Structured LLM Outputs

Why You Never Let the LLM Produce JSON

Apply the Same Two-Layer Architecture

Example

Interface Layer (LLM-Facing)

Implementation Layer (Logic-Facing) (Strict, Deterministic JSON)

Why This Works

Core Principle

Summary

Related Patterns

Pattern Formalisation (Appendix)

Applicability

Consequences

Known Uses

Forces Resolved

About This Pattern

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Interface Layer (`interface.py`) — LLM-Facing

Implementation Layer (`implementations/tavily_impl.py`) — Logic-Facing

Packages