Runtime Harness Adaptation for Deterministic LLM Agents — DSPy implementation of arxiv 2605.22166.
Life-Harness improves frozen LLM agents without changing model weights or evaluation environments. It adapts the runtime interface between a model and a deterministic environment through four lifecycle layers, evolved from training trajectories.
┌─────────────────────────────────────────┐
│ LifeHarness │
│ (dspy.Module orchestrator) │
│ │
Before interaction│ ① H3 Environment Contract Layer │
│ Enhances tool descriptions with │
│ policy constraints (Δ_C) │
│ │
Task conditioning │ ② H5 Procedural Skill Layer │
│ BM25 retrieval → skill injection │
│ into system prompt │
│ │
Before execution │ ③ H2 Action Realization Layer │
│ Validates/canonicalizes actions │
│ EXEC(action) or Block(message) │
│ │
After execution │ ④ H4 Trajectory Regulation Layer │
│ Detects repetition, stagnation, │
│ budget exhaustion → recovery │
└─────────────────────────────────────────┘
Each layer operates at a different stage of the agent interaction loop:
| Layer | Stage | Mode | What it does |
|---|---|---|---|
| H3 Contract | Before interaction | deterministic / llm | Appends policy hints to tool descriptions |
| H5 Skill | Task conditioning | BM25 retrieval | Injects relevant procedural strategies into the prompt |
| H2 Action | Before execution | rule / llm / hybrid | Validates actions or blocks with feedback |
| H4 Trajectory | After execution | pattern / llm | Detects degenerate patterns and triggers recovery |
export OPENAI_API_KEY="your-key"
pip install -e .import dspy
from life_harness import LifeHarness, Skill, SkillLibrary
# Configure LM
lm = dspy.LM("openai/gpt-4o-mini")
dspy.configure(lm=lm)
# Define environment (implements the Environment protocol)
env = MyEnvironment()
# Build skill library (from training trajectories)
skills = SkillLibrary(skills=[
Skill(
id="check_status_first",
title="Check Status Before Action",
pattern="cancel return modify order status",
tip="Always verify order status before acting.",
source="failure",
),
])
# Create harness with rule-based layers (paper-faithful defaults)
harness = LifeHarness(
environment=env,
skill_library=skills,
contract_hints={"cancel_order": "Only pending/processing orders can be cancelled."},
action_rules={"cancel_order": [MyCancelRule()]},
max_steps=10,
)
# Run episode
result = harness(task="Cancel order ORD-001")
print(result.steps, result.done, result.total_reward)All four layers use rule-based logic — no LLM calls in the harness itself. This matches the original paper's approach where H2/H3/H4 are deterministic interventions.
harness = LifeHarness(
environment=env,
contract_mode="deterministic", # H3: append text hints
action_mode="rule", # H2: deterministic rules
trajectory_mode="pattern", # H4: pattern detectors
)Use DSPy modules for harness layers, enabling prompt optimization via GEPA/MIPROv2.
harness = LifeHarness(
environment=env,
contract_mode="llm", # H3: ChainOfThought enhances contract
action_mode="llm", # H2: LLM validates/canonicalizes actions
trajectory_mode="llm", # H4: LLM analyzes trajectory health
)Combine deterministic rules with LLM fallback for robustness.
harness = LifeHarness(
environment=env,
action_mode="hybrid", # H2: try rules first, fall back to LLM
)Control whether the agent uses reasoning traces. "predict" (default, faster) for simple action selection from trajectory; "cot" (ChainOfThought) when reasoning traces are needed.
harness = LifeHarness(
environment=env,
agent_mode="predict", # or "cot" for step-by-step reasoning
)Set a character budget for the trajectory sent to the agent. When exceeded, the trajectory is truncated preserving the system prompt header and most recent turns.
harness = LifeHarness(
environment=env,
max_context_chars=16000, # default: 8000
)The default
max_context_chars=8000works well for GPT-4o-mini and similar models. Increase for longer episodes or models with larger context windows.
Implement the Environment protocol:
from life_harness import Environment, StepResult, ToolSpec
class MyEnv(Environment):
def init(self, task: str) -> tuple[Any, str]:
return state, "Initial observation"
def step(self, state: Any, action: str) -> StepResult:
# Process action, return observation
return StepResult(observation="...", state=new_state)
def get_contract(self) -> str:
return "Available tools: ..."
def get_tools(self) -> list[ToolSpec]:
return [ToolSpec(name="search", description="...", parameters={})]
def is_end(self, state: Any, observation: str) -> bool:
return observation.strip().startswith("Answer:")Define validation rules that block invalid actions before execution:
from life_harness import ActionRealizationError
class CancelOrderRule:
tool_name = "cancel_order"
def check(self, state, **kwargs):
order = state.get_order(kwargs["order_id"])
if order and order.status not in ("pending", "processing"):
raise ActionRealizationError(
f"Cannot cancel: status is '{order.status.value}'"
)
harness = LifeHarness(
environment=env,
action_rules={"cancel_order": [CancelOrderRule()]},
)Feedback loop: When an action rule raises
ActionRealizationError, H2 returns aBlock(message)decision. The harness injects[BLOCKED] {message}back into the agent's observation stream — the agent sees the error and can self-correct on the next turn. The episode continues; only the invalid action is prevented.
Built-in detectors for common degenerate patterns:
from life_harness import TrajectoryRegulator, RepetitionDetector, StagnationDetector, BudgetWarningDetector
harness = LifeHarness(
environment=env,
trajectory_detectors=[
RepetitionDetector(window=3, threshold=0.8),
StagnationDetector(window=5),
BudgetWarningDetector(warn_threshold=3),
],
)Recovery flow: H4 appends
[REGULATION - {severity}] {message}to the agent's trajectory on the next turn. The agent sees the warning and can adjust its strategy. Regulation is advisory — it never blocks actions or halts the episode.
Evolve harness layers from training trajectories using DSPy optimizers:
from life_harness import HarnessEvolver
evolver = HarnessEvolver(
harness=harness,
metric_fn=my_metric,
optimizer="gepa",
)
# Single optimizer
evolved = evolver.evolve(trainset=train_examples)
# Chain optimizers (GEPA → fine-tune → GEPA)
evolved = evolver.evolve_chain(
trainset=train_examples,
valset=val_examples,
strategy="p -> w -> p", # p=prompt opt (GEPA), w=weight finetune (BootstrapFinetune)
)
# Ensemble multiple evolved harnesses
ensemble = evolver.evolve_ensemble(programs=[harness1, harness2, harness3])Extract skills automatically from trajectories:
skills = HarnessEvolver.extract_skills_from_trajectories(
trajectories=episode_results,
min_occurrences=2,
)from life_harness import HarnessCallback
callback = HarnessCallback()
harness = LifeHarness(environment=env, callback=callback)
result = harness(task="Do something")
for entry in callback.get_trace():
print(f"{entry['layer']} {entry['event']} ({entry.get('elapsed', 0):.3f}s)")# Deterministic mode (rule-based harness)
OPENAI_API_KEY=sk-... python main.py
# LLM-powered harness
python main.py --mode llm
# Comparison: bare agent vs. harnessed agent
python main.py --mode comparison
# Use a local model
python main.py --lm ollama_chat/llama3harness/
├── life_harness/
│ ├── __init__.py # Public API
│ ├── core.py # LifeHarness (dspy.Module) + HarnessCallback
│ ├── environment.py # Environment protocol, ToolSpec, StepResult
│ ├── skills.py # Skill dataclass + SkillLibrary (BM25)
│ ├── evolution.py # HarnessEvolver (GEPA, MIPROv2, BetterTogether)
│ └── layers/
│ ├── contract.py # H3: ContractEnhancer
│ ├── skill.py # H5: ProceduralSkillLayer
│ ├── action.py # H2: ActionRealizer
│ └── trajectory.py # H4: TrajectoryRegulator
├── examples/
│ └── toy_environment.py # Demo e-commerce environment
├── main.py # Demo runner
└── pyproject.toml
Tianshi Xu, Huifeng Wen, Meng Li. "Adapting the Interface, Not the Model: Runtime Harness Adaptation for Deterministic LLM Agents." arXiv 2605.22166, May 2026.
- Python >= 3.12
- dspy-ai >= 3.2.1
- rank-bm25 >= 0.2.2
pip install dspy-ai[optuna]for MIPROv2 support
This implementation is provided for research purposes. The Life-Harness paper is licensed under CC BY 4.0.