DHMS Agent Harness v0.3.2 - Reproducibility Package

Overview

DHMS Agent Harness v0.3.2 adds reproducibility packaging for the v0.3.1
mock/local multi-case report. External developers can clone the repository and
reproduce the multi-case report without OpenClaw, DeepSeek, provider API keys,
or real agent execution.

This release is mock/local only. No new real OpenClaw or DeepSeek confirmations
were run for this release.

v0.3.2 builds on:

v0.2.1-agent-harness-evidence-seal - evidence-sealed prototype
v0.3.1-schema-report-polish - schema and report polish

Reproduction Command

Run from the repository root:

python3 cli.py test-agent-suite \
  --suite cases/agent_core \
  --run-all-cases \
  --mock-agent \
  --report \
  --output reports/reproducibility/v0.3.1_mock_all_cases

Reference Artifacts

The reproducibility package includes:

docs/reproducibility/v0.3.1-mock-local-multicase.md
docs/reproducibility/artifacts/v0.3.1_mock_all_cases/execution_summary.json
docs/reproducibility/artifacts/v0.3.1_mock_all_cases/suite_agent_report.md

Only lightweight reference artifacts are committed. The package does not commit
HTML output, logs, secrets, or real OpenClaw/DeepSeek outputs.

Expected Reproduction Summary

The mock/local run should report:

total_cases=6
taxonomy_summary: A=5, B=1, C=0
execution_summary.json exists
suite_agent_report.md exists
no real tool execution
no side effects

Validation Scope

v0.3.2 validation is mock/local only. It does not require a real model, API key,
OpenClaw, DeepSeek, or a real LLM Judge.

Limitations

This release does not claim:

new real model validation
new real OpenClaw or DeepSeek confirmations
full-suite production validation
production certification
multi-model certification
system-level sandbox proof
real LLM Judge validation
HTTP Adapter availability

No real LLM Judge was used, and the HTTP Adapter remains not implemented.

Release Status

Tag: v0.3.2-reproducibility-package

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DHMS Agent Harness v0.3.2 — Reproducibility Package

Choose a tag to compare

Sorry, something went wrong.