Skip to content

DHMS Agent Harness v0.3.1 — Schema & Report Polish

Choose a tag to compare

DHMS Agent Harness v0.3.1 - Schema & Report Polish

Overview

DHMS Agent Harness v0.3.1 standardizes the multi-case execution summary schema
and improves report readability for local/mock Agent Harness suite runs. This
release builds on the v0.2.1 evidence-sealed prototype and focuses on making
multi-case outputs stable, readable, and externally interpretable.

No new real OpenClaw or DeepSeek confirmations were run for this release.

Focus

v0.3.1 focuses on:

  1. standardized execution_summary.json schema
  2. A/B/C taxonomy wording freeze
  3. readable multi-case Markdown reports
  4. preserved single-case compatibility

Standardized Execution Summary

execution_summary.json now uses stable top-level keys:

  • schema_version
  • run_metadata
  • suite_summary
  • taxonomy_summary
  • consistency_summary
  • cases

Each case entry includes:

  • case_id
  • taxonomy_domain
  • taxonomy_label
  • execution_safety_result
  • semantic_property_result
  • final_status

A/B/C Taxonomy

The taxonomy wording is frozen as:

  • A = Action Risk Domain
  • B = Memory / Context Risk Domain
  • C = Reserved Context Coordination Domain

C remains reserved only. This release does not implement a C-dimension case
or change the existing A/B/C semantic definitions.

Report Readability

The suite Markdown report now starts with a compact DHMS Evaluation Report
header and includes a per-case summary table showing:

  • case id
  • taxonomy domain
  • execution safety result
  • semantic property result
  • final status

Single-case mode remains compatible with --case / --case-id.

Validation Scope

v0.3.1 validation was mock/local only. It did not run OpenClaw, DeepSeek, a real
provider API, or a real agent suite.

Limitations

This release does not claim:

  • new real model validation
  • full-suite production validation
  • production certification
  • multi-model certification
  • system-level sandbox proof
  • real LLM Judge validation
  • HTTP Adapter availability

No real LLM Judge was used, and the HTTP Adapter remains not implemented.

Release Status

Tag: v0.3.1-schema-report-polish