Releases · harnessworks/harness-starter-kit

11 Jun 04:26

baskduf

v0.1.11

af55924

v0.1.11 - Benchmark tasks, Rust profile, and localization coverage Latest

Latest

Patch release for deterministic benchmark tasks, stack-profile fixture coverage, and README localization drift hardening. This release adds repository-owned benchmark task specs for the external runner while preserving the rule that project-specific oracles live in this repository, not in the runner.

Added

Rust profile guidance, fixture coverage, smoke-test wiring, installer coverage, and README/profile documentation so Rust crate and Cargo workspace targets have a conservative local verification path.
Buildable gobasic package in the go-basic fixture plus a Go toolchain smoke test that runs the installed Go profile check_harness.py (go build, go vet, go test) when go is available and skips otherwise, closing the verification gap from issue #41.
Eight deterministic benchmark task definitions under benchmarks/tasks/, covering small bugfixes, docs-only boundaries, forbidden-file guards, failure memory, decision memory, profile scope, installer safety, and command workflow guidance.
Benchmark documentation and task outcome evidence for runner smoke checks, Codex dry-run oracle fixes, and benchmark ownership boundaries.
Traditional Chinese README localization and README language-switcher drift coverage.

Changed

Harden benchmark task tests so each task has a deterministic oracle, narrow expected files, explicit forbidden files, and required expected-file edits where the runner treats expected_files as an allowlist.
Normalize Markdown-oriented benchmark oracles for docs-only and refresh workflow tasks so ordinary line wrapping does not create false negatives.
Refresh README badges, localized README image references, repository-transfer URLs, static-site metadata, profile lists, and validation docs.
Update /harness update and adoption prompt guidance around source tracking and current repository URLs.

Validation

python3 -m unittest discover -s tests (194 tests, 2 skipped)
python3 -m py_compile scripts/apply_harness.py scripts/check_docs_drift.py scripts/check_structure.py scripts/check_encoding_hygiene.py scripts/check_effectiveness_plan.py scripts/check_failure_memory.py scripts/check_decision_memory.py scripts/harness_doctor.py
python3 scripts/check_docs_drift.py
python3 scripts/check_structure.py
python3 scripts/check_encoding_hygiene.py
python3 scripts/check_effectiveness_plan.py
python3 scripts/check_failure_memory.py
python3 scripts/check_decision_memory.py
python3 scripts/harness_doctor.py --target . (100/100)
git diff --check

Assets 2

08 Jun 07:27

baskduf

v0.1.10

d5cf04f

v0.1.10 - Harness Doctor v2 coupling diagnostics

Patch release for Harness Doctor v2. This release turns Doctor from a flat baseline scan into a six-element repository health and coupling diagnostic while preserving the boundary between harness readiness and agent effectiveness.

Added

Harness Doctor v2 scoring across Instructions, Constraints, Feedback, Memory, Evaluation, and Governance.
First-class coupling findings for orphan constraints, orphan feedback, unoperationalized memory, unevaluated memory, ungoverned change types, and promotion gaps.
Optional Doctor gates for minimum score and critical coupling findings, disabled by default.
Decision memory for the Doctor v2 model and task outcome evidence for the implementation and review loop.

Changed

Update /harness doctor, the scoring rubric, example reports, component map, theory docs, and roadmap guidance to describe the six-element diagnostic.
Keep Proven/effectiveness signals unmeasured in Doctor output unless durable outcome evidence supports a claim.
Tighten feedback-binding heuristics so generic documentation mentions do not count as execution wiring, and unbound check scripts do not inflate Feedback health.
Expand Doctor regression tests for coupling findings, optional gates, non-comparable outcome evidence, illustrative effectiveness reports, and feedback-binding edge cases.

Assets 2

06 Jun 14:43

baskduf

v0.1.9

d3b9732

v0.1.9 - Operational evidence and command-reference validation

Patch release for operational evidence tracking, Go profile coverage, and command-reference validation. This release strengthens the kit's ability to collect trustworthy agent-work evidence without turning the starter kit into a heavier automation framework.

Added

Go profile guidance, fixture coverage, smoke-test wiring, and README/profile documentation so Go targets have a conservative local verification path.
Task outcome evidence decision guidance for substantial harness work, including required evidence fields for included task outcome records.
Dogfood and effectiveness evidence reports for Today Bus, Harness ERP, and small evidence-pass scenarios, plus task outcome examples for harness adoption and maintenance work.
Failure records and decision memory for dogfood evidence consistency and first-pass task outcome evidence gaps.
Google site verification, sitemap, robots, and static-site metadata updates.

Changed

Extend scripts/check_effectiveness_plan.py, plus the generic template copy, to validate task outcome evidence fields and reject contradictory or stale effectiveness-report completion language.
Extend scripts/check_failure_memory.py and scripts/check_effectiveness_plan.py, plus generic template copies, to validate root make targets and root just recipes referenced by failure-memory records, adoption reports, and task outcome verification commands.
Strengthen dogfood evidence validation, effectiveness templates, adoption evidence checklists, roadmap guidance, and evaluation docs around operational evidence loops.
Refresh README, localized README files, contributor visuals, profile lists, validation docs, and lifecycle pilot notes to match the current evidence and profile coverage.
Revert the Crowdin localization sync path while preserving localized README consistency.

Validation

python3 -m unittest discover -s tests (146 tests, 1 skipped)
python3 -m py_compile scripts/apply_harness.py scripts/check_docs_drift.py scripts/check_structure.py scripts/check_encoding_hygiene.py scripts/check_effectiveness_plan.py scripts/check_failure_memory.py scripts/check_decision_memory.py scripts/harness_doctor.py
python3 scripts/check_docs_drift.py
python3 scripts/check_structure.py
python3 scripts/check_encoding_hygiene.py
python3 scripts/check_effectiveness_plan.py
python3 scripts/check_failure_memory.py
python3 scripts/check_decision_memory.py
python3 scripts/harness_doctor.py --target . (98/100)
git diff --check

Assets 2

03 Jun 06:27

baskduf

v0.1.8

7d6fac2

v0.1.8

Summary

Hardened failure-memory verification so failure records must point to concrete detection or prevention evidence instead of non-committal future checks.
Added scripts/check_failure_memory.py and kept the generic template copy aligned.
Extended adoption and effectiveness report checks to validate failure-memory linkage, concrete path references, and failure-memory fields.
Added root package.json script existence validation for npm, pnpm, yarn, and bun run <script> references.
Documented the Today Bus Next.js dogfood target alongside the Django dogfood target.

Validation

python3 -m unittest discover -s tests -> 115 OK, 1 skipped
python3 -m py_compile scripts/apply_harness.py scripts/check_docs_drift.py scripts/check_structure.py scripts/check_encoding_hygiene.py scripts/check_effectiveness_plan.py scripts/check_failure_memory.py scripts/check_decision_memory.py scripts/harness_doctor.py
python3 scripts/check_docs_drift.py
python3 scripts/check_structure.py
python3 scripts/check_encoding_hygiene.py
python3 scripts/check_effectiveness_plan.py
python3 scripts/check_failure_memory.py
python3 scripts/check_decision_memory.py
python3 scripts/harness_doctor.py --target . -> 100/100
git diff --check
git diff --cached --check

Assets 2

02 Jun 10:47

baskduf

v0.1.7

94e416b

v0.1.7

Summary

Added deterministic behavior check gate-placement guidance across harness review, refresh, adoption, generic AGENTS, and verification checklist workflows.
Added ADR and failure memory for deterministic product-behavior checks that remain focused/manual without gate-placement review.
Added adoption report gate-placement fields and examples for normal, focused, and manual verification paths.
Extended check_effectiveness_plan.py and the generic template copy to validate adoption report gate-placement fields, including exact heading matching and wrapped/nested field values.

Validation

python3 -m unittest discover -s tests -> 77 OK, 1 skipped
python3 -m py_compile scripts/apply_harness.py scripts/check_docs_drift.py scripts/check_structure.py scripts/check_encoding_hygiene.py scripts/check_effectiveness_plan.py scripts/check_decision_memory.py scripts/harness_doctor.py templates/generic/scripts/check_effectiveness_plan.py
python3 scripts/check_docs_drift.py
python3 scripts/check_structure.py
python3 scripts/check_encoding_hygiene.py
python3 scripts/check_effectiveness_plan.py
python3 scripts/check_decision_memory.py
python3 scripts/harness_doctor.py --target . -> 100/100
git diff --check
git diff --cached --check

Assets 2

02 Jun 05:51

baskduf

v0.1.5

55a9849

v0.1.5

Patch release for the decision-memory follow-up to v0.1.4. This release moves the decision-docs gate from review-only guidance into the generic target template that future adoptions copy.

Added

Decision-memory guidance in the generic AGENTS.md template for non-trivial product or workflow structure, integration or mock external-behavior boundaries, major data models, state classifications, or UX principles that become code structure.
A completion criterion requiring agents to report whether decision docs were added, an existing ADR covers the choice, or no decision record was needed.
A Decision-docs gate field in the harness review report template so the specific /harness review diagnostic does not get lost when reviewers use the template.
Regression coverage that keeps the generic completion gate and review report gate wired in.

Validation

python3 -m unittest discover -s tests
python3 -m py_compile scripts/apply_harness.py scripts/check_docs_drift.py scripts/check_structure.py scripts/check_encoding_hygiene.py scripts/check_effectiveness_plan.py scripts/harness_doctor.py
python3 scripts/check_docs_drift.py
python3 scripts/check_structure.py
python3 scripts/check_encoding_hygiene.py
python3 scripts/check_effectiveness_plan.py
python3 scripts/harness_doctor.py --target .

Assets 2

02 Jun 05:29

baskduf

v0.1.4

9bc6649

v0.1.4

Patch release for governance documentation and review diagnostics. This release keeps /harness review diagnostic-only while tightening the durable-memory review path and clarifying command usage.

Added

Failure memory for missed ADR review when structural product decisions are implemented without decision-record consideration.
A /harness review diagnostic warning for product or workflow structure, mock external-behavior boundaries, major data models, state classifications, or product UX principles that become code structure without a docs/decisions/ update or explicit justification.
iOS as a roadmap candidate profile paired with the existing Android profile, with Xcode, simulator, signing, and device checks documented as macOS/manual unless a target repository already has macOS CI.

Changed

Clarify that /harness ... names are prompt conventions by default, not built-in editor commands.
Refine /harness review sub-agent ownership guidance so reviewer mode and fallback reason stay parent/orchestrator-owned.
Update localized README command guidance to match the English prompt convention wording.

Validation

python3 -m unittest discover -s tests
python3 -m py_compile scripts/apply_harness.py scripts/check_docs_drift.py scripts/check_structure.py scripts/check_encoding_hygiene.py scripts/check_effectiveness_plan.py scripts/harness_doctor.py
python3 scripts/check_docs_drift.py
python3 scripts/check_structure.py
python3 scripts/check_encoding_hygiene.py
python3 scripts/check_effectiveness_plan.py
python3 scripts/harness_doctor.py --target .

Assets 2

31 May 07:31

baskduf

v0.1.3

08513ed

v0.1.3 - Harness review sub-agent routing fix

Patch release for /harness review reviewer-mode routing. This keeps the command diagnostic-only while making subagent availability and fallback reporting harder to skip silently.

Added

Explicit /harness review sub-agent invocation mode for read-only reviewer subagent use when the active runtime and tool instructions allow it.
Review report Invocation, Reviewer mode, and Fallback reason fields in the template and example report.
Regression coverage for subagent fallback guidance, prompt drift, localized README wiring, and route precedence.

Changed

Clarified /harness review fallback behavior when a subagent tool is present but not permitted by active tool instructions.
Routed /harness review sub-agent before the generic /harness review command in agent-facing prompts and command routing.

Validation

python3 -m unittest discover -s tests
python3 -m py_compile scripts/apply_harness.py scripts/check_docs_drift.py scripts/check_structure.py scripts/check_encoding_hygiene.py scripts/check_effectiveness_plan.py scripts/harness_doctor.py
python3 scripts/check_docs_drift.py
python3 scripts/check_structure.py
python3 scripts/check_encoding_hygiene.py
python3 scripts/check_effectiveness_plan.py
python3 scripts/harness_doctor.py --target .

Assets 2

31 May 06:28

baskduf

v0.1.2

d076aa9

v0.1.2

Governance release for change-set review. This release adds a diagnostic
/harness review workflow so maintainers can challenge current changes before
completion without adding runtime hooks, policy enforcement, CI adapters, or
more automatic installer behavior.

Added

/harness review command workflow for opposing harness-engineering review of
the current change set.
Harness review report template and example report.
Quick Start, full adoption prompt, localized README, static site, and
component-map wiring for /harness review.
Regression tests that keep /harness review command routing, localized docs,
report-template sections, and prompt drift covered.

Changed

Clarify that the existing harness review checklist is a periodic maintenance
checklist, distinct from the /harness review change-set command.
Update the roadmap from adding /harness review to refining it through real
target-repository use.

Assets 2

30 May 12:31

baskduf

v0.1.1

d4a5a09

v0.1.1

Stabilization release for the initial harness workflow. This release strengthens
the theory, evaluation, failure-memory, and contributor guidance added around
the v0.1.0 early release.

Added

Harness engineering theory document that separates repository harness health
from observed agent effectiveness.
Task outcome record template for comparable agent-work observations.
Roadmap and expanded contributor guidance for profiles, drift checks,
adoption examples, and release validation.
Regression coverage that keeps the static site copy prompt aligned with the
README adoption prompt.

Changed

Compact root and generic AGENTS.md guidance while preserving command
routing, analysis, validation, and commit rules.
Clarify python3 validation commands for macOS/Linux environments where
python is unavailable.
Clarify Harness Doctor score scope and non-scored evaluation/governance
signals.
Strengthen adoption and update guidance around failure-memory records for
user-visible runtime failures, high-risk bug paths, failed checks, repeated
agent mistakes, and cross-environment mismatches.

Assets 2

Releases: harnessworks/harness-starter-kit

v0.1.11 - Benchmark tasks, Rust profile, and localization coverage

Added

Changed

Validation

Uh oh!

v0.1.10 - Harness Doctor v2 coupling diagnostics

Added

Changed

Uh oh!

v0.1.9 - Operational evidence and command-reference validation

Added

Changed

Validation

Uh oh!

v0.1.8

Summary

Validation

Uh oh!

v0.1.7

Summary

Validation

Uh oh!

v0.1.5

Added

Validation

Uh oh!

v0.1.4

Added

Changed

Validation

Uh oh!

v0.1.3 - Harness review sub-agent routing fix

Added

Changed

Validation

Uh oh!

v0.1.2

Added

Changed

Uh oh!

v0.1.1

Added

Changed

Uh oh!