Skip to content

0.6.0 — 2026-06-09

Choose a tag to compare

@github-actions github-actions released this 09 Jun 13:50

Release Notes

Added

  • Security pipeline (Ring 2): three new skills ported from defending-code
    • threat-model: trust boundary analysis, threat actor enumeration, threat scenario generation → THREAT_MODEL.md
    • vuln-scan: 4-dimension systematic scanner (injection, auth, data exposure, dependencies) → VULN-FINDINGS.json
    • triage: adversarial validation with severity adjustment, chaining analysis, root-cause grouping → TRIAGE.json
    • Pipeline flow: /threat-model/vuln-scan/triage
  • Prompt auto-tuning for evolved skills: underperforming skills receive targeted tuning guidance based on A/B score gaps
    • Tuning sections appended after <!-- auto-tuned --> delimiter — original content never modified
    • Auto-rollback after 3 consecutive declining sessions (TUNING_DECLINE_LIMIT)
    • History tracked in SkillMeta.prompt_tuning_history (capped at 10 entries)
    • New functions: auto_tune_skills(), append_tuning_section(), strip_tuning_sections(), build_tuning_section()
    • 10 new unit tests covering serialization, scoring, decline counting
  • Audit --strict mode: trust boundary isolation for reviewer/auditor independence
    • Artifact-only delivery: audit modes receive only diff + spec, no builder context
    • Cross-check independence: code/security/test modes run blind until synthesis
    • Blind scoring: prevents anchoring bias between modes
    • No self-review: builder session excluded from audit agent selection
    • Activation: --strict flag or mode: strict in .harness/engagement.md
  • Engagement context: optional .harness/engagement.md for security assessment scoping
    • Defines: Authorization, Scope (in/out), Constraints, Environment, Exclusions
    • secure skill checks for engagement context and loads scope if present
    • Reference template: docs/references/engagement.md
  • Tiered verification ladder in /ship: T0 (build) → T1 (tests+lint+fmt) → T2 (AC verification) → T3 (security)
    • T1/T2 auto-retry ≤3 times
    • T3 conditional on engagement.md or security-scope diff
  • Semantic deduplication in /audit: cross-mode finding dedup between parallel checks and synthesis
    • NEW/DUP_BETTER/DUP_SKIP classification
    • Severity reassessment across modes (highest severity wins)
  • SkillOpt-inspired evolution optimization: three deep learning-inspired techniques adapted from SkillOpt for natural language skill evolution
    • Negative Feedback Buffer: persists rejected skill proposals with TTL-based expiry to prevent re-generating known-bad skills
    • Minibatch Reflection: decomposes observations into fixed-size batches for structural pattern extraction, catching micro-patterns hidden by session averages
    • Slow/Meta Update: epoch classification (Improving/Regressing/PersistentFailure/StableSuccess) with slow parameter tracking per evolved skill
  • Ineffective skill auto-eviction: skills that demonstrably lower session scores are automatically removed after 3+ sessions of negative attribution
  • New config options: rejected_buffer_ttl (default: 10) and minibatch_size (default: 8) in [evolution] section
  • New types: RejectedEntry, MinibatchInsight, EpochClass in src/shared/evolution.rs
  • New functions: analyze_minibatches(), classify_epoch(), update_meta_field(), rejected buffer CRUD
  • Eval skill + CLI: project quality & regression evaluation with 4 dimensions (correctness, performance, quality, regression)
    • epic eval --init scaffolds eval.yaml with auto-detected stack (Rust/Node/Python/Go/Java)
    • epic eval --json outputs structured results for CI pipelines
    • epic eval --baseline-update saves current run as baseline for regression comparison
    • LLM-as-judge integration in SKILL.md for quality dimension (deferred to LLM session)
    • Orbit integration: Step 5.5 Eval phase inserted automatically when eval.yaml exists
    • New modules: src/eval/{mod,config,runner,baseline,report}.rs

Changed

  • Skill descriptions normalized: removed Trigger: prefix from all 14 skill descriptions, applied consistent [What it does]. [When to use] pattern
  • Skill count: 23 → 26 (9 pipeline + 17 quality gates)

Install epic-harness 0.6.0

Install prebuilt binaries via shell script

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/epicsagas/epic-harness/releases/download/v0.6.0/epic-harness-installer.sh | sh

Install prebuilt binaries via powershell script

powershell -ExecutionPolicy Bypass -c "irm https://github.com/epicsagas/epic-harness/releases/download/v0.6.0/epic-harness-installer.ps1 | iex"

Download epic-harness 0.6.0

File Platform Checksum
epic-harness-aarch64-apple-darwin.tar.xz Apple Silicon macOS checksum
epic-harness-x86_64-apple-darwin.tar.xz Intel macOS checksum
epic-harness-x86_64-pc-windows-msvc.zip x64 Windows checksum
epic-harness-aarch64-unknown-linux-gnu.tar.xz ARM64 Linux checksum
epic-harness-x86_64-unknown-linux-gnu.tar.xz x64 Linux checksum