Skip to content

[plan] Define shared outcome model and fallback semantics for safe output evaluation #35034

@github-actions

Description

@github-actions

Objective

Implement the normalized outcome model with outcome_status and evidence_strength classifications, and establish explicit weak-evidence fallback semantics that do not count as accepted.

Context

This is foundational work for issue #35033. All subsequent evaluator slices depend on this shared model.

Approach

  1. Define an OutcomeStatus type with values: accepted, rejected, pending, ignored, skipped, unknown
  2. Define an EvidenceStrength type with values: strong, medium, weak
  3. Create a shared OutcomeEvaluation struct/type containing:
    • outcome_status
    • evidence_strength
    • signal (e.g., target_exists_only, merged, closed, acted_on, etc.)
  4. Implement a target_exists_only fallback evaluator that returns unknown status with weak evidence — not accepted
  5. Update the outcome reporting pipeline (JSONL output and telemetry fields) to emit normalized fields:
    • outcome_status, evidence_strength, signal
  6. Update dashboard field definitions to include the new counts: accepted_strong, accepted_medium, accepted_weak, fallback_exists_only_count, etc.

Files to Modify

  • Locate the existing outcome evaluation code (likely in actions/setup/js/ or pkg/) and update/create the shared model
  • Update JSONL reporting to emit normalized fields
  • Add unit tests for the shared model and fallback evaluator

Acceptance Criteria

  • OutcomeStatus and EvidenceStrength types are defined
  • target_exists_only fallback returns unknown (not accepted)
  • No existence-only fallback is counted as accepted in metrics
  • JSONL output emits outcome_status and evidence_strength fields
  • Tests cover the fallback evaluator and shared model
  • make agent-finish passes

Generated by 📋 Plan Command · sonnet46 949.9K ·

  • expires on May 28, 2026, 8:13 PM UTC

Metadata

Metadata

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions