Failure decomposition: categorize rejections to accelerate learning

## Problem

When an experiment is REJECTED, the framework logs which gates failed but doesn't diagnose WHY. This makes it hard to learn from failures at scale. The same root cause (e.g., "execution cost kills thin edges") can repeat 10+ times across branches before the orchestrator converges on a fix.

In a real deployment, 62% of experiments were rejected. Many had the same root cause repeated across different branches, but each rejection was treated as an independent failure.

## Proposal

After Step 5 (Collect Results), add automatic failure decomposition for REJECTed experiments.

### Failure categories

```
INSUFFICIENT_DATA     - n_entries < threshold
                        → broaden filter, add data sources, or relax gate

WRONG_PARAMETER_RANGE - metric improves monotonically toward search boundary
                        → extend search space in that direction

WRONG_SIGNAL_TYPE     - metric doesn't respond to any parameter variation
                        → branch hypothesis is wrong, consider exhausting

REGIME_DEPENDENT      - positive in some folds, negative in others
                        → needs regime filter or conditional activation

EXECUTION_KILLED      - positive pre-cost metric, negative post-cost
                        → switch execution mode or find larger edges

CONCENTRATION_RISK    - edge exists but concentrated in few samples/families
                        → needs diversification or larger universe

GATE_BLOCKED          - would have promoted but for one specific gate
                        → flag for gate evolution review (see issue #4)

NOISE                 - metric within 1 sigma of champion, no clear direction
                        → inconclusive, may need more data
```

### Implementation

1. The judge (or a post-judge analysis step) assigns a failure category to each REJECT
2. Categories are logged in `experiment_log.jsonl` under `failure_category`
3. Track category distributions per branch in `branch_beliefs.json`
4. When a branch accumulates 3+ failures of the same category, the orchestrator proposes the corresponding fix in the handoff

### Orchestrator behavior

In synthesis (Step 5b), after collecting rejections:
> "Branch {X} has {N} consecutive {EXECUTION_KILLED} failures. The signal has positive pre-cost edge but execution costs destroy it. Recommended action: switch to maker mode or increase minimum edge threshold."

### Branch-level tracking

```json
{
  "branch_name": {
    "failure_distribution": {
      "EXECUTION_KILLED": 4,
      "REGIME_DEPENDENT": 2,
      "NOISE": 1
    },
    "dominant_failure": "EXECUTION_KILLED",
    "recommended_action": "switch to maker mode"
  }
}
```

## Why this matters

Failures contain as much information as successes. A lab that treats every REJECT as an opaque "didn't work" is throwing away signal. Categorizing failures turns rejections into directed next steps. This is the difference between random search and adaptive search.

## Relationship to existing features

- Extends the synthesis step (5b) with structured failure analysis
- Feeds into the research scout: "Branch X is stuck with REGIME_DEPENDENT failures → scout for regime detection techniques"
- Feeds into gate evolution (issue #4): "Branch X is stuck with GATE_BLOCKED failures → review the blocking gate"
- Complements diagnostics: persistent failure categories are natural triggers for diagnostic experiments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failure decomposition: categorize rejections to accelerate learning #5

Problem

Proposal

Failure categories

Implementation

Orchestrator behavior

Branch-level tracking

Why this matters

Relationship to existing features

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Failure decomposition: categorize rejections to accelerate learning #5

Description

Problem

Proposal

Failure categories

Implementation

Orchestrator behavior

Branch-level tracking

Why this matters

Relationship to existing features

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions