Gate evolution: detect and flag overly restrictive scoring gates

## Problem

Hard gates in the judge are fixed at lab initialization and never questioned by the framework. In practice, a gate can systematically block profitable experiments for many cycles until a human manually intervenes. The framework should detect when a gate is the dominant rejection cause and surface it for review.

In a real deployment, a single gate accounted for >30% of all rejections and blocked a strategy with +13.6% return for 6 cycles. The human eventually removed it, which was the single most impactful scoring change of the entire session.

## Proposal

Track gate failure frequency in `branch_beliefs.json` at the lab level:

```json
{
  "gate_failure_counts": {
    "G1_min_entries": 3,
    "G4_custom_gate": 8,
    "G8_walk_forward": 12
  },
  "gate_binding_rate": {
    "G4_custom_gate": 0.38
  }
}
```

### Automatic detection

After each scoring cycle, update gate failure counts. When a single gate accounts for >40% of rejections across the lab (minimum 10 rejections):

1. **Flag in handoff**: "Gate {G} is the binding constraint on {N}/{M} rejections ({pct}%). Experiments blocked by this gate had average metric={X}. Consider whether this gate is appropriate."

2. **Propose diagnostic**: "Run the top 3 G-rejected experiments with the gate disabled. If aggregate PnL/metric is positive, the gate is too restrictive."

3. **On human checkpoint cycles**: Explicitly surface: "Gate {G} has blocked {N} experiments. Recommendation: {relax/remove/keep with justification}."

### Optional auto-relaxation

For labs that opt in:
```yaml
gate_evolution:
  auto_relax: true
  binding_threshold: 0.40  # fraction of rejections from one gate
  min_rejections: 10
  relaxation_factor: 0.80  # multiply threshold by 0.8
  max_relaxations: 2  # per gate
```

## Why this matters

Gates should protect against bad strategies, not block good ones. A gate designed for one domain or phase of research may become the bottleneck as the lab evolves. The framework should surface this automatically rather than requiring human intuition to diagnose why nothing is getting promoted.

## Relationship to existing features

- Fits naturally into Step 7 (Update State) as an additional check
- Complements the frame_challenge meta-branch which asks "are we measuring the right thing?"
- Gate relaxation proposals are a type of meta experiment (0 budget cost)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gate evolution: detect and flag overly restrictive scoring gates #4

Problem

Proposal

Automatic detection

Optional auto-relaxation

Why this matters

Relationship to existing features

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Gate evolution: detect and flag overly restrictive scoring gates #4

Description

Problem

Proposal

Automatic detection

Optional auto-relaxation

Why this matters

Relationship to existing features

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions