Skip to content

v4.0.0 — Exponentiated-Gradient Canonical Renovation

Latest

Choose a tag to compare

@HowardLiYH HowardLiYH released this 04 Jun 20:25
· 15 commits to v4-eg-canonical since this release

Headline

This release replaces the V3 additive + clamp + post-normalization
heuristic in the niche affinity update with the canonical
exponentiated-gradient (Hedge / multiplicative-weights) update, and
re-derives every paper headline number under the new algorithm.

Metric v1.0–v3.x (V3 heuristic) v4.0.0 (canonical EG)
Mean Specialization Index (6 domains) $0.747$ $0.992$
Mean Cohen's $d$ vs.\ homogeneous $\approx 23$ $\approx 73$
Std-across-seeds (typical) $0.036$$0.055$ $0.014$$0.026$
Mean SI at $\lambda = 0$ (no niche bonus) $0.329$ $0.650$
NichePop vs.\ MARL SI gap (4-domain head-to-head) $4.3\times$ $\geq 100\times$ (1.000 vs.\ $\le 0.02$)
Traffic ($R = 6$) SI $0.573$ (lowest) $0.995$ (no longer outlier)

All qualitative findings from prior versions are preserved; the
quantitative magnitudes are substantially strengthened.

What changed and why

The V3 update was an additive heuristic with three structural defects:

  1. Mass drift before normalization — the pre-norm sum drifts to
    $1 - \eta \alpha_{r_t}$, not 1, so the post-hoc normalization silently
    alters the step size.
  2. Eventual negativity — once specialization is high enough, the
    subtractive penalty drives small entries below zero, requiring an
    undocumented $\max(0.01, \cdot)$ clamp.
  3. State-dependent effective rate — after normalization, the effective
    rate on the winner is $\eta(1 - \alpha + \alpha^2)$, not $\eta$, which
    breaks the standard Hedge regret bound.

V4 replaces this with the canonical EG update on the simplex:

$$\alpha_r^{(t+1)} = \frac{\alpha_r^{(t)} \exp(\eta \cdot \mathbf{1}[r = r_t])}{\sum_k \alpha_k^{(t)} \exp(\eta \cdot \mathbf{1}[k = r_t])}.$$

This update preserves the simplex by construction, preserves the
interior strictly
(no clamp needed), reduces to replicator dynamics
in the small-$\eta$ limit, and inherits the canonical
$O(\sqrt{T \log R})$ Hedge regret bound
.

What's in this release

  • Algorithm: src/agents/niche_population.py now dispatches to EG (default) or V3 (update_rule="v3_additive", legacy).
  • Shared helper: experiments/_affinity_update.py centralizes V3/V4 update logic for all experiment scripts.
  • Experiments: every experiment script has been updated to V4 with rate rescaling (exp_unified_pipeline, exp_lambda_ablation, exp_all_domains, exp_lambda_zero_real, exp_rare_regime_resilience, exp_marl_comparison, exp_marl_standalone, exp_method_specialization). exp_task_performance is flagged as synthetic/illustrative.
  • Tests: tests/test_eg_update.py — 19/19 passing.
  • Paper: paper/main.tex (26 pages) and paper/method_deep_dive.tex (72 pages) recompiled with V4 numbers, full proofs, and Hedge regret-bound derivation. PDFs in this release.
  • Reports: docs/V4_FINAL_REPORT.md, results/unified_pipeline/v4_vs_v3_headline.md.

Reproducing the headline numbers

# Unified pipeline (Table 1)
python experiments/exp_unified_pipeline.py

# Lambda ablation (Table 2 in canonical paper)
python experiments/exp_lambda_ablation.py

# Method specialization (Table 2 in main.tex)
python experiments/exp_method_specialization.py

# MARL head-to-head (Table 3)
python experiments/exp_marl_comparison.py

# V3 vs V4 ablation (sanity check + diagnostics)
python experiments/exp_v4_v3_comparison.py --matched-rate

Naming clarification

Following v3.0.0, agent was replaced with participant in the
codebase. For the v4.0.0 paper text we use ``learner'' throughout
for academic clarity.