# GIFT Phase 3: Blind Challenge Validation

**Purpose**: Rigorous validation following AI council recommendations

## Protocol (Pre-registered)

1. **Blind Challenge**: Pre-registered "Primary + 2×Primary" conductors vs controls
2. **Null Grammars**: 100 alternative grammars of same complexity
3. **A Priori Predictions**: Test specific predictions BEFORE seeing data
4. **GPU Acceleration**: CuPy for Monte Carlo and bootstrap

---

## Pre-Registered Predictions (LOCKED BEFORE COMPUTATION)

### Prediction 1: Primary + 2×Primary conductors will outperform controls
- **GIFT conductors**: q = 43, 49, 35, 31, 17, 38, 56
- **Control conductors**: q = 23, 29, 37, 47, 53, 59, 61

### Prediction 2: q = 42 will show special structure
- If 42 is truly universal, L(s, χ₄₂) should show strong Fibonacci constraint

### Prediction 3: GIFT grammar will beat 95%+ of null grammars

---

In [None]:
# ============================================================================
# SETUP: Detect GPU and import accordingly
# ============================================================================

import numpy as np
import json
import time
from typing import Dict, List, Tuple, Optional, Set
from dataclasses import dataclass
from collections import Counter
import random

# Try GPU (CuPy)
try:
    import cupy as cp
    from cupyx.scipy import sparse as cp_sparse
    GPU_AVAILABLE = True
    xp = cp  # Use CuPy as array library
    print(f"✓ GPU available: {cp.cuda.runtime.getDeviceCount()} device(s)")
    print(f"  Device: {cp.cuda.Device().name}")
    print(f"  Memory: {cp.cuda.Device().mem_info[1] / 1e9:.1f} GB")
except ImportError:
    GPU_AVAILABLE = False
    xp = np  # Fallback to NumPy
    print("⚠ GPU not available, using CPU (NumPy)")

# mpmath for L-function computation (always CPU)
import mpmath as mp
mp.mp.dps = 25
print(f"✓ mpmath precision: {mp.mp.dps} digits")

# scipy for statistics
from scipy import stats

---
## Part 1: GIFT Constants and Grammars

In [None]:
# ============================================================================
# GIFT CONSTANTS (the "atoms" and "primaries")
# ============================================================================

@dataclass
class GIFTConstants:
    """All GIFT topological constants."""
    # Atoms
    p2: int = 2             # Pontryagin class
    N_gen: int = 3          # Generations
    dim_K7: int = 7         # K₇ dimension
    D_bulk: int = 11        # Bulk dimension
    
    # Fibonacci primes (secondary atoms)
    Weyl: int = 5           # F₅
    F7: int = 13            # F₇
    
    # Primaries (products of atoms)
    rank_E8: int = 8        # 2³
    dim_G2: int = 14        # 2 × 7
    b2: int = 21            # 3 × 7
    dim_J3O: int = 27       # 3³
    b3: int = 77            # 7 × 11
    H_star: int = 99        # 3² × 11
    
    # Derived
    chi_K7: int = 42        # 2 × 3 × 7
    dim_E8: int = 248       # E₈ dimension
    h_G2: int = 6           # Coxeter number

GIFT = GIFTConstants()

# Atoms and primaries as sets
ATOMS = {2, 3, 7, 11}
FIBONACCI_ATOMS = {5, 13}
PRIMARIES = {8, 14, 21, 27, 77, 99}
ALL_GIFT = ATOMS | FIBONACCI_ATOMS | PRIMARIES | {42, 248}

print(f"GIFT Atoms: {ATOMS}")
print(f"Fibonacci Atoms: {FIBONACCI_ATOMS}")
print(f"Primaries: {PRIMARIES}")
print(f"All GIFT constants: {sorted(ALL_GIFT)}")

In [None]:
# ============================================================================
# PRE-REGISTERED CONDUCTOR LISTS (LOCKED)
# ============================================================================

# Conductors following "Primary + 2×Primary" pattern
GIFT_CONDUCTORS = {
    # Primary + 2×Primary pattern
    43: 'b₂ + 2×D_bulk = 21 + 22',
    35: 'b₂ + 2×dim(K₇) = 21 + 14',
    49: 'b₂ + 2×dim(G₂) = 21 + 28',
    56: 'dim(G₂) + 2×b₂ = 14 + 42',
    38: '2×(b₂ - 2) = 2×19 or dim(G₂) + 3×rank(E₈)',
    31: 'N_gen + 2×dim(G₂) = 3 + 28',
    
    # Direct GIFT primes
    5: 'Weyl (F₅)',
    7: 'dim(K₇)',
    11: 'D_bulk',
    13: 'F₇',
    
    # Composites with GIFT meaning
    17: 'dim(G₂) + N_gen = 14 + 3',
    21: 'b₂ (Second Betti)',
    77: 'b₃ (Third Betti)',
    42: 'χ_K7 = 2×b₂ (THE 42 TEST)',
}

# Control conductors (no obvious GIFT decomposition)
CONTROL_CONDUCTORS = {
    23: 'Prime (no GIFT)',
    29: 'Prime (no GIFT)',
    37: 'Prime (no GIFT)',
    47: 'Prime (no GIFT)',
    53: 'Prime (no GIFT)',
    59: 'Prime (no GIFT)',
    61: 'Prime (no GIFT)',
    67: 'Prime (no GIFT)',
    71: 'Prime (no GIFT)',
    73: 'Prime (no GIFT)',
}

# All conductors to test
ALL_CONDUCTORS = {**GIFT_CONDUCTORS, **CONTROL_CONDUCTORS}

print(f"GIFT conductors ({len(GIFT_CONDUCTORS)}): {sorted(GIFT_CONDUCTORS.keys())}")
print(f"Control conductors ({len(CONTROL_CONDUCTORS)}): {sorted(CONTROL_CONDUCTORS.keys())}")
print(f"\nTotal conductors to test: {len(ALL_CONDUCTORS)}")

---
## Part 2: L-Function Computation (CPU)

In [None]:
# ============================================================================
# L-FUNCTION COMPUTATION (mpmath, CPU only) - OPTIMIZED
# ============================================================================

def legendre_symbol(a: int, p: int) -> int:
    """Compute Legendre symbol (a/p)."""
    if a % p == 0:
        return 0
    result = pow(a % p, (p - 1) // 2, p)
    return 1 if result == 1 else -1


def kronecker_symbol(a: int, n: int) -> int:
    """Compute Kronecker symbol (a/n) for general n."""
    if n == 1:
        return 1
    if n == 0:
        return 1 if abs(a) == 1 else 0
    
    # Factor out powers of 2
    v = 0
    n_copy = abs(n)
    while n_copy % 2 == 0:
        v += 1
        n_copy //= 2
    
    if v > 0:
        if a % 2 == 0:
            k2 = 0
        else:
            r = a % 8
            k2 = 1 if r in [1, 7] else -1
        k2 = k2 ** v
    else:
        k2 = 1
    
    if n_copy == 1:
        return k2
    
    # Odd part via Legendre
    return k2 * legendre_symbol(a, n_copy)


def dirichlet_L_function(s, q: int, terms: int = 8000):
    """Compute Dirichlet L-function L(s, χ_q) where χ is quadratic character."""
    result = mp.mpc(0)
    for n in range(1, terms + 1):
        if q == 2:
            chi_n = 1 if n % 2 == 1 else 0
        else:
            chi_n = kronecker_symbol(n, q)
        if chi_n != 0:
            result += mp.mpc(chi_n) / mp.power(n, s)
    return result


def find_zeros_dirichlet(q: int, num_zeros: int = 50, T_max: float = 90, 
                         step: float = 0.25, terms: int = 8000) -> List[float]:
    """Find zeros of L(1/2 + it, χ_q) by sign changes. OPTIMIZED VERSION."""
    zeros = []
    t = 1.0
    prev_val = dirichlet_L_function(mp.mpc(0.5, t), q, terms)
    prev_sign = 1 if prev_val.real > 0 else -1
    
    while len(zeros) < num_zeros and t < T_max:
        t += step
        curr_val = dirichlet_L_function(mp.mpc(0.5, t), q, terms)
        curr_sign = 1 if curr_val.real > 0 else -1
        
        if curr_sign != prev_sign and prev_sign != 0:
            # Bisection to refine zero (15 iterations = sufficient precision)
            t_low, t_high = t - step, t
            for _ in range(15):
                t_mid = (t_low + t_high) / 2
                val_mid = dirichlet_L_function(mp.mpc(0.5, t_mid), q, terms)
                sign_mid = 1 if val_mid.real > 0 else -1
                val_low = dirichlet_L_function(mp.mpc(0.5, t_low), q, terms)
                sign_low = 1 if val_low.real > 0 else -1
                if sign_low != sign_mid:
                    t_high = t_mid
                else:
                    t_low = t_mid
            zero_t = (t_low + t_high) / 2
            if not zeros or abs(zero_t - zeros[-1]) > 0.1:
                zeros.append(zero_t)
        
        prev_val = curr_val
        prev_sign = curr_sign
    
    return zeros

print("✓ L-function computation ready (OPTIMIZED: 8000 terms, 15 bisection iter)")

---
## Part 3: Recurrence Analysis (GPU-accelerated)

In [None]:
# ============================================================================
# RECURRENCE FITTING (GPU/CPU adaptive)
# ============================================================================

# GIFT lags (Fibonacci-based)
GIFT_LAGS = [5, 8, 13, 21]

def fit_recurrence(zeros: np.ndarray, lags: List[int] = GIFT_LAGS, 
                   use_gpu: bool = GPU_AVAILABLE) -> Tuple[Optional[np.ndarray], float, float]:
    """
    Fit linear recurrence γ_n = Σ a_i × γ_{n-lag_i} + c.
    
    Returns: (coefficients, RMSE, R²)
    """
    max_lag = max(lags)
    n_points = len(zeros) - max_lag
    
    if n_points < 10:
        return None, float('inf'), 0.0
    
    # Build design matrix
    X = np.zeros((n_points, len(lags) + 1))
    y = np.zeros(n_points)
    
    for i in range(n_points):
        idx = i + max_lag
        y[i] = zeros[idx]
        for j, lag in enumerate(lags):
            X[i, j] = zeros[idx - lag]
        X[i, -1] = 1  # Constant term
    
    # Solve least squares
    if use_gpu and GPU_AVAILABLE:
        X_gpu = cp.asarray(X)
        y_gpu = cp.asarray(y)
        coeffs_gpu, _, _, _ = cp.linalg.lstsq(X_gpu, y_gpu, rcond=None)
        coeffs = cp.asnumpy(coeffs_gpu)
        y_pred = cp.asnumpy(X_gpu @ coeffs_gpu)
    else:
        coeffs, _, _, _ = np.linalg.lstsq(X, y, rcond=None)
        y_pred = X @ coeffs
    
    # Compute metrics
    rmse = float(np.sqrt(np.mean((y - y_pred)**2)))
    ss_res = np.sum((y - y_pred)**2)
    ss_tot = np.sum((y - np.mean(y))**2)
    r2 = float(1 - ss_res / ss_tot) if ss_tot > 0 else 0.0
    
    return coeffs, rmse, r2


def compute_fibonacci_ratio(coeffs: np.ndarray, lags: List[int] = GIFT_LAGS) -> Optional[float]:
    """
    Compute the Fibonacci constraint ratio R = (8×a₈)/(13×a₁₃).
    Should be ≈ 1 if Fibonacci structure holds.
    """
    if coeffs is None:
        return None
    
    try:
        idx_8 = lags.index(8)
        idx_13 = lags.index(13)
    except ValueError:
        return None
    
    a8 = coeffs[idx_8]
    a13 = coeffs[idx_13]
    
    if abs(a13) < 1e-10:
        return None
    
    return float((8 * a8) / (13 * a13))


print(f"✓ Recurrence fitting ready (GPU: {GPU_AVAILABLE})")
print(f"  GIFT lags: {GIFT_LAGS}")

---
## Part 4: Null Grammar Generation (GPU-accelerated)

In [None]:
# ============================================================================
# NULL GRAMMAR TESTING
# ============================================================================

def generate_null_grammar(seed: int = None) -> Tuple[Set[int], Set[int]]:
    """
    Generate a random "grammar" of same complexity as GIFT.
    - 4 atoms (primes from 2-20)
    - 6 primaries (products of atoms)
    """
    if seed is not None:
        random.seed(seed)
    
    # Random 4 primes from small primes
    small_primes = [2, 3, 5, 7, 11, 13, 17, 19]
    atoms = set(random.sample(small_primes, 4))
    
    # Generate primaries as products
    atom_list = list(atoms)
    primaries = set()
    
    # Products of pairs
    for i in range(len(atom_list)):
        for j in range(i, len(atom_list)):
            p = atom_list[i] * atom_list[j]
            if p < 150:  # Keep reasonable size
                primaries.add(p)
    
    # Limit to 6 primaries like GIFT
    if len(primaries) > 6:
        primaries = set(random.sample(list(primaries), 6))
    
    return atoms, primaries


def grammar_generates_conductor(q: int, atoms: Set[int], primaries: Set[int]) -> bool:
    """
    Check if conductor q can be expressed as:
    - A single atom/primary
    - Sum of two primaries
    - Primary + 2×Primary
    """
    all_constants = atoms | primaries
    
    # Direct match
    if q in all_constants:
        return True
    
    # Sum of two
    for c1 in all_constants:
        for c2 in all_constants:
            if c1 + c2 == q:
                return True
            if c1 + 2*c2 == q:  # Primary + 2×Primary
                return True
    
    return False


def test_grammar_on_results(results: List[Dict], atoms: Set[int], primaries: Set[int]) -> float:
    """
    Test how well a grammar separates "good" from "bad" conductors.
    Returns mean |R-1| for conductors the grammar claims to generate.
    """
    grammar_conductors = []
    
    for r in results:
        q = r['q']
        if grammar_generates_conductor(q, atoms, primaries):
            grammar_conductors.append(r['R_deviation'])
    
    if not grammar_conductors:
        return float('inf')
    
    return float(np.mean(grammar_conductors))


print("✓ Null grammar generation ready")

# Test
test_atoms, test_primaries = generate_null_grammar(seed=42)
print(f"  Example null grammar: atoms={test_atoms}, primaries={test_primaries}")

---
## Part 5: GPU-Accelerated Bootstrap

In [None]:
# ============================================================================
# BOOTSTRAP CONFIDENCE INTERVALS (GPU-accelerated)
# ============================================================================

def bootstrap_mean_diff(group1: np.ndarray, group2: np.ndarray, 
                        n_bootstrap: int = 10000, 
                        use_gpu: bool = GPU_AVAILABLE) -> Tuple[float, float, float]:
    """
    Compute bootstrap confidence interval for mean difference.
    
    Returns: (observed_diff, ci_lower, ci_upper)
    """
    observed_diff = np.mean(group1) - np.mean(group2)
    n1, n2 = len(group1), len(group2)
    
    if use_gpu and GPU_AVAILABLE:
        # GPU-accelerated bootstrap
        g1_gpu = cp.asarray(group1)
        g2_gpu = cp.asarray(group2)
        
        # Generate all bootstrap indices at once
        idx1 = cp.random.randint(0, n1, size=(n_bootstrap, n1))
        idx2 = cp.random.randint(0, n2, size=(n_bootstrap, n2))
        
        # Compute all bootstrap means
        boot_means1 = cp.mean(g1_gpu[idx1], axis=1)
        boot_means2 = cp.mean(g2_gpu[idx2], axis=1)
        boot_diffs = boot_means1 - boot_means2
        
        # Get percentiles
        ci_lower = float(cp.percentile(boot_diffs, 2.5))
        ci_upper = float(cp.percentile(boot_diffs, 97.5))
        
        # Clean up GPU memory
        cp.get_default_memory_pool().free_all_blocks()
    else:
        # CPU fallback
        boot_diffs = np.zeros(n_bootstrap)
        for i in range(n_bootstrap):
            boot1 = np.random.choice(group1, size=n1, replace=True)
            boot2 = np.random.choice(group2, size=n2, replace=True)
            boot_diffs[i] = np.mean(boot1) - np.mean(boot2)
        
        ci_lower = float(np.percentile(boot_diffs, 2.5))
        ci_upper = float(np.percentile(boot_diffs, 97.5))
    
    return observed_diff, ci_lower, ci_upper


def bootstrap_p_value(group1: np.ndarray, group2: np.ndarray,
                      n_bootstrap: int = 10000,
                      use_gpu: bool = GPU_AVAILABLE) -> float:
    """
    Compute bootstrap p-value for H₀: mean(group1) >= mean(group2).
    """
    observed_diff = np.mean(group1) - np.mean(group2)
    combined = np.concatenate([group1, group2])
    n1 = len(group1)
    n_total = len(combined)
    
    if use_gpu and GPU_AVAILABLE:
        combined_gpu = cp.asarray(combined)
        
        # Permutation test
        count_more_extreme = 0
        batch_size = min(1000, n_bootstrap)
        
        for _ in range(n_bootstrap // batch_size):
            # Shuffle and split
            indices = cp.random.permutation(n_total)
            shuffled = combined_gpu[indices]
            
            perm_diff = cp.mean(shuffled[:n1]) - cp.mean(shuffled[n1:])
            if float(perm_diff) <= observed_diff:
                count_more_extreme += batch_size
        
        cp.get_default_memory_pool().free_all_blocks()
        p_value = count_more_extreme / n_bootstrap
    else:
        count_more_extreme = 0
        for _ in range(n_bootstrap):
            np.random.shuffle(combined)
            perm_diff = np.mean(combined[:n1]) - np.mean(combined[n1:])
            if perm_diff <= observed_diff:
                count_more_extreme += 1
        p_value = count_more_extreme / n_bootstrap
    
    return p_value


print(f"✓ Bootstrap methods ready (GPU: {GPU_AVAILABLE})")

---
## Part 6: Main Computation

⏱️ **Estimated time**: 45-60 minutes for all conductors

In [None]:
# ============================================================================
# CONFIGURATION (optimized - matches proven working parameters)
# ============================================================================

NUM_ZEROS = 50          # Zeros per conductor (sufficient for statistics)
T_MAX = 90              # Max imaginary part to search  
STEP = 0.25             # Initial step size (faster scan)
TERMS = 8000            # Series terms for L-function (sufficient precision)

N_NULL_GRAMMARS = 100   # Number of null grammars to test
N_BOOTSTRAP = 10000     # Bootstrap samples

# Estimated time: ~2-3 min per conductor, ~50-70 min total
print(f"Configuration (OPTIMIZED):")
print(f"  Zeros per conductor: {NUM_ZEROS}")
print(f"  T_max: {T_MAX}")
print(f"  Step size: {STEP}")
print(f"  Series terms: {TERMS}")
print(f"  Null grammars: {N_NULL_GRAMMARS}")
print(f"  Bootstrap samples: {N_BOOTSTRAP}")
print(f"  GPU acceleration: {GPU_AVAILABLE}")
print(f"\n⏱️ Estimated: ~2-3 min/conductor, ~50-70 min total")

In [None]:
# ============================================================================
# COMPUTE ZEROS FOR ALL CONDUCTORS
# ============================================================================

results = []
zeros_cache = {}

sorted_conductors = sorted(ALL_CONDUCTORS.keys())

print(f"\nComputing zeros for {len(sorted_conductors)} conductors...")
print(f"Parameters: {NUM_ZEROS} zeros, T_max={T_MAX}, step={STEP}, terms={TERMS}")
print("=" * 70)

total_start = time.time()

for i, q in enumerate(sorted_conductors):
    desc = ALL_CONDUCTORS[q]
    category = 'gift' if q in GIFT_CONDUCTORS else 'control'
    
    print(f"[{i+1}/{len(sorted_conductors)}] q={q:3d} ({category:7s})...", end=" ", flush=True)
    start = time.time()
    
    # Skip q=2 (trivial character)
    if q == 2:
        print("SKIPPED (trivial)")
        continue
    
    # Compute zeros with OPTIMIZED parameters
    zeros = find_zeros_dirichlet(q, NUM_ZEROS, T_MAX, STEP, TERMS)
    elapsed = time.time() - start
    
    if len(zeros) >= 35:  # Minimum 35 zeros (like original notebook)
        zeros_array = np.array(zeros)
        zeros_cache[q] = zeros_array
        
        # Fit recurrence
        coeffs, rmse, r2 = fit_recurrence(zeros_array, GIFT_LAGS)
        R = compute_fibonacci_ratio(coeffs, GIFT_LAGS)
        
        if R is not None:
            R_dev = abs(R - 1)
            results.append({
                'q': q,
                'category': category,
                'description': desc,
                'num_zeros': len(zeros),
                'coeffs': coeffs.tolist() if coeffs is not None else None,
                'R': float(R),
                'R_deviation': float(R_dev),
                'rmse': float(rmse),
                'r2': float(r2),
            })
            print(f"{len(zeros)} zeros, R={R:.3f}, |R-1|={R_dev:.3f} ({elapsed:.0f}s)")
        else:
            print(f"{len(zeros)} zeros, R=N/A ({elapsed:.0f}s)")
    else:
        print(f"only {len(zeros)} zeros - SKIPPED ({elapsed:.0f}s)")

total_elapsed = time.time() - total_start
print("=" * 70)
print(f"\n✓ Completed in {total_elapsed/60:.1f} minutes")
print(f"✓ {len(results)} conductors with valid results")

---
## Part 7: Blind Challenge Analysis

In [None]:
# ============================================================================
# BLIND CHALLENGE: GIFT vs CONTROL
# ============================================================================

gift_results = [r for r in results if r['category'] == 'gift']
control_results = [r for r in results if r['category'] == 'control']

print("=" * 70)
print("BLIND CHALLENGE: GIFT vs CONTROL CONDUCTORS")
print("=" * 70)

print(f"\nGIFT conductors ({len(gift_results)}):")
for r in sorted(gift_results, key=lambda x: x['R_deviation']):
    print(f"  q={r['q']:3d}: |R-1|={r['R_deviation']:.4f} ({r['description'][:40]})")

print(f"\nControl conductors ({len(control_results)}):")
for r in sorted(control_results, key=lambda x: x['R_deviation']):
    print(f"  q={r['q']:3d}: |R-1|={r['R_deviation']:.4f}")

In [None]:
# ============================================================================
# STATISTICAL TESTS
# ============================================================================

if gift_results and control_results:
    gift_devs = np.array([r['R_deviation'] for r in gift_results])
    control_devs = np.array([r['R_deviation'] for r in control_results])
    
    gift_mean = np.mean(gift_devs)
    control_mean = np.mean(control_devs)
    
    print("\n" + "=" * 70)
    print("STATISTICAL ANALYSIS")
    print("=" * 70)
    
    print(f"\nGroup Statistics:")
    print(f"  GIFT:    n={len(gift_devs)}, mean={gift_mean:.4f}, std={np.std(gift_devs):.4f}")
    print(f"  Control: n={len(control_devs)}, mean={control_mean:.4f}, std={np.std(control_devs):.4f}")
    
    if gift_mean < control_mean:
        ratio = control_mean / gift_mean
        print(f"\n  → GIFT is {ratio:.2f}× BETTER (lower |R-1|)")
    else:
        ratio = gift_mean / control_mean
        print(f"\n  → Control is {ratio:.2f}× better (unexpected!)")
    
    # t-test
    t_stat, p_ttest = stats.ttest_ind(gift_devs, control_devs)
    print(f"\nt-test:")
    print(f"  t-statistic: {t_stat:.3f}")
    print(f"  p-value (two-tailed): {p_ttest:.4f}")
    print(f"  p-value (one-tailed): {p_ttest/2:.4f}")
    
    # Mann-Whitney U
    u_stat, p_mann = stats.mannwhitneyu(gift_devs, control_devs, alternative='less')
    print(f"\nMann-Whitney U:")
    print(f"  U-statistic: {u_stat:.1f}")
    print(f"  p-value (one-tailed): {p_mann:.4f}")
    
    # Bootstrap CI
    print(f"\nBootstrap ({N_BOOTSTRAP} samples):")
    obs_diff, ci_low, ci_high = bootstrap_mean_diff(gift_devs, control_devs, N_BOOTSTRAP)
    print(f"  Observed diff (GIFT - Control): {obs_diff:.4f}")
    print(f"  95% CI: [{ci_low:.4f}, {ci_high:.4f}]")
    
    if ci_high < 0:
        print(f"  → GIFT significantly better (CI entirely negative) ✓")
    elif ci_low > 0:
        print(f"  → Control significantly better (unexpected)")
    else:
        print(f"  → No significant difference (CI includes 0)")
else:
    print("\n⚠ Not enough data for statistical analysis")

---
## Part 8: The 42 Test (Pre-registered Prediction)

In [None]:
# ============================================================================
# THE 42 TEST
# ============================================================================

print("=" * 70)
print("SPECIAL TEST: q = 42 (χ_K7 = 2×3×7)")
print("=" * 70)

q42_result = next((r for r in results if r['q'] == 42), None)

if q42_result:
    print(f"\nq = 42 Results:")
    print(f"  Zeros found: {q42_result['num_zeros']}")
    print(f"  R = {q42_result['R']:.4f}")
    print(f"  |R-1| = {q42_result['R_deviation']:.4f}")
    print(f"  R² = {q42_result['r2']:.6f}")
    
    # Rank among all conductors
    sorted_by_dev = sorted(results, key=lambda x: x['R_deviation'])
    rank_42 = next(i+1 for i, r in enumerate(sorted_by_dev) if r['q'] == 42)
    percentile = (len(results) - rank_42) / len(results) * 100
    
    print(f"\n  Rank: #{rank_42} out of {len(results)} conductors")
    print(f"  Percentile: {percentile:.1f}%")
    
    # Pre-registered prediction: 42 should be in top 25%
    if rank_42 <= len(results) // 4:
        print(f"\n  ✓ PREDICTION CONFIRMED: q=42 is in top 25%")
        q42_verdict = "PASS"
    elif rank_42 <= len(results) // 2:
        print(f"\n  ~ MARGINAL: q=42 is in top 50% but not top 25%")
        q42_verdict = "MARGINAL"
    else:
        print(f"\n  ✗ PREDICTION FAILED: q=42 is NOT in top 50%")
        q42_verdict = "FAIL"
else:
    print("\n⚠ q=42 not in results (insufficient zeros?)")
    q42_verdict = "NO DATA"

---
## Part 9: Null Grammar Competition

In [None]:
# ============================================================================
# NULL GRAMMAR COMPETITION
# ============================================================================

print("=" * 70)
print(f"NULL GRAMMAR COMPETITION ({N_NULL_GRAMMARS} random grammars)")
print("=" * 70)

# GIFT grammar performance
gift_grammar_score = test_grammar_on_results(results, ATOMS | FIBONACCI_ATOMS, PRIMARIES)
print(f"\nGIFT grammar mean |R-1|: {gift_grammar_score:.4f}")

# Test null grammars
print(f"\nTesting {N_NULL_GRAMMARS} null grammars...")

null_scores = []
better_than_gift = 0

for i in range(N_NULL_GRAMMARS):
    atoms, primaries = generate_null_grammar(seed=i)
    score = test_grammar_on_results(results, atoms, primaries)
    null_scores.append(score)
    
    if score <= gift_grammar_score:
        better_than_gift += 1
    
    if (i + 1) % 20 == 0:
        print(f"  {i+1}/{N_NULL_GRAMMARS} tested...")

null_scores = np.array([s for s in null_scores if s != float('inf')])

print(f"\nResults:")
print(f"  Valid null grammars: {len(null_scores)}")
print(f"  Null grammar mean |R-1|: {np.mean(null_scores):.4f} ± {np.std(null_scores):.4f}")
print(f"  GIFT grammar mean |R-1|: {gift_grammar_score:.4f}")
print(f"  Null grammars better than GIFT: {better_than_gift}")

p_value_grammar = better_than_gift / N_NULL_GRAMMARS
percentile_gift = (1 - p_value_grammar) * 100

print(f"\n  GIFT percentile: {percentile_gift:.1f}%")
print(f"  p-value: {p_value_grammar:.4f}")

# Pre-registered prediction: GIFT should beat 95% of null grammars
if percentile_gift >= 95:
    print(f"\n  ✓ PREDICTION CONFIRMED: GIFT beats {percentile_gift:.0f}% of null grammars")
    grammar_verdict = "STRONG PASS"
elif percentile_gift >= 80:
    print(f"\n  ~ MARGINAL: GIFT beats {percentile_gift:.0f}% (target: 95%)")
    grammar_verdict = "MARGINAL"
else:
    print(f"\n  ✗ PREDICTION FAILED: GIFT only beats {percentile_gift:.0f}%")
    grammar_verdict = "FAIL"

---
## Part 10: Complete Ranking

In [None]:
# ============================================================================
# COMPLETE RANKING
# ============================================================================

print("=" * 70)
print("COMPLETE RANKING (lower |R-1| = better Fibonacci constraint)")
print("=" * 70)

sorted_results = sorted(results, key=lambda x: x['R_deviation'])

print(f"\n{'Rank':<5} {'q':<4} {'Cat':<8} {'R':>8} {'|R-1|':>8} {'R²':>10} Description")
print("-" * 80)

for i, r in enumerate(sorted_results, 1):
    marker = "★" if r['category'] == 'gift' else "·"
    special = " ← 42!" if r['q'] == 42 else ""
    print(f"{i:<5} {r['q']:<4} {r['category']:<8} {r['R']:>8.3f} {r['R_deviation']:>8.4f} {r['r2']:>10.6f} {r['description'][:25]}{special} {marker}")

print("\nLegend: ★ = GIFT conductor, · = Control")

---
## Part 11: Final Verdict

In [None]:
# ============================================================================
# FINAL VERDICT
# ============================================================================

print("\n" + "#" * 70)
print("#" + " PHASE 3 VALIDATION: FINAL VERDICT ".center(68) + "#")
print("#" * 70)

print("\n" + "=" * 60)
print("PRE-REGISTERED PREDICTIONS:")
print("=" * 60)

# Prediction 1: GIFT vs Control
if gift_results and control_results:
    if gift_mean < control_mean and p_ttest/2 < 0.05:
        pred1 = "✓ PASS"
        pred1_detail = f"GIFT {control_mean/gift_mean:.1f}× better, p={p_ttest/2:.4f}"
    elif gift_mean < control_mean:
        pred1 = "~ MARGINAL"
        pred1_detail = f"GIFT better but p={p_ttest/2:.4f} > 0.05"
    else:
        pred1 = "✗ FAIL"
        pred1_detail = "Control outperformed GIFT"
else:
    pred1 = "? NO DATA"
    pred1_detail = "Insufficient results"

print(f"\n1. GIFT conductors outperform controls: {pred1}")
print(f"   {pred1_detail}")

# Prediction 2: q=42
print(f"\n2. q=42 shows special structure: {q42_verdict}")
if q42_result:
    print(f"   Rank #{rank_42}/{len(results)}, |R-1|={q42_result['R_deviation']:.4f}")

# Prediction 3: Grammar competition
print(f"\n3. GIFT grammar beats 95% of nulls: {grammar_verdict}")
print(f"   Actually beats {percentile_gift:.1f}%")

# Overall
print("\n" + "=" * 60)

verdicts = [pred1, q42_verdict, grammar_verdict]
pass_count = sum(1 for v in verdicts if 'PASS' in v)
fail_count = sum(1 for v in verdicts if 'FAIL' in v)

if pass_count >= 2 and fail_count == 0:
    overall = "STRONG EVIDENCE"
    overall_detail = "Fractal encoding hypothesis SUPPORTED"
elif pass_count >= 1 and fail_count <= 1:
    overall = "MODERATE EVIDENCE"
    overall_detail = "Partial support, some caveats"
elif fail_count >= 2:
    overall = "WEAK EVIDENCE"
    overall_detail = "Hypothesis NOT well supported"
else:
    overall = "INCONCLUSIVE"
    overall_detail = "Mixed results"

print(f"OVERALL: {overall}")
print(f"         {overall_detail}")
print("=" * 60)

print("\n" + "#" * 70)

In [None]:
# ============================================================================
# SAVE RESULTS
# ============================================================================

output = {
    'test': 'GIFT Phase 3 Blind Challenge',
    'timestamp': time.strftime('%Y-%m-%d %H:%M:%S'),
    'gpu_used': GPU_AVAILABLE,
    'parameters': {
        'num_zeros': NUM_ZEROS,
        'T_max': T_MAX,
        'series_terms': TERMS,
        'n_null_grammars': N_NULL_GRAMMARS,
        'n_bootstrap': N_BOOTSTRAP,
        'gift_lags': GIFT_LAGS,
    },
    'results': results,
    'predictions': {
        'pred1_gift_vs_control': {
            'verdict': pred1,
            'gift_mean': float(gift_mean) if gift_results else None,
            'control_mean': float(control_mean) if control_results else None,
            'p_value': float(p_ttest/2) if gift_results and control_results else None,
        },
        'pred2_q42': {
            'verdict': q42_verdict,
            'rank': rank_42 if q42_result else None,
            'R_deviation': q42_result['R_deviation'] if q42_result else None,
        },
        'pred3_grammar': {
            'verdict': grammar_verdict,
            'percentile': float(percentile_gift),
            'p_value': float(p_value_grammar),
        },
    },
    'overall_verdict': overall,
}

output_path = 'GIFT_Phase3_Blind_Challenge_Results.json'
with open(output_path, 'w') as f:
    json.dump(output, f, indent=2)

print(f"\n✓ Results saved to '{output_path}'")
print(f"\nPlease share this file for verification!")

---

## Summary

This Phase 3 validation implements the AI council recommendations:

1. **Blind Challenge**: Pre-registered GIFT vs Control conductors
2. **The 42 Test**: Special test for the "universal constant"
3. **Null Grammar Competition**: GIFT vs 100 random grammars
4. **Statistical Rigor**: Bootstrap CIs, permutation tests, Mann-Whitney U
5. **GPU Acceleration**: CuPy for heavy statistics (when available)

The pre-registered predictions allow falsification - if they fail, the fractal encoding hypothesis is weakened.

---
*GIFT Framework — Phase 3 Validation*
*February 2026*