# Loop 11 Final Strategic Analysis

## Key Question: Should we submit exp_011?

**exp_011 Results:**
- CV: 0.82032 (+/- 0.01408) - BEST CV EVER
- 10-fold StratifiedKFold with regularization
- Regularization IMPROVED CV (contrary to overfitting hypothesis)

**Previous Submissions:**
- exp_000: CV 0.8067 → LB 0.7971 (gap: -0.96%)
- exp_003: CV 0.8195 → LB 0.8045 (gap: -1.50%)
- exp_004: CV 0.8193 → LB 0.8041 (gap: -1.52%)
- exp_006: CV 0.8171 → LB 0.8010 (gap: -1.61%)

In [1]:
import numpy as np
import pandas as pd
from scipy import stats

# CV-LB data from submissions
submissions = [
    {'exp': 'exp_000', 'cv': 0.8067, 'lb': 0.7971},
    {'exp': 'exp_003', 'cv': 0.8195, 'lb': 0.8045},
    {'exp': 'exp_004', 'cv': 0.8193, 'lb': 0.8041},
    {'exp': 'exp_006', 'cv': 0.8171, 'lb': 0.8010},
]

df = pd.DataFrame(submissions)
df['gap'] = df['lb'] - df['cv']
df['gap_pct'] = df['gap'] * 100
print("Submission History:")
print(df.to_string(index=False))
print(f"\nMean CV-LB gap: {df['gap'].mean():.4f} ({df['gap_pct'].mean():.2f}%)")

Submission History:
    exp     cv     lb     gap  gap_pct
exp_000 0.8067 0.7971 -0.0096    -0.96
exp_003 0.8195 0.8045 -0.0150    -1.50
exp_004 0.8193 0.8041 -0.0152    -1.52
exp_006 0.8171 0.8010 -0.0161    -1.61

Mean CV-LB gap: -0.0140 (-1.40%)


In [2]:
# Linear regression to predict LB from CV
from sklearn.linear_model import LinearRegression

X = df['cv'].values.reshape(-1, 1)
y = df['lb'].values

model = LinearRegression()
model.fit(X, y)

print(f"CV-LB Model: LB = {model.coef_[0]:.3f} * CV + {model.intercept_:.3f}")
print(f"R² = {model.score(X, y):.3f}")

# Predict LB for exp_011
exp_011_cv = 0.82032
predicted_lb = model.predict([[exp_011_cv]])[0]
print(f"\nexp_011 CV: {exp_011_cv:.5f}")
print(f"Predicted LB: {predicted_lb:.5f}")
print(f"Best LB so far: 0.8045")
print(f"Difference: {predicted_lb - 0.8045:.5f}")

CV-LB Model: LB = 0.541 * CV + 0.360
R² = 0.916

exp_011 CV: 0.82032
Predicted LB: 0.80420
Best LB so far: 0.8045
Difference: -0.00030


In [3]:
# What CV is needed to beat 0.8045 LB?
target_lb = 0.8045
required_cv = (target_lb - model.intercept_) / model.coef_[0]
print(f"To beat LB 0.8045, need CV > {required_cv:.5f}")
print(f"exp_011 CV: {exp_011_cv:.5f}")
print(f"Gap: {exp_011_cv - required_cv:.5f}")

# What about to beat 0.8050?
target_lb_2 = 0.8050
required_cv_2 = (target_lb_2 - model.intercept_) / model.coef_[0]
print(f"\nTo beat LB 0.8050, need CV > {required_cv_2:.5f}")

To beat LB 0.8045, need CV > 0.82087
exp_011 CV: 0.82032
Gap: -0.00055

To beat LB 0.8050, need CV > 0.82180


In [4]:
# Key insight: The CV-LB relationship is not perfect
# Let's look at the residuals
df['predicted_lb'] = model.predict(df['cv'].values.reshape(-1, 1))
df['residual'] = df['lb'] - df['predicted_lb']
print("Residual Analysis:")
print(df[['exp', 'cv', 'lb', 'predicted_lb', 'residual']].to_string(index=False))
print(f"\nResidual std: {df['residual'].std():.5f}")
print(f"This means LB predictions have uncertainty of ~{df['residual'].std():.5f}")

Residual Analysis:
    exp     cv     lb  predicted_lb  residual
exp_000 0.8067 0.7971      0.796833  0.000267
exp_003 0.8195 0.8045      0.803758  0.000742
exp_004 0.8193 0.8041      0.803650  0.000450
exp_006 0.8171 0.8010      0.802459 -0.001459

Residual std: 0.00099
This means LB predictions have uncertainty of ~0.00099


In [5]:
# Strategic Decision Analysis
print("="*60)
print("STRATEGIC DECISION ANALYSIS")
print("="*60)

print("\n1. exp_011 is the BEST CV ever (0.82032)")
print("   - Exceeds exp_003's 0.81951 by +0.00081")
print("   - Regularization IMPROVED CV (not decreased)")

print("\n2. Predicted LB for exp_011: {:.5f}".format(predicted_lb))
print("   - Slightly below best LB (0.8045)")
print("   - BUT: Regularization might help generalization")

print("\n3. Submissions remaining: 6")
print("   - Can afford to test exp_011")
print("   - Need LB feedback to calibrate")

print("\n4. Key uncertainty:")
print("   - CV-LB model has residual std of {:.5f}".format(df['residual'].std()))
print("   - exp_011 could be anywhere from {:.5f} to {:.5f}".format(
    predicted_lb - 2*df['residual'].std(), 
    predicted_lb + 2*df['residual'].std()))

print("\n5. RECOMMENDATION: SUBMIT exp_011")
print("   - Best CV ever")
print("   - Regularization might help generalization")
print("   - Need LB feedback to calibrate")
print("   - 6 submissions remaining")

STRATEGIC DECISION ANALYSIS

1. exp_011 is the BEST CV ever (0.82032)
   - Exceeds exp_003's 0.81951 by +0.00081
   - Regularization IMPROVED CV (not decreased)

2. Predicted LB for exp_011: 0.80420
   - Slightly below best LB (0.8045)
   - BUT: Regularization might help generalization

3. Submissions remaining: 6
   - Can afford to test exp_011
   - Need LB feedback to calibrate

4. Key uncertainty:
   - CV-LB model has residual std of 0.00099
   - exp_011 could be anywhere from 0.80222 to 0.80619

5. RECOMMENDATION: SUBMIT exp_011
   - Best CV ever
   - Regularization might help generalization
   - Need LB feedback to calibrate
   - 6 submissions remaining


In [6]:
# What's the realistic ceiling?
print("\n" + "="*60)
print("REALISTIC CEILING ANALYSIS")
print("="*60)

print("\nTop LB on leaderboard: ~0.8066")
print("Our best LB: 0.8045")
print("Gap to top: 0.0021 (0.26%)")

print("\nTo reach 0.8066 LB, need CV of: {:.5f}".format(
    (0.8066 - model.intercept_) / model.coef_[0]))

print("\nThis is VERY high - would require significant breakthrough.")
print("More realistic goal: Beat our best LB (0.8045)")

print("\nTarget 0.9642 is IMPOSSIBLE.")
print("We should focus on incremental improvements toward 0.81 LB.")


REALISTIC CEILING ANALYSIS

Top LB on leaderboard: ~0.8066
Our best LB: 0.8045
Gap to top: 0.0021 (0.26%)

To reach 0.8066 LB, need CV of: 0.82475

This is VERY high - would require significant breakthrough.
More realistic goal: Beat our best LB (0.8045)

Target 0.9642 is IMPOSSIBLE.
We should focus on incremental improvements toward 0.81 LB.
