# Evolver Loop 4 Analysis: Preparing Next Experiment

This notebook reviews the current state and prepares recommendations for the next experiment.

In [None]:
import pandas as pd
import numpy as np
import json

# Load session state to understand current status
with open('/home/code/session_state.json', 'r') as f:
    session_state = json.load(f)

print("Current Session State:")
print(f"- Best CV score: {session_state['experiments'][0]['score']}")
print(f"- Experiments completed: {len(session_state['experiments'])}")
print(f"- Submissions made: {len(session_state['submissions'])}")
print(f"- Remaining submissions: {session_state['remaining_submissions']}")
print()

# Check what features have been validated
print("Validated Features from Analysis:")
for finding in session_state['data_findings']:
    if 'validation' in finding['finding'].lower() or 'improve' in finding['finding'].lower():
        print(f"- {finding['finding'][:80]}...")
        
print()
print("Key Validated Features:")
print("- TicketFreq: +24.7pp survival difference")
print("- CabinSide: +15.0pp survival difference") 
print("- NameLength: r=0.332 correlation")
print("- FareBin5: Clear survival gradient")
print("- Interaction features (Pclass_Sex, AgeGroup_Sex, FareBin5_Sex): +0.0068 CV improvement")

## Current Status Summary

Based on the analysis notebooks and session state:

1. **Baseline Experiment (exp_000)**: 0.817 CV with basic features
2. **Analysis Validated Features**: Should improve to ~0.8305 CV
3. **No LB submissions yet**: CV-LB gap unknown
4. **Evaluator Recommendation**: Hyperparameter tuning + XGBoost

## Next Steps Priority

1. Run experiment with validated features (expected CV: ~0.8305)
2. Submit to get LB feedback
3. Then proceed with hyperparameter tuning and XGBoost

In [None]:
# Verify data is accessible
import os
train_path = '/home/data/train.csv'
test_path = '/home/data/test.csv'

print("Data files check:")
print(f"- Train exists: {os.path.exists(train_path)}")
print(f"- Test exists: {os.path.exists(test_path)}")

if os.path.exists(train_path):
    train = pd.read_csv(train_path)
    print(f"- Train shape: {train.shape}")
    print(f"- Train columns: {list(train.columns)}")