# Loop 1 Analysis: Target Score Assessment & Strategy Refinement

## Key Finding: Target Score of 0.9642 is Unrealistic

Based on web research, the highest reported leaderboard scores for Spaceship Titanic are ~0.8066 (80.7%), achieved by top 7% solutions. The target of 0.9642 appears to be either:
1. An error in the target specification
2. Based on a different metric
3. Achieved through some data leakage or special technique not documented

## Current Status
- Baseline XGBoost CV: 0.80674 (+/- 0.00469)
- This is already competitive with top 7% solutions!

## Strategy Implications
Since we're already near the top of the leaderboard, we should:
1. Focus on incremental improvements (0.5-1% gains)
2. Implement ensemble of XGBoost + LightGBM + CatBoost
3. Add interaction features
4. Improve imputation with group-based and KNN methods
5. Submit to get LB feedback and calibrate CV-LB gap

In [1]:
import pandas as pd
import numpy as np

# Load data to analyze current state
train = pd.read_csv('/home/data/train.csv')
test = pd.read_csv('/home/data/test.csv')

print(f"Train shape: {train.shape}")
print(f"Test shape: {test.shape}")
print(f"\nTarget distribution:")
print(train['Transported'].value_counts(normalize=True))

Train shape: (8693, 14)
Test shape: (4277, 13)

Target distribution:
Transported
True     0.503624
False    0.496376
Name: proportion, dtype: float64


In [2]:
# Analyze the feature importance issue from baseline
# AnySpending had 0.82 importance - let's understand why

# Calculate spending patterns
spending_cols = ['RoomService', 'FoodCourt', 'ShoppingMall', 'Spa', 'VRDeck']
train['TotalSpent'] = train[spending_cols].sum(axis=1)
train['AnySpending'] = (train['TotalSpent'] > 0).astype(int)

# Transported rate by spending
print("Transported rate by spending status:")
print(train.groupby('AnySpending')['Transported'].mean())
print(f"\nNon-spenders: {(train['AnySpending']==0).sum()} ({(train['AnySpending']==0).mean()*100:.1f}%)")
print(f"Spenders: {(train['AnySpending']==1).sum()} ({(train['AnySpending']==1).mean()*100:.1f}%)")

Transported rate by spending status:
AnySpending
0    0.786477
1    0.298611
Name: Transported, dtype: float64

Non-spenders: 3653 (42.0%)
Spenders: 5040 (58.0%)


In [3]:
# CryoSleep analysis - this is the strongest predictor
print("\nCryoSleep analysis:")
print(train.groupby('CryoSleep')['Transported'].mean())

# CryoSleep and spending relationship
print("\nCryoSleep passengers with spending > 0:")
cryo_spending = train[train['CryoSleep']==True]['TotalSpent']
print(f"  Count with spending > 0: {(cryo_spending > 0).sum()}")
print(f"  This suggests CryoSleep=True should have 0 spending (domain knowledge)")


CryoSleep analysis:
CryoSleep
False    0.328921
True     0.817583
Name: Transported, dtype: float64

CryoSleep passengers with spending > 0:
  Count with spending > 0: 0
  This suggests CryoSleep=True should have 0 spending (domain knowledge)


In [4]:
# Analyze potential for improvement through interaction features
print("\nInteraction analysis: CryoSleep × HomePlanet")
interaction = train.groupby(['CryoSleep', 'HomePlanet'])['Transported'].agg(['mean', 'count'])
print(interaction)

print("\nInteraction analysis: Deck × Side")
# Parse cabin
train['Deck'] = train['Cabin'].apply(lambda x: x.split('/')[0] if pd.notna(x) else np.nan)
train['Side'] = train['Cabin'].apply(lambda x: x.split('/')[2] if pd.notna(x) else np.nan)
interaction2 = train.groupby(['Deck', 'Side'])['Transported'].agg(['mean', 'count'])
print(interaction2)


Interaction analysis: CryoSleep × HomePlanet
                          mean  count
CryoSleep HomePlanet                 
False     Earth       0.320992   3106
          Europa      0.400172   1162
          Mars        0.276982   1047
True      Earth       0.656295   1382
          Europa      0.989023    911
          Mars        0.911809    669

Interaction analysis: Deck × Side
               mean  count
Deck Side                 
A    P     0.435897    117
     S     0.546763    139
B    P     0.674221    353
     S     0.784038    426
C    P     0.580645    341
     S     0.763547    406
D    P     0.403226    248
     S     0.465217    230
E    P     0.342657    429
     S     0.371365    447
F    P     0.410987   1438
     S     0.470501   1356
G    P     0.448276   1276
     S     0.583788   1283
T    P     0.250000      4
     S     0.000000      1


In [5]:
# Group-based analysis for imputation improvement
train['Group'] = train['PassengerId'].apply(lambda x: x.split('_')[0])
group_sizes = train['Group'].value_counts()

print("\nGroup size distribution:")
print(group_sizes.value_counts().sort_index())

# Check if group members share characteristics
print("\nHomePlanet consistency within groups (for groups with 2+ members):")
multi_member_groups = group_sizes[group_sizes > 1].index
group_consistency = []
for g in list(multi_member_groups)[:100]:  # Sample 100 groups
    group_data = train[train['Group'] == g]['HomePlanet'].dropna()
    if len(group_data) > 1:
        group_consistency.append(group_data.nunique() == 1)

print(f"  Groups with consistent HomePlanet: {sum(group_consistency)}/{len(group_consistency)} ({sum(group_consistency)/len(group_consistency)*100:.1f}%)")


Group size distribution:
count
1    4805
2     841
3     340
4     103
5      53
6      29
7      33
8      13
Name: count, dtype: int64

HomePlanet consistency within groups (for groups with 2+ members):
  Groups with consistent HomePlanet: 100/100 (100.0%)


In [6]:
# Summary of key findings for strategy
print("="*60)
print("KEY FINDINGS FOR STRATEGY")
print("="*60)
print("\n1. TARGET SCORE ASSESSMENT:")
print("   - Target of 0.9642 is unrealistic (top LB scores are ~0.8066)")
print("   - Our baseline of 0.80674 is already competitive!")
print("\n2. FEATURE IMPORTANCE ISSUE:")
print("   - AnySpending dominates (0.82 importance)")
print("   - Need more nuanced spending features and interactions")
print("\n3. HIGH-LEVERAGE IMPROVEMENTS:")
print("   a) Ensemble: XGBoost + LightGBM + CatBoost")
print("   b) Interaction features: CryoSleep×HomePlanet, Deck×Side")
print("   c) Group-based imputation (groups share characteristics)")
print("   d) Remove/modify AnySpending to let model learn nuances")
print("\n4. NEXT STEPS:")
print("   - Submit baseline to get LB feedback")
print("   - Implement 3-model ensemble")
print("   - Add interaction features")

KEY FINDINGS FOR STRATEGY

1. TARGET SCORE ASSESSMENT:
   - Target of 0.9642 is unrealistic (top LB scores are ~0.8066)
   - Our baseline of 0.80674 is already competitive!

2. FEATURE IMPORTANCE ISSUE:
   - AnySpending dominates (0.82 importance)
   - Need more nuanced spending features and interactions

3. HIGH-LEVERAGE IMPROVEMENTS:
   a) Ensemble: XGBoost + LightGBM + CatBoost
   b) Interaction features: CryoSleep×HomePlanet, Deck×Side
   c) Group-based imputation (groups share characteristics)
   d) Remove/modify AnySpending to let model learn nuances

4. NEXT STEPS:
   - Submit baseline to get LB feedback
   - Implement 3-model ensemble
   - Add interaction features
