# Loss Function Comparison Experiment

## Critical Issues Identified

From debugging ([09_debug_training.ipynb](09_debug_training.ipynb)):

**üî¥ Issue #1: Output Scale Wrong**
- Model predictions: **-0.5 to +0.5** (range: 1)
- Actual targets: **0 to 200g** (range: 200)
- **200√ó scale mismatch!**

**üî¥ Issue #2: Cannot Overfit Single Batch**
- After 100 training steps on 1 batch: **R¬≤ = -2.74**
- Expected: **R¬≤ > 0.9** (model should memorize)
- **Architecture is broken!**

**üî¥ Issue #3: Competition Weights in Loss**
- Dry_Total_g: 50% weight + largest values = **dominates training**
- Other targets get ignored
- **Competition weights are for EVALUATION, not TRAINING!**

## Hypothesis

**The problem: Using competition weights during training causes:**
1. Model focuses only on Dry_Total_g (50% weight)
2. Output scale mismatch (predicting ~0 instead of 0-200g)
3. Unequal learning (large targets dominate gradients)

**The solution: Normalize targets + use plain MSE during training**
- All targets on same scale (mean=0, std=1)
- Equal gradient contribution
- Model outputs reasonable range
- Evaluate with competition weights (where they belong!)

## This Experiment

We'll train **3 models** with different loss functions:

### **Approach A: Normalized + Plain MSE** (RECOMMENDED)
- Normalize targets to mean=0, std=1
- Train with unweighted MSE
- Denormalize predictions for evaluation

### **Approach B: Plain MSE + Output Scaling**
- No target normalization
- Add ReLU to ensure positive outputs
- Train with unweighted MSE

### **Approach C: Competition Weighted** (CURRENT)
- Current approach (for comparison)
- Weighted MSE with [0.1, 0.1, 0.1, 0.2, 0.5]

**Each model:**
1. First: Test if it can overfit a single batch (sanity check)
2. Then: Train for 10 epochs
3. Compare: R¬≤ scores, loss curves, predictions

**Expected winner: Approach A**

---
## Setup

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path
import warnings
warnings.filterwarnings('ignore')

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader
import torchvision.transforms as transforms
import torchvision.models as models
from PIL import Image

from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score, mean_absolute_error
from tqdm.auto import tqdm
import copy

sns.set_style('whitegrid')
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

np.random.seed(42)
torch.manual_seed(42)
if torch.cuda.is_available():
    torch.cuda.manual_seed(42)

print("‚úì Imports complete")

In [None]:
# Load data
train_enriched = pd.read_csv('competition/train_enriched.csv')
train_enriched['Sampling_Date'] = pd.to_datetime(train_enriched['Sampling_Date'])
train_enriched['full_image_path'] = train_enriched['image_path'].apply(lambda x: f'competition/{x}')

target_cols = ['Dry_Green_g', 'Dry_Dead_g', 'Dry_Clover_g', 'GDM_g', 'Dry_Total_g']
competition_weights = torch.tensor([0.1, 0.1, 0.1, 0.2, 0.5])

train_data, val_data = train_test_split(train_enriched, test_size=0.2, random_state=42)

print(f"Data loaded: {len(train_data)} train, {len(val_data)} val")
print(f"Targets: {target_cols}")
print(f"Competition weights: {competition_weights.tolist()}")

In [None]:
# Calculate target statistics for normalization
target_means = torch.tensor([train_data[col].mean() for col in target_cols], dtype=torch.float32)
target_stds = torch.tensor([train_data[col].std() for col in target_cols], dtype=torch.float32)

print("\nTarget statistics:")
for i, col in enumerate(target_cols):
    print(f"  {col:15s}: mean={target_means[i]:.2f}g, std={target_stds[i]:.2f}g")

# Save for later use
torch.save({'means': target_means, 'stds': target_stds}, 'target_normalization.pth')
print("\n‚úì Target normalization stats saved")

---
## Define 3 Dataset Classes (Normalized vs Unnormalized)

In [None]:
class NormalizedDataset(Dataset):
    """Dataset with NORMALIZED targets (Approach A)."""
    def __init__(self, dataframe, target_means, target_stds, augment=False):
        self.df = dataframe.reset_index(drop=True)
        self.target_means = target_means
        self.target_stds = target_stds
        
        if augment:
            self.transform = transforms.Compose([
                transforms.Resize((224, 224)),
                transforms.RandomHorizontalFlip(),
                transforms.RandomVerticalFlip(),
                transforms.RandomRotation(10),
                transforms.ToTensor(),
                transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
            ])
        else:
            self.transform = transforms.Compose([
                transforms.Resize((224, 224)),
                transforms.ToTensor(),
                transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
            ])
    
    def __len__(self):
        return len(self.df)
    
    def __getitem__(self, idx):
        row = self.df.iloc[idx]
        img = Image.open(row['full_image_path']).convert('RGB')
        img = self.transform(img)
        
        # Normalize targets
        targets = torch.tensor(row[target_cols].values.astype('float32'), dtype=torch.float32)
        targets_normalized = (targets - self.target_means) / self.target_stds
        
        return {
            'image': img, 
            'targets': targets_normalized,  # Normalized targets
            'targets_original': targets      # Keep original for evaluation
        }

class UnnormalizedDataset(Dataset):
    """Dataset with UNNORMALIZED targets (Approaches B and C)."""
    def __init__(self, dataframe, augment=False):
        self.df = dataframe.reset_index(drop=True)
        
        if augment:
            self.transform = transforms.Compose([
                transforms.Resize((224, 224)),
                transforms.RandomHorizontalFlip(),
                transforms.RandomVerticalFlip(),
                transforms.RandomRotation(10),
                transforms.ToTensor(),
                transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
            ])
        else:
            self.transform = transforms.Compose([
                transforms.Resize((224, 224)),
                transforms.ToTensor(),
                transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
            ])
    
    def __len__(self):
        return len(self.df)
    
    def __getitem__(self, idx):
        row = self.df.iloc[idx]
        img = Image.open(row['full_image_path']).convert('RGB')
        img = self.transform(img)
        targets = torch.tensor(row[target_cols].values.astype('float32'), dtype=torch.float32)
        
        return {'image': img, 'targets': targets}

print("‚úì Dataset classes defined")

---
## Define 3 Model Architectures

In [None]:
class ModelA_Normalized(nn.Module):
    """Approach A: For normalized targets (outputs can be negative)."""
    def __init__(self, num_outputs=5):
        super().__init__()
        self.resnet = models.resnet18(pretrained=True)
        num_features = self.resnet.fc.in_features
        
        # Simple FC head - NO activation at end (can output negative)
        self.resnet.fc = nn.Sequential(
            nn.Linear(num_features, 256),
            nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(256, num_outputs)  # No final activation!
        )
    
    def forward(self, x):
        return self.resnet(x)

class ModelB_PlainMSE(nn.Module):
    """Approach B: For unnormalized targets with ReLU (ensure positive)."""
    def __init__(self, num_outputs=5):
        super().__init__()
        self.resnet = models.resnet18(pretrained=True)
        num_features = self.resnet.fc.in_features
        
        # FC head with ReLU at end (ensure positive outputs)
        self.resnet.fc = nn.Sequential(
            nn.Linear(num_features, 256),
            nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(256, num_outputs),
            nn.ReLU()  # Ensure positive (biomass can't be negative!)
        )
    
    def forward(self, x):
        return self.resnet(x)

class ModelC_Weighted(nn.Module):
    """Approach C: Current approach (for comparison)."""
    def __init__(self, num_outputs=5):
        super().__init__()
        self.resnet = models.resnet18(pretrained=True)
        num_features = self.resnet.fc.in_features
        
        # Same as Model B
        self.resnet.fc = nn.Sequential(
            nn.Linear(num_features, 256),
            nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(256, num_outputs),
            nn.ReLU()
        )
    
    def forward(self, x):
        return self.resnet(x)

print("‚úì Model architectures defined")
print("\n  Model A: For normalized targets (no final activation)")
print("  Model B: For unnormalized targets (ReLU at end)")
print("  Model C: Same as B (difference is loss function)")

---
## Define 3 Loss Functions

In [None]:
class PlainMSELoss(nn.Module):
    """Approach A & B: Plain unweighted MSE."""
    def __init__(self):
        super().__init__()
    
    def forward(self, pred, target):
        return F.mse_loss(pred, target)

class CompetitionWeightedLoss(nn.Module):
    """Approach C: Competition-weighted MSE (current approach)."""
    def __init__(self):
        super().__init__()
        self.weights = competition_weights.to(device)
    
    def forward(self, pred, target):
        mse = F.mse_loss(pred, target, reduction='none')
        weighted_mse = (mse * self.weights).mean()
        return weighted_mse

print("‚úì Loss functions defined")
print("\n  PlainMSELoss: Simple MSE (for A & B)")
print("  CompetitionWeightedLoss: Weighted MSE with [0.1, 0.1, 0.1, 0.2, 0.5] (for C)")

---
## Evaluation Function (Always uses Competition Weights)

In [None]:
def calculate_competition_r2(predictions, targets):
    """
    Calculate competition R¬≤ score (weighted).
    
    Args:
        predictions: numpy array (N, 5)
        targets: numpy array (N, 5)
    
    Returns:
        competition_r2: float (weighted R¬≤)
        per_target_r2: list of floats (R¬≤ per target)
    """
    per_target_r2 = []
    competition_r2 = 0
    
    weights = competition_weights.numpy()
    
    for i in range(5):
        r2 = r2_score(targets[:, i], predictions[:, i])
        per_target_r2.append(r2)
        competition_r2 += weights[i] * r2
    
    return competition_r2, per_target_r2

print("‚úì Evaluation function defined")
print("  Always evaluates with competition weights [0.1, 0.1, 0.1, 0.2, 0.5]")

---
## Test 1: Can Each Model Overfit a Single Batch?

In [None]:
def test_overfitting(model, dataset, loss_fn, model_name, is_normalized=False, num_steps=100):
    """
    Test if model can overfit a single batch.
    
    Args:
        is_normalized: If True, denormalize predictions for R¬≤ calculation
    """
    print(f"\n{'='*80}")
    print(f"OVERFITTING TEST: {model_name}")
    print(f"{'='*80}")
    
    # Get one batch
    loader = DataLoader(dataset, batch_size=16, shuffle=False)
    batch = next(iter(loader))
    images = batch['image'].to(device)
    targets = batch['targets'].to(device)
    
    if is_normalized:
        targets_original = batch['targets_original'].to(device)
    else:
        targets_original = targets
    
    # Train for num_steps
    model = model.to(device)
    model.train()
    optimizer = torch.optim.AdamW(model.parameters(), lr=3e-4)  # Higher LR for overfitting
    
    losses = []
    r2_scores = []
    
    for step in range(num_steps):
        # Forward
        pred = model(images)
        loss = loss_fn(pred, targets)
        
        # Backward
        optimizer.zero_grad()
        loss.backward()
        torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=10.0)  # Gradient clipping
        optimizer.step()
        
        losses.append(loss.item())
        
        # Calculate R¬≤ every 20 steps
        if step % 20 == 0:
            model.eval()
            with torch.no_grad():
                pred_eval = model(images)
                
                # Denormalize if needed
                if is_normalized:
                    pred_denorm = pred_eval * target_stds.to(device) + target_means.to(device)
                else:
                    pred_denorm = pred_eval
                
                comp_r2, _ = calculate_competition_r2(
                    pred_denorm.cpu().numpy(), 
                    targets_original.cpu().numpy()
                )
                r2_scores.append(comp_r2)
                print(f"Step {step:3d}: Loss = {loss.item():.4f}, R¬≤ = {comp_r2:+.4f}")
            model.train()
    
    # Final evaluation
    model.eval()
    with torch.no_grad():
        pred_final = model(images)
        
        if is_normalized:
            pred_final_denorm = pred_final * target_stds.to(device) + target_means.to(device)
        else:
            pred_final_denorm = pred_final
        
        final_r2, per_target_r2 = calculate_competition_r2(
            pred_final_denorm.cpu().numpy(),
            targets_original.cpu().numpy()
        )
    
    print(f"\nFinal Results:")
    print(f"  Competition R¬≤: {final_r2:+.4f}")
    print(f"\n  Per-target R¬≤:")
    for i, col in enumerate(target_cols):
        print(f"    {col:15s}: {per_target_r2[i]:+.4f}")
    
    # Interpretation
    print(f"\n{'='*80}")
    if final_r2 > 0.9:
        print("‚úÖ SUCCESS: Model CAN overfit a single batch!")
        print("   Architecture is working. Ready for full training.")
        success = True
    elif final_r2 > 0.5:
        print("‚ö†Ô∏è  PARTIAL: Model learning but slowly.")
        print("   May work with more epochs or tuning.")
        success = True
    elif final_r2 > 0.0:
        print("‚ö†Ô∏è  WEAK: Model barely learning.")
        print("   Will likely struggle in full training.")
        success = False
    else:
        print("‚ùå FAILURE: Model CANNOT learn even a single batch.")
        print("   Architecture or loss function is broken.")
        success = False
    print(f"{'='*80}")
    
    return success, final_r2, losses, r2_scores

print("‚úì Overfitting test function defined")

In [None]:
# Create datasets
train_dataset_normalized = NormalizedDataset(train_data, target_means, target_stds, augment=False)
train_dataset_unnormalized = UnnormalizedDataset(train_data, augment=False)

# Test Approach A: Normalized + Plain MSE
model_a = ModelA_Normalized()
loss_a = PlainMSELoss()
success_a, r2_a, losses_a, r2_hist_a = test_overfitting(
    model_a, train_dataset_normalized, loss_a, 
    "Approach A: Normalized + Plain MSE",
    is_normalized=True
)

In [None]:
# Test Approach B: Plain MSE + Output Scaling
model_b = ModelB_PlainMSE()
loss_b = PlainMSELoss()
success_b, r2_b, losses_b, r2_hist_b = test_overfitting(
    model_b, train_dataset_unnormalized, loss_b,
    "Approach B: Plain MSE + Output Scaling (ReLU)",
    is_normalized=False
)

In [None]:
# Test Approach C: Competition Weighted
model_c = ModelC_Weighted()
loss_c = CompetitionWeightedLoss()
success_c, r2_c, losses_c, r2_hist_c = test_overfitting(
    model_c, train_dataset_unnormalized, loss_c,
    "Approach C: Competition Weighted MSE (Current)",
    is_normalized=False
)

In [None]:
# Compare overfitting results
print("\n" + "="*80)
print("OVERFITTING TEST SUMMARY")
print("="*80)

results_df = pd.DataFrame({
    'Approach': [
        'A: Normalized + Plain MSE',
        'B: Plain MSE + ReLU',
        'C: Competition Weighted'
    ],
    'Final R¬≤ (100 steps)': [r2_a, r2_b, r2_c],
    'Can Overfit?': [
        '‚úÖ Yes' if success_a else '‚ùå No',
        '‚úÖ Yes' if success_b else '‚ùå No',
        '‚úÖ Yes' if success_c else '‚ùå No'
    ]
})

print("\n" + results_df.to_string(index=False))
print("\n" + "="*80)

# Determine if we should proceed with full training
if not (success_a or success_b or success_c):
    print("\n‚ö†Ô∏è  WARNING: None of the approaches can overfit a single batch!")
    print("   This suggests a fundamental problem.")
    print("   Recommend investigating further before full training.")
elif success_a:
    print("\n‚úÖ Approach A can overfit! This is the most promising approach.")
    print("   Proceeding with full training...")
else:
    print("\n‚úÖ At least one approach can overfit.")
    print("   Proceeding with full training for successful approach(es)...")

---
## Full Training: 10 Epochs Each

Only train approaches that passed the overfitting test.

In [None]:
def train_model(model, train_loader, val_loader, loss_fn, num_epochs=10, 
                model_name='Model', is_normalized=False):
    """
    Train model for specified epochs.
    
    Args:
        is_normalized: If True, denormalize predictions for R¬≤ evaluation
    """
    print(f"\n{'='*80}")
    print(f"TRAINING: {model_name}")
    print(f"{'='*80}\n")
    
    model = model.to(device)
    optimizer = torch.optim.AdamW(model.parameters(), lr=3e-4, weight_decay=1e-4)
    scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', factor=0.5, patience=2)
    
    history = {
        'train_loss': [],
        'val_loss': [],
        'val_r2': [],
        'epoch': []
    }
    
    best_r2 = -float('inf')
    
    for epoch in range(num_epochs):
        # Training
        model.train()
        train_loss = 0
        
        for batch in train_loader:
            images = batch['image'].to(device)
            targets = batch['targets'].to(device)
            
            optimizer.zero_grad()
            outputs = model(images)
            loss = loss_fn(outputs, targets)
            loss.backward()
            torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=10.0)
            optimizer.step()
            
            train_loss += loss.item() * images.size(0)
        
        train_loss /= len(train_loader.dataset)
        
        # Validation
        model.eval()
        val_loss = 0
        all_preds = []
        all_targets = []
        
        with torch.no_grad():
            for batch in val_loader:
                images = batch['image'].to(device)
                targets = batch['targets'].to(device)
                
                if is_normalized:
                    targets_original = batch['targets_original'].to(device)
                else:
                    targets_original = targets
                
                outputs = model(images)
                loss = loss_fn(outputs, targets)
                val_loss += loss.item() * images.size(0)
                
                # Denormalize for R¬≤ calculation
                if is_normalized:
                    outputs_denorm = outputs * target_stds.to(device) + target_means.to(device)
                else:
                    outputs_denorm = outputs
                
                all_preds.append(outputs_denorm.cpu().numpy())
                all_targets.append(targets_original.cpu().numpy())
        
        val_loss /= len(val_loader.dataset)
        
        # Calculate R¬≤
        all_preds = np.vstack(all_preds)
        all_targets = np.vstack(all_targets)
        val_r2, _ = calculate_competition_r2(all_preds, all_targets)
        
        # Update scheduler
        scheduler.step(val_loss)
        
        # Store history
        history['train_loss'].append(train_loss)
        history['val_loss'].append(val_loss)
        history['val_r2'].append(val_r2)
        history['epoch'].append(epoch + 1)
        
        # Print progress
        print(f"Epoch {epoch+1:2d}/{num_epochs}: Train Loss={train_loss:.4f}, Val Loss={val_loss:.4f}, Val R¬≤={val_r2:+.4f}")
        
        # Save best
        if val_r2 > best_r2:
            best_r2 = val_r2
            torch.save(model.state_dict(), f'{model_name.replace(" ", "_")}_best.pth')
            print(f"  üíæ New best R¬≤ = {best_r2:+.4f}")
    
    print(f"\n‚úì Training complete! Best R¬≤ = {best_r2:+.4f}\n")
    return history, best_r2

print("‚úì Training function defined")

In [None]:
# Create dataloaders
batch_size = 16

# For Approach A (normalized)
train_dataset_norm = NormalizedDataset(train_data, target_means, target_stds, augment=True)
val_dataset_norm = NormalizedDataset(val_data, target_means, target_stds, augment=False)
train_loader_norm = DataLoader(train_dataset_norm, batch_size=batch_size, shuffle=True)
val_loader_norm = DataLoader(val_dataset_norm, batch_size=batch_size, shuffle=False)

# For Approaches B & C (unnormalized)
train_dataset_unnorm = UnnormalizedDataset(train_data, augment=True)
val_dataset_unnorm = UnnormalizedDataset(val_data, augment=False)
train_loader_unnorm = DataLoader(train_dataset_unnorm, batch_size=batch_size, shuffle=True)
val_loader_unnorm = DataLoader(val_dataset_unnorm, batch_size=batch_size, shuffle=False)

print("‚úì Dataloaders created")

In [None]:
# Train Approach A (if passed overfitting test)
if success_a:
    model_a_full = ModelA_Normalized()
    history_a, best_r2_a = train_model(
        model_a_full, train_loader_norm, val_loader_norm, loss_a,
        num_epochs=10, model_name='Approach_A', is_normalized=True
    )
else:
    print("‚è≠Ô∏è  Skipping Approach A (failed overfitting test)")
    history_a, best_r2_a = None, None

In [None]:
# Train Approach B (if passed overfitting test)
if success_b:
    model_b_full = ModelB_PlainMSE()
    history_b, best_r2_b = train_model(
        model_b_full, train_loader_unnorm, val_loader_unnorm, loss_b,
        num_epochs=10, model_name='Approach_B', is_normalized=False
    )
else:
    print("‚è≠Ô∏è  Skipping Approach B (failed overfitting test)")
    history_b, best_r2_b = None, None

In [None]:
# Train Approach C (if passed overfitting test)
if success_c:
    model_c_full = ModelC_Weighted()
    history_c, best_r2_c = train_model(
        model_c_full, train_loader_unnorm, val_loader_unnorm, loss_c,
        num_epochs=10, model_name='Approach_C', is_normalized=False
    )
else:
    print("‚è≠Ô∏è  Skipping Approach C (failed overfitting test)")
    history_c, best_r2_c = None, None

---
## Comparison & Visualization

In [None]:
# Final comparison
print("\n" + "="*80)
print("FINAL COMPARISON")
print("="*80)

comparison_df = pd.DataFrame({
    'Approach': [
        'A: Normalized + Plain MSE',
        'B: Plain MSE + ReLU',
        'C: Competition Weighted',
        '---',
        'Linear Regression (baseline)',
        'Previous CNN (40 epochs)'
    ],
    'Competition R¬≤': [
        f"{best_r2_a:+.4f}" if best_r2_a is not None else 'N/A',
        f"{best_r2_b:+.4f}" if best_r2_b is not None else 'N/A',
        f"{best_r2_c:+.4f}" if best_r2_c is not None else 'N/A',
        '---',
        '+0.2048',
        '-1.2527'
    ]
})

print("\n" + comparison_df.to_string(index=False))
print("\n" + "="*80)

In [None]:
# Plot training curves
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Loss curves
ax = axes[0]
if history_a:
    ax.plot(history_a['epoch'], history_a['val_loss'], 'o-', label='A: Normalized + Plain MSE', linewidth=2)
if history_b:
    ax.plot(history_b['epoch'], history_b['val_loss'], 's-', label='B: Plain MSE + ReLU', linewidth=2)
if history_c:
    ax.plot(history_c['epoch'], history_c['val_loss'], '^-', label='C: Competition Weighted', linewidth=2)
ax.set_xlabel('Epoch', fontsize=12)
ax.set_ylabel('Validation Loss', fontsize=12)
ax.set_title('Loss Curves', fontsize=14, fontweight='bold')
ax.legend()
ax.grid(alpha=0.3)

# R¬≤ curves
ax = axes[1]
if history_a:
    ax.plot(history_a['epoch'], history_a['val_r2'], 'o-', label='A: Normalized + Plain MSE', linewidth=2)
if history_b:
    ax.plot(history_b['epoch'], history_b['val_r2'], 's-', label='B: Plain MSE + ReLU', linewidth=2)
if history_c:
    ax.plot(history_c['epoch'], history_c['val_r2'], '^-', label='C: Competition Weighted', linewidth=2)
ax.axhline(y=0.0, color='gray', linestyle='--', linewidth=2, label='Baseline (mean)')
ax.axhline(y=0.2048, color='orange', linestyle='--', linewidth=2, label='Linear regression')
ax.set_xlabel('Epoch', fontsize=12)
ax.set_ylabel('Competition R¬≤', fontsize=12)
ax.set_title('R¬≤ Progress', fontsize=14, fontweight='bold')
ax.legend()
ax.grid(alpha=0.3)

plt.tight_layout()
plt.savefig('loss_function_comparison.png', dpi=150, bbox_inches='tight')
plt.show()

print("‚úì Comparison plot saved")

---
## Conclusion & Recommendations

In [None]:
print("\n" + "="*80)
print("CONCLUSION")
print("="*80)

# Determine winner
valid_approaches = []
if best_r2_a is not None:
    valid_approaches.append(('A', best_r2_a))
if best_r2_b is not None:
    valid_approaches.append(('B', best_r2_b))
if best_r2_c is not None:
    valid_approaches.append(('C', best_r2_c))

if valid_approaches:
    winner, winner_score = max(valid_approaches, key=lambda x: x[1])
    
    print(f"\nüèÜ WINNER: Approach {winner}")
    print(f"   Competition R¬≤ = {winner_score:+.4f}")
    
    if winner_score > 0.2:
        print("\n‚úÖ SUCCESS! CNN beats linear regression!")
        print("\n   Key insights:")
        if winner == 'A':
            print("   ‚Ä¢ Target normalization is CRITICAL for this task")
            print("   ‚Ä¢ Plain MSE works better than weighted MSE for training")
            print("   ‚Ä¢ Competition weights belong in EVALUATION, not TRAINING")
        elif winner == 'B':
            print("   ‚Ä¢ Output scaling (ReLU) helps with unnormalized targets")
            print("   ‚Ä¢ Plain MSE works better than weighted MSE")
        else:
            print("   ‚Ä¢ Weighted MSE can work (surprising!)")
        
        print("\n   Next steps:")
        print("   1. Scale up to 20-30 epochs with early stopping")
        print("   2. Try slightly larger model (ResNet34)")
        print("   3. Fine-tune learning rate and weight decay")
        print("   4. Generate test predictions and submit!")
        
    elif winner_score > 0.0:
        print("\n‚ö†Ô∏è  PARTIAL SUCCESS: Positive R¬≤ but below linear regression")
        print("\n   Recommendations:")
        print("   ‚Ä¢ Train for more epochs (20-30)")
        print("   ‚Ä¢ Try custom normalization (not ImageNet)")
        print("   ‚Ä¢ Experiment with learning rate")
        print("   ‚Ä¢ Consider ensemble with linear model")
        
    else:
        print("\n‚ùå FAILURE: All approaches still have negative R¬≤")
        print("\n   Further investigation needed:")
        print("   ‚Ä¢ Check ImageNet normalization (use custom?)")
        print("   ‚Ä¢ Try training from scratch (no pretrained weights)")
        print("   ‚Ä¢ Investigate data quality issues")
        print("   ‚Ä¢ Consider simpler task (predict only Dry_Total_g)")
else:
    print("\n‚ùå All approaches failed overfitting test")
    print("   Need to fix fundamental architecture issues first")

print("\n" + "="*80)
print("‚úì Experiment complete!")
print("="*80)