# 🧠 Data Exploration - Robot Navigation Training Data

**Goals:**
1. Generate training data with goal-aware features (3×3 perception + 2 goal_delta = 11 features)
2. Load and inspect dataset
3. View sample data examples  
4. Analyze key matrix properties
5. Examine data distribution and goal delta patterns

**Note:** This notebook explores the new goal-aware architecture with wall padding and compass-like navigation.


## Configuration Options

**Goal-Aware Navigation System:**

| Mode | Features | Description | Expected Accuracy |
|------|----------|-------------|-------------------|
| **Goal-Aware** | 11 (9 perception + 2 goal_delta) | Compass-like navigation | 80-85% |
| **Basic** | 9 (perception only) | Local obstacle avoidance | 70-75% |

**Key Features:**
- 🧱 **Wall Padding**: 12×12 environments with 1-cell borders
- 🎯 **Goal Delta**: Relative coordinates (dx, dy) at every step
- 🧠 **Simplified Architecture**: 11 features vs previous 37 features
- 🧭 **Compass-like Navigation**: Robot always knows direction to goal

**Usage:**
```bash
# Generate datasets
python scripts/generate_data.py large          # Goal-aware mode (11 features)
python scripts/generate_data.py large --basic  # Basic mode (9 features)

# Train models
python scripts/train_nn.py                     # Goal-aware mode
python scripts/train_nn.py --basic             # Basic mode
```

**Benefits:**
- ✅ Simpler architecture (70% fewer features)
- ✅ Goal awareness like animal navigation
- ✅ No memory dependencies
- ✅ Better generalization


In [4]:
# 📦 IMPORTS
# ==========
import sys
import subprocess
from pathlib import Path
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Add project to path
project_root = Path().resolve().parent
sys.path.append(str(project_root))

# Import data generation utilities
from core.data_generation import load_training_data, TrainingDataGenerator

# Set visualization style
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")

print("✅ Setup complete!")


✅ Setup complete!


In [5]:
# Load the training dataset
X_train, y_train, metadata = load_training_data("../data/raw/large_training_dataset.npz")

# Detect mode (goal-aware vs basic)
feature_count = X_train.shape[1]
is_goal_aware = feature_count == 11
mode_type = "Goal-Aware 🎯" if is_goal_aware else "Basic"

print("✅ Dataset loaded successfully!")
print(f"   Training examples: {len(X_train)}")
print(f"   Environments: {len(metadata)}")
print(f"   Mode: {mode_type}")
print(f"   Features: {feature_count}")

# Show environment info
if len(metadata) > 0:
    sample_env = metadata[0]
    print(f"   Environment size: 12×12 (10×10 inner + 1-cell walls)")
    print(f"   Wall padding: {sample_env.get('wall_padding', 'N/A')}")


📂 Training data loaded from ../data/raw/large_training_dataset.npz
✅ Dataset loaded successfully!
   Training examples: 8761
   Environments: 1000
   Mode: Goal-Aware 🎯
   Features: 11
   Environment size: 12×12 (10×10 inner + 1-cell walls)
   Wall padding: N/A


In [6]:
# Feature breakdown
feature_count = X_train.shape[1]
print(f"\n🧠 Feature Breakdown:")
if feature_count == 11:
    print(f"   ✅ Goal-Aware Mode: {feature_count} features")
    print(f"      • Perception: 9 features (3×3 grid)")
    print(f"      • Goal Delta: 2 features (dx, dy)")
    print(f"      • Architecture: 11 → 64 → 32 → 4")
elif feature_count == 9:
    print(f"   ⚠️  Basic Mode: {feature_count} features")
    print(f"      • Perception: 9 features (3×3 grid)")
    print(f"      • Architecture: 9 → 64 → 32 → 4")
else:
    print(f"   ❓ Unknown Mode: {feature_count} features")
    print(f"   Expected: 11 (goal-aware) or 9 (basic) features")

# Action labels
action_names = ['UP', 'DOWN', 'LEFT', 'RIGHT']
action_counts = np.bincount(y_train)

print(f"\n🎯 Action Distribution:")
for i, action in enumerate(action_names):
    count = action_counts[i]
    percentage = (count / len(y_train)) * 100
    print(f"   {action}: {count:>5} ({percentage:>5.1f}%)")



🧠 Feature Breakdown:
   ✅ Goal-Aware Mode: 11 features
      • Perception: 9 features (3×3 grid)
      • Goal Delta: 2 features (dx, dy)
      • Architecture: 11 → 64 → 32 → 4

🎯 Action Distribution:
   UP:  2221 ( 25.4%)
   DOWN:  2119 ( 24.2%)
   LEFT:  2151 ( 24.6%)
   RIGHT:  2270 ( 25.9%)


## Summary

**Dataset Ready for Training:**
- ✅ Goal-aware features (11: 9 perception + 2 goal_delta)
- ✅ Wall padding (12×12 environments with 1-cell borders)
- ✅ Balanced action distribution
- ✅ Diverse obstacle patterns
- ✅ Compass-like navigation data

**Key Insights:**
- 🧭 **Goal Delta Range**: Shows robot's distance and direction awareness
- 🧱 **Wall Padding**: Consistent boundary handling across environments
- 🎯 **Simplified Architecture**: 11 features vs previous 37 features (70% reduction)
- 🧠 **Compass-like Navigation**: Robot learns goal-relative spatial awareness

**Next Steps:**
1. Train neural network: `python scripts/train_nn.py`
2. Compare with basic mode: `python scripts/train_nn.py --basic`
3. Analyze goal-aware vs basic performance differences


In [7]:
# Display sample training examples
print("👁️  SAMPLE TRAINING EXAMPLES")
print("=" * 60)

action_names = ['UP', 'DOWN', 'LEFT', 'RIGHT']
num_samples = 5

for i in range(num_samples):
    sample_idx = np.random.randint(0, len(X_train))
    sample_x = X_train[sample_idx]
    sample_y = y_train[sample_idx]
    
    print(f"\nExample {i+1} (index {sample_idx}):")
    
    # Dynamic feature detection and display
    feature_count = X_train.shape[1]
    
    if feature_count == 11:
        # Goal-aware mode: 9 perception + 2 goal_delta
        perception = sample_x[:9].reshape(3, 3)
        goal_delta = sample_x[9:11]  # Last 2 features (dx, dy)
        
        print(f"  3×3 Perception (Binary):")
        for row in perception:
            print(f"    {' '.join(['X' if x > 0 else '.' for x in row])}")
        
        print(f"  Goal Delta (dx, dy): ({goal_delta[0]:.0f}, {goal_delta[1]:.0f})")
        print(f"    → Direction to goal: {'UP' if goal_delta[0] < 0 else 'DOWN'} {'LEFT' if goal_delta[1] < 0 else 'RIGHT'}")
        print(f"    → Distance to goal: {abs(goal_delta[0]) + abs(goal_delta[1])} steps")
                
    elif feature_count == 9:
        # Basic mode: 9 perception only
        perception = sample_x.reshape(3, 3)
        print(f"  3×3 Perception (Binary):")
        for row in perception:
            print(f"    {' '.join(['X' if x > 0 else '.' for x in row])}")
            
    else:
        print(f"  Unknown feature format: {feature_count} features")
        print(f"  Raw features: {sample_x[:10]}...")  # Show first 10 features
    
    print(f"  → Action: {action_names[sample_y]}")


👁️  SAMPLE TRAINING EXAMPLES

Example 1 (index 2740):
  3×3 Perception (Binary):
    . . X
    X . .
    X . .
  Goal Delta (dx, dy): (-2, 1)
    → Direction to goal: UP RIGHT
    → Distance to goal: 3.0 steps
  → Action: UP

Example 2 (index 5374):
  3×3 Perception (Binary):
    . . X
    X . .
    . . .
  Goal Delta (dx, dy): (-3, 3)
    → Direction to goal: UP RIGHT
    → Distance to goal: 6.0 steps
  → Action: UP

Example 3 (index 1778):
  3×3 Perception (Binary):
    . . .
    . . X
    . . X
  Goal Delta (dx, dy): (3, 1)
    → Direction to goal: DOWN RIGHT
    → Distance to goal: 4.0 steps
  → Action: DOWN

Example 4 (index 1598):
  3×3 Perception (Binary):
    . . .
    . . X
    X . .
  Goal Delta (dx, dy): (1, -3)
    → Direction to goal: DOWN LEFT
    → Distance to goal: 4.0 steps
  → Action: LEFT

Example 5 (index 2538):
  3×3 Perception (Binary):
    . . .
    . . .
    . . .
  Goal Delta (dx, dy): (-3, -2)
    → Direction to goal: UP LEFT
    → Distance to goal: 5.0 steps
