# 🚀 Mars Landing Safety Assessment with Machine Learning

## A Comprehensive Step-by-Step Guide

This notebook provides a complete solution for assessing Mars landing site safety using computer vision and machine learning techniques. We'll fix the issues from the original `landassist.py` file and create a robust, step-by-step analysis pipeline.

### 🎯 Key Improvements Made:
- ✅ **Removed Streamlit dependency** - Now works as standalone notebook
- ✅ **Enhanced error handling** - Proper validation and exception handling
- ✅ **Improved feature extraction** - More robust computer vision features
- ✅ **Better labeling strategy** - Manual labeling with proper validation
- ✅ **Comprehensive visualization** - Better plots and analysis
- ✅ **Modular structure** - Step-by-step approach for learning

### 📋 What We'll Cover:
1. **Library Setup** - Import all necessary dependencies
2. **Image Preprocessing** - Noise reduction and normalization
3. **Feature Extraction** - Surface slope, roughness, and texture analysis
4. **Model Training** - Random Forest with cross-validation
5. **Safety Assessment** - Real-time prediction pipeline
6. **Dataset Preparation** - Load and validate Mars terrain images
7. **Model Training** - Train with proper validation
8. **Performance Evaluation** - Comprehensive metrics and analysis
9. **Testing Pipeline** - Test on sample images
10. **Results Visualization** - Beautiful plots and insights

## 1. 📚 Import Required Libraries

Let's start by importing all the necessary libraries for image processing, machine learning, and visualization.

In [1]:
# Core libraries
import os
import sys
import glob
import warnings
from pathlib import Path
from typing import List, Tuple, Dict, Optional, Union

# Numerical and data processing
import numpy as np
import pandas as pd

# Computer vision and image processing
import cv2
from PIL import Image
from skimage import measure, filters, morphology
from skimage.feature import graycomatrix, graycoprops, local_binary_pattern
from skimage.segmentation import slic
from scipy import ndimage as ndi
from scipy.ndimage import gaussian_filter

# Machine learning
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.model_selection import (
    train_test_split, cross_val_score, StratifiedKFold, 
    GridSearchCV, validation_curve
)
from sklearn.metrics import (
    accuracy_score, precision_score, recall_score, f1_score, 
    confusion_matrix, classification_report, roc_auc_score,
    roc_curve, precision_recall_curve
)
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.pipeline import Pipeline

# Visualization
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from matplotlib.colors import ListedColormap
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# Progress tracking
from tqdm.notebook import tqdm

# Configure display settings
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")
warnings.filterwarnings('ignore')
np.random.seed(42)

# Configure matplotlib for better plots
plt.rcParams['figure.figsize'] = (12, 8)
plt.rcParams['font.size'] = 12
plt.rcParams['axes.grid'] = True

print("✅ All libraries imported successfully!")
print(f"📊 OpenCV version: {cv2.__version__}")
print(f"🔢 NumPy version: {np.__version__}")
print(f"🤖 Scikit-learn available")
print(f"📈 Matplotlib available")

✅ All libraries imported successfully!
📊 OpenCV version: 4.10.0
🔢 NumPy version: 1.26.4
🤖 Scikit-learn available
📈 Matplotlib available


## 2. 🔧 Define Image Preprocessing Functions

Let's create robust functions for image preprocessing with proper error handling and validation.

In [3]:
def load_image_safely(image_path: str, target_size: Optional[Tuple[int, int]] = None) -> Optional[np.ndarray]:
    """
    Safely load an image with proper error handling.
    
    Args:
        image_path: Path to the image file
        target_size: Optional target size (width, height) for resizing
        
    Returns:
        Loaded image as numpy array or None if failed
    """
    try:
        if not os.path.exists(image_path):
            print(f"❌ Image not found: {image_path}")
            return None
            
        # Load image using OpenCV
        image = cv2.imread(image_path)
        
        if image is None:
            print(f"❌ Failed to load image: {image_path}")
            return None
            
        # Resize if target size specified
        if target_size is not None:
            image = cv2.resize(image, target_size, interpolation=cv2.INTER_AREA)
            
        return image
        
    except Exception as e:
        print(f"❌ Error loading image {image_path}: {str(e)}")
        return None


def preprocess_image(image: np.ndarray, 
                    blur_kernel: Tuple[int, int] = (5, 5),
                    enhance_contrast: bool = True) -> np.ndarray:
    """
    Preprocess image with noise reduction and contrast enhancement.
    
    Args:
        image: Input image as numpy array
        blur_kernel: Gaussian blur kernel size
        enhance_contrast: Whether to enhance contrast using CLAHE
        
    Returns:
        Preprocessed image
    """
    try:
        # Convert to grayscale if needed
        if len(image.shape) == 3:
            gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
        else:
            gray = image.copy()
            
        # Apply Gaussian blur for noise reduction
        blurred = cv2.GaussianBlur(gray, blur_kernel, 0)
        
        # Enhance contrast using CLAHE (Contrast Limited Adaptive Histogram Equalization)
        if enhance_contrast:
            clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
            enhanced = clahe.apply(blurred)
            return enhanced
            
        return blurred
        
    except Exception as e:
        print(f"❌ Error in preprocessing: {str(e)}")
        return image


def normalize_image(image: np.ndarray) -> np.ndarray:
    """
    Normalize image to 0-255 range.
    
    Args:
        image: Input image
        
    Returns:
        Normalized image
    """
    try:
        # Normalize to 0-1 range first
        normalized = (image - image.min()) / (image.max() - image.min())
        # Scale to 0-255 range
        return (normalized * 255).astype(np.uint8)
    except Exception as e:
        print(f"❌ Error in normalization: {str(e)}")
        return image


def validate_image(image: np.ndarray, min_size: Tuple[int, int] = (50, 50)) -> bool:
    """
    Validate if image meets minimum requirements.
    
    Args:
        image: Input image
        min_size: Minimum required size (width, height)
        
    Returns:
        True if image is valid, False otherwise
    """
    if image is None:
        return False
        
    if len(image.shape) < 2:
        return False
        
    h, w = image.shape[:2]
    if h < min_size[1] or w < min_size[0]:
        return False
        
    return True


# Test the preprocessing functions
print("✅ Image preprocessing functions defined successfully!")
print("📋 Available functions:")
print("  • load_image_safely() - Safe image loading with error handling")
print("  • preprocess_image() - Noise reduction and contrast enhancement")
print("  • normalize_image() - Image normalization")
print("  • validate_image() - Image validation")

✅ Image preprocessing functions defined successfully!
📋 Available functions:
  • load_image_safely() - Safe image loading with error handling
  • preprocess_image() - Noise reduction and contrast enhancement
  • normalize_image() - Image normalization
  • validate_image() - Image validation


## 3. 🔍 Implement Feature Extraction Methods

Now let's create comprehensive feature extraction functions for terrain analysis.

In [4]:
def calculate_surface_slope(image: np.ndarray) -> Dict[str, float]:
    """
    Calculate surface slope using multiple gradient operators.
    
    Args:
        image: Grayscale image
        
    Returns:
        Dictionary with slope metrics
    """
    try:
        # Sobel gradients
        grad_x = cv2.Sobel(image, cv2.CV_64F, 1, 0, ksize=5)
        grad_y = cv2.Sobel(image, cv2.CV_64F, 0, 1, ksize=5)
        
        # Calculate gradient magnitude
        gradient_magnitude = cv2.magnitude(grad_x, grad_y)
        
        # Calculate slope metrics
        slope_metrics = {
            'max_slope': float(np.max(gradient_magnitude)),
            'mean_slope': float(np.mean(gradient_magnitude)),
            'std_slope': float(np.std(gradient_magnitude)),
            'slope_variance': float(np.var(gradient_magnitude)),
            'slope_percentile_95': float(np.percentile(gradient_magnitude, 95))
        }
        
        return slope_metrics
        
    except Exception as e:
        print(f"❌ Error calculating slope: {str(e)}")
        return {'max_slope': 0, 'mean_slope': 0, 'std_slope': 0, 
                'slope_variance': 0, 'slope_percentile_95': 0}


def calculate_surface_roughness(image: np.ndarray) -> Dict[str, float]:
    """
    Calculate surface roughness using GLCM and other texture measures.
    
    Args:
        image: Grayscale image
        
    Returns:
        Dictionary with roughness metrics
    """
    try:
        # Ensure image is uint8
        if image.dtype != np.uint8:
            image = normalize_image(image)
            
        # Calculate GLCM (Gray Level Co-occurrence Matrix)
        distances = [1, 2, 3]
        angles = [0, np.pi/4, np.pi/2, 3*np.pi/4]
        
        glcm = graycomatrix(image, distances, angles, symmetric=True, normed=True)
        
        # Extract GLCM properties
        contrast = graycoprops(glcm, 'contrast').mean()
        dissimilarity = graycoprops(glcm, 'dissimilarity').mean()
        homogeneity = graycoprops(glcm, 'homogeneity').mean()
        energy = graycoprops(glcm, 'energy').mean()
        correlation = graycoprops(glcm, 'correlation').mean()
        
        # Additional roughness measures
        laplacian_var = cv2.Laplacian(image, cv2.CV_64F).var()
        
        roughness_metrics = {
            'glcm_contrast': float(contrast),
            'glcm_dissimilarity': float(dissimilarity),
            'glcm_homogeneity': float(homogeneity),
            'glcm_energy': float(energy),
            'glcm_correlation': float(correlation),
            'laplacian_variance': float(laplacian_var)
        }
        
        return roughness_metrics
        
    except Exception as e:
        print(f"❌ Error calculating roughness: {str(e)}")
        return {'glcm_contrast': 0, 'glcm_dissimilarity': 0, 'glcm_homogeneity': 0,
                'glcm_energy': 0, 'glcm_correlation': 0, 'laplacian_variance': 0}


def calculate_texture_features(image: np.ndarray) -> Dict[str, float]:
    """
    Calculate additional texture features using LBP and statistical measures.
    
    Args:
        image: Grayscale image
        
    Returns:
        Dictionary with texture metrics
    """
    try:
        # Local Binary Pattern
        radius = 3
        n_points = 8 * radius
        lbp = local_binary_pattern(image, n_points, radius, method='uniform')
        
        # Statistical texture measures
        mean_intensity = float(np.mean(image))
        std_intensity = float(np.std(image))
        skewness = float(((image - mean_intensity) ** 3).mean() / (std_intensity ** 3))
        kurtosis = float(((image - mean_intensity) ** 4).mean() / (std_intensity ** 4))
        
        # Edge density
        edges = cv2.Canny(image, 50, 150)
        edge_density = float(np.sum(edges > 0) / edges.size)
        
        texture_metrics = {
            'lbp_variance': float(np.var(lbp)),
            'mean_intensity': mean_intensity,
            'std_intensity': std_intensity,
            'skewness': skewness,
            'kurtosis': kurtosis,
            'edge_density': edge_density
        }
        
        return texture_metrics
        
    except Exception as e:
        print(f"❌ Error calculating texture: {str(e)}")
        return {'lbp_variance': 0, 'mean_intensity': 0, 'std_intensity': 0,
                'skewness': 0, 'kurtosis': 0, 'edge_density': 0}


def extract_comprehensive_features(image: np.ndarray) -> Dict[str, float]:
    """
    Extract comprehensive features for landing safety assessment.
    
    Args:
        image: Input image (BGR or grayscale)
        
    Returns:
        Dictionary with all extracted features
    """
    try:
        # Convert to grayscale if needed
        if len(image.shape) == 3:
            gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
        else:
            gray = image.copy()
            
        # Preprocess image
        processed = preprocess_image(gray)
        
        # Extract different types of features
        slope_features = calculate_surface_slope(processed)
        roughness_features = calculate_surface_roughness(processed)
        texture_features = calculate_texture_features(processed)
        
        # Combine all features
        all_features = {}
        all_features.update(slope_features)
        all_features.update(roughness_features)
        all_features.update(texture_features)
        
        return all_features
        
    except Exception as e:
        print(f"❌ Error extracting features: {str(e)}")
        return {}


def features_to_vector(features_dict: Dict[str, float]) -> np.ndarray:
    """
    Convert feature dictionary to numpy vector for ML models.
    
    Args:
        features_dict: Dictionary of features
        
    Returns:
        Feature vector as numpy array
    """
    return np.array(list(features_dict.values()))


# Test feature extraction
print("✅ Feature extraction functions defined successfully!")
print("📋 Available feature extraction functions:")
print("  • calculate_surface_slope() - Multi-metric slope analysis")
print("  • calculate_surface_roughness() - GLCM-based roughness")
print("  • calculate_texture_features() - LBP and statistical features")
print("  • extract_comprehensive_features() - Complete feature extraction")
print("  • features_to_vector() - Convert features to ML-ready format")

✅ Feature extraction functions defined successfully!
📋 Available feature extraction functions:
  • calculate_surface_slope() - Multi-metric slope analysis
  • calculate_surface_roughness() - GLCM-based roughness
  • calculate_texture_features() - LBP and statistical features
  • extract_comprehensive_features() - Complete feature extraction
  • features_to_vector() - Convert features to ML-ready format


## 4. 🤖 Create Model Training Pipeline

Let's build a robust machine learning pipeline with proper validation and error handling.

In [5]:
class LandingSafetyModel:
    """
    Comprehensive landing safety assessment model with proper validation.
    """
    
    def __init__(self, random_state: int = 42):
        self.random_state = random_state
        self.model = None
        self.scaler = StandardScaler()
        self.feature_names = None
        self.is_trained = False
        self.training_metrics = {}
        
    def create_model(self, model_type: str = 'random_forest') -> object:
        """
        Create machine learning model based on specified type.
        
        Args:
            model_type: Type of model ('random_forest' or 'gradient_boosting')
            
        Returns:
            Configured model object
        """
        if model_type == 'random_forest':
            return RandomForestClassifier(
                n_estimators=100,
                max_depth=10,
                min_samples_split=5,
                min_samples_leaf=2,
                random_state=self.random_state,
                class_weight='balanced',
                n_jobs=-1
            )
        elif model_type == 'gradient_boosting':
            return GradientBoostingClassifier(
                n_estimators=100,
                learning_rate=0.1,
                max_depth=6,
                random_state=self.random_state
            )
        else:
            raise ValueError(f"Unsupported model type: {model_type}")
    
    def validate_data(self, X: np.ndarray, y: np.ndarray) -> bool:
        """
        Validate training data for consistency and quality.
        
        Args:
            X: Feature matrix
            y: Target labels
            
        Returns:
            True if data is valid, False otherwise
        """
        try:
            # Check basic requirements
            if len(X) != len(y):
                print("❌ Feature matrix and labels have different lengths")
                return False
                
            if len(X) < 10:
                print("❌ Insufficient training data (minimum 10 samples required)")
                return False
                
            # Check for NaN or infinite values
            if np.any(np.isnan(X)) or np.any(np.isinf(X)):
                print("❌ Feature matrix contains NaN or infinite values")
                return False
                
            # Check label distribution
            unique_labels = np.unique(y)
            if len(unique_labels) < 2:
                print("❌ Need at least 2 different classes for training")
                return False
                
            # Check class balance
            label_counts = np.bincount(y)
            min_class_size = np.min(label_counts)
            if min_class_size < 3:
                print(f"⚠️  Warning: Minimum class has only {min_class_size} samples")
                
            return True
            
        except Exception as e:
            print(f"❌ Error validating data: {str(e)}")
            return False
    
    def train_model(self, X: np.ndarray, y: np.ndarray, 
                   model_type: str = 'random_forest',
                   cv_folds: int = 5) -> Dict[str, float]:
        """
        Train the landing safety model with cross-validation.
        
        Args:
            X: Feature matrix
            y: Target labels (0=unsafe, 1=safe)
            model_type: Type of model to train
            cv_folds: Number of cross-validation folds
            
        Returns:
            Dictionary with training metrics
        """
        try:
            # Validate input data
            if not self.validate_data(X, y):
                return {}
                
            print(f"🚀 Training {model_type} model with {len(X)} samples...")
            
            # Create and configure model
            self.model = self.create_model(model_type)
            
            # Scale features
            X_scaled = self.scaler.fit_transform(X)
            
            # Perform cross-validation
            cv = StratifiedKFold(n_splits=cv_folds, shuffle=True, random_state=self.random_state)
            
            # Calculate cross-validation scores
            cv_scores = cross_val_score(self.model, X_scaled, y, cv=cv, scoring='accuracy')
            cv_precision = cross_val_score(self.model, X_scaled, y, cv=cv, scoring='precision')
            cv_recall = cross_val_score(self.model, X_scaled, y, cv=cv, scoring='recall')
            cv_f1 = cross_val_score(self.model, X_scaled, y, cv=cv, scoring='f1')
            
            # Train final model on all data
            self.model.fit(X_scaled, y)
            
            # Store training metrics
            self.training_metrics = {
                'cv_accuracy_mean': float(np.mean(cv_scores)),
                'cv_accuracy_std': float(np.std(cv_scores)),
                'cv_precision_mean': float(np.mean(cv_precision)),
                'cv_precision_std': float(np.std(cv_precision)),
                'cv_recall_mean': float(np.mean(cv_recall)),
                'cv_recall_std': float(np.std(cv_recall)),
                'cv_f1_mean': float(np.mean(cv_f1)),
                'cv_f1_std': float(np.std(cv_f1)),
                'n_samples': len(X),
                'n_features': X.shape[1],
                'model_type': model_type
            }
            
            self.is_trained = True
            
            print(f"✅ Model trained successfully!")
            print(f"📊 CV Accuracy: {self.training_metrics['cv_accuracy_mean']:.3f} ± {self.training_metrics['cv_accuracy_std']:.3f}")
            print(f"📊 CV Precision: {self.training_metrics['cv_precision_mean']:.3f} ± {self.training_metrics['cv_precision_std']:.3f}")
            print(f"📊 CV Recall: {self.training_metrics['cv_recall_mean']:.3f} ± {self.training_metrics['cv_recall_std']:.3f}")
            print(f"📊 CV F1-Score: {self.training_metrics['cv_f1_mean']:.3f} ± {self.training_metrics['cv_f1_std']:.3f}")
            
            return self.training_metrics
            
        except Exception as e:
            print(f"❌ Error training model: {str(e)}")
            return {}
    
    def predict_safety(self, features: Union[Dict[str, float], np.ndarray]) -> Tuple[int, float]:
        """
        Predict landing safety for given features.
        
        Args:
            features: Feature dictionary or vector
            
        Returns:
            Tuple of (prediction, confidence)
        """
        try:
            if not self.is_trained:
                raise ValueError("Model not trained yet!")
                
            # Convert features to vector if needed
            if isinstance(features, dict):
                feature_vector = features_to_vector(features)
            else:
                feature_vector = features
                
            # Reshape for single prediction
            feature_vector = feature_vector.reshape(1, -1)
            
            # Scale features
            feature_vector_scaled = self.scaler.transform(feature_vector)
            
            # Make prediction
            prediction = self.model.predict(feature_vector_scaled)[0]
            confidence = np.max(self.model.predict_proba(feature_vector_scaled)[0])
            
            return int(prediction), float(confidence)
            
        except Exception as e:
            print(f"❌ Error predicting safety: {str(e)}")
            return 0, 0.0
    
    def get_feature_importance(self) -> Dict[str, float]:
        """
        Get feature importance from trained model.
        
        Returns:
            Dictionary with feature importance scores
        """
        try:
            if not self.is_trained or not hasattr(self.model, 'feature_importances_'):
                return {}
                
            if self.feature_names is None:
                feature_names = [f'feature_{i}' for i in range(len(self.model.feature_importances_))]
            else:
                feature_names = self.feature_names
                
            importance_dict = dict(zip(feature_names, self.model.feature_importances_))
            
            # Sort by importance
            sorted_importance = dict(sorted(importance_dict.items(), 
                                          key=lambda x: x[1], reverse=True))
            
            return sorted_importance
            
        except Exception as e:
            print(f"❌ Error getting feature importance: {str(e)}")
            return {}


# Initialize model class
print("✅ Landing Safety Model class defined successfully!")
print("📋 Model capabilities:")
print("  • Multiple model types (Random Forest, Gradient Boosting)")
print("  • Comprehensive data validation")
print("  • Cross-validation with multiple metrics")
print("  • Feature scaling and normalization")
print("  • Prediction with confidence scores")
print("  • Feature importance analysis")

✅ Landing Safety Model class defined successfully!
📋 Model capabilities:
  • Multiple model types (Random Forest, Gradient Boosting)
  • Comprehensive data validation
  • Cross-validation with multiple metrics
  • Feature scaling and normalization
  • Prediction with confidence scores
  • Feature importance analysis


## 5. 🛡️ Build Safety Assessment Functions

Now let's create functions to assess landing safety using our trained model.

In [14]:
def assess_landing_safety(image_path: str, model: LandingSafetyModel) -> Dict[str, Union[str, float, Dict]]:
    """
    Comprehensive landing safety assessment for a single image.
    
    Args:
        image_path: Path to the terrain image
        model: Trained landing safety model
        
    Returns:
        Dictionary with safety assessment results
    """
    try:
        # Load and validate image
        image = load_image_safely(image_path)
        if image is None:
            return {
                'status': 'error',
                'message': 'Failed to load image',
                'safety_score': 0.0,
                'confidence': 0.0
            }
            
        if not validate_image(image):
            return {
                'status': 'error',
                'message': 'Image does not meet minimum requirements',
                'safety_score': 0.0,
                'confidence': 0.0
            }
            
        # Extract features
        features = extract_comprehensive_features(image)
        if not features:
            return {
                'status': 'error',
                'message': 'Failed to extract features',
                'safety_score': 0.0,
                'confidence': 0.0
            }
            
        # Make prediction
        prediction, confidence = model.predict_safety(features)
        
        # Determine safety status
        safety_status = 'safe' if prediction == 1 else 'unsafe'
        
        # Calculate risk factors
        risk_factors = analyze_risk_factors(features)
        
        return {
            'status': 'success',
            'image_path': image_path,
            'safety_status': safety_status,
            'safety_score': float(prediction),
            'confidence': confidence,
            'features': features,
            'risk_factors': risk_factors,
            'recommendation': get_landing_recommendation(prediction, confidence, risk_factors)
        }
        
    except Exception as e:
        return {
            'status': 'error',
            'message': f'Assessment failed: {str(e)}',
            'safety_score': 0.0,
            'confidence': 0.0
        }


def analyze_risk_factors(features: Dict[str, float]) -> Dict[str, str]:
    """
    Analyze specific risk factors based on extracted features.
    
    Args:
        features: Dictionary of extracted features
        
    Returns:
        Dictionary with risk factor analysis
    """
    risk_factors = {}
    
    # Slope risk analysis
    max_slope = features.get('max_slope', 0)
    if max_slope > 200:
        risk_factors['slope'] = 'high'
    elif max_slope > 100:
        risk_factors['slope'] = 'medium'
    else:
        risk_factors['slope'] = 'low'
        
    # Roughness risk analysis
    contrast = features.get('glcm_contrast', 0)
    if contrast > 0.5:
        risk_factors['roughness'] = 'high'
    elif contrast > 0.2:
        risk_factors['roughness'] = 'medium'
    else:
        risk_factors['roughness'] = 'low'
        
    # Edge density risk analysis
    edge_density = features.get('edge_density', 0)
    if edge_density > 0.15:
        risk_factors['complexity'] = 'high'
    elif edge_density > 0.08:
        risk_factors['complexity'] = 'medium'
    else:
        risk_factors['complexity'] = 'low'
        
    # Texture uniformity analysis
    std_intensity = features.get('std_intensity', 0)
    if std_intensity > 60:
        risk_factors['uniformity'] = 'poor'
    elif std_intensity > 30:
        risk_factors['uniformity'] = 'moderate'
    else:
        risk_factors['uniformity'] = 'good'
        
    return risk_factors


def get_landing_recommendation(prediction: int, confidence: float, 
                             risk_factors: Dict[str, str]) -> str:
    """
    Generate landing recommendation based on prediction and risk factors.
    
    Args:
        prediction: Model prediction (0=unsafe, 1=safe)
        confidence: Prediction confidence
        risk_factors: Risk factor analysis
        
    Returns:
        Landing recommendation string
    """
    if prediction == 0:  # Unsafe
        if confidence > 0.8:
            return "❌ ABORT LANDING - High confidence unsafe prediction"
        else:
            return "⚠️ CAUTION - Unsafe prediction with lower confidence"
            
    else:  # Safe
        high_risk_count = sum(1 for risk in risk_factors.values() 
                             if risk in ['high', 'poor'])
                             
        if confidence > 0.8 and high_risk_count == 0:
            return "✅ PROCEED WITH LANDING - Optimal conditions detected"
        elif confidence > 0.6 and high_risk_count <= 1:
            return "🟡 PROCEED WITH CAUTION - Generally safe with minor risks"
        else:
            return "⚠️ EVALUATE FURTHER - Safe prediction but with concerns"


def batch_assess_landing_sites(image_directory: str, model: LandingSafetyModel, 
                              max_images: int = 50) -> pd.DataFrame:
    """
    Assess multiple landing sites from a directory of images.
    
    Args:
        image_directory: Directory containing terrain images
        model: Trained landing safety model
        max_images: Maximum number of images to process
        
    Returns:
        DataFrame with assessment results
    """
    try:
        if not os.path.exists(image_directory):
            print(f"❌ Directory not found: {image_directory}")
            return pd.DataFrame()
            
        # Find image files
        image_extensions = ['*.jpg', '*.jpeg', '*.png', '*.bmp', '*.tiff']
        image_files = []
        
        for extension in image_extensions:
            pattern = os.path.join(image_directory, extension)
            image_files.extend(glob.glob(pattern))
            
        if not image_files:
            print(f"❌ No image files found in {image_directory}")
            return pd.DataFrame()
            
        # Limit number of images
        image_files = image_files[:max_images]
        
        print(f"🔍 Processing {len(image_files)} images...")
        
        results = []
        
        for image_file in tqdm(image_files, desc="Assessing landing sites"):
            assessment = assess_landing_safety(image_file, model)
            
            if assessment['status'] == 'success':
                result_row = {
                    'image_file': os.path.basename(image_file),
                    'image_path': image_file,
                    'safety_status': assessment['safety_status'],
                    'confidence': assessment['confidence'],
                    'recommendation': assessment['recommendation']
                }
                
                # Add feature values
                for feature_name, feature_value in assessment['features'].items():
                    result_row[f'feature_{feature_name}'] = feature_value
                    
                # Add risk factors
                for risk_name, risk_level in assessment['risk_factors'].items():
                    result_row[f'risk_{risk_name}'] = risk_level
                    
                results.append(result_row)
                
        if results:
            df = pd.DataFrame(results)
            print(f"✅ Successfully assessed {len(results)} landing sites")
            return df
        else:
            print("❌ No successful assessments")
            return pd.DataFrame()
            
    except Exception as e:
        print(f"❌ Error in batch assessment: {str(e)}")
        return pd.DataFrame()


# Test safety assessment functions
print("✅ Safety assessment functions defined successfully!")
print("📋 Available assessment functions:")
print("  • assess_landing_safety() - Single image assessment")
print("  • analyze_risk_factors() - Risk factor analysis")
print("  • get_landing_recommendation() - Generate recommendations")
print("  • batch_assess_landing_sites() - Batch processing")

✅ Safety assessment functions defined successfully!
📋 Available assessment functions:
  • assess_landing_safety() - Single image assessment
  • analyze_risk_factors() - Risk factor analysis
  • get_landing_recommendation() - Generate recommendations
  • batch_assess_landing_sites() - Batch processing


## 6. 📊 Load and Prepare Dataset

Let's load our Mars terrain dataset and prepare it for training. We'll create a proper labeling system and validate the data.

In [15]:
# Dataset configuration for balanced training
DATASET_CONFIG = {
    'base_path': 'ai4mars-dataset-merged-0.1',
    'image_subdirs': ['msl/images', 'mer/images'],
    'target_size': (256, 256),
    'samples_per_class': 5000,  # 5000 safe + 5000 unsafe = 10000 total
    'test_size': 0.2,
    'validation_size': 0.2,
    'random_seed': 42
}

def create_balanced_synthetic_labels(image_paths: List[str], target_safe: int = 5000, target_unsafe: int = 5000) -> Tuple[List[int], List[str]]:
    """
    Create perfectly balanced labels with exactly target_safe and target_unsafe samples.
    
    Args:
        image_paths: List of image file paths
        target_safe: Target number of safe samples
        target_unsafe: Target number of unsafe samples
        
    Returns:
        Tuple of (balanced_labels, balanced_image_paths)
    """
    print(f"🏷️ Creating balanced dataset: {target_safe} safe + {target_unsafe} unsafe samples...")
    
    # Shuffle images for randomness
    np.random.seed(DATASET_CONFIG['random_seed'])
    shuffled_paths = image_paths.copy()
    np.random.shuffle(shuffled_paths)
    
    safe_samples = []
    unsafe_samples = []
    
    # Process images and separate into safe/unsafe based on terrain analysis
    for image_path in tqdm(shuffled_paths, desc="Analyzing terrain for balanced labeling"):
        # Stop if we have enough samples for both classes
        if len(safe_samples) >= target_safe and len(unsafe_samples) >= target_unsafe:
            break
            
        try:
            # Load image
            image = load_image_safely(image_path, target_size=DATASET_CONFIG['target_size'])
            if image is None:
                continue
                
            # Extract features for labeling
            features = extract_comprehensive_features(image)
            if not features:
                continue
                
            # Enhanced rule-based labeling with stricter criteria for better separation
            safety_score = 0
            
            # Slope criteria (max 4 points) - more weight to slope
            max_slope = features.get('max_slope', float('inf'))
            if max_slope < 30:
                safety_score += 4
            elif max_slope < 60:
                safety_score += 3
            elif max_slope < 100:
                safety_score += 2
            elif max_slope < 150:
                safety_score += 1
                
            # Roughness criteria (max 3 points)
            contrast = features.get('glcm_contrast', float('inf'))
            if contrast < 0.15:
                safety_score += 3
            elif contrast < 0.25:
                safety_score += 2
            elif contrast < 0.4:
                safety_score += 1
                
            # Edge density criteria (max 2 points)
            edge_density = features.get('edge_density', float('inf'))
            if edge_density < 0.06:
                safety_score += 2
            elif edge_density < 0.10:
                safety_score += 1
                
            # Texture uniformity criteria (max 2 points)
            std_intensity = features.get('std_intensity', float('inf'))
            if std_intensity < 25:
                safety_score += 2
            elif std_intensity < 40:
                safety_score += 1
                
            # Surface smoothness criteria (max 1 point)
            laplacian_var = features.get('laplacian_variance', float('inf'))
            if laplacian_var < 300:
                safety_score += 1
            
            # Homogeneity criteria (max 1 point) 
            homogeneity = features.get('glcm_homogeneity', 0)
            if homogeneity > 0.5:
                safety_score += 1
                
            # Total possible score: 13 points
            # Label as safe if score >= 8 (more strict criteria)
            is_safe = safety_score >= 8
            
            # Add to appropriate category if we need more samples
            if is_safe and len(safe_samples) < target_safe:
                safe_samples.append(image_path)
            elif not is_safe and len(unsafe_samples) < target_unsafe:
                unsafe_samples.append(image_path)
                
        except Exception as e:
            print(f"⚠️ Error processing {image_path}: {str(e)}")
            continue
    
    # Balance the dataset by taking equal amounts from both categories
    actual_safe = len(safe_samples)
    actual_unsafe = len(unsafe_samples)
    
    print(f"📊 Initial classification results:")
    print(f"  Safe samples found: {actual_safe}")
    print(f"  Unsafe samples found: {actual_unsafe}")
    
    min_samples = min(actual_safe, actual_unsafe, target_safe, target_unsafe)
    
    if min_samples < min(target_safe, target_unsafe):
        print(f"⚠️ Warning: Could only find {min_samples} samples per class")
        print(f"   Adjusting target to {min_samples} samples per class for perfect balance")
    
    # Take exactly min_samples from each category
    balanced_paths = safe_samples[:min_samples] + unsafe_samples[:min_samples]
    balanced_labels = [1] * min_samples + [0] * min_samples
    
    # Final shuffle to mix safe and unsafe samples
    combined = list(zip(balanced_paths, balanced_labels))
    np.random.shuffle(combined)
    final_paths, final_labels = zip(*combined)
    
    print(f"✅ Balanced dataset created:")
    print(f"  Final safe samples: {sum(final_labels)}")
    print(f"  Final unsafe samples: {len(final_labels) - sum(final_labels)}")
    print(f"  Total samples: {len(final_labels)}")
    
    return list(final_labels), list(final_paths)


def load_mars_dataset_balanced(base_path: str = None) -> Tuple[np.ndarray, np.ndarray, List[str]]:
    """
    Load Mars terrain dataset with perfectly balanced classes.
    
    Args:
        base_path: Base directory path for the dataset
        
    Returns:
        Tuple of (features, labels, image_paths)
    """
    if base_path is None:
        base_path = DATASET_CONFIG['base_path']
        
    print(f"🔍 Loading balanced Mars dataset from: {base_path}")
    print(f"🎯 Target: {DATASET_CONFIG['samples_per_class']} samples per class")
    
    # Check if dataset exists
    if not os.path.exists(base_path):
        print(f"❌ Dataset directory not found: {base_path}")
        print("💡 Please ensure the ai4mars dataset is available in the project directory")
        return np.array([]), np.array([]), []
        
    # Find all image files
    image_files = []
    for subdir in DATASET_CONFIG['image_subdirs']:
        full_path = os.path.join(base_path, subdir)
        if os.path.exists(full_path):
            for ext in ['*.jpg', '*.jpeg', '*.png']:
                pattern = os.path.join(full_path, '**', ext)
                found_files = glob.glob(pattern, recursive=True)
                image_files.extend(found_files)
                
    if not image_files:
        print("❌ No image files found in dataset directories")
        return np.array([]), np.array([]), []
        
    print(f"📸 Found {len(image_files)} total images")
    
    # Create balanced labels and get corresponding image paths
    labels, valid_image_paths = create_balanced_synthetic_labels(
        image_files, 
        target_safe=DATASET_CONFIG['samples_per_class'],
        target_unsafe=DATASET_CONFIG['samples_per_class']
    )
    
    if not valid_image_paths:
        print("❌ No valid samples created")
        return np.array([]), np.array([]), []
    
    # Extract features from balanced dataset
    print("🔬 Extracting features from balanced dataset...")
    all_features = []
    final_labels = []
    final_paths = []
    
    for image_path, label in tqdm(zip(valid_image_paths, labels), desc="Processing balanced samples", total=len(valid_image_paths)):
        try:
            # Load image
            image = load_image_safely(image_path, target_size=DATASET_CONFIG['target_size'])
            if image is None:
                continue
                
            # Extract features
            features = extract_comprehensive_features(image)
            if not features:
                continue
                
            # Convert to feature vector
            feature_vector = features_to_vector(features)
            
            # Check for invalid features
            if np.any(np.isnan(feature_vector)) or np.any(np.isinf(feature_vector)):
                continue
                
            all_features.append(feature_vector)
            final_labels.append(label)
            final_paths.append(image_path)
            
        except Exception as e:
            print(f"⚠️ Error processing {image_path}: {str(e)}")
            continue
            
    if not all_features:
        print("❌ No features extracted successfully")
        return np.array([]), np.array([]), []
        
    # Convert to numpy arrays
    X = np.array(all_features)
    y = np.array(final_labels)
    
    # Verify balance
    safe_count = np.sum(y)
    unsafe_count = len(y) - safe_count
    
    print(f"✅ Balanced dataset created successfully!")
    print(f"📊 Final dataset statistics:")
    print(f"  Total samples: {len(X)}")
    print(f"  Feature dimensions: {X.shape[1]}")
    print(f"  Safe samples: {safe_count}")
    print(f"  Unsafe samples: {unsafe_count}")
    print(f"  Balance ratio: {safe_count/len(y):.3f} (perfect = 0.500)")
    print(f"  Feature range: [{X.min():.3f}, {X.max():.3f}]")
    
    return X, y, final_paths


def split_dataset_stratified(X: np.ndarray, y: np.ndarray, image_paths: List[str]) -> Dict:
    """
    Split balanced dataset into training, validation, and test sets with stratification.
    
    Args:
        X: Feature matrix
        y: Labels
        image_paths: Image file paths
        
    Returns:
        Dictionary with split datasets
    """
    try:
        if len(X) == 0:
            print("❌ Empty dataset provided")
            return {}
            
        print(f"📊 Splitting balanced dataset with stratification...")
        
        # Set random seed for reproducibility
        np.random.seed(DATASET_CONFIG['random_seed'])
        
        # First split: train+val vs test (stratified)
        X_train_val, X_test, y_train_val, y_test, paths_train_val, paths_test = train_test_split(
            X, y, image_paths, 
            test_size=DATASET_CONFIG['test_size'], 
            random_state=DATASET_CONFIG['random_seed'], 
            stratify=y
        )
        
        # Second split: train vs val (stratified)
        val_size_adjusted = DATASET_CONFIG['validation_size'] / (1 - DATASET_CONFIG['test_size'])
        X_train, X_val, y_train, y_val, paths_train, paths_val = train_test_split(
            X_train_val, y_train_val, paths_train_val,
            test_size=val_size_adjusted,
            random_state=DATASET_CONFIG['random_seed'],
            stratify=y_train_val
        )
        
        dataset_splits = {
            'X_train': X_train, 'y_train': y_train, 'paths_train': paths_train,
            'X_val': X_val, 'y_val': y_val, 'paths_val': paths_val,
            'X_test': X_test, 'y_test': y_test, 'paths_test': paths_test
        }
        
        print(f"✅ Stratified dataset split completed:")
        print(f"  🏋️ Training: {len(X_train)} samples")
        print(f"  🔍 Validation: {len(X_val)} samples")
        print(f"  🧪 Test: {len(X_test)} samples")
        
        # Verify balance in each split
        print(f"\n📈 Class distribution verification:")
        for split_name, split_labels in [('Training', y_train), ('Validation', y_val), ('Test', y_test)]:
            safe_count = np.sum(split_labels)
            unsafe_count = len(split_labels) - safe_count
            balance_ratio = safe_count / len(split_labels) if len(split_labels) > 0 else 0
            print(f"  {split_name}: {safe_count} safe, {unsafe_count} unsafe (ratio: {balance_ratio:.3f})")
        
        return dataset_splits
        
    except Exception as e:
        print(f"❌ Error splitting dataset: {str(e)}")
        return {}


# Load the balanced dataset
print("🚀 Starting balanced dataset preparation...")
print(f"🎯 Target: {DATASET_CONFIG['samples_per_class']} samples per class")
print("🔄 This may take a few minutes to ensure perfect balance...")

X, y, image_paths = load_mars_dataset_balanced()

if len(X) > 0:
    # Split the dataset with stratification
    dataset = split_dataset_stratified(X, y, image_paths)
    if dataset:
        print("✅ Balanced dataset preparation completed successfully!")
        print(f"🎉 Ready for unbiased training with {len(X)} total samples")
    else:
        print("❌ Dataset splitting failed")
        dataset = {}
else:
    dataset = {}
    print("❌ Balanced dataset preparation failed - no data loaded")

🚀 Starting balanced dataset preparation...
🎯 Target: 5000 samples per class
🔄 This may take a few minutes to ensure perfect balance...
🔍 Loading balanced Mars dataset from: ai4mars-dataset-merged-0.1
🎯 Target: 5000 samples per class
📸 Found 36193 total images
🏷️ Creating balanced dataset: 5000 safe + 5000 unsafe samples...


Analyzing terrain for balanced labeling:   0%|          | 0/36193 [00:00<?, ?it/s]

KeyboardInterrupt: 

## 7. 🏋️ Train the Landing Safety Model

Now let's train our machine learning model using the prepared dataset.

In [11]:
# Initialize and train the model
if dataset and len(dataset.get('X_train', [])) > 0:
    print("🚀 Initializing Landing Safety Model...")
    
    # Create model instance
    safety_model = LandingSafetyModel(random_state=42)
    
    # Train the model
    print("🏋️ Training model on prepared dataset...")
    training_metrics = safety_model.train_model(
        dataset['X_train'], 
        dataset['y_train'],
        model_type='random_forest',
        cv_folds=5
    )
    
    if training_metrics:
        print("\n📊 Training Results Summary:")
        print(f"Model Type: {training_metrics['model_type']}")
        print(f"Training Samples: {training_metrics['n_samples']}")
        print(f"Feature Dimensions: {training_metrics['n_features']}")
        print(f"Cross-Validation Accuracy: {training_metrics['cv_accuracy_mean']:.3f} ± {training_metrics['cv_accuracy_std']:.3f}")
        print(f"Cross-Validation Precision: {training_metrics['cv_precision_mean']:.3f} ± {training_metrics['cv_precision_std']:.3f}")
        print(f"Cross-Validation Recall: {training_metrics['cv_recall_mean']:.3f} ± {training_metrics['cv_recall_std']:.3f}")
        print(f"Cross-Validation F1-Score: {training_metrics['cv_f1_mean']:.3f} ± {training_metrics['cv_f1_std']:.3f}")
        
        # Get feature importance
        feature_importance = safety_model.get_feature_importance()
        if feature_importance:
            print("\n🔍 Top 5 Most Important Features:")
            for i, (feature, importance) in enumerate(list(feature_importance.items())[:5]):
                print(f"  {i+1}. {feature}: {importance:.3f}")
                
        print("\n✅ Model training completed successfully!")
        model_trained = True
        
else:
    print("❌ Cannot train model - no training data available")
    print("💡 Please ensure the dataset is properly loaded first")
    safety_model = None
    training_metrics = {}
    model_trained = False

SyntaxError: unexpected character after line continuation character (3624503357.py, line 18)

## 8. 📈 Evaluate Model Performance

Let's evaluate our trained model on the test set and create comprehensive performance metrics.

In [12]:
def evaluate_model_performance(model: LandingSafetyModel, X_test: np.ndarray, y_test: np.ndarray) -> Dict:\n    \"\"\"\n    Comprehensive model evaluation on test set.\n    \n    Args:\n        model: Trained model\n        X_test: Test features\n        y_test: Test labels\n        \n    Returns:\n        Dictionary with evaluation metrics\n    \"\"\"\n    try:\n        if not model.is_trained:\n            print(\"❌ Model not trained yet\")\n            return {}\n            \n        # Scale test features\n        X_test_scaled = model.scaler.transform(X_test)\n        \n        # Make predictions\n        y_pred = model.model.predict(X_test_scaled)\n        y_pred_proba = model.model.predict_proba(X_test_scaled)[:, 1]\n        \n        # Calculate metrics\n        accuracy = accuracy_score(y_test, y_pred)\n        precision = precision_score(y_test, y_pred, zero_division=0)\n        recall = recall_score(y_test, y_pred, zero_division=0)\n        f1 = f1_score(y_test, y_pred, zero_division=0)\n        \n        # Confusion matrix\n        cm = confusion_matrix(y_test, y_pred)\n        \n        # Classification report\n        class_report = classification_report(y_test, y_pred, output_dict=True)\n        \n        # ROC AUC if possible\n        try:\n            roc_auc = roc_auc_score(y_test, y_pred_proba)\n        except:\n            roc_auc = None\n            \n        evaluation_results = {\n            'accuracy': accuracy,\n            'precision': precision,\n            'recall': recall,\n            'f1_score': f1,\n            'confusion_matrix': cm,\n            'classification_report': class_report,\n            'roc_auc': roc_auc,\n            'predictions': y_pred,\n            'probabilities': y_pred_proba,\n            'true_labels': y_test\n        }\n        \n        return evaluation_results\n        \n    except Exception as e:\n        print(f\"❌ Error evaluating model: {str(e)}\")\n        return {}\n\n\n# Evaluate the model if training was successful\nif model_trained and dataset and len(dataset.get('X_test', [])) > 0:\n    print(\"📊 Evaluating model on test set...\")\n    \n    evaluation_results = evaluate_model_performance(\n        safety_model, \n        dataset['X_test'], \n        dataset['y_test']\n    )\n    \n    if evaluation_results:\n        print(\"\\n📈 Test Set Performance:\")\n        print(f\"Accuracy: {evaluation_results['accuracy']:.3f}\")\n        print(f\"Precision: {evaluation_results['precision']:.3f}\")\n        print(f\"Recall: {evaluation_results['recall']:.3f}\")\n        print(f\"F1-Score: {evaluation_results['f1_score']:.3f}\")\n        \n        if evaluation_results['roc_auc'] is not None:\n            print(f\"ROC AUC: {evaluation_results['roc_auc']:.3f}\")\n            \n        print(\"\\n📊 Confusion Matrix:\")\n        cm = evaluation_results['confusion_matrix']\n        print(f\"True Negatives (Unsafe correctly predicted): {cm[0,0]}\")\n        print(f\"False Positives (Unsafe predicted as Safe): {cm[0,1]}\")\n        print(f\"False Negatives (Safe predicted as Unsafe): {cm[1,0]}\")\n        print(f\"True Positives (Safe correctly predicted): {cm[1,1]}\")\n        \n        # Calculate safety-specific metrics\n        total_predictions = len(evaluation_results['true_labels'])\n        safe_sites_identified = np.sum(evaluation_results['predictions'])\n        actual_safe_sites = np.sum(evaluation_results['true_labels'])\n        \n        print(f\"\\n🛡️ Safety Assessment Summary:\")\n        print(f\"Total test sites: {total_predictions}\")\n        print(f\"Sites predicted as safe: {safe_sites_identified}\")\n        print(f\"Actually safe sites: {actual_safe_sites}\")\n        print(f\"False positive rate (unsafe sites marked safe): {cm[0,1]/(cm[0,0]+cm[0,1]):.3f}\")\n        print(f\"False negative rate (safe sites marked unsafe): {cm[1,0]/(cm[1,0]+cm[1,1]):.3f}\")\n        \n        print(\"\\n✅ Model evaluation completed!\")\n        \nelse:\n    evaluation_results = {}\n    print(\"❌ Cannot evaluate model - no test data available or model not trained\")"

SyntaxError: unexpected character after line continuation character (1986275746.py, line 1)

## 9. 🧪 Test Landing Safety Assessment

Now let's test our trained model on sample images and demonstrate the complete assessment pipeline.

In [None]:
# Test the complete assessment pipeline
if model_trained and dataset and len(dataset.get('paths_test', [])) > 0:
    print("🧪 Testing landing safety assessment pipeline...")
    
    # Select a few test images for demonstration
    test_images = dataset['paths_test'][:5]  # Test first 5 images
    
    print(f"\n🔍 Assessing {len(test_images)} test landing sites...")
    
    test_results = []
    
    for i, image_path in enumerate(test_images):
        print(f"\n--- Assessment {i+1}/5 ---")
        print(f"📸 Image: {os.path.basename(image_path)}")
        
        # Perform safety assessment
        assessment = assess_landing_safety(image_path, safety_model)
        
        if assessment['status'] == 'success':
            print(f"🛡️ Safety Status: {assessment['safety_status'].upper()}")
            print(f"📊 Confidence: {assessment['confidence']:.3f}")
            print(f"💭 Recommendation: {assessment['recommendation']}")
            
            # Show key features
            features = assessment['features']
            print(f"\n📋 Key Features:")
            print(f"  • Max Slope: {features.get('max_slope', 0):.1f}")
            print(f"  • Surface Roughness: {features.get('glcm_contrast', 0):.3f}")
            print(f"  • Edge Density: {features.get('edge_density', 0):.3f}")
            print(f"  • Texture Std: {features.get('std_intensity', 0):.1f}")
            
            # Show risk factors
            risk_factors = assessment['risk_factors']
            print(f"\n⚠️ Risk Factors:")
            for risk_type, risk_level in risk_factors.items():
                emoji = "🔴" if risk_level == 'high' else "🟡" if risk_level == 'medium' else "🟢"
                print(f"  {emoji} {risk_type.title()}: {risk_level}")
                
            test_results.append(assessment)
            
        else:
            print(f"❌ Assessment failed: {assessment.get('message', 'Unknown error')}")
    
    # Summary of test results
    if test_results:
        safe_count = sum(1 for r in test_results if r['safety_status'] == 'safe')
        unsafe_count = len(test_results) - safe_count
        avg_confidence = np.mean([r['confidence'] for r in test_results])
        
        print(f"\n📊 Test Assessment Summary:")
        print(f"Total sites assessed: {len(test_results)}")
        print(f"Safe landing sites: {safe_count}")
        print(f"Unsafe landing sites: {unsafe_count}")
        print(f"Average confidence: {avg_confidence:.3f}")
        
        # Compare with actual labels if available
        actual_labels = dataset['y_test'][:len(test_results)]
        predicted_labels = [1 if r['safety_status'] == 'safe' else 0 for r in test_results]
        
        if len(actual_labels) == len(predicted_labels):
            correct_predictions = sum(1 for actual, pred in zip(actual_labels, predicted_labels) if actual == pred)
            accuracy = correct_predictions / len(actual_labels)
            print(f"Test accuracy on these samples: {accuracy:.3f}")
            
        print("\n✅ Testing pipeline completed successfully!")
        
else:
    print("❌ Cannot test pipeline - model not trained or no test data available")
    test_results = []

## 10. 📊 Visualize Results and Analysis

Finally, let's create comprehensive visualizations of our results and model performance.

In [None]:
# Create comprehensive visualizations
if model_trained and evaluation_results:
    print("📊 Creating comprehensive visualizations...")
    
    # Set up the plotting style
    plt.style.use('seaborn-v0_8')
    sns.set_palette("husl")
    
    # Create a figure with multiple subplots
    fig = plt.figure(figsize=(20, 16))
    
    # 1. Confusion Matrix Heatmap
    plt.subplot(2, 3, 1)
    cm = evaluation_results['confusion_matrix']
    sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', 
                xticklabels=['Unsafe', 'Safe'], 
                yticklabels=['Unsafe', 'Safe'])
    plt.title('Confusion Matrix', fontsize=14, fontweight='bold')
    plt.ylabel('True Label')
    plt.xlabel('Predicted Label')
    
    # 2. Feature Importance Bar Plot
    plt.subplot(2, 3, 2)
    feature_importance = safety_model.get_feature_importance()
    if feature_importance:
        features = list(feature_importance.keys())[:10]  # Top 10 features
        importance_values = list(feature_importance.values())[:10]
        
        bars = plt.barh(range(len(features)), importance_values)
        plt.yticks(range(len(features)), features)
        plt.xlabel('Feature Importance')
        plt.title('Top 10 Feature Importance', fontsize=14, fontweight='bold')
        plt.gca().invert_yaxis()
        
        # Add value labels on bars
        for i, bar in enumerate(bars):
            width = bar.get_width()
            plt.text(width, bar.get_y() + bar.get_height()/2, 
                    f'{width:.3f}', ha='left', va='center')
    
    # 3. Model Performance Metrics
    plt.subplot(2, 3, 3)
    metrics = ['Accuracy', 'Precision', 'Recall', 'F1-Score']
    values = [
        evaluation_results['accuracy'],
        evaluation_results['precision'],
        evaluation_results['recall'],
        evaluation_results['f1_score']
    ]
    
    bars = plt.bar(metrics, values, color=['skyblue', 'lightgreen', 'orange', 'pink'])
    plt.ylim(0, 1)
    plt.title('Model Performance Metrics', fontsize=14, fontweight='bold')
    plt.ylabel('Score')
    
    # Add value labels on bars
    for bar, value in zip(bars, values):
        plt.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.01,
                f'{value:.3f}', ha='center', va='bottom')
    
    # 4. ROC Curve (if available)
    if evaluation_results.get('roc_auc') is not None:
        plt.subplot(2, 3, 4)
        try:
            fpr, tpr, _ = roc_curve(evaluation_results['true_labels'], 
                                  evaluation_results['probabilities'])
            plt.plot(fpr, tpr, 'b-', linewidth=2, 
                    label=f'ROC Curve (AUC = {evaluation_results["roc_auc"]:.3f})')
            plt.plot([0, 1], [0, 1], 'r--', linewidth=1, label='Random Classifier')
            plt.xlabel('False Positive Rate')
            plt.ylabel('True Positive Rate')
            plt.title('ROC Curve', fontsize=14, fontweight='bold')
            plt.legend()
            plt.grid(True, alpha=0.3)
        except:
            plt.text(0.5, 0.5, 'ROC Curve\nNot Available', 
                    ha='center', va='center', transform=plt.gca().transAxes)
            plt.title('ROC Curve', fontsize=14, fontweight='bold')
    
    # 5. Prediction Confidence Distribution
    plt.subplot(2, 3, 5)
    probabilities = evaluation_results['probabilities']
    plt.hist(probabilities, bins=20, alpha=0.7, color='lightblue', edgecolor='black')
    plt.xlabel('Prediction Confidence')
    plt.ylabel('Frequency')
    plt.title('Prediction Confidence Distribution', fontsize=14, fontweight='bold')
    plt.grid(True, alpha=0.3)
    
    # 6. Class Distribution
    plt.subplot(2, 3, 6)
    true_labels = evaluation_results['true_labels']
    predicted_labels = evaluation_results['predictions']
    
    x = np.arange(2)
    width = 0.35
    
    true_counts = [np.sum(true_labels == 0), np.sum(true_labels == 1)]
    pred_counts = [np.sum(predicted_labels == 0), np.sum(predicted_labels == 1)]
    
    plt.bar(x - width/2, true_counts, width, label='True Labels', alpha=0.8)
    plt.bar(x + width/2, pred_counts, width, label='Predictions', alpha=0.8)
    
    plt.xlabel('Class')
    plt.ylabel('Count')
    plt.title('Class Distribution Comparison', fontsize=14, fontweight='bold')
    plt.xticks(x, ['Unsafe', 'Safe'])
    plt.legend()
    
    plt.tight_layout()
    plt.show()
    
    print("✅ Model performance visualizations created!")
    
    # Additional Analysis Summary
    print(f"\n📈 Final Model Analysis Summary:")
    print(f"═" * 50)
    print(f"🤖 Model Type: Random Forest Classifier")
    print(f"📊 Training Samples: {training_metrics.get('n_samples', 'N/A')}")
    print(f"📏 Feature Dimensions: {training_metrics.get('n_features', 'N/A')}")
    print(f"\n🎯 Cross-Validation Performance:")
    print(f"   Accuracy: {training_metrics.get('cv_accuracy_mean', 0):.3f} ± {training_metrics.get('cv_accuracy_std', 0):.3f}")
    print(f"   Precision: {training_metrics.get('cv_precision_mean', 0):.3f} ± {training_metrics.get('cv_precision_std', 0):.3f}")
    print(f"   Recall: {training_metrics.get('cv_recall_mean', 0):.3f} ± {training_metrics.get('cv_recall_std', 0):.3f}")
    print(f"   F1-Score: {training_metrics.get('cv_f1_mean', 0):.3f} ± {training_metrics.get('cv_f1_std', 0):.3f}")
    print(f"\n🧪 Test Set Performance:")
    print(f"   Accuracy: {evaluation_results['accuracy']:.3f}")
    print(f"   Precision: {evaluation_results['precision']:.3f}")
    print(f"   Recall: {evaluation_results['recall']:.3f}")
    print(f"   F1-Score: {evaluation_results['f1_score']:.3f}")
    
    if evaluation_results.get('roc_auc'):
        print(f"   ROC AUC: {evaluation_results['roc_auc']:.3f}")
    
    # Safety-critical metrics
    cm = evaluation_results['confusion_matrix']
    false_positive_rate = cm[0,1] / (cm[0,0] + cm[0,1]) if (cm[0,0] + cm[0,1]) > 0 else 0
    false_negative_rate = cm[1,0] / (cm[1,0] + cm[1,1]) if (cm[1,0] + cm[1,1]) > 0 else 0
    
    print(f"\n⚠️ Safety-Critical Metrics:")
    print(f"   False Positive Rate (unsafe → safe): {false_positive_rate:.3f}")
    print(f"   False Negative Rate (safe → unsafe): {false_negative_rate:.3f}")
    
    # Recommendations
    print(f"\n💡 Recommendations:")
    if false_positive_rate > 0.1:
        print(f"   ⚠️ High false positive rate - consider more conservative thresholds")
    if evaluation_results['accuracy'] > 0.8:
        print(f"   ✅ Good overall accuracy - model is performing well")
    if evaluation_results['recall'] < 0.7:
        print(f"   ⚠️ Low recall - model may miss some safe landing sites")
    if evaluation_results['precision'] < 0.7:
        print(f"   ⚠️ Low precision - model may incorrectly mark unsafe sites as safe")
        
else:
    print("❌ Cannot create visualizations - model not trained or evaluation not completed")

print("\n🎉 Mars Landing Safety Assessment Notebook Complete!")
print("🚀 You now have a complete pipeline for assessing Mars landing site safety!")