<!--
Copyright (c) 2025 Milin Patel
Hochschule Kempten - University of Applied Sciences
-->

*Copyright (c) 2025 Milin Patel. All Rights Reserved.*

# Simulation-Based SOTIF Validation

**Module 04: Safety of the Intended Functionality (ISO 21448)**

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/milinpatel07/Autonomous-Driving_AI-Safety-and-Security/blob/main/04_SOTIF/notebooks/04_simulation_sotif_validation.ipynb)

**Author:** Milin Patel  
**Institution:** Hochschule Kempten - University of Applied Sciences

---

## Learning Objectives

By the end of this notebook, you will:
- Understand the role of simulation in SOTIF validation
- Learn about CARLA simulator for autonomous driving testing
- Create SOTIF-related test scenarios programmatically
- Evaluate perception system performance under various conditions
- Apply metrics for SOTIF acceptance criteria

---

## Background

This notebook is based on research:

> Patel, M., Jung, R. (2024). "Simulation-Based Performance Evaluation of 3D Object Detection Methods with Deep Learning for a LiDAR Point Cloud Dataset in a SOTIF-related Use Case." VEHITS 2024.

Simulation enables testing perception systems across thousands of scenarios that would be impractical or dangerous to test in the real world.

In [None]:
# Setup
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import seaborn as sns
from typing import Dict, List, Tuple
import warnings
warnings.filterwarnings('ignore')

np.random.seed(42)
plt.style.use('seaborn-v0_8-whitegrid')
print("Setup complete.")

## 1. Simulation in SOTIF Validation

### Why Simulation?

| Advantage | Description |
|-----------|-------------|
| **Safety** | Test dangerous scenarios without risk |
| **Scale** | Run thousands of scenarios automatically |
| **Control** | Precise control over environmental conditions |
| **Reproducibility** | Exact scenario repetition for debugging |
| **Cost** | Lower cost than physical testing |

### Simulation Platforms

| Platform | Developer | Key Features |
|----------|-----------|---------------|
| **CARLA** | Intel/CVC | Open-source, sensor simulation, Python API |
| **LGSVL** | LG Electronics | Unity-based, ROS integration |
| **AirSim** | Microsoft | Unreal Engine, drone + car simulation |
| **SUMO** | DLR | Traffic flow simulation |
| **PreScan** | Siemens | Commercial, physics-based sensors |

In [None]:
def visualize_simulation_role():
    """Visualize where simulation fits in SOTIF validation."""
    fig, ax = plt.subplots(figsize=(14, 6))
    
    # Validation pyramid
    levels = [
        ('Simulation\n(Millions of scenarios)', 0.9, '#3498db'),
        ('Test Track\n(Thousands)', 0.6, '#2ecc71'),
        ('Public Road\n(Hundreds)', 0.35, '#f39c12'),
        ('Field Data\n(Real incidents)', 0.15, '#e74c3c')
    ]
    
    y_base = 0.1
    for label, width, color in levels:
        rect = plt.Rectangle((0.5-width/2, y_base), width, 0.15,
                            facecolor=color, alpha=0.8, edgecolor='black')
        ax.add_patch(rect)
        ax.text(0.5, y_base+0.075, label, ha='center', va='center',
               fontsize=10, fontweight='bold', color='white')
        y_base += 0.17
    
    # Arrows
    ax.annotate('', xy=(0.1, 0.7), xytext=(0.1, 0.15),
               arrowprops=dict(arrowstyle='->', lw=2))
    ax.text(0.05, 0.4, 'Coverage', rotation=90, va='center', fontweight='bold')
    
    ax.annotate('', xy=(0.95, 0.15), xytext=(0.95, 0.7),
               arrowprops=dict(arrowstyle='->', lw=2))
    ax.text(0.97, 0.4, 'Fidelity', rotation=90, va='center', fontweight='bold')
    
    ax.set_xlim(0, 1.1)
    ax.set_ylim(0, 0.9)
    ax.axis('off')
    ax.set_title('SOTIF Validation Pyramid - Simulation Role', fontsize=14, fontweight='bold')
    
    plt.tight_layout()
    plt.show()

visualize_simulation_role()

## 2. SOTIF Test Scenario Design

Based on Patel et al. (2024), a SOTIF-related use case includes:

- **Ego vehicle** equipped with LiDAR sensor
- **Multiple weather conditions** (clear, cloudy, rain)
- **Different times of day** (noon, sunset, night)
- **Target objects** for detection evaluation

### Weather Condition Matrix (21 conditions)

In [None]:
# Define SOTIF test scenario matrix
weather_conditions = ['Clear', 'Cloudy', 'Light Rain', 'Heavy Rain', 
                      'Light Fog', 'Dense Fog', 'Wet Road']
times_of_day = ['Morning', 'Noon', 'Sunset', 'Night']

# Generate scenario matrix
scenarios = []
scenario_id = 1
for weather in weather_conditions[:3]:  # Subset for demonstration
    for time in times_of_day:
        scenarios.append({
            'ID': f'SC-{scenario_id:03d}',
            'Weather': weather,
            'Time': time,
            'Visibility': np.random.uniform(0.5, 1.0) if weather == 'Clear' else np.random.uniform(0.2, 0.7),
            'Precipitation': 0.0 if 'Rain' not in weather else np.random.uniform(0.3, 0.8)
        })
        scenario_id += 1

scenarios_df = pd.DataFrame(scenarios)
print(f"Generated {len(scenarios_df)} test scenarios")
display(scenarios_df)

## 3. Simulated LiDAR Point Cloud Generation

In CARLA, LiDAR sensors generate point clouds representing the 3D environment. For SOTIF validation, we analyze how detection performance varies with conditions.

In [None]:
def generate_simulated_point_cloud(n_points=1000, weather='Clear', time='Noon'):
    """Generate a simulated point cloud with weather effects."""
    # Base point cloud (vehicle-like shape)
    x = np.random.uniform(5, 20, n_points)  # Distance
    y = np.random.uniform(-3, 3, n_points)   # Lateral
    z = np.random.uniform(-1, 2, n_points)   # Height
    
    # Add weather-dependent noise
    noise_factor = {
        'Clear': 0.05,
        'Cloudy': 0.08,
        'Light Rain': 0.15,
        'Heavy Rain': 0.3,
        'Light Fog': 0.2,
        'Dense Fog': 0.4
    }.get(weather, 0.1)
    
    # Time-dependent intensity
    intensity_factor = {
        'Morning': 0.9,
        'Noon': 1.0,
        'Sunset': 0.7,
        'Night': 0.5
    }.get(time, 0.8)
    
    # Add noise
    x += np.random.normal(0, noise_factor, n_points)
    y += np.random.normal(0, noise_factor, n_points)
    z += np.random.normal(0, noise_factor * 0.5, n_points)
    
    # Simulate point dropout (more in bad weather)
    dropout_rate = min(0.5, noise_factor * 1.5)
    mask = np.random.random(n_points) > dropout_rate
    
    # Intensity
    intensity = np.random.uniform(0.3, 1.0, n_points) * intensity_factor
    
    return {
        'x': x[mask],
        'y': y[mask],
        'z': z[mask],
        'intensity': intensity[mask],
        'n_points': sum(mask),
        'dropout_rate': dropout_rate
    }

# Generate point clouds for different conditions
fig = plt.figure(figsize=(15, 5))

conditions = [('Clear', 'Noon'), ('Light Rain', 'Noon'), ('Dense Fog', 'Night')]

for i, (weather, time) in enumerate(conditions):
    ax = fig.add_subplot(1, 3, i+1, projection='3d')
    
    pc = generate_simulated_point_cloud(n_points=500, weather=weather, time=time)
    
    scatter = ax.scatter(pc['x'], pc['y'], pc['z'], c=pc['intensity'], 
                        cmap='viridis', s=5, alpha=0.6)
    
    ax.set_xlabel('X (forward)')
    ax.set_ylabel('Y (lateral)')
    ax.set_zlabel('Z (height)')
    ax.set_title(f'{weather}, {time}\n({pc["n_points"]} points)', fontweight='bold')
    ax.view_init(elev=20, azim=45)

plt.suptitle('Simulated LiDAR Point Clouds Under Different Conditions', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

## 4. Object Detection Performance Evaluation

For SOTIF validation, we evaluate detection performance using:

- **Average Precision (AP)** - Area under precision-recall curve
- **Recall** - Percentage of actual objects detected
- **Detection latency** - Processing time

Based on Patel et al. (2024), detection methods include:
- PointPillars
- SECOND
- PV-RCNN

In [None]:
def simulate_detection_performance(weather, time, detector='PointPillars'):
    """Simulate detection performance based on conditions."""
    # Base performance by detector
    base_ap = {
        'PointPillars': 0.75,
        'SECOND': 0.78,
        'PV-RCNN': 0.82
    }.get(detector, 0.75)
    
    # Weather degradation
    weather_factor = {
        'Clear': 1.0,
        'Cloudy': 0.98,
        'Light Rain': 0.85,
        'Heavy Rain': 0.65,
        'Light Fog': 0.80,
        'Dense Fog': 0.55
    }.get(weather, 0.9)
    
    # Time degradation
    time_factor = {
        'Morning': 0.95,
        'Noon': 1.0,
        'Sunset': 0.90,
        'Night': 0.75
    }.get(time, 0.9)
    
    # Calculate final performance with some randomness
    ap = base_ap * weather_factor * time_factor * np.random.uniform(0.95, 1.05)
    recall = min(0.99, ap * np.random.uniform(1.0, 1.1))
    
    return {
        'AP': min(1.0, ap),
        'Recall': recall,
        'Latency_ms': np.random.uniform(30, 80)
    }

# Evaluate across all scenarios
detectors = ['PointPillars', 'SECOND', 'PV-RCNN']
results = []

for _, scenario in scenarios_df.iterrows():
    for detector in detectors:
        perf = simulate_detection_performance(scenario['Weather'], scenario['Time'], detector)
        results.append({
            'Scenario': scenario['ID'],
            'Weather': scenario['Weather'],
            'Time': scenario['Time'],
            'Detector': detector,
            **perf
        })

results_df = pd.DataFrame(results)

print("Detection Performance Summary")
summary = results_df.groupby(['Weather', 'Detector'])['AP'].mean().unstack()
display(summary.round(3))

In [None]:
def visualize_detection_performance(results_df):
    """Visualize detection performance across conditions."""
    fig, axes = plt.subplots(1, 3, figsize=(15, 5))
    
    # AP by Weather
    sns.boxplot(data=results_df, x='Weather', y='AP', hue='Detector', ax=axes[0])
    axes[0].set_title('Average Precision by Weather', fontweight='bold')
    axes[0].set_ylim(0, 1)
    axes[0].axhline(0.7, color='red', linestyle='--', label='SOTIF Threshold')
    axes[0].tick_params(axis='x', rotation=45)
    
    # AP by Time
    sns.boxplot(data=results_df, x='Time', y='AP', hue='Detector', ax=axes[1])
    axes[1].set_title('Average Precision by Time of Day', fontweight='bold')
    axes[1].set_ylim(0, 1)
    axes[1].axhline(0.7, color='red', linestyle='--')
    axes[1].get_legend().remove()
    
    # Recall vs AP scatter
    for detector in results_df['Detector'].unique():
        subset = results_df[results_df['Detector'] == detector]
        axes[2].scatter(subset['AP'], subset['Recall'], label=detector, alpha=0.6, s=50)
    axes[2].set_xlabel('Average Precision')
    axes[2].set_ylabel('Recall')
    axes[2].set_title('Recall vs AP', fontweight='bold')
    axes[2].axvline(0.7, color='red', linestyle='--')
    axes[2].axhline(0.8, color='red', linestyle='--')
    axes[2].legend()
    
    plt.tight_layout()
    plt.show()

visualize_detection_performance(results_df)

## 5. SOTIF Acceptance Criteria

Based on ISO 21448, we define acceptance criteria for perception performance:

| Metric | Threshold | Rationale |
|--------|-----------|----------|
| AP (Clear) | >= 0.80 | Baseline performance |
| AP (Adverse) | >= 0.70 | Degraded but acceptable |
| Recall | >= 0.85 | Safety-critical detection |
| Latency | <= 100ms | Real-time requirement |

In [None]:
def evaluate_sotif_acceptance(results_df):
    """Evaluate SOTIF acceptance criteria."""
    criteria = {
        'AP_Clear': 0.80,
        'AP_Adverse': 0.70,
        'Recall': 0.85,
        'Latency_ms': 100
    }
    
    evaluation = []
    
    for detector in results_df['Detector'].unique():
        subset = results_df[results_df['Detector'] == detector]
        clear = subset[subset['Weather'] == 'Clear']
        adverse = subset[subset['Weather'] != 'Clear']
        
        ap_clear = clear['AP'].mean()
        ap_adverse = adverse['AP'].mean()
        recall = subset['Recall'].mean()
        latency = subset['Latency_ms'].mean()
        
        evaluation.append({
            'Detector': detector,
            'AP (Clear)': ap_clear,
            'AP (Adverse)': ap_adverse,
            'Recall': recall,
            'Latency (ms)': latency,
            'Pass Clear': 'PASS' if ap_clear >= criteria['AP_Clear'] else 'FAIL',
            'Pass Adverse': 'PASS' if ap_adverse >= criteria['AP_Adverse'] else 'FAIL',
            'Pass Recall': 'PASS' if recall >= criteria['Recall'] else 'FAIL',
            'Pass Latency': 'PASS' if latency <= criteria['Latency_ms'] else 'FAIL'
        })
    
    eval_df = pd.DataFrame(evaluation)
    
    # Overall pass/fail
    eval_df['SOTIF Status'] = eval_df.apply(
        lambda r: 'ACCEPTED' if all([
            r['Pass Clear'] == 'PASS',
            r['Pass Adverse'] == 'PASS',
            r['Pass Recall'] == 'PASS',
            r['Pass Latency'] == 'PASS'
        ]) else 'REJECTED',
        axis=1
    )
    
    return eval_df

eval_results = evaluate_sotif_acceptance(results_df)
print("SOTIF Acceptance Evaluation")
print("=" * 60)
display(eval_results[['Detector', 'AP (Clear)', 'AP (Adverse)', 'Recall', 'SOTIF Status']].round(3))

## 6. Identifying Triggering Conditions from Simulation

Simulation helps identify specific conditions where performance drops below acceptable thresholds.

In [None]:
def identify_triggering_conditions(results_df, ap_threshold=0.7):
    """Identify conditions causing performance below threshold."""
    failing = results_df[results_df['AP'] < ap_threshold].copy()
    
    if len(failing) > 0:
        print(f"Identified {len(failing)} scenarios below AP threshold ({ap_threshold})")
        print("\nTriggering Condition Summary:")
        print(failing.groupby(['Weather', 'Time'])['AP'].agg(['count', 'mean']).round(3))
        return failing
    else:
        print("No scenarios below threshold")
        return pd.DataFrame()

triggering = identify_triggering_conditions(results_df, ap_threshold=0.70)

if len(triggering) > 0:
    print("\nRecommended Mitigations:")
    print("- Restrict ODD to exclude identified conditions")
    print("- Improve model robustness through data augmentation")
    print("- Add redundant sensing for adverse conditions")

## 7. Exercise: Design Your SOTIF Test Campaign

**Task:** Design a simulation-based test campaign for a camera-based pedestrian detection system.

Consider:
- What weather conditions to test?
- What lighting conditions?
- What pedestrian variations (clothing, pose, occlusion)?
- What acceptance criteria?

In [None]:
# Exercise: Design your test campaign

# TODO: Define weather conditions
# camera_weather = [...]

# TODO: Define lighting conditions
# camera_lighting = [...]

# TODO: Define pedestrian variations
# pedestrian_vars = [...]

# TODO: Define acceptance criteria
# criteria = {...}

print("Design your camera-based pedestrian detection test campaign.")

## Summary

In this notebook, you learned:

- **Simulation Role**: Critical for SOTIF validation at scale
- **Scenario Design**: Weather, time, and environmental condition matrices
- **Performance Metrics**: AP, Recall for detection evaluation
- **Acceptance Criteria**: Thresholds for SOTIF compliance
- **Triggering Condition Identification**: Finding failure modes through systematic testing

### References

- Patel, M., Jung, R. (2024). "Simulation-Based Performance Evaluation of 3D Object Detection Methods." VEHITS.
- Patel, M. (2024). "SOTIF PCOD Dataset." IEEE DataPort.
- ISO 21448:2022 - Safety of the intended functionality
- CARLA Simulator Documentation: https://carla.org

---

*Notebook created by Milin Patel | Hochschule Kempten*  
*Last updated: 2025-01-22*