# 🚗 nuScenes Dataset - Comprehensive EDA Notebook

This notebook provides a comprehensive exploratory data analysis of the nuScenes mini dataset, covering 22 different analyses across various aspects of autonomous driving data.

## 📊 Complete Analysis Suite Overview

### 🚶 **Pedestrian Analysis (1-6)**
1. **Pedestrian Behaviour Analysis** - Standing, Walking, Running behaviors
2. **Pedestrian/Cyclist Ratio** - Distribution of pedestrians vs cyclists
3. **Pedestrian Density across Road Types** - Pedestrian counts per road type
4. **Pedestrian Road Crossing** - Jaywalking vs Crosswalk patterns
5. **Pedestrian Visibility Status** - Fully Visible, Occluded, Truncated
6. **Pedestrian Path w.r.t. Ego Vehicle** - In Path, Out of Path analysis

### 🚙 **Vehicle Analysis (7-9)**
7. **Vehicle Class Distribution** - Car, Bus, Truck, Van, Trailer counts
8. **Object Behaviour Distribution** - Moving vs Parked objects
9. **Vehicle Position w.r.t. Ego Vehicle** - Front, Left, Right, Behind positions

### 🌦️ **Environmental Analysis (10-13)**
10. **Weather Conditions** - Sunny, Rainy, Snow, Clear, Foggy, Overcast, Sleet
11. **Environment Distribution** - Urban, Rural, Desert, Offroad, Forest
12. **Time of Day Distribution** - Morning, Noon, Evening, Night
13. **Geographical Locations** - Singapore, US, Europe, Asia, Australia

### 🛣️ **Road Infrastructure Analysis (14-17)**
14. **Road Details/Curvature** - Straight, Curved, Intersection, Roundabouts
15. **Road Type Distribution** - Narrow, Highway, OneWay, OffRoad, City Road, Parking
16. **Road Obstacles** - Potholes, Debris, Closures, Construction Zones
17. **Road Furniture Analysis** - Streetlights, Curbs, Guardrails, etc.

### 🚗 **Ego Vehicle Analysis (18-20)**
18. **Ego Vehicle Motion Analysis** - Stop at red light, Stop at ped crossing, Moving
19. **Ego Vehicle Events Analysis** - Lane Change, Take Over, Turn, Exit
20. **Traffic Density vs Weather** - Vehicle counts across weather conditions

### 🔍 **Special Analysis (21-22)**
21. **Multi-Modal Synchronization** - Lidar, Radar, Camera data sync
22. **Rare Class Occurrences** - Ambulance, Police, Construction Vehicle, Wildlife

## 🎨 Interactive Features
- **9 Chart Types**: Bar, Pie, Donut, Heat Map, Radar, Histogram, Stacked Bar, Scatter, Density
- **Real Data Only**: No synthetic or sample data used
- **Fixed Labels**: All x-axis labels shown even if count is zero
- **Auto-Save**: High-resolution plots saved to `figures/exploratory/`

## 🔧 Setup and Configuration

Setting up the environment for nuScenes dataset analysis with all required imports and configurations.

In [None]:
# Standard imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
import os
import sys
from datetime import datetime
warnings.filterwarnings('ignore')

# Add project paths
sys.path.append('../src')
sys.path.append('../plots')
sys.path.append('../config')

# nuScenes EDA imports - All Data Loaders
from src.data_loader import (
    # Pedestrian Analysis (1-6)
    load_pedestrian_behaviour_data,
    load_pedestrian_cyclist_ratio,
    load_pedestrian_density_road_types,
    load_pedestrian_road_crossing,
    load_pedestrian_visibility_status,
    load_pedestrian_path_ego_data,
    
    # Vehicle Analysis (7-9)
    load_vehicle_class_data,
    load_object_behaviour_data,
    load_vehicle_position_ego_data,
    
    # Environmental Analysis (10-13)
    load_weather_conditions,
    load_environment_distribution,
    load_time_of_day_distribution,
    load_geographical_locations,
    
    # Road Infrastructure Analysis (14-17)
    load_road_details,
    load_road_type_distribution,
    load_road_obstacles,
    load_road_furniture_data,
    
    # Ego Vehicle Analysis (18-20)
    load_ego_vehicle_motion_data,
    load_ego_vehicle_events_data,
    load_traffic_density_weather_data,
    
    # Special Analysis (21-22)
    load_multimodal_synchronization_data,
    load_rare_class_occurrences
)

# Dataset configuration
DATAROOT = "../Data/Raw/nuscenes/v1.0-mini"
VERSION = "v1.0-mini"
OUTPUT_DIR = "../figures/exploratory"

# Configure plotting style
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")
plt.rcParams['figure.figsize'] = (14, 8)
plt.rcParams['font.size'] = 10

# Display options
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
pd.set_option('display.max_colwidth', 50)

print("🚗 nuScenes EDA Setup Complete!")
print(f"📁 Dataset Path: {DATAROOT}")
print(f"📊 Output Directory: {OUTPUT_DIR}")
print(f"🎨 Available Analyses: 22 comprehensive EDA modules")
print(f"🕒 Session Start Time: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")

# Verify dataset exists
if os.path.exists(DATAROOT):
    print("✅ nuScenes dataset found!")
else:
    print("❌ nuScenes dataset not found. Please check the path.")
    print("📍 Expected path: Data/Raw/nuscenes/v1.0-mini")

---

# 🚶 Pedestrian Analysis Suite (1-6)

Comprehensive pedestrian behavior and interaction analysis using real nuScenes data.

## 1. 🚶‍♀️ Pedestrian Behaviour Analysis
Analyzes pedestrian activities: Standing, Walking, Running based on nuScenes attributes.

**Data Source**: `sample_annotation` + `attribute`  
**Logic**: Maps pedestrian attributes to behavior states  
**Labels**: Standing, Walking, Running

In [None]:
# 1. Pedestrian Behaviour Analysis
print("🔍 Analysis 1: Loading Pedestrian Behaviour Data...")
pedestrian_behaviour = load_pedestrian_behaviour_data(DATAROOT, VERSION)

print("\n📊 Pedestrian Behaviour Distribution:")
total_pedestrian_instances = sum(pedestrian_behaviour.values())
for behavior, count in pedestrian_behaviour.items():
    percentage = (count / total_pedestrian_instances * 100) if total_pedestrian_instances > 0 else 0
    print(f"  {behavior}: {count} instances ({percentage:.1f}%)")

print(f"\n📈 Total Pedestrian Behavior Instances: {total_pedestrian_instances}")

# Uncomment to visualize:
# from plots.PedestrianBehaviour import plot_pedestrian_behaviour
# plot_pedestrian_behaviour(pedestrian_behaviour)

## 2. 🚴 Pedestrian/Cyclist Ratio Analysis
Analyzes the distribution of pedestrians vs cyclists in the dataset.

**Data Source**: `sample_annotation`  
**Logic**: Category-based filtering and counting  
**Labels**: Pedestrian, Cyclist, cycle without rider

In [None]:
# 2. Pedestrian/Cyclist Ratio
print("🔍 Analysis 2: Loading Pedestrian/Cyclist Ratio Data...")
ped_cyclist_ratio = load_pedestrian_cyclist_ratio(DATAROOT, VERSION)

print("\n📊 Pedestrian/Cyclist Distribution:")
total_instances = sum(ped_cyclist_ratio.values())
for category, count in ped_cyclist_ratio.items():
    percentage = (count / total_instances * 100) if total_instances > 0 else 0
    print(f"  {category}: {count} instances ({percentage:.1f}%)")

print(f"\n📈 Total Instances: {total_instances}")

# Calculate ratios
if ped_cyclist_ratio['Pedestrian'] > 0 and ped_cyclist_ratio['Cyclist'] > 0:
    ratio = ped_cyclist_ratio['Pedestrian'] / ped_cyclist_ratio['Cyclist']
    print(f"🔢 Pedestrian to Cyclist Ratio: {ratio:.2f}:1")

# Uncomment to visualize:
# from plots.PedestrianCyclistRatio import plot_pedestrian_cyclist_ratio
# plot_pedestrian_cyclist_ratio(ped_cyclist_ratio)

## 3. 🛣️ Pedestrian Density across Road Types
Analyzes pedestrian density across different road types.

**Data Source**: `sample` + `scene` + `sample_annotation`  
**Logic**: Scene keyword analysis + pedestrian counting  
**Labels**: Narrow, Highway, OneWay, OffRoad, City Road

In [None]:
# 3. Pedestrian Density across Road Types
print("🔍 Analysis 3: Loading Pedestrian Density across Road Types...")
ped_density_road = load_pedestrian_density_road_types(DATAROOT, VERSION)

print("\n📊 Pedestrian Density by Road Type:")
total_pedestrians = sum(ped_density_road.values())
for road_type, count in ped_density_road.items():
    percentage = (count / total_pedestrians * 100) if total_pedestrians > 0 else 0
    print(f"  {road_type}: {count} pedestrians ({percentage:.1f}%)")

print(f"\n📈 Total Pedestrians Analyzed: {total_pedestrians}")

# Find highest density road type
if total_pedestrians > 0:
    max_road_type = max(ped_density_road, key=ped_density_road.get)
    print(f"🔝 Highest Density Road Type: {max_road_type} ({ped_density_road[max_road_type]} pedestrians)")

# Uncomment to visualize:
# from plots.PedestrianDensityRoadTypes import plot_pedestrian_density_road_types
# plot_pedestrian_density_road_types(ped_density_road)

## 4. 🚸 Pedestrian Road Crossing Analysis
Analyzes pedestrian road crossing patterns: Jaywalking vs Crosswalk usage.

**Data Source**: `sample_annotation` + `attribute`  
**Logic**: Attribute analysis + urban pattern heuristics  
**Labels**: Jaywalking, Crosswalk

In [None]:
# 4. Pedestrian Road Crossing
print("🔍 Analysis 4: Loading Pedestrian Road Crossing Data...")
ped_crossing = load_pedestrian_road_crossing(DATAROOT, VERSION)

print("\n📊 Pedestrian Road Crossing Distribution:")
total_crossings = sum(ped_crossing.values())
for crossing_type, count in ped_crossing.items():
    percentage = (count / total_crossings * 100) if total_crossings > 0 else 0
    print(f"  {crossing_type}: {count} instances ({percentage:.1f}%)")

print(f"\n📈 Total Crossing Instances: {total_crossings}")

# Safety analysis
if total_crossings > 0:
    safe_ratio = (ped_crossing['Crosswalk'] / total_crossings * 100)
    unsafe_ratio = (ped_crossing['Jaywalking'] / total_crossings * 100)
    print(f"✅ Safe Crossing Rate: {safe_ratio:.1f}%")
    print(f"⚠️ Unsafe Crossing Rate: {unsafe_ratio:.1f}%")

# Uncomment to visualize:
# from plots.PedestrianRoadCrossing import plot_pedestrian_road_crossing
# plot_pedestrian_road_crossing(ped_crossing)

## 5. 👁️ Pedestrian Visibility Status Analysis
Analyzes pedestrian visibility status in the dataset.

**Data Source**: `sample_annotation` + `visibility`  
**Logic**: Visibility percentage classification  
**Labels**: Fully Visible, Occluded, Truncated

In [None]:
# 5. Pedestrian Visibility Status
print("🔍 Analysis 5: Loading Pedestrian Visibility Status Data...")
ped_visibility = load_pedestrian_visibility_status(DATAROOT, VERSION)

print("\n📊 Pedestrian Visibility Distribution:")
total_visibility = sum(ped_visibility.values())
for visibility, count in ped_visibility.items():
    percentage = (count / total_visibility * 100) if total_visibility > 0 else 0
    print(f"  {visibility}: {count} instances ({percentage:.1f}%)")

print(f"\n📈 Total Visibility Instances: {total_visibility}")

# Detection quality analysis
if total_visibility > 0:
    clear_visibility = ped_visibility['Fully Visible'] / total_visibility * 100
    print(f"🔍 Clear Detection Rate: {clear_visibility:.1f}%")

# Uncomment to visualize:
# from plots.PedestrianVisibilityStatus import plot_pedestrian_visibility_status
# plot_pedestrian_visibility_status(ped_visibility)

## 6. 🎯 Pedestrian Path w.r.t. Ego Vehicle Analysis
Analyzes whether pedestrians are in the ego vehicle's path or out of path.

**Data Source**: `ego_pose` + `sample_annotation`  
**Logic**: Path intersection calculation  
**Labels**: In Path, Out of Path

In [None]:
# 6. Pedestrian Path w.r.t. Ego Vehicle
print("🔍 Analysis 6: Loading Pedestrian Path w.r.t. Ego Vehicle Data...")
ped_path_ego = load_pedestrian_path_ego_data(DATAROOT, VERSION)

print("\n📊 Pedestrian Path Distribution:")
total_path_instances = sum(ped_path_ego.values())
for path_status, count in ped_path_ego.items():
    percentage = (count / total_path_instances * 100) if total_path_instances > 0 else 0
    print(f"  {path_status}: {count} instances ({percentage:.1f}%)")

print(f"\n📈 Total Path Instances: {total_path_instances}")

# Safety assessment
if total_path_instances > 0:
    risk_level = ped_path_ego['In Path'] / total_path_instances * 100
    print(f"⚠️ Collision Risk Level: {risk_level:.1f}% of pedestrians in ego path")
    
    if risk_level > 50:
        print("🚨 HIGH RISK scenario detected")
    elif risk_level > 20:
        print("⚠️ MODERATE RISK scenario")
    else:
        print("✅ LOW RISK scenario")

# Uncomment to visualize:
# from plots.PedestrianPathEgo import plot_pedestrian_path_ego
# plot_pedestrian_path_ego(ped_path_ego)

---

# 🚙 Vehicle Analysis Suite (7-9)

Comprehensive vehicle behavior and positioning analysis using real nuScenes data.

## 7. 🚗 Vehicle Class Distribution Analysis
Analyzes the distribution of different vehicle classes in the dataset.

**Data Source**: `sample_annotation`  
**Logic**: Direct category mapping  
**Labels**: Car, Bus, Truck, Van, Trailer

In [None]:
# 7. Vehicle Class Distribution
print("🔍 Analysis 7: Loading Vehicle Class Distribution Data...")
vehicle_class = load_vehicle_class_data(DATAROOT, VERSION)

print("\n📊 Vehicle Class Distribution:")
total_vehicles = sum(vehicle_class.values())
for vehicle_type, count in vehicle_class.items():
    percentage = (count / total_vehicles * 100) if total_vehicles > 0 else 0
    print(f"  {vehicle_type}: {count} vehicles ({percentage:.1f}%)")

print(f"\n📈 Total Vehicles: {total_vehicles}")

# Fleet composition analysis
if total_vehicles > 0:
    dominant_class = max(vehicle_class, key=vehicle_class.get)
    print(f"🚗 Dominant Vehicle Class: {dominant_class} ({vehicle_class[dominant_class]} vehicles)")

# Uncomment to visualize:
# from plots.VehicleClass import plot_vehicle_class
# plot_vehicle_class(vehicle_class)

## 8. 🚦 Object Behaviour Distribution Analysis
Analyzes moving vs parked object behavior in the dataset.

**Data Source**: `sample_annotation` + position tracking  
**Logic**: Position tracking across frames  
**Labels**: Moving, Parked

In [None]:
# 8. Object Behaviour Distribution
print("🔍 Analysis 8: Loading Object Behaviour Distribution Data...")
object_behaviour = load_object_behaviour_data(DATAROOT, VERSION)

print("\n📊 Object Behaviour Distribution:")
total_objects = sum(object_behaviour.values())
for behaviour, count in object_behaviour.items():
    percentage = (count / total_objects * 100) if total_objects > 0 else 0
    print(f"  {behaviour}: {count} objects ({percentage:.1f}%)")

print(f"\n📈 Total Objects Analyzed: {total_objects}")

# Traffic dynamics analysis
if total_objects > 0:
    mobility_ratio = object_behaviour['Moving'] / total_objects * 100
    print(f"🚗 Traffic Mobility: {mobility_ratio:.1f}% of objects are moving")
    
    if mobility_ratio > 60:
        print("🚦 High traffic activity scenario")
    elif mobility_ratio > 30:
        print("🚗 Moderate traffic activity scenario")
    else:
        print("🅿️ Low traffic activity scenario (mostly parked)")

# Uncomment to visualize:
# from plots.ObjectBehaviour import plot_object_behaviour
# plot_object_behaviour(object_behaviour)

## 9. 📍 Vehicle Position w.r.t. Ego Vehicle Analysis
Analyzes where other vehicles are positioned relative to the ego vehicle.

**Data Source**: `ego_pose` + `sample_annotation`  
**Logic**: Geometric coordinate transformation  
**Labels**: Front, Left, Right, Behind

In [None]:
# 9. Vehicle Position w.r.t. Ego Vehicle
print("🔍 Analysis 9: Loading Vehicle Position w.r.t. Ego Vehicle Data...")
vehicle_position_ego = load_vehicle_position_ego_data(DATAROOT, VERSION)

print("\n📊 Vehicle Position Distribution:")
total_positions = sum(vehicle_position_ego.values())
for position, count in vehicle_position_ego.items():
    percentage = (count / total_positions * 100) if total_positions > 0 else 0
    print(f"  {position}: {count} vehicles ({percentage:.1f}%)")

print(f"\n📈 Total Vehicle Positions: {total_positions}")

# Traffic pattern analysis
if total_positions > 0:
    front_behind = vehicle_position_ego['Front'] + vehicle_position_ego['Behind']
    left_right = vehicle_position_ego['Left'] + vehicle_position_ego['Right']
    
    print(f"🚗 Longitudinal Traffic: {front_behind} vehicles ({front_behind/total_positions*100:.1f}%)")
    print(f"🛣️ Lateral Traffic: {left_right} vehicles ({left_right/total_positions*100:.1f}%)")
    
    # Determine traffic density
    if total_positions > 80:
        print("🚦 High traffic density scenario")
    elif total_positions > 40:
        print("🚗 Moderate traffic density scenario")
    else:
        print("🛣️ Low traffic density scenario")

# Uncomment to visualize:
# from plots.VehiclePositionEgo import plot_vehicle_position_ego
# plot_vehicle_position_ego(vehicle_position_ego)

---

# 🌦️ Environmental Analysis Suite (10-13)

Environmental conditions and geographical context analysis using real nuScenes data.

## 10-13. Environmental Analysis Summary
Running all environmental analyses together for comprehensive context understanding.

In [None]:
# 10. Weather Conditions
print("🔍 Analysis 10: Loading Weather Conditions Data...")
weather_conditions = load_weather_conditions(DATAROOT, VERSION)
print("📊 Weather Distribution:", dict(weather_conditions))

# 11. Environment Distribution  
print("\n🔍 Analysis 11: Loading Environment Distribution Data...")
environment_dist = load_environment_distribution(DATAROOT, VERSION)
print("📊 Environment Distribution:", dict(environment_dist))

# 12. Time of Day Distribution
print("\n🔍 Analysis 12: Loading Time of Day Distribution Data...")
time_of_day = load_time_of_day_distribution(DATAROOT, VERSION)
print("📊 Time of Day Distribution:", dict(time_of_day))

# 13. Geographical Locations
print("\n🔍 Analysis 13: Loading Geographical Locations Data...")
geo_locations = load_geographical_locations(DATAROOT, VERSION)
print("📊 Geographical Distribution:", dict(geo_locations))

# Environmental Context Summary
print("\n🌍 ENVIRONMENTAL CONTEXT SUMMARY:")
print("="*50)

total_weather = sum(weather_conditions.values())
if total_weather > 0:
    dominant_weather = max(weather_conditions, key=weather_conditions.get)
    print(f"☀️ Dominant Weather: {dominant_weather}")

total_env = sum(environment_dist.values()) 
if total_env > 0:
    dominant_env = max(environment_dist, key=environment_dist.get)
    print(f"🌆 Dominant Environment: {dominant_env}")

total_time = sum(time_of_day.values())
if total_time > 0:
    dominant_time = max(time_of_day, key=time_of_day.get)
    print(f"🕐 Dominant Time Period: {dominant_time}")

total_geo = sum(geo_locations.values())
if total_geo > 0:
    dominant_geo = max(geo_locations, key=geo_locations.get)
    print(f"🗺️ Primary Location: {dominant_geo}")

# Uncomment to visualize any specific analysis:
# from plots.Weather import plot_weather_distribution
# plot_weather_distribution(weather_conditions)

---

# 🛣️ Road Infrastructure Analysis Suite (14-17)

Road characteristics and infrastructure analysis using real nuScenes data.

In [None]:
# 14. Road Details/Curvature
print("🔍 Analysis 14: Loading Road Details/Curvature Data...")
road_details = load_road_details(DATAROOT, VERSION)
print("📊 Road Curvature Distribution:", dict(road_details))

# 15. Road Type Distribution
print("\n🔍 Analysis 15: Loading Road Type Distribution Data...")
road_types = load_road_type_distribution(DATAROOT, VERSION)
print("📊 Road Type Distribution:", dict(road_types))

# 16. Road Obstacles
print("\n🔍 Analysis 16: Loading Road Obstacles Data...")
road_obstacles = load_road_obstacles(DATAROOT, VERSION)
print("📊 Road Obstacles Distribution:", dict(road_obstacles))

# 17. Road Furniture Analysis
print("\n🔍 Analysis 17: Loading Road Furniture Data...")
road_furniture = load_road_furniture_data(DATAROOT, VERSION)
print("📊 Road Furniture Distribution:", dict(road_furniture))

# Road Infrastructure Summary
print("\n🛣️ ROAD INFRASTRUCTURE SUMMARY:")
print("="*50)

total_details = sum(road_details.values())
if total_details > 0:
    dominant_geometry = max(road_details, key=road_details.get)
    print(f"🛣️ Dominant Road Geometry: {dominant_geometry}")

total_types = sum(road_types.values())
if total_types > 0:
    dominant_type = max(road_types, key=road_types.get)
    print(f"🏙️ Dominant Road Type: {dominant_type}")

total_obstacles = sum(road_obstacles.values())
print(f"⚠️ Total Road Obstacles Detected: {total_obstacles}")

total_furniture = sum(road_furniture.values())
print(f"🏗️ Total Road Furniture Items: {total_furniture}")

# Infrastructure complexity assessment
complexity_score = total_obstacles + (total_furniture / 10)
if complexity_score > 50:
    print("🏗️ High infrastructure complexity")
elif complexity_score > 20:
    print("🛣️ Moderate infrastructure complexity")  
else:
    print("🛤️ Simple infrastructure environment")

---

# 🚗 Ego Vehicle Analysis Suite (18-20)

Ego vehicle behavior and motion analysis using real nuScenes data.

In [None]:
# 18. Ego Vehicle Motion Analysis
print("🔍 Analysis 18: Loading Ego Vehicle Motion Data...")
ego_motion = load_ego_vehicle_motion_data(DATAROOT, VERSION)
print("📊 Ego Motion Distribution:", dict(ego_motion))

# 19. Ego Vehicle Events Analysis  
print("\n🔍 Analysis 19: Loading Ego Vehicle Events Data...")
ego_events = load_ego_vehicle_events_data(DATAROOT, VERSION)
print("📊 Ego Events Distribution:", dict(ego_events))

# 20. Traffic Density vs Weather
print("\n🔍 Analysis 20: Loading Traffic Density vs Weather Data...")
traffic_weather = load_traffic_density_weather_data(DATAROOT, VERSION)
print("📊 Traffic-Weather Distribution:", dict(traffic_weather))

# Ego Vehicle Behavior Summary
print("\n🚗 EGO VEHICLE BEHAVIOR SUMMARY:")
print("="*50)

total_motion = sum(ego_motion.values())
if total_motion > 0:
    dominant_motion = max(ego_motion, key=ego_motion.get)
    print(f"🚦 Dominant Motion State: {dominant_motion}")

total_events = sum(ego_events.values())
if total_events > 0:
    dominant_event = max(ego_events, key=ego_events.get)
    print(f"🔄 Most Frequent Event: {dominant_event} ({ego_events[dominant_event]} occurrences)")

# Driving complexity analysis
complex_events = ego_events.get('Lane Change', 0) + ego_events.get('Take Over', 0)
simple_events = ego_events.get('Turn', 0) + ego_events.get('Exit', 0)

if complex_events > simple_events:
    print("📈 High complexity driving scenario")
elif simple_events > complex_events:
    print("📉 Standard complexity driving scenario")
else:
    print("⚖️ Balanced driving complexity")

# Traffic density analysis
total_traffic_weather = sum(traffic_weather.values())
if total_traffic_weather > 0:
    avg_density = total_traffic_weather / len([v for v in traffic_weather.values() if v > 0])
    print(f"🚗 Average Traffic Density: {avg_density:.1f} vehicles per weather condition")

---

# 🔍 Special Analysis Suite (21-22)

Advanced sensor synchronization and rare class detection analysis.

In [None]:
# 21. Multi-Modal Synchronization Analysis
print("🔍 Analysis 21: Loading Multi-Modal Synchronization Data...")
multimodal_sync = load_multimodal_synchronization_data(DATAROOT, VERSION)
print("📊 Sensor Synchronization:", dict(multimodal_sync))

# 22. Rare Class Occurrences
print("\n🔍 Analysis 22: Loading Rare Class Occurrences Data...")
rare_classes = load_rare_class_occurrences(DATAROOT, VERSION)
print("📊 Rare Class Distribution:", dict(rare_classes))

# Special Analysis Summary
print("\n🔬 SPECIAL ANALYSIS SUMMARY:")
print("="*50)

# Sensor coverage analysis
total_sensor_data = sum(multimodal_sync.values())
if total_sensor_data > 0:
    print(f"📡 Total Sensor Data Points: {total_sensor_data}")
    
    lidar_coverage = (multimodal_sync.get('Lidar', 0) / total_sensor_data * 100)
    camera_coverage = (multimodal_sync.get('Camera', 0) / total_sensor_data * 100) 
    radar_coverage = (multimodal_sync.get('Radar', 0) / total_sensor_data * 100)
    
    print(f"📷 Camera Coverage: {camera_coverage:.1f}%")
    print(f"📡 Lidar Coverage: {lidar_coverage:.1f}%")
    print(f"📊 Radar Coverage: {radar_coverage:.1f}%")

# Rare class detection
total_rare = sum(rare_classes.values())
if total_rare > 0:
    print(f"🚨 Total Rare Class Detections: {total_rare}")
    
    for rare_class, count in rare_classes.items():
        if count > 0:
            print(f"  🔍 {rare_class}: {count} detections")
else:
    print("📊 No rare classes detected in this dataset subset")

print(f"\n🎯 Data Quality Score: {(total_sensor_data/1000):.2f}/10")

---

# 📈 Complete EDA Summary Report

## 🎯 Dataset Overview
This comprehensive analysis covered all 22 aspects of the nuScenes mini dataset, providing insights into:
- **Pedestrian Dynamics**: Behavior, positioning, and safety interactions
- **Vehicle Fleet Composition**: Distribution and positioning patterns  
- **Environmental Context**: Weather, geography, and temporal patterns
- **Road Infrastructure**: Types, geometry, obstacles, and furniture
- **Ego Vehicle Behavior**: Motion patterns and driving events
- **Sensor Integration**: Multi-modal data synchronization
- **Edge Cases**: Rare class occurrences and special scenarios

## 🚀 Next Steps
1. **Visualization**: Uncomment plotting code above to generate interactive charts
2. **Deep Dive**: Focus on specific analyses based on findings
3. **Model Development**: Use insights for autonomous driving algorithm development
4. **Safety Analysis**: Leverage pedestrian path and visibility data for safety systems

## 📊 Key Insights Summary
- **Most Active Analysis Areas**: [Based on data volume]
- **Critical Safety Concerns**: [Based on pedestrian path analysis]
- **Environmental Conditions**: [Based on weather/geography data]
- **Sensor Coverage**: [Based on multi-modal sync analysis]

---
**Note**: All analyses use real nuScenes mini dataset data without any synthetic fallbacks. Zero counts indicate genuine absence of data rather than missing analysis capability.