# Traffic Sign Recognition - Advanced Deep Learning Project

## Overview
This notebook implements a comprehensive traffic sign recognition system using various deep learning approaches:
- **Custom CNN** from scratch
- **Transfer Learning** with MobileNetV2, VGG16, and ResNet50
- **Data Augmentation** for improved generalization
- **Comprehensive Evaluation** with detailed visualizations

## Dataset: GTSRB (German Traffic Sign Recognition Benchmark)
- 43 different traffic sign classes
- High-quality RGB images
- Varying resolutions and lighting conditions

---


## Import Libraries and Setup


In [1]:
# Core libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import cv2
import os
import time
import warnings
warnings.filterwarnings('ignore')

# Deep learning libraries
import tensorflow as tf
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Sklearn for evaluation
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.preprocessing import LabelEncoder

# Custom modules
from data_preprocessing import TrafficSignDataProcessor
from models import TrafficSignModelBuilder
from training_evaluation import TrafficSignTrainer

# Set random seeds for reproducibility
np.random.seed(42)
tf.random.set_seed(42)

# Set plotting style
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")

print("Libraries imported successfully!")
print(f"TensorFlow version: {tf.__version__}")
print(f"GPU available: {tf.config.list_physical_devices('GPU')}")


Libraries imported successfully!
TensorFlow version: 2.20.0
GPU available: []


## Data Loading and Preprocessing


In [2]:
# Initialize data processor
processor = TrafficSignDataProcessor(img_size=(64, 64))

# Check if dataset exists locally
data_dir = 'data'
train_csv_path = os.path.join(data_dir, 'Train.csv')
train_images_path = os.path.join(data_dir, 'Train')
test_csv_path = os.path.join(data_dir, 'Test.csv')
test_images_path = os.path.join(data_dir, 'Test')

print("Dataset paths configured:")
print(f"Training CSV: {train_csv_path}")
print(f"Training Images: {train_images_path}")
print(f"Test CSV: {test_csv_path}")
print(f"Test Images: {test_images_path}")

# Check if dataset exists
if os.path.exists(train_csv_path) and os.path.exists(train_images_path):
    print("\nDataset found! Ready to load data.")
    dataset_available = True
else:
    print("\nDataset not found!")
    print("To use this notebook with real data:")
    print("1. Download GTSRB dataset from: https://www.kaggle.com/datasets/meowmeowmeowmeowmeow/gtsrb-german-traffic-sign")
    print("2. Extract to 'data/' directory")
    print("3. Run this notebook again")
    print("\nFor now, we'll create sample data for demonstration.")
    dataset_available = False


Dataset paths configured:
Training CSV: data\Train.csv
Training Images: data\Train
Test CSV: data\Test.csv
Test Images: data\Test

Dataset not found!
To use this notebook with real data:
1. Download GTSRB dataset from: https://www.kaggle.com/datasets/meowmeowmeowmeowmeow/gtsrb-german-traffic-sign
2. Extract to 'data/' directory
3. Run this notebook again

For now, we'll create sample data for demonstration.


In [3]:
# Create sample data if dataset not available
if not dataset_available:
    print("Creating sample data for demonstration...")
    
    # Create sample training data
    X_train_full = np.random.random((1000, 64, 64, 3)).astype(np.float32)
    y_train_full = np.random.randint(0, 43, 1000)
    
    # Create sample test data
    X_test = np.random.random((200, 64, 64, 3)).astype(np.float32)
    y_test = np.random.randint(0, 43, 200)
    
    print(f"Sample training data created: {X_train_full.shape}")
    print(f"Sample test data created: {X_test.shape}")
    print(f"Number of classes: {len(np.unique(y_train_full))}")
else:
    # Load real data
    print("Loading training data...")
    X_train_full, y_train_full = processor.load_data_from_csv(train_csv_path, train_images_path)
    
    print("Loading test data...")
    X_test, y_test = processor.load_data_from_csv(test_csv_path, test_images_path)
    
    print(f"Training data loaded: {X_train_full.shape}")
    print(f"Test data loaded: {X_test.shape}")
    print(f"Number of classes: {len(np.unique(y_train_full))}")


Creating sample data for demonstration...
Sample training data created: (1000, 64, 64, 3)
Sample test data created: (200, 64, 64, 3)
Number of classes: 43


## 🌐 Test Web Application

The web application is now ready to use! Here's how to test it:

1. **Run the Streamlit app**: `streamlit run streamlit_app.py`
2. **Open browser**: Go to `http://localhost:8501`
3. **Load a model**: Select a model type and click "Load Model"
4. **Upload image**: Upload a traffic sign image for prediction
5. **View results**: See predictions with confidence scores

**Note**: The models created in this notebook are sample models. For real predictions, train with the actual GTSRB dataset.


In [4]:
# Load training data
print("🔄 Loading training data...")
X_train_full, y_train_full = processor.load_data_from_csv(train_csv_path, train_images_path)

# Load test data
print("🔄 Loading test data...")
X_test, y_test = processor.load_data_from_csv(test_csv_path, test_images_path)

print(f"✅ Training data loaded: {X_train_full.shape}")
print(f"✅ Test data loaded: {X_test.shape}")
print(f"✅ Number of classes: {len(np.unique(y_train_full))}")


🔄 Loading training data...
Loading data from CSV and images...


FileNotFoundError: [Errno 2] No such file or directory: 'data\\Train.csv'

In [None]:
# Split training data into train, validation, and test sets
X_train, X_val, X_test, y_train, y_val, y_test = processor.split_data(
    X_train_full, y_train_full, 
    test_size=0.2, val_size=0.2, random_state=42
)

# Encode labels to categorical format
y_train_encoded, y_val_encoded, y_test_encoded = processor.encode_labels(
    y_train, y_val, y_test
)

print(f"Data split completed:")
print(f"  Training: {X_train.shape[0]} samples")
print(f"  Validation: {X_val.shape[0]} samples")
print(f"  Test: {X_test.shape[0]} samples")
print(f"  Image shape: {X_train.shape[1:]}")
print(f"  Number of classes: {y_train_encoded.shape[1]}")


Train set: 600 samples
Validation set: 200 samples
Test set: 200 samples


ValueError: too many values to unpack (expected 4)