# Práctica 4: Reconocimiento de Matrículas

Este notebook implementa un prototipo de reconocimiento de matrículas de vehículos en video. Los objetivos de esta práctica incluyen la detección y seguimiento de personas y vehículos, el reconocimiento de matrículas visibles en los vehículos, y la exportación de los resultados en un video y un archivo CSV.

## Objetivos

La práctica se enfoca en desarrollar un sistema de detección y reconocimiento de objetos que cumpla con los siguientes requisitos:

- Detección y seguimiento: Identificación y rastreo de personas y vehículos presentes en el video.
- Reconocimiento de matrículas: Detección de matrículas en los vehículos y reconocimiento del texto usando OCR.
- Conteo total de clases: Recuento acumulativo de cada tipo de objeto detectado.
- Exportación de resultados: Generación de un video que visualice los resultados y exportación de un archivo CSV con el detalle de las detecciones.

## Preparación del entorno

In [1]:
import cv2
import time
import math
import csv
from collections import defaultdict, Counter
from ultralytics import YOLO
import easyocr

In [7]:
def initialize_model(model_path):
    """Initialize the YOLO model for detection."""
    return YOLO(model_path)

def initialize_reader():
    """Initialize the EasyOCR reader."""
    return easyocr.Reader(['en'])  

def initialize_video_writer(cap, output_video_path):
    """Set up the video writer for the processed video."""
    fourcc = cv2.VideoWriter_fourcc(*'mp4v')  # Codec
    fps = cap.get(cv2.CAP_PROP_FPS)
    frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
    return cv2.VideoWriter(output_video_path, fourcc, fps, (frame_width, frame_height))

def write_csv_header(csv_file_path):
    """Prepare CSV file for logging."""
    with open(csv_file_path, mode='w', newline='') as file:
        writer = csv.writer(file)
        writer.writerow(['frame', 'object_type', 'confidence', 'tracking_id', 'x1', 'y1', 'x2', 'y2',
                         'license_plate_confidence', 'mx1', 'my1', 'mx2', 'my2', 'license_plate_text'])

def put_text(frame, text, position, color=(0, 255, 0), font_scale=0.6, thickness=2, bg_color=(0, 0, 0)):
    """Helper function to put text with background on the frame."""
    text_size = cv2.getTextSize(text, cv2.FONT_HERSHEY_SIMPLEX, font_scale, thickness)[0]
    text_x, text_y = position
    box_coords = ((text_x, text_y - text_size[1] - 5), (text_x + text_size[0] + 5, text_y + 5))
    cv2.rectangle(frame, box_coords[0], box_coords[1], bg_color, cv2.FILLED)
    cv2.putText(frame, text, position, cv2.FONT_HERSHEY_SIMPLEX, font_scale, color, thickness)

In [25]:
# Parameters 
video_path = 'C0142.mp4'  # Path to input video
model_path = 'yolo11n.pt'  # Path to YOLO model
license_plate_detector_model_path = 'runs/detect/license_plate_detector/weights/best.pt'  # Path to license plate detector model
output_video_path = 'output_video.mp4'  # Path to save the annotated output video
csv_file_path = 'detection_tracking_log.csv'  # Path to save the CSV log file
show_video = True  # Set to True to display the video while processing
classes_to_detect = [0, 1, 2, 3, 5]  # Class IDs to detect (e.g., [0, 2] for person and car)

model = initialize_model(model_path)
license_plate_detector = YOLO(license_plate_detector_model_path)
reader = initialize_reader()

# Open the video file and set up output for processed video
cap = cv2.VideoCapture(video_path)
fourcc = cv2.VideoWriter_fourcc(*'mp4v')  # Codec
fps = cap.get(cv2.CAP_PROP_FPS)
frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
out = initialize_video_writer(cap, output_video_path)
write_csv_header(csv_file_path)

# Define class names and colors for display
class_names = {
    0: "person",
    1: "bicycle",
    2: "car",
    3: "motorbike",
    5: "bus"
}
class_colors = {
    0: (255, 0, 0),
    1: (0, 255, 0),
    2: (0, 0, 255),
    3: (255, 255, 0),
    5: (0, 255, 255)
}

# Persistent total count of each class across all frames
total_class_count = Counter()
# Track unique IDs for each class to count only once
seen_ids = defaultdict(set)
frame_number = 0  # Initialize frame counter

# Loop through each frame
while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        break

    start_time = time.time()
    frame_number += 1

    # Run YOLO detection and tracking
    results = model.track(frame, persist=True, classes=classes_to_detect)
    current_frame_count = Counter()

    # Process detections
    for result in results:
        boxes = result.boxes

        for box in boxes:
            x1, y1, x2, y2 = map(int, box.xyxy[0])
            cls = int(box.cls[0])
            confidence = round(float(box.conf[0]), 2)

            if box.id is not None:
                track_id = int(box.id[0].tolist())
                if track_id not in seen_ids[cls]:
                    seen_ids[cls].add(track_id)
                    total_class_count[class_names[cls]] += 1

                # Draw bounding box and label
                color = class_colors.get(cls, (0, 255, 0))
                cv2.rectangle(frame, (x1, y1), (x2, y2), color, 3)
                put_text(frame, f"{class_names[cls]} {confidence}", (x1, y1 - 10), color=color)
                put_text(frame, f"ID: {track_id}", (x1, y2 + 20), color=color)

                # License plate recognition for cars
                license_plate_text = ""
                plate_confidence = None
                mx1, my1, mx2, my2 = None, None, None, None

                # Check if the detected object is a car, then detect license plate within its bounding box
                if class_names[cls] in ["car", "motorbike", "bus"]:
                    vehicle_img = frame[y1:y2, x1:x2]  # Crop the vehicle area to search for license plate
                    
                    # Check if the cropped image is large enough for license plate detection
                    min_plate_size = 100
                    if vehicle_img.shape[0] < min_plate_size or vehicle_img.shape[1] < min_plate_size:
                        continue
                    
                    # Check if the confidence is high enough for license plate detection
                    if confidence < 0.7:
                        continue
                    
                    # Run license plate detector model on the cropped vehicle image
                    plate_results = license_plate_detector.predict(vehicle_img)

                    # Process license plate detection results
                    if plate_results and len(plate_results[0].boxes) > 0:
                        for plate_box in plate_results[0].boxes:
                            # Get bounding box coordinates for the license plate, adjusted to the frame's coordinates
                            px1, py1, px2, py2 = map(int, plate_box.xyxy[0])
                            px1, py1, px2, py2 = px1 + x1, py1 + y1, px2 + x1, py2 + y1  # Adjust to the car's bounding box position
                                                        
                            # Draw bounding box for license plate
                            background_color = (255, 255, 255)  # White background for contrast
                            cv2.rectangle(frame, (px1, py1), (px2, py2), background_color, 2)
                                
                            # Extract the license plate text using OCR
                            license_plate_roi = frame[py1:py2, px1:px2]
                            plate_ocr_results = reader.readtext(license_plate_roi, allowlist='0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ')
                            
                            if plate_ocr_results:
                                license_plate_text = plate_ocr_results[0][-2]
                                plate_confidence = round(plate_ocr_results[0][-1], 2)

                                # Draw label for license plate in high-contrast format
                                high_contrast_color = (0, 0, 0)  # Black text
                                put_text(frame, f"Plate: {license_plate_text}", (px1, py2 + 20), color=high_contrast_color, bg_color=background_color)
                                
                                # Save coordinates for CSV logging
                                mx1, my1, mx2, my2 = px1, py1, px2, py2

                # Write to CSV
                with open(csv_file_path, mode='a', newline='') as file:
                    writer = csv.writer(file)
                    writer.writerow([frame_number, class_names[cls], confidence, track_id, x1, y1, x2, y2,
                                     plate_confidence, mx1, my1, mx2, my2, license_plate_text])

                current_frame_count[class_names[cls]] += 1

    # Display counts and FPS
    y_offset = 30
    for cls, count in total_class_count.items():
        put_text(frame, f"Total {cls}: {count}", (10, y_offset))
        y_offset += 20

    for cls, count in current_frame_count.items():
        put_text(frame, f"Frame {cls}: {count}", (10, y_offset), color=(255, 0, 0))
        y_offset += 20

    fps_calc = 1.0 / (time.time() - start_time)
    put_text(frame, f"FPS: {fps_calc:.2f}", (10, y_offset), color=(255, 0, 0))

    # Write frame to output video
    out.write(frame)

    # Optionally display the frame
    if show_video:
        cv2.imshow('Detection and Tracking', frame)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

# Release resources
cap.release()
out.release()
cv2.destroyAllWindows()


0: 384x640 1 car, 1 bus, 37.0ms
Speed: 1.6ms preprocess, 37.0ms inference, 0.6ms postprocess per image at shape (1, 3, 384, 640)

0: 544x640 (no detections), 37.9ms
Speed: 1.7ms preprocess, 37.9ms inference, 0.2ms postprocess per image at shape (1, 3, 544, 640)

0: 384x640 1 car, 1 bus, 28.4ms
Speed: 1.5ms preprocess, 28.4ms inference, 0.7ms postprocess per image at shape (1, 3, 384, 640)

0: 544x640 (no detections), 109.2ms
Speed: 1.6ms preprocess, 109.2ms inference, 0.2ms postprocess per image at shape (1, 3, 544, 640)

0: 384x640 1 car, 1 bus, 33.5ms
Speed: 1.6ms preprocess, 33.5ms inference, 0.6ms postprocess per image at shape (1, 3, 384, 640)

0: 512x640 (no detections), 37.1ms
Speed: 1.4ms preprocess, 37.1ms inference, 0.2ms postprocess per image at shape (1, 3, 512, 640)

0: 384x640 1 car, 1 bus, 28.7ms
Speed: 1.4ms preprocess, 28.7ms inference, 0.5ms postprocess per image at shape (1, 3, 384, 640)

0: 512x640 (no detections), 75.2ms
Speed: 1.4ms preprocess, 75.2ms inference, 

KeyboardInterrupt: 

### Resultados

Esta sección se presentan los resultados obtenidos. Cargaremos el archivo CSV para revisar el recuento total de cada tipo de objeto detectado, así como los detalles de las detecciones de matrículas.

In [4]:
# Cargar el archivo CSV de resultados
import pandas as pd

results_df = pd.read_csv('detection_tracking_log.csv')
print("Resumen de detecciones por clase:")
print(results_df['object_type'].value_counts())

print("\nEjemplo de datos de detección de matrículas:")
display(results_df[results_df['object_type'] == 'car'].head())

Resumen de detecciones por clase:
object_type
car          11106
person        2929
bus            223
motorbike       50
bicycle         40
Name: count, dtype: int64

Ejemplo de datos de detección de matrículas:


Unnamed: 0,frame,object_type,confidence,tracking_id,x1,y1,x2,y2,license_plate_confidence,mx1,my1,mx2,my2,license_plate_text
1,1,car,0.84,2,1321,312,1398,371,,,,,,
3,2,car,0.81,2,1321,312,1397,371,,,,,,
5,3,car,0.82,2,1320,312,1395,371,,,,,,
7,4,car,0.84,2,1318,312,1395,371,,,,,,
9,5,car,0.82,2,1317,312,1394,371,,,,,,


## Conclusión

En esta práctica se ha desarrollado un prototipo funcional que permite:

- Detectar y seguir personas y vehículos en video.
- Detectar y leer matrículas en vehículos mediante un modelo YOLO y OCR.
- Exportar los resultados visuales en un video y los datos de detección en un archivo CSV.

Este prototipo constituye una herramienta útil para el análisis automatizado de video en aplicaciones de monitoreo y seguridad, con posibilidad de mejoras futuras en el rendimiento y precisión del OCR de matrículas.