# OBJECT DETECTION AND VEHICLE COUNTER

**Nama :** Bagas Dwi Santosa

**E-mail :** bagasdwisantosa87@gmail.com

**Git Repository :**


**Hasil & Problem:**
- Berhasil Membaca File Video Input (`toll_gate.mp4`)
- Berhasil Mendeteksi Kendaraan Mobil dan Bus
- Menampilkan Perhitungan Jumlah Kendaraan, tetapi hasil masih tidak sesuai karena yang dihitung per frame
- Berhasil Menampilkan ID Unique, tetapi per frame sehingga hasilnya masih belum sesuai
- Berhasil Visualisasi Perhitungan Jumlah Kendaraan
- Berhasil Output Video

## Install & Import Library

In [63]:
!pip install ultralytics

from IPython import display
display.clear_output()

import ultralytics
ultralytics.checks()

Ultralytics YOLOv8.2.77 🚀 Python-3.10.12 torch-2.3.1+cu121 CUDA:0 (Tesla T4, 15102MiB)
Setup complete ✅ (2 CPUs, 12.7 GB RAM, 33.7/78.2 GB disk)


In [64]:
import os
import matplotlib.pyplot as plt

from ultralytics import YOLO

from IPython.display import display, Image
import cv2
from google.colab.patches import cv2_imshow

## Download Videos

In [65]:
!gdown --fuzzy https://drive.google.com/file/d/1m1bNzP2W50rLUMrevp7wVRsd9ABRwzKn/view?usp=drive_link

Downloading...
From: https://drive.google.com/uc?id=1m1bNzP2W50rLUMrevp7wVRsd9ABRwzKn
To: /content/toll_gate.mp4
  0% 0.00/1.34M [00:00<?, ?B/s]100% 1.34M/1.34M [00:00<00:00, 52.1MB/s]


## Perform Detection
Fungsi `perform_detection` adalah utilitas untuk melakukan deteksi objek pada sebuah frame gambar menggunakan model deteksi yang ditentukan.

In [66]:
def perform_detection(frame, model):
    """Perform object detection on the frame."""
    results = model(frame)
    return results

## Count Vehicles
Fungsi `count_vehicles` menghitung jumlah kendaraan dalam sebuah frame berdasarkan kelas yang diinginkan.

In [67]:
def count_vehicles(results, desired_classes):
    """Count the number of vehicles based on the desired classes for a single frame."""
    counts = {class_id: 0 for class_id in desired_classes}

    for result in results:
        boxes = result.boxes
        for box in boxes:
            class_id = int(box.cls[0])
            if class_id in desired_classes:
                counts[class_id] += 1

    return counts

## Bounding Box Vehicle
Fungsi `draw_bounding_boxes` untuk menggambar kotak batas dan label pada sebuah frame gambar untuk menandai objek yang terdeteksi

In [68]:
def draw_bounding_boxes(frame, results, desired_classes, id_tracker):
    """Draw bounding boxes, labels, and unique IDs on the frame."""
    for result in results:
        boxes = result.boxes
        for box in boxes:
            x1, y1, x2, y2 = box.xyxy[0]
            class_id = int(box.cls[0])
            confidence = box.conf[0]

            if class_id in desired_classes:
                label = f'{desired_classes[class_id]} ({confidence:.2f})'
                unique_id = id_tracker[class_id].pop(0)
                full_label = f'{label} {unique_id}'

                cv2.rectangle(frame, (int(x1), int(y1)), (int(x2), int(y2)), (0, 255, 0), 2)
                cv2.putText(frame, full_label, (int(x1), int(y1) - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

## Detection & Display
Fungsi `detect_and_display_video` berfungsi untuk mendeteksi objek dalam video, menggambar kotak batas pada objek yang terdeteksi, menghitung jumlah kendaraan berdasarkan kelas yang diinginkan, dan menyimpan video yang telah dianotasi.

In [69]:
def detect_and_display_video(video_path, output_path):
    """Detect objects in the video and save the annotated video, counting vehicles overall."""
    # Load YOLOv8 model
    model = YOLO("yolov8n.pt")

    # Define desired classes
    desired_classes = {2: 'Car', 5: 'Bus'}

    # Initialize overall counts
    overall_counts = {class_id: 0 for class_id in desired_classes}

    # Initialize unique ID trackers
    id_tracker = {
        2: [f'C{i+1}' for i in range(1000)],  # Car IDs
        5: [f'B{i+1}' for i in range(1000)]   # Bus IDs
    }

    # Open video file
    video_capture = cv2.VideoCapture(video_path)

    # Get video properties
    frame_width = int(video_capture.get(cv2.CAP_PROP_FRAME_WIDTH))
    frame_height = int(video_capture.get(cv2.CAP_PROP_FRAME_HEIGHT))
    fps = video_capture.get(cv2.CAP_PROP_FPS)

    # Define codec and create VideoWriter object
    fourcc = cv2.VideoWriter_fourcc(*'mp4v')  # or use 'XVID' for .avi files
    out = cv2.VideoWriter(output_path, fourcc, fps, (frame_width, frame_height))

    while True:
        # Read frame-by-frame
        ret, frame = video_capture.read()

        if not ret:
            break

        # Perform detection
        results = perform_detection(frame, model)

        # Count vehicles for the current frame
        frame_counts = count_vehicles(results, desired_classes)

        # Update overall counts
        for class_id, count in frame_counts.items():
            overall_counts[class_id] += count

        # Draw bounding boxes, labels, and unique IDs
        draw_bounding_boxes(frame, results, desired_classes, id_tracker)

        # Create summary text
        summary_text = ' '.join([f'{name}: {overall_counts[class_id]}' for class_id, name in desired_classes.items()])

        # Put summary text on the frame
        cv2.putText(frame, summary_text, (10, frame_height - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.75, (255, 0, 0), 2)

        # Write the frame to the output video file
        out.write(frame)

    # Release video capture and writer objects
    video_capture.release()
    out.release()
    cv2.destroyAllWindows()

## Main Process

In [70]:
video_path = '/content/toll_gate.mp4'
output_path = '/content/output_toll_gate.mp4'
detect_and_display_video(video_path, output_path)


0: 384x640 4 persons, 1 car, 1 bus, 2 traffic lights, 9.2ms
Speed: 2.0ms preprocess, 9.2ms inference, 2.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 persons, 1 bus, 2 traffic lights, 6.5ms
Speed: 2.0ms preprocess, 6.5ms inference, 1.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 persons, 1 bus, 2 traffic lights, 6.4ms
Speed: 2.0ms preprocess, 6.4ms inference, 1.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 persons, 1 car, 1 bus, 2 traffic lights, 6.4ms
Speed: 1.8ms preprocess, 6.4ms inference, 1.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 persons, 1 car, 1 bus, 2 traffic lights, 6.4ms
Speed: 2.0ms preprocess, 6.4ms inference, 1.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 persons, 1 car, 1 bus, 2 traffic lights, 6.3ms
Speed: 1.7ms preprocess, 6.3ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 persons, 1 car, 1 bus, 2 traffic lights, 6.4ms
Speed: 2.5m