<a href="https://colab.research.google.com/github/Youssef-T-Moustafa/Object-Detection-Model/blob/main/Object_Detection_System.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

-------
# **OBJECT DETECTION PROJECT**
## **Made by:** *Youssef Moustafa*
-------



# Table of Contents

| Number | Content                                               |
|--------|-------------------------------------------------------|
| 1      | [Project Overview](#project-overview)                |
| 2      | [Project Setup](#project-setup)                      |
| 3      | [Import Dependencies](#import-dependencies)          |
| 4      | [Clone Darknet Repo and Install Dependencies](#clone-darknet-repo-and-install-dependencies) |
| 5      | [Enable GPU, OPENCV, and LIBSO in Darknet Makefile](#enable-gpu-opencv-and-libso-in-darknet-makefile) |
| 6      | [Download and Upload Darknet Files](#download-and-upload-darknet-files) |
| 7      | [YOLOv4 Object Detection on Sample Image](#yolov4-object-detection-on-sample-image) |
| 8      | [Webcam Photo Capture with YOLOv4 Object Detection](#webcam-photo-capture-with-yolov4-object-detection) |
| 9      | [Image Upload for YOLOv4 Object Detection](#image-upload-for-yolov4-object-detection) |
| 10     | [Live Webcam Object Detection with YOLOv4](#live-webcam-object-detection-with-yolov4) |
| 11     | [Conclusion](#conclusion)                            |
| 12     | [Future Improvements](#future-improvements)          |


----
**Link to Drive:** https://drive.google.com/drive/folders/13VMil-1IZN_CcH5tpZRp4wkoffPuSTeW?usp=sharing

----

# **Object Detection System**

The Object Detection System is a comprehensive project designed for real-time object detection using the state-of-the-art YOLOv4 architecture. This system leverages the capabilities of Google Colab for an interactive and accessible environment.

# **Project Overview**

The system encompasses the following key components and functionalities:

1. **Darknet Integration:**
   - Utilizes the Darknet framework, a neural network library, and its Python bindings for seamless integration with the YOLOv4 architecture.

2. **Webcam Capture and Image Upload:**
   - Provides flexibility for users to capture live images from their webcam or upload images directly from their local machine for object detection.

3. **Real-time Object Detection:**
   - Implements the YOLOv4 architecture for real-time object detection on both live webcam feeds and uploaded images.

4. **Interactive Visualization:**
   - Dynamically overlays bounding boxes and labels on the video stream, showcasing detected objects in real-time.

5. **User-friendly Experience:**
- Employs JavaScript and Python interaction to create an intuitive and interactive interface for capturing images, running object detection, and visualizing results.

# **Project Setup**
To ensure the successful execution of the PRISM Object Detection Project, follow these setup steps:

1. **Change Runtime Accelerator to T4 GPU:**
   - Navigate to the Colab notebook menu and select "Runtime."
   - Choose "Change runtime type" and set the hardware accelerator to "GPU" (specifically, T4 GPU).

2. **Download Necessary Files:**
   - Download the following files from the drive link and store them on **your Google Drive**:
     - darnket.py
     - yolov4-csp.weights
  - Copy their paths into the respective cells where they are required.

3. **Mount Google Drive:**
   - Mount your Google Drive to the Colab notebook using the following code snippet:
     ```python
     from google.colab import drive
     drive.mount('/content/drive')
     ```
   - Follow the prompted instructions to authenticate and provide access.

4. **Enable Webcam**
- Webcam is required for the live object detection.

Now, your environment is set up, and you're ready to proceed with the PRISM Object Detection Project.

# **Import Dependencies**

In [None]:
# import dependencies
from IPython.display import display, Javascript, Image
from google.colab.output import eval_js
from google.colab.patches import cv2_imshow
from base64 import b64decode, b64encode
import tensorflow as tf
import cv2
import ctypes as ct
import random
import os
import numpy as np
import PIL
import io
import html
import time
import shutil
import matplotlib.pyplot as plt
%matplotlib inline

This code block imports essential libraries and modules for the PRISM Object Detection Project:

- **IPython.display:** Functions for displaying outputs, handling JavaScript, and showing images.
- **google.colab.output:** Tools for evaluating JavaScript code and displaying OpenCV images.
- **base64:** Functions for encoding and decoding base64 data.
- **tensorflow (as tf):** TensorFlow, a machine learning library for deep learning tasks.
- **cv2:** OpenCV, a computer vision library for image and video processing.
- **ctypes (as ct):** C-compatible data types, useful for low-level operations.
- **random:** Module for generating random numbers, often used in data augmentation.
- **os:** Operating system interaction for file and directory operations.
- **numpy (as np):** NumPy, a numerical computing library for array handling.
- **PIL:** Python Imaging Library, used for image processing tasks.
- **io:** Module for working with streams and file-like objects.
- **html:** Module for HTML-related operations.
- **time:** Module for time-related operations.
- **shutil:** Module for high-level file operations like moving and copying.
- **matplotlib.pyplot (as plt):** Matplotlib, a plotting library for visualizing data.
- **%matplotlib inline:** Jupyter magic command for rendering Matplotlib plots in the notebook.


# **Clone Darknet Repo and Install Dependencies**

In [None]:
# clone darknet repo
!git clone https://github.com/AlexeyAB/darknet

In [None]:
!pip install darknet
!pip install darknet.py
!pip3 install opencv-python
!pip3 install scikit-image
import darknet



This code block performs the following tasks:

1. **Clone Darknet Repository:**
   - Clones the Darknet repository from [https://github.com/AlexeyAB/darknet](https://github.com/AlexeyAB/darknet).

2. **Install Darknet Python Bindings:**
   - Installs the Darknet library using the `darknet` and `darknet.py` Python packages.
   - The `!pip install darknet` and `!pip install darknet.py` commands handle the installation.

3. **Install OpenCV:**
   - Installs the OpenCV library using the `opencv-python` package.

4. **Install scikit-image:**
   - Installs the scikit-image library using the `scikit-image` package.

5. **Import Darknet:**
   - Imports the Darknet library, making its functions and classes available for use in the notebook.

These steps set up the Darknet environment and its Python bindings along with additional dependencies for image processing and computer vision tasks.


# **Enable GPU, OPENCV, and LIBSO in Darknet Makefile**

In [None]:
# change makefile to have GPU, OPENCV and LIBSO enabled
%cd darknet
!sed -i 's/OPENCV=0/OPENCV=1/' Makefile
!sed -i 's/GPU=0/GPU=1/' Makefile
!sed -i 's/CUDNN=0/CUDNN=1/' Makefile
!sed -i 's/CUDNN_HALF=0/CUDNN_HALF=1/' Makefile
!sed -i 's/LIBSO=0/LIBSO=1/' Makefile

In [None]:
#to build darknet (will take a few minutes)
!make



This code block modifies the Darknet Makefile to enable GPU support, OpenCV integration, and build the shared library (`libdarknet.so`).

1. **Change Directory to Darknet:**
   - Switches to the Darknet directory using `%cd darknet`.

2. **Modify Makefile:**
   - Uses `sed` commands to modify the Makefile:
      - Sets `OPENCV=1` to enable OpenCV support.
      - Sets `GPU=1` to enable GPU acceleration.
      - Sets `CUDNN=1` to enable cuDNN library support.
      - Sets `CUDNN_HALF=1` to enable reduced precision (half) computation in cuDNN.
      - Sets `LIBSO=1` to build the shared library (`libdarknet.so`).

3. **Build Darknet:**
   - Executes the `make` command to build Darknet with the specified configurations.

These changes optimize Darknet for GPU usage, enable OpenCV for image processing, and build the shared library for use in Python.


# **Download and Upload Darknet Files**
##**(PleaseFollow the Project Setup!)**


In [None]:
#script_path ='/content/drive/MyDrive/Colab_Notebooks/PRISM_Project/darknet.py'

# PLEASE DOWNLOAD THE "darknet.py" FILE THEN UPLOAD TO DRIVE AND PLACE THE PATH
script_path = 'path/of/darknet.py'

# Use the %run magic command to execute the script
%run $script_path

In [None]:
#drive_weights_path ='/content/drive/MyDrive/Colab_Notebooks/PRISM_Project/yolov4-csp.weights'

#PLEASE DOWNLOAD THE "yolov4-csp.weights" FILE THEN UPLOAD TO DRIVE AND PLACE THE PATH
drive_weights_path = 'path/of/yolov4-csp.weights'

# Destination path in the Colab environment
colab_weights_path = '/content/darknet/yolov4-csp.weights'

# Copy the weights file
shutil.copy(drive_weights_path, colab_weights_path)

This code block facilitates the download and setup of essential Darknet files:

1. **Download "darknet.py" Script:**
   - The script path is specified using `script_path`.
   - It is then executed using the `%run` magic command.

2. **Upload "yolov4-csp.weights" File:**
   - The script requests users to download the "yolov4-csp.weights" file and upload it to their Google Drive.
   - The path of the uploaded file on Google Drive is set as `drive_weights_path`.

3. **Set Destination Paths:**
   - `colab_weights_path` is defined as the destination path for the weights file within the Colab environment.

4. **Copy Weights File:**
   - The `shutil.copy` command is used to copy the weights file from the Google Drive path to the Colab environment.

These steps ensure that the necessary Darknet files are available and properly configured for the project.


# **YOLOv4 Object Detection on Sample Image**

In [None]:
# load in YOLOv4 architecture network
from darknet import *

network, class_names, class_colors = load_network("cfg/yolov4-csp.cfg", "cfg/coco.data", "yolov4-csp.weights")
width = network_width(network)
height = network_height(network)

# darknet helper function to run detection on image
def darknet_helper(img, width, height):
  darknet_image = make_image(width, height, 3)
  img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
  img_resized = cv2.resize(img_rgb, (width, height),
                              interpolation=cv2.INTER_LINEAR)

  # get image ratios to convert bounding boxes to proper size
  img_height, img_width, _ = img.shape
  width_ratio = img_width/width
  height_ratio = img_height/height

  # run model on darknet style image to get detections
  copy_image_from_bytes(darknet_image, img_resized.tobytes())
  detections = detect_image(network, class_names, darknet_image)
  free_image(darknet_image)
  return detections, width_ratio, height_ratio

In [None]:
# run test on person.jpg image that comes with repository
image = cv2.imread("data/person.jpg")
detections, width_ratio, height_ratio = darknet_helper(image, width, height)

for label, confidence, bbox in detections:
  left, top, right, bottom = bbox2points(bbox)
  left, top, right, bottom = int(left * width_ratio), int(top * height_ratio), int(right * width_ratio), int(bottom * height_ratio)
  cv2.rectangle(image, (left, top), (right, bottom), class_colors[label], 2)
  cv2.putText(image, "{} [{:.2f}]".format(label, float(confidence)),
                    (left, top - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.5,
                    class_colors[label], 2)
cv2_imshow(image)



This code block demonstrates the application of YOLOv4 object detection on an example image ("data/person.jpg") using the previously loaded network and helper function:

1. **Load YOLOv4 Architecture Network:**
   - The YOLOv4 architecture network is loaded using the `load_network` function.
   - Network configuration is specified in "cfg/yolov4-csp.cfg," class information in "cfg/coco.data," and weights in "yolov4-csp.weights."

2. **Get Network Width and Height:**
   - `width` and `height` are obtained from the network using the `network_width` and `network_height` functions.

3. **Darknet Helper Function:**
   - The `darknet_helper` function is defined to facilitate object detection on an image.
   - It resizes the image, converts it to the darknet format, runs the YOLOv4 model, and returns detections along with width and height ratios.

4. **Run Test on Sample Image:**
   - The code reads the "data/person.jpg" image and performs object detection using the `darknet_helper` function.
   - Detected objects are annotated on the image with bounding boxes and labels.

5. **Display Result:**
   - The final annotated image is displayed using the `cv2_imshow` function.

This example demonstrates the YOLOv4 object detection capabilities on a sample image, showcasing bounding boxes and labels for detected objects.


# **Webcam Photo Capture with YOLOv4 Object Detection**

In [None]:
  # function to convert the JavaScript object into an OpenCV image
def js_to_image(js_reply):
    """
    Params:
            js_reply: JavaScript object containing image from webcam
    Returns:
            img: OpenCV BGR image
    """
    # decode base64 image
    image_bytes = b64decode(js_reply.split(',')[1])
    # convert bytes to numpy array
    jpg_as_np = np.frombuffer(image_bytes, dtype=np.uint8)
    # decode numpy array into OpenCV BGR image
    img = cv2.imdecode(jpg_as_np, flags=1)

    return img

# function to convert OpenCV Rectangle bounding box image into base64 byte string to be overlayed on video stream
def bbox_to_bytes(bbox_array):
    """
    Params:
            bbox_array: Numpy array (pixels) containing rectangle to overlay on video stream.
    Returns:
          bytes: Base64 image byte string
    """
    # convert array into PIL image
    bbox_PIL = PIL.Image.fromarray(bbox_array, 'RGBA')
    iobuf = io.BytesIO()
    # format bbox into png for return
    bbox_PIL.save(iobuf, format='png')
    # format return string
    bbox_bytes = 'data:image/png;base64,{}'.format((str(b64encode(iobuf.getvalue()), 'utf-8')))

    return bbox_bytes

In [None]:
def take_photo(filename='photo.jpg', quality=0.8):
  js = Javascript('''
    async function takePhoto(quality) {
      const div = document.createElement('div');
      const capture = document.createElement('button');
      capture.textContent = 'Capture';
      div.appendChild(capture);

      const video = document.createElement('video');
      video.style.display = 'block';
      const stream = await navigator.mediaDevices.getUserMedia({video: true});

      document.body.appendChild(div);
      div.appendChild(video);
      video.srcObject = stream;
      await video.play();

      // Resize the output to fit the video element.
      google.colab.output.setIframeHeight(document.documentElement.scrollHeight, true);

      // Wait for Capture to be clicked.
      await new Promise((resolve) => capture.onclick = resolve);

      const canvas = document.createElement('canvas');
      canvas.width = video.videoWidth;
      canvas.height = video.videoHeight;
      canvas.getContext('2d').drawImage(video, 0, 0);
      stream.getVideoTracks()[0].stop();
      div.remove();
      return canvas.toDataURL('image/jpeg', quality);
    }
    ''')
  display(js)

  # get photo data
  data = eval_js('takePhoto({})'.format(quality))
  # get OpenCV format image
  img = js_to_image(data)

  # call darknet helper on webcam image
  detections, width_ratio, height_ratio = darknet_helper(img, width, height)

  # loop through detections and draw them on webcam image
  for label, confidence, bbox in detections:
    left, top, right, bottom = bbox2points(bbox)
    left, top, right, bottom = int(left * width_ratio), int(top * height_ratio), int(right * width_ratio), int(bottom * height_ratio)
    cv2.rectangle(img, (left, top), (right, bottom), class_colors[label], 2)
    cv2.putText(img, "{} [{:.2f}]".format(label, float(confidence)),
                      (left, top - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.5,
                      class_colors[label], 2)
  # save image
  cv2.imwrite(filename, img)

  return filename

In [None]:
try:
  filename = take_photo('photo.jpg')
  print('Saved to {}'.format(filename))

  # Show the image which was just taken.
  display(Image(filename))
except Exception as err:
  # Errors will be thrown if the user does not have a webcam or if they do not
  # grant the page permission to access it.
  print(str(err))



This code block defines functions for capturing photos from a webcam and applying YOLOv4 object detection to the captured images:

1. **`js_to_image(js_reply)` Function:**
   - Converts a JavaScript object containing an image from the webcam into an OpenCV BGR image.
   - Decodes base64 image, converts bytes to a NumPy array, and decodes the array into an OpenCV BGR image.

2. **`bbox_to_bytes(bbox_array)` Function:**
   - Converts an OpenCV rectangle bounding box image into a base64 byte string.
   - The function converts the array into a PIL image, saves it as a PNG, and returns the base64 image byte string.

3. **`take_photo(filename='photo.jpg', quality=0.8)` Function:**
   - Uses JavaScript to capture a photo from the webcam.
   - Calls the `js_to_image` function to convert the JavaScript object into an OpenCV BGR image.
   - Applies YOLOv4 object detection using the `darknet_helper` function on the webcam image.
   - Draws bounding boxes and labels on the image for detected objects.
   - Saves the annotated image to a specified filename.
   - Displays the saved image.

This code allows users to interactively capture photos from their webcam, apply YOLOv4 object detection, and view the results.


# **Image Upload for YOLOv4 Object Detection**

In [None]:
from google.colab import files

# Function to upload an image from Colab
def upload_image():
    uploaded = files.upload()
    # Return the first uploaded file
    return list(uploaded.keys())[0] if uploaded else None

# Function to run YOLOv4 object detection on the uploaded image
def detect_objects_on_uploaded_image(uploaded_image_path):
    if uploaded_image_path:
        # Read the uploaded image
        uploaded_image = cv2.imread(uploaded_image_path)

        # Call the darknet helper on the uploaded image
        detections, _, _ = darknet_helper(uploaded_image, width, height)

        # Draw bounding boxes and labels on the image
        for label, confidence, bbox in detections:
            left, top, right, bottom = bbox2points(bbox)
            cv2.rectangle(uploaded_image, (left, top), (right, bottom), class_colors[label], 2)
            cv2.putText(uploaded_image, "{} [{:.2f}]".format(label, float(confidence)),
                        (left, top - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.5,
                        class_colors[label], 2)

        # Display the image with detections
        cv2_imshow(uploaded_image)

# Upload image from Colab
uploaded_image_path = upload_image()

# Run YOLOv4 object detection on the uploaded image
detect_objects_on_uploaded_image(uploaded_image_path)




This code block extends the functionality to allow users to upload an image directly from the Colab notebook and apply YOLOv4 object detection:

1. **`upload_image()` Function:**
   - Utilizes the `files.upload()` method from the Colab `files` module to prompt the user to upload an image.
   - Returns the path of the uploaded image or `None` if no image is uploaded.

2. **`detect_objects_on_uploaded_image(uploaded_image_path)` Function:**
   - Takes the path of the uploaded image as input.
   - Reads the uploaded image using OpenCV.
   - Applies the `darknet_helper` function for YOLOv4 object detection.
   - Draws bounding boxes and labels on the image for detected objects.
   - Displays the image with detection results using `cv2_imshow`.

3. **Upload Image and Run Detection:**
   - Calls `upload_image()` to allow the user to upload an image.
   - Calls `detect_objects_on_uploaded_image()` to run YOLOv4 object detection on the uploaded image.

This interactive code allows users to either capture a photo from their webcam or upload an image from their local machine, applying YOLOv4 object detection and visualizing the results.


# **Live Webcam Object Detection with YOLOv4**

In [None]:
# JavaScript to properly create the live video stream using webcam as input
def video_stream():
  js = Javascript('''
    var video;
    var div = null;
    var stream;
    var captureCanvas;
    var imgElement;
    var labelElement;

    var pendingResolve = null;
    var shutdown = false;

    function removeDom() {
       stream.getVideoTracks()[0].stop();
       video.remove();
       div.remove();
       video = null;
       div = null;
       stream = null;
       imgElement = null;
       captureCanvas = null;
       labelElement = null;
    }

    function onAnimationFrame() {
      if (!shutdown) {
        window.requestAnimationFrame(onAnimationFrame);
      }
      if (pendingResolve) {
        var result = "";
        if (!shutdown) {
          captureCanvas.getContext('2d').drawImage(video, 0, 0, 640, 480);
          result = captureCanvas.toDataURL('image/jpeg', 0.8)
        }
        var lp = pendingResolve;
        pendingResolve = null;
        lp(result);
      }
    }

    async function createDom() {
      if (div !== null) {
        return stream;
      }

      div = document.createElement('div');
      div.style.border = '2px solid black';
      div.style.padding = '3px';
      div.style.width = '100%';
      div.style.maxWidth = '600px';
      document.body.appendChild(div);

      const modelOut = document.createElement('div');
      modelOut.innerHTML = "<span>Status:</span>";
      labelElement = document.createElement('span');
      labelElement.innerText = 'No data';
      labelElement.style.fontWeight = 'bold';
      modelOut.appendChild(labelElement);
      div.appendChild(modelOut);

      video = document.createElement('video');
      video.style.display = 'block';
      video.width = div.clientWidth - 6;
      video.setAttribute('playsinline', '');
      video.onclick = () => { shutdown = true; };
      stream = await navigator.mediaDevices.getUserMedia(
          {video: { facingMode: "environment"}});
      div.appendChild(video);

      imgElement = document.createElement('img');
      imgElement.style.position = 'absolute';
      imgElement.style.zIndex = 1;
      imgElement.onclick = () => { shutdown = true; };
      div.appendChild(imgElement);

      const instruction = document.createElement('div');
      instruction.innerHTML =
          '<span style="color: red; font-weight: bold;">' +
          'When finished, click here or on the video to stop this demo</span>';
      div.appendChild(instruction);
      instruction.onclick = () => { shutdown = true; };

      video.srcObject = stream;
      await video.play();

      captureCanvas = document.createElement('canvas');
      captureCanvas.width = 640;
      captureCanvas.height = 480;
      window.requestAnimationFrame(onAnimationFrame);

      return stream;
    }
    async function stream_frame(label, imgData) {
      if (shutdown) {
        removeDom();
        shutdown = false;
        return '';
      }

      var preCreate = Date.now();
      stream = await createDom();

      var preShow = Date.now();
      if (label != "") {
        labelElement.innerHTML = label;
      }

      if (imgData != "") {
        var videoRect = video.getClientRects()[0];
        imgElement.style.top = videoRect.top + "px";
        imgElement.style.left = videoRect.left + "px";
        imgElement.style.width = videoRect.width + "px";
        imgElement.style.height = videoRect.height + "px";
        imgElement.src = imgData;
      }

      var preCapture = Date.now();
      var result = await new Promise(function(resolve, reject) {
        pendingResolve = resolve;
      });
      shutdown = false;

      return {'create': preShow - preCreate,
              'show': preCapture - preShow,
              'capture': Date.now() - preCapture,
              'img': result};
    }
    ''')

  display(js)

def video_frame(label, bbox):
  data = eval_js('stream_frame("{}", "{}")'.format(label, bbox))
  return data

In [None]:
# start streaming video from webcam
video_stream()
# label for video
label_html = 'Capturing...'
# initialze bounding box to empty
bbox = ''
count = 0
while True:
    js_reply = video_frame(label_html, bbox)
    if not js_reply:
        break

    # convert JS response to OpenCV Image
    frame = js_to_image(js_reply["img"])

    # create transparent overlay for bounding box
    bbox_array = np.zeros([480,640,4], dtype=np.uint8)

    # call our darknet helper on video frame
    detections, width_ratio, height_ratio = darknet_helper(frame, width, height)

    # loop through detections and draw them on transparent overlay image
    for label, confidence, bbox in detections:
      left, top, right, bottom = bbox2points(bbox)
      left, top, right, bottom = int(left * width_ratio), int(top * height_ratio), int(right * width_ratio), int(bottom * height_ratio)
      bbox_array = cv2.rectangle(bbox_array, (left, top), (right, bottom), class_colors[label], 2)
      bbox_array = cv2.putText(bbox_array, "{} [{:.2f}]".format(label, float(confidence)),
                        (left, top - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.5,
                        class_colors[label], 2)

    bbox_array[:,:,3] = (bbox_array.max(axis = 2) > 0 ).astype(int) * 255
    # convert overlay of bbox into bytes
    bbox_bytes = bbox_to_bytes(bbox_array)
    # update bbox so next frame gets new overlay
    bbox = bbox_bytes# start streaming video from webcam
video_stream()
# label for video
label_html = 'Capturing...'
# initialze bounding box to empty
bbox = ''
count = 0
while True:
    js_reply = video_frame(label_html, bbox)
    if not js_reply:
        break

    # convert JS response to OpenCV Image
    frame = js_to_image(js_reply["img"])

    # create transparent overlay for bounding box
    bbox_array = np.zeros([480,640,4], dtype=np.uint8)

    # call our darknet helper on video frame
    detections, width_ratio, height_ratio = darknet_helper(frame, width, height)

    # loop through detections and draw them on transparent overlay image
    for label, confidence, bbox in detections:
      left, top, right, bottom = bbox2points(bbox)
      left, top, right, bottom = int(left * width_ratio), int(top * height_ratio), int(right * width_ratio), int(bottom * height_ratio)
      bbox_array = cv2.rectangle(bbox_array, (left, top), (right, bottom), class_colors[label], 2)
      bbox_array = cv2.putText(bbox_array, "{} [{:.2f}]".format(label, float(confidence)),
                        (left, top - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.5,
                        class_colors[label], 2)

    bbox_array[:,:,3] = (bbox_array.max(axis = 2) > 0 ).astype(int) * 255
    # convert overlay of bbox into bytes
    bbox_bytes = bbox_to_bytes(bbox_array)
    # update bbox so next frame gets new overlay
    bbox = bbox_bytes

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>



This code block establishes a live video stream using the webcam as input and performs real-time object detection using YOLOv4. It integrates JavaScript and Python code to create an interactive experience:

1. **JavaScript for Video Streaming:**
   - The `video_stream()` function initializes a live video stream from the webcam using JavaScript.
   - It creates DOM elements for video display and interaction.

2. **JavaScript Functions for Video Frame Processing:**
   - `stream_frame(label, bbox)` handles processing each video frame.
   - It updates the label and bounding box based on the detection results.
   - The function communicates between JavaScript and Python, allowing dynamic updates.

3. **Main Python Loop for Real-time Detection:**
   - A continuous loop captures and processes video frames.
   - The `video_frame` function sends the label and bounding box information to the JavaScript side.
   - It receives the processed frame as a base64-encoded image.

4. **Overlaying Bounding Boxes on Video Frames:**
   - Bounding boxes from YOLOv4 detections are drawn on a transparent overlay.
   - The overlay is updated for each frame, creating a dynamic visual representation.

This interactive code block enables real-time webcam-based object detection using YOLOv4 and showcases the results with bounding boxes and labels.


# **Conclusion**

The integration of Darknet and YOLOv4, along with interactive JavaScript and Python components, provides a robust platform for users to explore object detection capabilities in an accessible manner. The system's real-time overlay of bounding boxes and labels on video streams enhances user experience and understanding.

# **Future Improvements**

To enhance the model and expand its detection capabilities, consider the following improvements:

1. **Include More Classes in the Weights File:**
   - Expand the YOLOv4 weights file to include additional object classes relevant to specific use cases. This can be achieved by training the model on a diverse dataset containing the desired classes.

2. **Custom Model Training:**
   - Train a custom YOLOv4 model on a dataset containing a broader range of object classes. Use transfer learning to fine-tune the pre-trained model on specific classes of interest.

3. **Data Augmentation:**
   - Augment the existing dataset with various transformations (e.g., rotations, flips, and color adjustments) to improve model generalization and robustness.

4. **Optimization for Specific Environments:**
   - Tailor the model to perform well in specific environments by training it on datasets representative of those environments. This can improve object detection accuracy under specific conditions.

5. **Continuous Model Refinement:**
   - Periodically update the model with additional annotated data to adapt to evolving object detection requirements and improve overall accuracy.

By implementing these improvements, the Object Detection System can evolve into a more versatile and accurate solution for detecting a wider range of objects across various applications.
