#### Steps

Download the YOLOv3 weights file (yolov3.weights), configuration file (yolov3.cfg), and class names file (coco.names) from the official YOLO website.

#### yolov3.weights:

This file contains the pre-trained weights of the YOLOv3 model.

The weights represent the learned parameters from the model during training on a large dataset. 

The YOLOv3 weights file is quite large, and it encapsulates the knowledge acquired by the model to recognize a wide range of objects.

The weights file provides the learned parameters of the model, enabling it to make accurate predictions based on the features it learned during training.

The weights file in the context of neural networks, including YOLOv3, contains the learned parameters of the model. 

These parameters represent the weights and biases of the network's neurons, and they have been adjusted during the training process to minimize the difference between the predicted output and the actual output (ground truth) for a given input.

#### yolov3.cfg:

The configuration file specifies the architecture and configuration settings of the YOLOv3 model. 

It defines parameters such as the number of layers, filter sizes, activation functions, and more. 

The configuration file is crucial for reconstructing the YOLOv3 model architecture, allowing you to load the model correctly in your code.

The configuration file is needed to reconstruct the YOLOv3 model architecture correctly in your code. 

It specifies the number of layers, their types, and various hyperparameters.

#### coco.names:

The class names file contains a list of object classes that the YOLOv3 model has been trained to detect.

In the case of the coco.names file, it includes names of objects from the COCO (Common Objects in Context) dataset, which is a widely used dataset for object detection. 

Each line in the file corresponds to a unique object class.

The class names file helps you interpret the output of the model. 

It provides the names of the classes corresponding to the numeric labels predicted by the model. This is crucial for understanding which objects the model has detected in an image.

Together, these files allow you to use a pre-trained YOLOv3 model for object detection tasks without having to train the model from scratch on your specific dataset.

In [1]:
#import necessary libraries
import cv2
import numpy as np

In [2]:
net = cv2.dnn.readNet("yolov3.weights", "yolov3.cfg")
classes = []

with open("coco.names", "r") as f:
    classes = [line.strip() for line in f]

layer_names = net.getUnconnectedOutLayersNames()

Use cv2.dnn.readNet to load the YOLO model from the weights and configuration files.
Read class names from the coco.names file.

layer_names = net.getUnconnectedOutLayersNames() is used to retrieve the names of the unconnected output layers of the YOLOv3 neural network.

getUnconnectedOutLayersNames(): This method is provided by OpenCV's DNN module, specifically for YOLO models. It returns the names of the unconnected output layers.

In YOLOv3, the network architecture is designed in such a way that the final detection results are obtained from multiple output layers. These output layers provide predictions at different scales or resolutions.

The output layers of the YOLOv3 network are unconnected because they operate independently and provide predictions for different spatial resolutions.

In summary, layer_names = net.getUnconnectedOutLayersNames() is used to obtain the names of the unconnected output layers of the YOLOv3 model, which is important for extracting detection results during inference.

In [3]:
# Function to get object predictions
def get_predictions(img):
    blob = cv2.dnn.blobFromImage(img, 0.00392, (416, 416), (0, 0, 0), True, crop=False)
    net.setInput(blob)
    outs = net.forward(layer_names)

    return outs

This code defines a function get_predictions that takes an input image (img) and performs object detection using a YOLOv3 neural network

cv2.dnn.blobFromImage: This function preprocesses the input image to make it suitable for input to the YOLOv3 network. The parameters used are as follows:

img: The input image.
0.00392: Scaling factor for pixel values. This is used to normalize the pixel values.

(416, 416): Size to which the image will be resized before feeding it to the network. YOLOv3 typically uses 416x416 pixels.

(0, 0, 0): Mean subtraction values for each channel (BGR).

True: Indicates whether to swap the Blue and Red channels, which is typically set to True for OpenCV.

crop=False: Indicates whether to crop the image after resizing.

The resulting blob is a standardized input that can be fed into the YOLOv3 network.

net.setInput(blob): This sets the blob as the input to the YOLOv3 network. The neural network is now ready to perform a forward pass with this input.

net.forward(layer_names): This is the forward pass through the YOLOv3 network. It computes the output of the network based on the given input blob. The layer_names are the names of the unconnected output layers, and they were obtained earlier using net.getUnconnectedOutLayersNames().

return outs: The variable outs contains the predictions made by the YOLOv3 network. These predictions include information about bounding boxes, class probabilities, and other relevant details for detected objects.

By calling get_predictions on an input image, you obtain the YOLOv3 predictions, which can be further processed to draw bounding boxes, display confidence scores, and interpret the detected objects in the image.

In [4]:
# Function to draw bounding boxes
def draw_boxes(img, outs):
    height, width, _ = img.shape
    class_ids = []
    confidences = []
    boxes = []
    #This parts iterates the output model
    #THe yolov3 model divides the inputs image into
    #a grid and ,akes predictions for each grid cell

    for out in outs:
        for detection in out:
            #For each detection in the grid cell, this part extracts the confidence
            #Scores for each class
            #The class_id is detemined as the index with the maximum confidence
            scores = detection[5:]
            class_id = np.argmax(scores)
            confidence = scores[class_id]
            #It checks if the confidence for the predicted class is above threshold
            #Certain threshold(e.g, 0.5)
            #If the confidence is below this threshold, the detection is ignored
            
            if confidence > 0.5:
            #This calculates the bounding box co-ordinates and dimensions using
            #The information from YOLO output
                center_x = int(detection[0] * width)
                center_y = int(detection[1] * height)
                w = int(detection[2] * width)
                h = int(detection[3] * height)

                x = int(center_x - w / 2)
                y = int(center_y - h / 2)
               #The relevent info (class id,confidence,and boundong box)
               #is append to lists for further processing

                class_ids.append(class_id)
                confidences.append(float(confidence))
                boxes.append([x, y, w, h])
     #non-maximum suppression is applied to eliminate redunant and overlapping boxes.
    #It keeps only the most confident bounding boxes
    indices = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4)

    # Loop through the indices after non-maximum suppression
    for i in indices:
        # i is already a scalar, so no need to use i[0]
        box = boxes[i]
        x, y, w, h = box

        # Get class label, confidence, and color for drawing the bounding box
        label = str(classes[class_ids[i]])
        confidence = confidences[i]
        color = (0, 255, 0)  # Green color

        # Draw bounding box and label on the image
        cv2.rectangle(img, (x, y), (x + w, y + h), color, 2)
        cv2.putText(img, f"{label} {confidence:.2f}", (x, y - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2)


This code defines a function draw_boxes that takes an input image (img) and the output of YOLOv3 predictions (outs). The function processes the predictions and draws bounding boxes around detected objects on the image.

Loop through detections: The function iterates through the detections provided by the YOLOv3 output and extracts relevant information such as class IDs, confidences, and bounding box coordinates.

Filter by confidence: Detections with confidence scores below a certain threshold (e.g., 0.5) are ignored.

Calculate bounding box coordinates: The function calculates the bounding box coordinates based on the YOLOv3 output.

Non-maximum suppression: Non-maximum suppression (NMS) is applied to eliminate redundant overlapping boxes, keeping only the most confident ones.

Draw bounding boxes: The function then draws bounding boxes and labels on the input image using OpenCV functions.

Overall, the draw_boxes function facilitates the visualization of YOLOv3 object detection results by drawing bounding boxes around detected objects.

In [13]:
# Read input image
img = cv2.imread("planet1.jpeg")


In [14]:
# Get YOLO predictions
outs = get_predictions(img)


In [15]:
# Draw bounding boxes on the image
draw_boxes(img, outs)

In [16]:
# Display the output image
cv2.imshow("YOLO Object Detection", img)
cv2.waitKey(0)
cv2.destroyAllWindows()