## Counting objects with YOLO8

In this exercise, we will count a particular object in real-time using [YOLOv8](https://docs.ultralytics.com/models/yolov8/) object detection model. We will see how we can effectivily monitor not only static objects but also the objects as they move within a bounding box. We will also see how we can change the object that we want to count for different scenarios. Run the next cell to initialize the code.

In [1]:
%pip install -q opencv-python pyyaml ultralytics

Note: you may need to restart the kernel to use updated packages.


### Initializing the model

Our next step is to initialize the model. Key points about this steps are :
- Detection model name is declared. In our case that is "yolov8n"
- Detection model path is set.
- Label map is loaded. The label map tells us what class of objects we can use to accomplish the counting of a particular object. For example, we can use people as class of object, or apples. Label map is the list of all those classes of objects.

Click on the Play icon to the left of the cell below to initialize the model.

In [26]:
import platform, cv2, time, collections, torch, yaml
from ultralytics import YOLO
from pathlib import Path
import numpy as np
from IPython import display
import matplotlib.pyplot as plt

# Define the model to be used.
det_model = YOLO('yolov8n.pt')  # You can use 'yolov8s.pt', 'yolov8m.pt', etc. for different model sizes
    
# Loading the model names.
label_map = det_model.model.names
reversed_label_map = {v: k for k, v in label_map.items()}

# Need to make en empty call to initialize the model
res = det_model()





  ckpt = torch.load(file, map_location="cpu")


image 1/2 /home/dakir/IPD2024/.venv/lib/python3.12/site-packages/ultralytics/assets/bus.jpg: 640x480 4 persons, 1 bus, 1 stop sign, 84.3ms
image 2/2 /home/dakir/IPD2024/.venv/lib/python3.12/site-packages/ultralytics/assets/zidane.jpg: 384x640 2 persons, 1 tie, 61.9ms
Speed: 4.3ms preprocess, 73.1ms inference, 2.3ms postprocess per image at shape (1, 3, 384, 640)


### Define helper functions
These helper functions draw the box around the detected object, and track the detected object.

Click on the Play icon to the left of the cell below to setup the helper functions

In [27]:
def show_box(box, ax):
    x0, y0 = box[0], box[1]
    w, h = box[2] - box[0], box[3] - box[1]
    ax.add_patch(plt.Rectangle((x0, y0), w, h, edgecolor="green", facecolor=(0, 0, 0, 0), lw=2))

# Function defined to draw points. 
# This is function creates points with colors and other details.
def show_points(coords, labels, ax, marker_size=375):
    pos_points = coords[labels == 1]
    neg_points = coords[labels == 0]
    ax.scatter(
        pos_points[:, 0],
        pos_points[:, 1],
        color="green",
        marker="*",
        s=marker_size,
        edgecolor="white",
        linewidth=1.25,
    )
    ax.scatter(
        neg_points[:, 0],
        neg_points[:, 1],
        color="red",
        marker="*",
        s=marker_size,
        edgecolor="white",
        linewidth=1.25,
    )
    

### Inferencing function
The inferencing function is the core of the this excercise. This function takes three parameters. These paramters are:
- Source: This parameter tells the inferencing function which video feed to use as the source.
- DeviceType: This parameter is related to the device type to use for inferencing. In our example, we are going to use "CPU" as the devices type. Other example of this parater is "GPU" which we are not using in this exercise.
- Object to count - This parameter tells what object to use for counting. 


Click on the Play icon to the left of the cell below to setup the inferencing function

In [28]:
# Funtion defined to run the inferencing using source video and target object.
def run_inference(source, deviceType, objectToCount):
    objectid = reversed_label_map[objectToCount]
    frame_count = 0
    cap = cv2.VideoCapture(source)
    assert cap.isOpened(), "Error reading video file"
    
    line_points = [(0, 600), (800, 600),(800, 0),(0,0),(0,600)]  # line or region points

    # open the video feed
    while cap.isOpened():
        success, frame = cap.read()
        object_count = 0
        if not success:
            print("Video frame is empty or video processing has been successfully completed.")
            break
        # Count persons in the current frame
        results = det_model(frame)
        for result in results:
            for box in result.boxes:
                if box.cls[0] == objectid:
                    object_count += 1
                    x1, y1, x2, y2 = map(int, box.xyxy[0])
                    cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
                    cv2.putText(frame, f'{objectToCount}', (x1, y1-10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (36,255,12), 2)
         
        # Display the frame with the count
        cv2.putText(frame, f'{label_map[objectid]}: {object_count}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
        cv2.imshow('Frame', frame)
        
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

    cap.release()
    cv2.destroyAllWindows()

### Configure video and choose what to track
For this step, assign one of three values to objectToCount variable below. The three options are: person, apple and banana.

- To accomplish this, enter the value for "objectToCount" variable in the below cell.
- Once "objectToCount" variable is set, click on the play icon to the left of the cell below to execute assigning of the source vide step

Additionally, provide a URL of a Video feed that would be used as the source video for inference.

- To accomplish this, enter the value for "sourceVideo" variable in the below cell.
- Once "sourceVideo" variable is set, click on the play icon to the left of the cell below to execute assigning of the source vide step.


In [31]:
objectToCount = "boat"   # "person", "apple" or "banana"

# Available video 1: https://download.microsoft.com/download/caaf80b6-2394-4fbc-8430-8b41a3206c64/people-are-pushing-carts-along.mp4
# Available video 2: https://download.microsoft.com/download/a0ac5d61-60b6-4037-9555-ba5acefeb0c8/people-near-shop-counter-fruit.mp4
sourceVideo = "https://download.microsoft.com/download/caaf80b6-2394-4fbc-8430-8b41a3206c64/people-are-pushing-carts-along.mp4"

### Execution
This is the final step of the exercise. In this step, the inferening function that was defined in the previous step, is called and the output result is shown.
As mentioned previously, the inferencing function receives the information about the source of the video, the deviceType (in our case it is "CPU") and the object to count information from this execution step.

Click on the Play icon to the left of the cell below to execute the final step.

In [32]:
# Ensuring we have sourceVideo variable set if not then set a default value.
if sourceVideo == "":
    sourceVideo = "https://download.microsoft.com/download/caaf80b6-2394-4fbc-8430-8b41a3206c64/people-are-pushing-carts-along.mp4"

# Running the inferencing.
run_inference(
    source=sourceVideo,
    deviceType="CPU",
    objectToCount=objectToCount
)


0: 384x640 6 persons, 64.1ms
Speed: 4.7ms preprocess, 64.1ms inference, 3.2ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 8 persons, 1 boat, 63.9ms
Speed: 4.8ms preprocess, 63.9ms inference, 2.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 6 persons, 1 boat, 61.0ms
Speed: 4.4ms preprocess, 61.0ms inference, 1.9ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 persons, 60.8ms
Speed: 4.4ms preprocess, 60.8ms inference, 2.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 persons, 60.9ms
Speed: 4.2ms preprocess, 60.9ms inference, 2.6ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 6 persons, 1 apple, 6.4ms
Speed: 1.3ms preprocess, 6.4ms inference, 1.8ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 6 persons, 2 apples, 6.0ms
Speed: 2.6ms preprocess, 6.0ms inference, 1.4ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 persons, 2 apples, 5.0ms
Speed: 1.6ms preprocess, 5.0ms inference,

### Bonus activities

#### 1. Adding/removing track lines for person counting

In this bonus activity, we will change code to allow track lines to be visible in case we are tracking objects in motion.

- To accomplish this, change the "draw_tracks" parameter in run_inference function to reflect if tracks are required or not.
- Rerun the notebook after making change to test.

#### 2. Change the dimenions of the bounding box

In this bonus activity, we will change the code to draw the bounding box with different dimenions.

- To accomplish this, change the run_inference function to change the line_points variable.
- Example line_points = [(0, 600), (800, 600),(800, 0),(0,0),(0,600)] 
- Rerun the notebook after making change to test.