In [2]:
!git clone https://github.com/lakhangudigar/Computer-vision.git

Cloning into 'Computer-vision'...
remote: Enumerating objects: 3, done.[K
remote: Counting objects: 100% (3/3), done.[K
remote: Total 3 (delta 0), reused 0 (delta 0), pack-reused 0 (from 0)[K
Receiving objects: 100% (3/3), done.


In [None]:
!pip install cvzone
!pip install ultralytics

Collecting cvzone
  Downloading cvzone-1.6.1.tar.gz (25 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: cvzone
  Building wheel for cvzone (setup.py) ... [?25l[?25hdone
  Created wheel for cvzone: filename=cvzone-1.6.1-py3-none-any.whl size=26298 sha256=37f93267391eb23d400b2919daf9298ff90c16bb419f9c8c71195652b70b39ad
  Stored in directory: /root/.cache/pip/wheels/2c/9f/b3/92e945ac4a71bf727a92463f38155cc5a4fa49c5010b38ec4c
Successfully built cvzone
Installing collected packages: cvzone
Successfully installed cvzone-1.6.1
Collecting ultralytics
  Downloading ultralytics-8.3.27-py3-none-any.whl.metadata (35 kB)
Collecting ultralytics-thop>=2.0.0 (from ultralytics)
  Downloading ultralytics_thop-2.0.10-py3-none-any.whl.metadata (9.4 kB)
Downloading ultralytics-8.3.27-py3-none-any.whl (878 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m879.0/879.0 kB[0m [31m31.3 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading ultralytic

In [None]:
from google.colab import drive
import cv2
import numpy as np
from ultralytics import YOLO
import cvzone

### Explanation

1. **`google.colab` (specifically `drive`)**:  
   - **Purpose**: This module provides functions to interact with Google Drive directly within Google Colab.

2. **`cv2` (OpenCV)**:  
   - **Purpose**: OpenCV (Open Source Computer Vision Library) is a popular library for computer vision tasks.

3. **`numpy` (imported as `np`)**:  
   - **Purpose**: NumPy is a fundamental library for numerical operations and handling large, multi-dimensional arrays in Python.

4. **`ultralytics` (specifically `YOLO`)**:  
   - **Purpose**: This is a library from Ultralytics that provides implementations for the YOLO (You Only Look Once) model, which is a fast and efficient object detection model.

     - `model.track()` or `model.detect()`: The YOLO model can detect objects and track them across frames. The model detects objects and provides bounding boxes, labels, and other properties for each object found.

5. **`cvzone`**:  
   - **Purpose**: CVZone is a Python library that builds on top of OpenCV and provides additional utilities for computer vision tasks, especially ones related to object tracking, detection, and interaction.


In [None]:
with open("/content/sample_data/coco.txt","r") as f:
  class_name = f.read().splitlines()

### Explanation

This code reads class names from a text file called `coco.txt` and stores each line as an element in a list. Here’s a step-by-step explanation of what each part does:

1. **`with open("/content/sample_data/coco.txt","r") as f`:**
   - This opens the file `coco.txt` located in the `/content/sample_data/` directory.
   - The `"r"` mode opens the file in read-only mode.
   - The `with` statement ensures the file is automatically closed after reading, even if an error occurs, making it a safe way to handle files.

2. **`f.read()`**:
   - This reads the entire content of the file into a single string.

3. **`.splitlines()`**:
   - This method splits the string into a list of lines, where each line is an item in the list, without including newline characters (`\n`).

4. **`class_name`**:
   - The list of class names is stored in the variable `class_name`.
   - Each entry in the `class_name` list corresponds to an object class from the COCO dataset, such as "person," "car," etc., which is used to label objects detected by the YOLO model.

In object detection, this list helps identify what each detected class ID represents. For example, if YOLO detects an object with a class ID of `0`, you can refer to `class_name[0]` to get the name of the detected class (e.g., "person").

In [None]:
model = YOLO("/content/drive/MyDrive/path_to_your_file/yolov8m.pt")


### Explanation
- **`YOLO`**: This is the class from the Ultralytics library that sets up a YOLO model instance.

- **`"/content/drive/MyDrive/path_to_your_file/yolov8s.pt"`**: This specifies the location and filename of the YOLOv8 weights file (in this case, `yolov8s.pt`), which contains all the pre-trained parameters required for running YOLOv8.

Once initialized, this model instance `model` is ready to detect objects in images or video frames using the YOLOv8 architecture. The "s" in `yolov8s.pt` stands for "small" and reflects the model's size and speed; other options might include `yolov8n.pt` (nano), `yolov8m.pt` (medium), etc., each with different performance characteristics.

In [None]:
cap = cv2.VideoCapture('/content/sample_data/m.avi')
count = 0
area = [(222,118),(194,337),(799,300),(728,112)]
people_count = []


### Explanation:

1. **`cap = cv2.VideoCapture('/content/sample_data/m.avi')`**:
   - This line creates a video capture object named `cap` using OpenCV’s `VideoCapture` function, which is used to read a video file.
   - The path `'/content/sample_data/m.avi'` indicates the location of the video file you want to analyze. This file is expected to be in the `sample_data` directory of your current environment (e.g., Google Colab).

2. **`count = 0`**:
   - This initializes a variable named `count` to zero. It will likely be used to keep track of the number of frames processed or to control the processing rate (e.g., skipping frames for performance reasons).

3. **`area = [(222, 118), (194, 337), (799, 300), (728, 112)]`**:
   - This creates a list named `area`, which contains tuples representing the coordinates of points that define a polygonal area (a region of interest) in the video frame.
   - These coordinates can be used for object tracking or counting within a specific area, such as counting people who enter or exit that defined area.

4. **`people_count = []`**:
   - This initializes an empty list named `people_count`. It will be used to store unique identifiers (IDs) of people detected in the video.
   - By maintaining this list, the program can count how many unique people have entered the defined area without double-counting anyone who appears in multiple frames.


In [None]:
output_path = '/content/processed_output3.avi'
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter(output_path, fourcc, 10.0, (1020,500))

## Explanation:

### What is .avi file?
An .avi file is a multimedia container format used for storing video and audio data.

### what is CODEC?
XVID: An open-source codec that is a popular choice for creating AVI files.

A codec (short for coder-decoder or compressor-decompressor) is a software or hardware tool used to encode and decode digital data streams, particularly audio and video. Codecs are essential for compressing files to save space and for decompressing them for playback or editing.

### what is VideoWriter_fourcc function?
The cv2.VideoWriter_fourcc function in OpenCV is used to specify the codec that will be used for encoding video files

1. **`output_path = '/content/processed_output.avi'`**:
   - This line defines a variable named `output_path` that stores the file path where the processed video will be saved. In this case, the processed video will be saved as `processed_output.avi` in the `/content` directory (common in environments like Google Colab).

2. **`fourcc = cv2.VideoWriter_fourcc(*'XVID')`**:
   - This line creates a FourCC code using OpenCV's `VideoWriter_fourcc` function.
   - FourCC (Four Character Code) is a 4-byte code used to specify the video codec that will be used for encoding the output video.
   - Here, `*'XVID'` specifies the XVID codec, which is a popular video codec that provides good compression and quality. The asterisk (*) unpacks the string into individual characters.

3. **`out = cv2.VideoWriter(output_path, fourcc, 20.0, (1020, 500))`**:
   - This line initializes a `VideoWriter` object named `out`. The `VideoWriter` object is used to write frames to a video file.
   - The parameters passed to `VideoWriter` are:
     - `output_path`: The path where the processed video will be saved.
     - `fourcc`: The codec used for encoding the video (in this case, XVID).
     - `10.0`: The frames per second (FPS) for the output video. This means the video will play back at 10 frames per second.
     - `(1020, 500)`: The size of the video frames (width, height) that will be written to the output file. This must match the dimensions of the frames being processed.

### Overall Purpose:
This code snippet sets up the necessary components to save the processed video output. It specifies the output file path, selects the video codec, and defines the frame rate and resolution of the output video. This setup allows you to write the frames that are processed in your video analysis loop to a new video file.



Let's go through this code line-by-line for detailed explanation:

```python
# Initialize counter and ID mapping
count = 0
people_count = []
id_map = {}
custom_id = 1
```
- **count = 0**: This initializes a frame counter `count` to 0, which will be used later to control frame processing frequency.
- **people_count = []**: This list will store unique tracking IDs for detected people to avoid double-counting.
- **id_map = {}**: This dictionary will map tracker IDs (from the tracking model) to custom sequential IDs for each person.
- **custom_id = 1**: This sets the starting value for assigning custom IDs to people detected in the frame.

---

```python
while True:
    # Capture frame-by-frame
    ret, frame = cap.read()
    if not ret:
        break
```
- **while True**: This creates an infinite loop to process each video frame continuously.
- **ret, frame = cap.read()**: Reads a frame from the video stream (`cap`). `ret` is a boolean flag that is `True` if a frame is successfully read; otherwise, it's `False`. `frame` contains the actual frame.
- **if not ret: break**: If no frame is captured (e.g., at the end of the video), the loop breaks and processing stops.

---

```python
    count += 1
    if count % 2 != 0:  # Skip every other frame
        continue
```
- **count += 1**: This increments the frame counter by 1 for every frame read.
- **if count % 2 != 0: continue**: This ensures only every second frame is processed by skipping odd-numbered frames. This helps reduce processing load and speed up performance by halving the frame rate.

---

```python
    # Resize frame and process with the model
    frame = cv2.resize(frame, (1020, 500))
    results = model.track(frame, persist=True)
```
- **frame = cv2.resize(frame, (1020, 500))**: Resizes the current frame to 1020x500 pixels to ensure consistent input size for the model.
- **results = model.track(frame, persist=True)**: The tracking model processes the frame and returns detection results (bounding boxes, class IDs, track IDs, and confidence scores). The `persist=True` keeps the model's internal tracking state across frames.

---

```python
    # Check if there are detections
    if results[0].boxes is not None and results[0].boxes.id is not None:
```
- **if results[0].boxes is not None and results[0].boxes.id is not None**: This checks if any objects were detected in the frame (i.e., `results[0].boxes` is not empty) and if they have valid tracking IDs (i.e., `results[0].boxes.id` is not empty).

---

```python
        boxes = results[0].boxes.xyxy.int().cpu().tolist()
        class_ids = results[0].boxes.cls.int().cpu().tolist()
        track_ids = results[0].boxes.id.int().cpu().tolist()
        confidences = results[0].boxes.conf.cpu().tolist()
```
- **boxes = results[0].boxes.xyxy.int().cpu().tolist()**: Extracts the bounding box coordinates (top-left x, y and bottom-right x, y) from the detected objects and converts them to integer format, then moves the data to CPU memory and converts it into a list.
- **class_ids = results[0].boxes.cls.int().cpu().tolist()**: Extracts class IDs (e.g., 'person', 'car') and converts them to integers, moves them to CPU, and converts to a list.
- **track_ids = results[0].boxes.id.int().cpu().tolist()**: Extracts the tracking IDs (unique identifiers for each detected object) and converts them to integers, moves them to CPU, and converts to a list.
- **confidences = results[0].boxes.conf.cpu().tolist()**: Extracts the confidence scores (likelihood of correct detection) and moves them to CPU, then converts to a list.

---

```python
        for box, class_id, track_id, conf in zip(boxes, class_ids, track_ids, confidences):
            c = class_name[class_id]
            x1, y1, x2, y2 = box
            cx = int(x1 + x2) // 2
            cy = int(y1 + y2) // 2
```
- **for box, class_id, track_id, conf in zip(boxes, class_ids, track_ids, confidences)**: Iterates through all detected objects in the current frame, unpacking their bounding box, class ID, tracking ID, and confidence score.
- **c = class_name[class_id]**: Converts the numeric class ID into a readable class name (e.g., 'person', 'car') using the `class_name` dictionary.
- **x1, y1, x2, y2 = box**: Extracts the bounding box coordinates (top-left and bottom-right corners).
- **cx = int(x1 + x2) // 2**: Computes the x-coordinate of the center of the bounding box.
- **cy = int(y1 + y2) // 2**: Computes the y-coordinate of the center of the bounding box.

---

```python
            # Check if the detected object is a person and within the specified area
            if 'person' in c:
                result = cv2.pointPolygonTest(np.array(area, np.int32), (cx, cy), False)
                if result >= 0:
```
- **if 'person' in c**: Checks if the detected object is classified as a person.
- **result = cv2.pointPolygonTest(np.array(area, np.int32), (cx, cy), False)**: Uses OpenCV's `pointPolygonTest` function to check if the center of the detected object (`cx, cy`) lies within a defined polygonal area (`area`). If `result >= 0`, the point is inside the area.
  
---

```python
                    # Assign a unique ID if not already mapped
                    if track_id not in id_map:
                        id_map[track_id] = custom_id
                        custom_id += 1
```
- **if track_id not in id_map**: Checks if this tracking ID has been seen before. If not, a custom ID is assigned.
- **id_map[track_id] = custom_id**: Maps the current tracking ID to a custom sequential ID.
- **custom_id += 1**: Increments the custom ID for the next new person.

---

```python
                    # Draw bounding box and ID
                    cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
                    cvzone.putTextRect(frame, f'ID-{id_map[track_id]}', (x1, y1), 1, 1)
```
- **cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)**: Draws a green rectangle around the detected person using their bounding box coordinates.
- **cvzone.putTextRect(frame, f'ID-{id_map[track_id]}', (x1, y1), 1, 1)**: Draws the custom ID of the detected person near the top-left corner of the bounding box.

---

```python
                    # Update people count if ID is new
                    if track_id not in people_count:
                        people_count.append(track_id)
```
- **if track_id not in people_count**: If this is a new person (i.e., their tracking ID hasn't been counted yet), add their tracking ID to `people_count` to avoid counting them again.
- **people_count.append(track_id)**: Adds the current tracking ID to the list of counted people.

---

```python
        # Display the people counter
        p_counter = len(people_count)
        cvzone.putTextRect(frame, f'People counter: {p_counter}', (50, 60), 2, 2)
        cv2.polylines(frame, [np.array(area, np.int32)], True, (255, 0, 0), 2)
```
- **p_counter = len(people_count)**: Counts the total number of unique people detected so far.
- **cvzone.putTextRect(frame, f'People counter: {p_counter}', (50, 60), 2, 2)**: Displays the total people count on the frame at position (50, 60).
- **cv2.polylines(frame, [np.array(area, np.int32)], True, (255, 0, 0), 2)**: Draws the specified polygonal area on the frame in blue to visually show the region where people are being counted.

---

```python
        # Write frame to output video
        out.write(frame)
```
- **out.write(frame)**: Writes the current frame (with bounding boxes, IDs, and counters) to the output video file.

---

```python
# Release resources
cap.release()
out.release()
print("Video saved as:", output_path)
```
- **cap.release()**: Releases the video capture object (`cap`) to free up system resources.
- **out.release()**: Releases the video writer object (`out`) used to save the output video.
- **print("Video saved as:", output_path)**: Prints a message showing the file path

 of the saved video, indicating that the video processing has been completed successfully.

This entire code is part of an object detection and tracking application, where people are detected and tracked in video frames. The code ensures that each person is uniquely identified and counted only once, and the results are visualized and saved to a new video file.

In [None]:

# Initialize counter and ID mapping
count = 0
people_count = []
id_map = {}
custom_id = 1

while True:
    # Capture frame-by-frame
    ret, frame = cap.read()
    if not ret:
        break

    count += 1
    if count % 2 != 0:  # Skip every other frame
        continue

    # Resize frame and process with the model
    frame = cv2.resize(frame, (1020, 500))
    results = model.track(frame, persist=True)

    # Check if there are detections
    if results[0].boxes is not None and results[0].boxes.id is not None:
        boxes = results[0].boxes.xyxy.int().cpu().tolist()
        class_ids = results[0].boxes.cls.int().cpu().tolist()
        track_ids = results[0].boxes.id.int().cpu().tolist()
        confidences = results[0].boxes.conf.cpu().tolist()

        for box, class_id, track_id, conf in zip(boxes, class_ids, track_ids, confidences):
            c = class_name[class_id]
            x1, y1, x2, y2 = box
            cx = int(x1 + x2) // 2
            cy = int(y1 + y2) // 2

            # Check if the detected object is a person and within the specified area
            if 'person' in c:
                result = cv2.pointPolygonTest(np.array(area, np.int32), (cx, cy), False)
                if result >= 0:
                    # Assign a unique ID if not already mapped
                    if track_id not in id_map:
                        id_map[track_id] = custom_id
                        custom_id += 1

                    # Draw bounding box and ID
                    cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
                    cvzone.putTextRect(frame, f'ID-{id_map[track_id]}', (x1, y1), 1, 1)

                    # Update people count if ID is new
                    if track_id not in people_count:
                        people_count.append(track_id)

        # Display the people counter
        p_counter = len(people_count)
        cvzone.putTextRect(frame, f'People counter: {p_counter}', (50, 60), 2, 2)
        cv2.polylines(frame, [np.array(area, np.int32)], True, (255, 0, 0), 2)

        # Write frame to output video
        out.write(frame)

# Release resources
cap.release()
out.release()
print("Video saved as:", output_path)



0: 320x640 11 persons, 23.8ms
Speed: 2.3ms preprocess, 23.8ms inference, 1.7ms postprocess per image at shape (1, 3, 320, 640)

0: 320x640 10 persons, 1 elephant, 22.8ms
Speed: 2.6ms preprocess, 22.8ms inference, 1.4ms postprocess per image at shape (1, 3, 320, 640)

0: 320x640 11 persons, 22.9ms
Speed: 2.5ms preprocess, 22.9ms inference, 1.5ms postprocess per image at shape (1, 3, 320, 640)

0: 320x640 11 persons, 22.9ms
Speed: 3.7ms preprocess, 22.9ms inference, 1.5ms postprocess per image at shape (1, 3, 320, 640)

0: 320x640 10 persons, 22.9ms
Speed: 3.0ms preprocess, 22.9ms inference, 1.5ms postprocess per image at shape (1, 3, 320, 640)

0: 320x640 9 persons, 22.9ms
Speed: 3.4ms preprocess, 22.9ms inference, 1.6ms postprocess per image at shape (1, 3, 320, 640)

0: 320x640 10 persons, 22.9ms
Speed: 2.3ms preprocess, 22.9ms inference, 1.5ms postprocess per image at shape (1, 3, 320, 640)

0: 320x640 10 persons, 23.0ms
Speed: 2.7ms preprocess, 23.0ms inference, 1.7ms postprocess p

In [None]:
!git clone https://github.com/<lakhangudigar>/<OpenCv>.git


/bin/bash: line 1: lakhangudigar: No such file or directory
