# Introduction and Library Installation

In this section, we'll introduce the Ultralytics and OpenCV libraries and guide you through the installation process using pip.

## Introduction to Ultralytics and OpenCV Libraries

`Ultralytics` is a popular open-source computer vision library that provides state-of-the-art implementations of deep learning models, particularly for object detection and image classification tasks. It offers easy-to-use APIs and pre-trained models, making it a go-to choice for many developers and researchers in the field of computer vision.

`OpenCV (Open Source Computer Vision Library)` is another widely used library for computer vision tasks. It provides a comprehensive set of functions and algorithms for image processing, video analysis, and machine learning. OpenCV is written in C++ and has bindings for Python, making it accessible and versatile for various applications.

## Installing Libraries using pip

To install Ultralytics and OpenCV libraries, you can use the Python package manager pip. If you haven't installed pip yet, you can do so by following the instructions on the official Python website.

Once pip is installed, you can install Ultralytics and OpenCV by running the following commands in your terminal or command prompt:

`pip install yolov5`  `# Install Ultralytics library for YOLO models`


`pip install opencv-python`  `# Install OpenCV library for computer vision tasks`

Make sure to run these commands in your terminal with appropriate permissions, or within a virtual environment if you're working in a project-specific environment.


With Ultralytics and OpenCV installed, you're ready to dive into object detection and image processing tasks using YOLO models and OpenCV functions.

# Problem Introduction

In this section, we'll delve into the problem space surrounding object detection and image processing using YOLO (You Only Look Once) models. We'll explore the significance of object detection in images and its diverse applications across various domains.

## Explanation about Object Detection and Image Processing using YOLO Models

Object detection is a fundamental task in computer vision that involves identifying and locating objects within images or video frames. The goal is to accurately classify the objects present in an image and determine their precise locations using bounding boxes.

YOLO (You Only Look Once) is a popular deep learning architecture for object detection tasks. Unlike traditional approaches that involve sliding window techniques or region proposal networks, YOLO takes a different approach by framing object detection as a regression problem. It divides the input image into a grid and predicts bounding boxes and class probabilities directly from the entire image in one pass.

## Importance of Object Detection in Images and its Applications

`Autonomous Vehicles`: Object detection is essential for autonomous vehicles to perceive and understand their surroundings. It enables vehicles to detect and track pedestrians, vehicles, traffic signs, and other objects on the road to ensure safe navigation.


`Surveillance Systems`: In security and surveillance systems, object detection helps in detecting and monitoring suspicious activities, intruders, and objects of interest in video streams or static images.


`Retail Analytics`: Object detection is used in retail environments for inventory management, shelf monitoring, and customer behavior analysis. It enables retailers to track product availability, optimize shelf layouts, and analyze customer interactions with products.


`Medical Imaging`: In medical imaging, object detection aids in identifying anatomical structures, abnormalities, and pathological conditions in medical scans such as X-rays, MRIs, and CT scans. It assists healthcare professionals in diagnosis, treatment planning, and disease monitoring.


By accurately detecting and localizing objects in images, YOLO models facilitate a wide range of applications across industries, contributing to advancements in technology, safety, and efficiency.

# Using YOLO Model for Object Detection in Images

In this section, we'll learn how to utilize the YOLO (You Only Look Once) model for object detection in images. We'll cover the steps involved in loading the YOLO model, detecting objects in an image, and interpreting the detected results to display object categories and model confidence.

In [2]:
# وارد کردن کتابخانه‌های مورد نیاز
# Importing required libraries
from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator
import cv2

In [3]:
# ایجاد یک نمونه از مدل YOLO با استفاده از فایل آموزش دیده شده yolov8n.pt
# Creating an instance of the YOLO model using the trained model file yolov8n.pt
model = YOLO('yolov8n.pt')

In [4]:
# ایجاد یک نمونه دیگر از مدل YOLO برای تشخیص تقسیم بندی
# Creating another instance of the YOLO model for segmentation
model2 = YOLO('yolov8n-seg.pt')

## Detecting Objects in an Image and Displaying Results

Once the YOLO model is loaded, we can use it to detect objects in an image. We'll pass the image to the model and obtain the detection results, which include bounding boxes, object categories, and confidence scores. We'll then visualize the detection results by drawing bounding boxes around detected objects on the image.

In [5]:
# انجام تشخیص بر روی تصویر bus.jpg و گرفتن نتایج
# Performing detection on the image bus.jpg and getting the results
result = model('bus.jpg')
print(result[0].boxes.cls[0])  # چاپ نام کلاس اشیا شناسایی شده


image 1/1 c:\Users\ya30\Desktop\aioleran7\ai_deep_j12\bus.jpg: 320x640 1 bus, 93.1ms
Speed: 2.0ms preprocess, 93.1ms inference, 2.0ms postprocess per image at shape (1, 3, 320, 640)


tensor(5.)


In [6]:
# دریافت نام کلاس‌ها
# Getting the class names
class_names = result[0].names
class_names[result[0].boxes.cls[0].item()]

'bus'

In [7]:
# محاسبه و چاپ درصد اطمینان مدل
# Calculating and printing the model's confidence percentage
conf = result[0].boxes[0].conf[0].item() * 100
print(conf, '%')

93.7484860420227 %


In [8]:
# خواندن تصویر sakht.jpg
# Reading the image sakht.jpg
img = cv2.imread("sakht.jpg")

In [9]:
# انجام تشخیص بر روی تصویر sakht.jpg
# Performing detection on the image sakht.jpg
result = model('sakht.jpg')


image 1/1 c:\Users\ya30\Desktop\aioleran7\ai_deep_j12\sakht.jpg: 448x640 5 persons, 1 car, 1 dog, 114.6ms
Speed: 4.0ms preprocess, 114.6ms inference, 1.0ms postprocess per image at shape (1, 3, 448, 640)


In [10]:
# ایجاد یک نمونه دیگر از کلاس Annotator برای نمایش برچسب‌ها
# Creating another instance of the Annotator class for labeling
box = Annotator(img)

In [11]:
# اضافه کردن برچسب‌ها به اشیا شناسایی شده با درصد اطمینان بالا
# Adding labels to the detected objects with high confidence percentage
for i in range(len(result[0].boxes)):
    if (float(result[0].boxes[i].conf) > 0.2):
        box.box_label(result[0].boxes[i].xyxy[0], f"{result[0].names[int(result[0].boxes[i].cls[0].item())]} {float(result[0].boxes[i].conf):.2}", (0, 0, 255))

In [12]:
# نمایش تصویر حاوی برچسب‌های اضافه شده
# Displaying the image with the added labels
cv2.imshow("ax", img)
cv2.waitKey(0)
cv2.destroyAllWindows()

In [13]:
# انجام تشخیص بر روی تصویر sakht.jpg و نمایش نتایج
# Performing detection on the image sakht.jpg and showing the results
result = model2('sakht.jpg', show=True)


image 1/1 c:\Users\ya30\Desktop\aioleran7\ai_deep_j12\sakht.jpg: 448x640 5 persons, 1 car, 1 dog, 182.3ms
Speed: 1.0ms preprocess, 182.3ms inference, 9.6ms postprocess per image at shape (1, 3, 448, 640)


In [14]:
# انجام تشخیص بر روی تصویر sakht.jpg
# Performing detection on the image sakht.jpg
result = model('sakht.jpg')


image 1/1 i:\project\aiolearn-projects\deep-learning\ai_deep_j12\sakht.jpg: 448x640 5 persons, 1 car, 1 dog, 134.7ms
Speed: 5.0ms preprocess, 134.7ms inference, 0.0ms postprocess per image at shape (1, 3, 448, 640)


In [14]:
# ایجاد یک نمونه دیگر از کلاس Annotator برای نمایش برچسب‌ها
# Creating another instance of the Annotator class for labeling
box = Annotator(img)

In [15]:
# اضافه کردن برچسب‌ها به اشیا شناسایی شده با درصد اطمینان بالا
# Adding labels to the detected objects with high confidence percentage
for i in range(len(result[0].boxes)):
    if (float(result[0].boxes[i].conf) > 0.2):
        box.box_label(result[0].boxes[i].xyxy[0], f"{result[0].names[int(result[0].boxes[i].cls[0].item())]} {float(result[0].boxes[i].conf):.2}", (0, 0, 255))

In [16]:
# نمایش تصویر حاوی برچسب‌های اضافه شده
# Displaying the image with the added labels
cv2.imshow("ax", img)
cv2.waitKey(0)
cv2.destroyAllWindows()

In [17]:
# انجام تشخیص بر روی تصویر sakht.jpg و نمایش نتایج
# Performing detection on the image sakht.jpg and showing the results
result = model2('sakht.jpg', show=True)


image 1/1 c:\Users\ya30\Desktop\aioleran7\ai_deep_j12\sakht.jpg: 448x640 5 persons, 1 car, 1 dog, 133.3ms
Speed: 1.9ms preprocess, 133.3ms inference, 10.5ms postprocess per image at shape (1, 3, 448, 640)


In [18]:
# ایجاد یک نمونه دیگر از مدل YOLO برای تشخیص نقاط کلیدی
# Creating another instance of the YOLO model for keypoint estimation
model3 = YOLO('yolov8n-pose.pt')

Downloading https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8n-pose.pt to 'yolov8n-pose.pt'...
100%|██████████| 6.51M/6.51M [00:01<00:00, 4.53MB/s]


In [19]:
# انجام تشخیص بر روی تصویر sakht.jpg و نمایش نتایج
# Performing keypoint estimation on the image sakht.jpg and showing the results
result = model3('sakht.jpg', show=True)


image 1/1 c:\Users\ya30\Desktop\aioleran7\ai_deep_j12\sakht.jpg: 448x640 4 persons, 100.8ms
Speed: 1.0ms preprocess, 100.8ms inference, 1.0ms postprocess per image at shape (1, 3, 448, 640)


: 