### YOLO v1 Architecture and Functionality
* The input image is divided into an S × S grid of cells. Each grid cell is responsible for predicting B bounding boxes and their confidence scores, as well as C class probabilities. 

* The bounding box predictions include the coordinates (x, y) of the box center relative to the grid cell, the width (w) and height (h) of the box relative to the entire image, and a confidence score that indicates how certain the model is that the box contains an object.
* Thus for each grid cell, the model predicts a total of B * 5 + C values.
* Confidence Score = Pr(Object) * IOU(pred, truth) (It is for the bounding box)
* Class Probabilities = Pr(Class_i | Object) (It is for the grid cell)
* Ouput Tensor Shape = S x S x (B * 5 + C)  

* If a cell contains a ground-truth object (object center in cell), among the B box predictors that cell has, the one with highest IoU (with that ground truth box) is chosen as responsible. The loss is only calculated for that bounding box predictor. The other bounding box predictors in that cell are ignored for that object.

* The loss function used in YOLOv1 is a combination of multiple components that measure the accuracy of the model's predictions. These components include:
  - Localization Loss: Measures the error in the predicted bounding box coordinates (x, y, w, h) for the boxes responsible for detecting objects.
  - Confidence Loss: Measures the error in the confidence scores for both the boxes that contain objects and those that do not.
  - Classification Loss: Measures the error in the predicted class probabilities for the grid cells that contain objects.

* During inference, the model outputs a tensor of shape S x S x (B * 5 + C) for each input image. This tensor contains the predicted bounding box coordinates, confidence scores, and class probabilities for each grid cell.

* Non-Maximum Suppression (NMS) is applied to filter out overlapping bounding boxes and retain only the most confident ones for each detected object.

## for Object Detection for Gate and Flare Detection


### Training Code   

In [None]:
from ultralytics import YOLO
import cv2

model = YOLO("yolov11n.pt")
MODEL_PATH = "/Users/mohammadbilal/Documents/Projects/GateDetection/Models/yolo_v11n.pt"        # path to your trained YOLO .pt file
#IMAGE_PATH = "GateDetection/assets/gate+flare-2022.jpg"       # path to input image
SAVE_PATH = "GateDetection/assets/results/output.jpg"      # output image path


[KDownloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolov8n.pt to 'yolov8n.pt': 100% ━━━━━━━━━━━━ 6.2MB 21.7MB/s 0.3s.2s<0.4s8s


In [None]:
model.train(
    data='Dataset/data.yaml',
    save=True,
    epochs=100, 
    imgsz=480, # Resolution of the image
    batch=8, # Batch Size
    lr0=0.001,
    lrf=0.1, # Final learning rate multiplier
    momentum=0.9,
    weight_decay=0.0005,
    augment=True,
    workers=2
)

### Inference Code

In [None]:
# For Image Inference
model = YOLO(MODEL_PATH)

results = model(
    IMAGE_PATH,
    conf=0.25,        
    iou=0.45,         
    imgsz=640,
    device="cpu"
)

annotated_img = results[0].plot()
cv2.imwrite(SAVE_PATH, annotated_img)

cv2.imshow("YOLO Detection", annotated_img)
cv2.waitKey(0)
cv2.destroyAllWindows()

print("Inference completed. Saved at:", SAVE_PATH)

In [None]:
# For Video Inference
import cv2
from ultralytics import YOLO

VIDEO_PATH = "/Users/mohammadbilal/Documents/Projects/GateDetection/assets/test_files/test_video3.mp4"
SAVE_PATH = "/Users/mohammadbilal/Documents/Projects/GateDetection/assets/results/output_yolo8n.mp4"

# Load model
model = YOLO(MODEL_PATH)

# Open video
cap = cv2.VideoCapture(VIDEO_PATH)
if not cap.isOpened():
    raise RuntimeError("Error opening video file")

# Get video properties
fps = cap.get(cv2.CAP_PROP_FPS)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))

# Video writer
fourcc = cv2.VideoWriter_fourcc(*"mp4v")
out = cv2.VideoWriter(SAVE_PATH, fourcc, fps, (width, height))

while True:
    ret, frame = cap.read()
    if not ret:
        break

    # YOLO inference on frame
    results = model(
        frame,
        conf=0.25,
        iou=0.45,
        imgsz=640,
        device="cpu"
    )

    # Draw detections
    annotated_frame = results[0].plot()

    # Write frame
    out.write(annotated_frame)

    # Optional display
    cv2.imshow("YOLO Detection", annotated_frame)
    if cv2.waitKey(1) & 0xFF == ord("q"):
        break

cap.release()
out.release()
cv2.destroyAllWindows()

print("Video inference completed. Saved at:", SAVE_PATH)



0: 384x640 (no detections), 30.5ms
Speed: 1.4ms preprocess, 30.5ms inference, 0.2ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 (no detections), 23.6ms
Speed: 1.0ms preprocess, 23.6ms inference, 0.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 (no detections), 25.0ms
Speed: 0.9ms preprocess, 25.0ms inference, 0.2ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 (no detections), 22.3ms
Speed: 0.9ms preprocess, 22.3ms inference, 0.2ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 (no detections), 31.2ms
Speed: 1.5ms preprocess, 31.2ms inference, 0.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 (no detections), 23.1ms
Speed: 0.9ms preprocess, 23.1ms inference, 0.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 (no detections), 23.9ms
Speed: 1.0ms preprocess, 23.9ms inference, 0.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 (no detections), 22.2ms
Speed: 1.0ms preprocess, 22.2ms i

: 