### Real-Time Object Detection System

This project uses the Pascal VOC dataset to develop a model that detects objects in live video feeds. You’ll begin by preparing a subset of the COCO dataset with classes relevant to your goals, such as "person, car, bicycle, dog, cat or bus" and perform data augmentation for robust training. With YOLOv5, a popular model for real-time applications, you’ll load pretrained weights and fine-tune them using your filtered dataset, aiming for accuracy and speed. 

For inference, you’ll integrate OpenCV to process video from a webcam or video file, running object detection in real time. The model draws bounding boxes around detected objects and labels them on the video feed, demonstrating strong applicability in real-time scenarios. 

Finally, you’ll deploy the solution using Streamlit, where users can upload video files or use their webcams to view real-time detections directly from a web interface. Streamlit’s simple deployment pipeline allows easy sharing of your project for remote access and feedback.

With your background in data science and machine learning, this project leverages skills in model training, data preprocessing, and deployment, essential for your data science portfolio. It’s an excellent addition to your projects as it demonstrates practical applications of machine learning for real-time use cases—aligning with your goal to attract employers and stand out in the field.

### Importing Libraries

In [3]:
!pip install torchvision






In [None]:
import torch
import torchvision
from torchvision import models
from torchvision import transforms
import cv2
import numpy as np
import streamlit as st
from PIL import Image
import time

In [None]:
import torch
import torchvision
print("Torch and TorchVision are loaded!")


In [None]:
!pip uninstall torch torchvision


In [None]:
!pip install torch torchvision


In [None]:
import torchvision
print(torchvision.__version__)


fasterrcnn_resnet50_fpn: Loads a pre-trained Faster R-CNN model with a ResNet50 backbone.
model.eval(): Sets the model to evaluation mode.
device: Automatically selects GPU if available, which will speed up the process.


### Real-Time Object Detection with OpenCV
We use OpenCV to capture a live video stream from the camera and perform object detection on each frame.

In [5]:
# Transform to apply on each frame
transform = transforms.Compose([
    transforms.ToTensor()
])

# Start capturing video from the webcam
cap = cv2.VideoCapture(0)

while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        break

    # Convert frame to PIL image
    pil_img = Image.fromarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))

    # Apply transformation and pass to model
    input_tensor = transform(pil_img).unsqueeze(0).to(device)
    with torch.no_grad():
        outputs = model(input_tensor)
    
    # Draw bounding boxes and labels on the frame
    for box, label, score in zip(outputs[0]['boxes'], outputs[0]['labels'], outputs[0]['scores']):
        if score >= 0.5:  # Set threshold to 0.5 for detection confidence
            box = box.to('cpu').numpy().astype(int)
            cv2.rectangle(frame, (box[0], box[1]), (box[2], box[3]), (0, 255, 0), 2)
            cv2.putText(frame, f"{label.item()} {score:.2f}", (box[0], box[1] - 10),
                        cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
    
    # Display the frame
    cv2.imshow("Real-Time Object Detection", frame)

    # Break loop on 'q' key press
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()


NameError: name 'transforms' is not defined