Skip to content

Jsahoo20/Object-detection-tts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

This project was developed as a mini-project for the Sensors course to assist visually impaired individuals using real-time object detection and voice feedback(Text-to-Speech). The system captures live video from a webcam or camera module from the device, detects nearby objects using YOLOv8, and announces their names and positions (left, center, right) through a Text-to-Speech (TTS) engine.

It demonstrates the integration of computer vision, speech synthesis, and real-time sensor-inspired systems β€” representing how technology can help enhance accessibility and independence for the blind.

Additionally, an ultrasonic sensor measures the distance to obstacles, and a vibration motor alerts the user through tactile feedback when objects are nearby.

Features

  • Real-time object detection with YOLOv8
  • Voice output using pyttsx3 (offline)
  • Lightweight custom tracking based on IOU (Intersection over Union)
  • Announces object name and direction (e.g., β€œperson at left”)
  • Multi-threaded Text-to-Speech system for smooth audio without lag

Project Structure

OBJECT-DETECTION/
β”‚
β”œβ”€β”€ main.py             # Main script (camera, detection, and logic)
β”œβ”€β”€ Object_det.py       # ObjectTracker class (IOU-based tracking)
β”œβ”€β”€ tts.py              # Threaded Text-to-Speech assistant
β”œβ”€β”€ requirements.txt    # Dependencies
β”œβ”€β”€ .gitignore
β”œβ”€β”€ README.md
β”‚
β”œβ”€β”€ yolov8m.pt          # YOLOv8 model weights

Requirements

  • Python 3.8+
  • A working webcam or camera module
  • (Optional) CUDA-compatible GPU for faster inference

Installation

1️⃣ Clone the repository

git clone https://github.com/<your-username>/OBJECT-DETECTION.git
cd OBJECT-DETECTION

2️⃣ Create and activate virtual environment (optional but recommended)

python -m venv .venv
# Windows
.venv\Scripts\activate
# Linux/Mac
source .venv/bin/activate

3️⃣ Install dependencies Create a file named requirements.txt and add:

ultralytics
opencv-python
cvzone
pyttsx3
torch

Then run:

pip install -r requirements.txt

4️⃣ Add YOLO model weights Ensure that yolov8m.pt is present in the same folder (as shown above). You can also download it using:

yolo download model=yolov8m.pt

Running the Project

Run the main file:

python main.py

The system will start capturing live video and announcing detected objects. Press q anytime to exit safely.

How It Works

Step Component Function
1️⃣ YOLOv8 (Object Detection) Detects objects in each video frame
2️⃣ ObjectTracker (Object_det.py) Tracks previously seen objects to prevent duplicate announcements
3️⃣ VoiceAssistant (tts.py) Announces detected objects asynchronously using text-to-speech
4️⃣ Position Detection Determines if an object is on the left, center, or right of the frame

Future Improvements

πŸ”Ή Integrate Google TTS or Coqui TTS for natural voice πŸ”Ή Implement object tracking using DeepSORT πŸ”Ή Port to Raspberry Pi for portable embedded use

πŸ“œ License

This project is released under the MIT License. Feel free to use, modify, and share for educational purposes.

About

Smart stick for Blinds using Object detection and Text-to-Speech

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages