This project was developed as a mini-project for the Sensors course to assist visually impaired individuals using real-time object detection and voice feedback(Text-to-Speech). The system captures live video from a webcam or camera module from the device, detects nearby objects using YOLOv8, and announces their names and positions (left, center, right) through a Text-to-Speech (TTS) engine.
It demonstrates the integration of computer vision, speech synthesis, and real-time sensor-inspired systems β representing how technology can help enhance accessibility and independence for the blind.
Additionally, an ultrasonic sensor measures the distance to obstacles, and a vibration motor alerts the user through tactile feedback when objects are nearby.
- Real-time object detection with YOLOv8
- Voice output using
pyttsx3(offline) - Lightweight custom tracking based on IOU (Intersection over Union)
- Announces object name and direction (e.g., βperson at leftβ)
- Multi-threaded Text-to-Speech system for smooth audio without lag
OBJECT-DETECTION/
β
βββ main.py # Main script (camera, detection, and logic)
βββ Object_det.py # ObjectTracker class (IOU-based tracking)
βββ tts.py # Threaded Text-to-Speech assistant
βββ requirements.txt # Dependencies
βββ .gitignore
βββ README.md
β
βββ yolov8m.pt # YOLOv8 model weights
- Python 3.8+
- A working webcam or camera module
- (Optional) CUDA-compatible GPU for faster inference
1οΈβ£ Clone the repository
git clone https://github.com/<your-username>/OBJECT-DETECTION.git
cd OBJECT-DETECTION2οΈβ£ Create and activate virtual environment (optional but recommended)
python -m venv .venv
# Windows
.venv\Scripts\activate
# Linux/Mac
source .venv/bin/activate3οΈβ£ Install dependencies
Create a file named requirements.txt and add:
ultralytics
opencv-python
cvzone
pyttsx3
torchThen run:
pip install -r requirements.txt4οΈβ£ Add YOLO model weights
Ensure that yolov8m.pt is present in the same folder (as shown above).
You can also download it using:
yolo download model=yolov8m.ptRun the main file:
python main.pyThe system will start capturing live video and announcing detected objects.
Press q anytime to exit safely.
| Step | Component | Function |
|---|---|---|
| 1οΈβ£ | YOLOv8 (Object Detection) | Detects objects in each video frame |
| 2οΈβ£ | ObjectTracker (Object_det.py) | Tracks previously seen objects to prevent duplicate announcements |
| 3οΈβ£ | VoiceAssistant (tts.py) | Announces detected objects asynchronously using text-to-speech |
| 4οΈβ£ | Position Detection | Determines if an object is on the left, center, or right of the frame |
πΉ Integrate Google TTS or Coqui TTS for natural voice πΉ Implement object tracking using DeepSORT πΉ Port to Raspberry Pi for portable embedded use
This project is released under the MIT License. Feel free to use, modify, and share for educational purposes.