Project-Cortex is a low-cost (<$150), high-impact AI wearable designed to assist visually impaired individuals by providing real-time scene understanding, object detection, and audio navigation. Built for the Young Innovators Awards (YIA) 2026 competition.
We aim to democratize assistive technology by disrupting the $4,000+ premium device market (OrCam, eSight) using commodity hardware and a novel "Hybrid AI" architecture.
- Compute: Raspberry Pi 5 (4GB/8GB RAM)
- Vision: IMX415 8MP Low-Light Camera (MIPI CSI-2)
- Power: 30,000mAh USB-C PD Power Bank
- Cooling: Official RPi 5 Active Cooler
- Audio: USB Audio Interface + Bone Conduction Headphones
- Connectivity: Mobile Hotspot (no dedicated SIM module)
- Purpose: Instant safety-critical object detection
- Model: YOLOv8n / TensorFlow Lite
- Latency: <100ms
- Power: 8-12W during inference
- Location:
src/layer1_reflex/
- Purpose: Complex scene analysis, OCR, natural language descriptions
- Model: Google Gemini 1.5 Flash (via API)
- Fallback: OpenAI GPT-4 Vision
- Latency: ~1-3s (network dependent)
- Location:
src/layer2_thinker/
- Features: GPS navigation, 3D spatial audio, caregiver dashboard
- Tech Stack: FastAPI (backend), React (dashboard), PyOpenAL (audio)
- Location:
src/layer3_guide/
ProjectCortex/
βββ Version_1/ # Archived ESP32-CAM implementation
β βββ Docs/ # v1.0 technical retrospective
β βββ Code/ # v1.0 Python/Arduino code
βββ models/ # Shared AI models (YOLO variants)
βββ TTS Model/ # Piper TTS model files
βββ src/ # Version 2.0 source code
β βββ layer1_reflex/ # Local object detection module
β βββ layer2_thinker/ # Cloud AI integration module
β βββ layer3_guide/ # Navigation & UI module
β βββ main.py # Application entry point
βββ config/ # Configuration files (.yaml, .json)
βββ tests/ # Unit and integration tests
βββ docs/ # Technical documentation
βββ utils/ # Helper scripts and tools
βββ .env.example # Environment variables template
βββ requirements.txt # Python dependencies
βββ README.md # This file
- Raspberry Pi 5 (4GB+ RAM) with Raspberry Pi OS (64-bit)
- IMX415 Camera Module (connected via CSI port)
- Python 3.11+
- Active internet connection (for Layer 2)
-
Clone the repository:
git clone https://github.com/IRSPlays/ProjectCortex.git cd ProjectCortex -
Set up Python environment:
python3.11 -m venv venv source venv/bin/activate pip install --upgrade pip pip install -r requirements.txt -
Configure environment variables:
cp .env.example .env nano .env # Add your API keys (Gemini, Murf AI, etc.) -
Test camera module:
libcamera-hello --camera 0 # Should display camera preview -
Run the application:
python src/main.py
Add to /boot/firmware/config.txt:
usb_max_current_enable=1
dtoverlay=imx415Configure in config/camera.yaml:
resolution: [1920, 1080]
framerate: 30
format: RGB888Edit config/models.yaml:
layer1:
model: "models/yolo11s.pt"
device: "cpu" # Change to "cuda" if using Coral TPU
confidence: 0.5Project-Cortex features a binaural 3D spatial audio system that helps visually impaired users navigate their environment using audio cues. This system converts YOLO object detections into positioned audio sources, creating an "audio map" of the surroundings.
| Feature | Description |
|---|---|
| Audio Beacons | Continuous directional sounds that guide users to targets (e.g., "lead me to the door") |
| Proximity Alerts | Distance-based warnings that intensify as obstacles approach |
| Object Tracking | Real-time 3D audio sources for each detected object |
| Distance Estimation | Calculate real-world distance using known object sizes |
| Object-Specific Sounds | Distinct audio signatures for different object classes (car vs person vs chair) |
| HRTF Support | Head-Related Transfer Function for realistic binaural audio on headphones |
YOLO Detection β Position Calculator β OpenAL 3D Audio β Headphones
β β β
βΌ βΌ βΌ
[bbox] β [x, y, z coords] β [Binaural audio]
Position Mapping Algorithm:
- X-axis (Left/Right): Bbox horizontal center β audio pan
- Y-axis (Up/Down): Bbox vertical center β audio elevation
- Z-axis (Distance): Bbox size β audio volume/distance
from src.layer3_guide.spatial_audio import SpatialAudioManager, Detection
# Initialize spatial audio
audio = SpatialAudioManager()
audio.start()
# Update with YOLO detections
detections = [
Detection("chair_1", "chair", 0.92, (100, 200, 300, 600)),
Detection("person_1", "person", 0.87, (1400, 100, 1800, 900)),
]
audio.update_detections(detections)
# Start navigation beacon to guide user
audio.start_beacon("chair") # "Follow the sound to the chair"
# Stop when done
audio.stop()Edit config/spatial_audio.yaml to customize:
- Distance thresholds for proximity alerts
- Object-specific sound mappings
- Ping rates and volumes for beacons
- Known object sizes for distance estimation
| Module | File | Purpose |
|---|---|---|
SpatialAudioManager |
manager.py |
Central orchestrator for all spatial audio |
PositionCalculator |
position_calculator.py |
YOLO bbox β 3D coordinates |
AudioBeacon |
audio_beacon.py |
Navigation guidance pings |
ProximityAlertSystem |
proximity_alert.py |
Distance-based warnings |
ObjectSoundMapper |
object_sounds.py |
Object class β sound mapping |
ObjectTracker |
object_tracker.py |
Multi-object audio management |
pip install PyOpenAL numpy PyYAMLLinux/Raspberry Pi:
sudo apt-get install libopenal-dev libopenal1π Full documentation: docs/SPATIAL_AUDIO_IMPLEMENTATION.md
Run unit tests:
pytest tests/ -vRun integration tests (requires hardware):
pytest tests/integration/ --hardware| Metric | Target | Current Status |
|---|---|---|
| Layer 1 Latency | <100ms | TBD |
| Layer 2 Latency | <3s | TBD |
| Power Consumption | <20W avg | TBD |
| Battery Life | 6-8 hours | TBD |
| Object Detection Accuracy | >85% mAP | TBD |
- Repository restructure
- Camera integration with libcamera
- Layer 1 YOLO inference pipeline
- Layer 2 Gemini API integration
- Audio subsystem (TTS + STT)
- GPS navigation module
- 3D spatial audio engine β IMPLEMENTED
- Caregiver web dashboard
- Power optimization
- User testing & feedback
- Documentation for judges
- Prototype enclosure design
- Demonstration video
- Bill of Materials (BOM) - Complete parts list with costs
- Architecture Deep Dive - Technical design decisions
- API Reference - Code documentation
- v1.0 Retrospective - Lessons learned from ESP32-CAM
This is a competition prototype developed by Haziq (@IRSPlays). For questions or collaboration inquiries, please open an issue.
This project is licensed under the MIT License - see the LICENSE file for details.
- YIA 2026 Organizers - For the opportunity to innovate
- Raspberry Pi Foundation - For affordable, powerful compute
- Ultralytics - For accessible YOLO implementations
- Google Gemini Team - For multimodal AI API access
Project Lead: Haziq
GitHub: @IRSPlays
Repository: ProjectCortex
Built with π for accessibility. Engineered with π₯ for excellence.