Skip to content

advay3011/sight-app

Repository files navigation

Sight 👁️

"Sight gives visually impaired users a pair of AI eyes — just say 'Hey Sight' and ask anything about the world in front of you."

Sight is a real-time voice-first accessibility companion for visually impaired users. Built for the Amazon Nova AI Hackathon.

Demo

Say "Hey Sight" to activate, then ask anything:

  • "What do you see?" — describes your surroundings
  • "Read this menu" — reads items, prices, and allergens aloud
  • "Any pasta without meat?" — filters by dietary preference
  • "Find Chinese food near me" — searches nearby restaurants
  • "How do I get to the pharmacy?" — gives directions by voice

Fully hands-free. No screen. No buttons. Just voice.

Tech Stack

Component Technology
Voice pipeline Amazon Nova 2 Sonic via AWS Bedrock
Vision Amazon Nova 2 Lite via AWS Bedrock
Agent routing Strands SDK
Location search AWS Location Service
Orchestration Pipecat
WebRTC transport Daily.co
Wake word Porcupine (Picovoice)
Backend Python + FastAPI
Frontend Mobile browser (HTML/JS)

Architecture

Phone mic → Pipecat → Nova Sonic → handle_query
                                        ↓
                                 Strands Agent (Nova Lite)
                                        ↓
                    ┌───────────────────┼───────────────────┐
               describe_scene      find_places        get_directions
               (Nova 2 Lite)    (AWS Location)      (AWS Location)

Project Structure

nova1/
├── voice.py          # Pipecat pipeline + Nova Sonic
├── sight.py          # Strands agent + 5 tools
├── location_search.py # AWS Location Service wrapper
├── camera_store.py   # Thread-safe mobile camera frame store
├── wake_word.py      # Porcupine wake word detection
├── static/
│   └── index.html    # Mobile browser client
└── .env              # AWS credentials (not committed)

Setup

Prerequisites

  • Python 3.11+
  • AWS account with Bedrock access (Nova 2 Sonic + Nova 2 Lite)
  • Daily.co API key
  • Picovoice access key

Installation

# Clone the repo
git clone https://github.com/advay3011/sight-app.git
cd sight-app

# Create virtual environment
python -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install pipecat-ai[aws,daily,silero] strands-agents pvporcupine pvrecorder opencv-python python-dotenv loguru boto3 fastapi

Environment variables

Create a .env file:

AWS_ACCESS_KEY_ID=your_key
AWS_SECRET_ACCESS_KEY=your_secret
AWS_REGION=us-east-1
DAILY_API_KEY=your_daily_key
PICOVOICE_ACCESS_KEY=your_picovoice_key

Run

# Terminal 1 — start backend
python voice.py

# Terminal 2 — start ngrok tunnel
ngrok http 7860

# Open on phone
# Go to https://your-ngrok-url.ngrok.io/sight
# Say "Hey Sight" to activate

Hackathon Categories

  • Voice AI — Nova 2 Sonic speech-to-speech pipeline
  • Multimodal Understanding — Nova 2 Lite vision analysis

Built for

Amazon Nova AI Hackathon 2026 — amazon-nova.devpost.com

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors