Smart note-taking app for students with dyslexia. Combines audio transcription, photo capture, and AI summaries.
- Audio transcription with speech-to-text
- Photo capture with AI descriptions
- Smart note grouping (2-minute intervals)
- AI summaries via Claude
- Text-to-speech for accessibility
- Search including photo content (
image: keyword)
React + TypeScript + Tailwind CSS
- Note interface with search and TTS
- Real-time transcription display
- Photo context integration
Node.js + Express + Claude API
- Photo upload and AI analysis
- Transcription storage
- Summary generation
MentraOS SDK integration
- Smart glasses photo capture
- Real-time transcription
- WebView interface
- Node.js 18+ and Bun runtime
- Anthropic Claude API key
- MentraOS API key (for Glass integration)
- Clone and install dependencies:
cd HackMIT
bun install
cd frontend && bun install
cd ../backend && npm install
cd ../Glass && bun install- Environment Setup:
Create backend/.env:
ANTHROPIC_API_KEY=your_claude_api_key_hereCreate Glass/.env:
PACKAGE_NAME=your_package_name
MENTRAOS_API_KEY=your_mentraos_api_key
PORT=3000- Start the application:
# Start all services
bun run dev
# Or start individually
bun run dev:frontend # Frontend on http://localhost:5173
bun run dev:backend # Backend on http://localhost:3001
cd Glass && bun run dev # Glass on http://localhost:3000- Auto-grouping: Transcriptions and photos are automatically grouped into notes based on 2-minute intervals
- AI Summaries: Claude generates concise summaries of transcription content
- Manual Notes: Users can create and edit manual notes alongside generated ones
- Note Management: Edit, delete, and organize both manual and generated notes
- Text Search: Search through note titles, content, and transcriptions
- Image Search: Use
image: keywordto search through AI-generated photo descriptions - Real-time Results: Debounced search with loading indicators
- Async Processing: Handles large datasets efficiently
- Text-to-Speech: All content can be read aloud with custom controls
- Visual Indicators: Clear UI feedback for TTS playback status
- Floating Controls: Consistent play/stop buttons positioned for easy access
- Progress Tracking: Visual progress bars for audio playback
- Multiple Sources: Support for manual uploads and Glass device captures
- AI Descriptions: Automatic image analysis and description generation
- Context Linking: Photos are linked to nearby transcriptions temporally
- Visual Context: Photos displayed alongside related transcript content
| Endpoint | Method | Description |
|---|---|---|
/upload |
POST | Upload photos with AI description generation |
/photos |
GET | Retrieve uploaded photos list |
/glass-photos |
GET | Retrieve Glass device photos |
/transcription |
POST | Save transcription with AI analysis |
/transcriptions/:userId |
GET | Retrieve user transcriptions |
/summarize |
POST | Generate transcript summaries |
/chat |
POST | AI chat with note context |
| Endpoint | Method | Description |
|---|---|---|
/webview |
GET | Photo viewer interface |
/api/latest-photo |
GET | Latest photo metadata |
/api/photo/:requestId |
GET | Photo data by request ID |
HackMIT/
βββ frontend/ # React frontend application
β βββ src/
β β βββ components/ # Reusable UI components
β β βββ hooks/ # Custom React hooks
β β βββ pages/ # Route components
β β βββ utils/ # Utility functions
β β βββ types/ # TypeScript definitions
β βββ package.json
βββ backend/ # Node.js backend server
β βββ uploads/ # Photo storage directory
β βββ transcriptions/ # Transcription log files
β βββ start.js # Main server file
β βββ llm.js # Claude API integration
β βββ package.json
βββ Glass/ # MentraOS integration
β βββ src/
β β βββ index.ts # Main Glass application
β βββ photos/ # Glass photo storage
β βββ views/ # EJS templates
β βββ package.json
βββ package.json # Root package with dev scripts
- React 18: Modern React with hooks and concurrent features
- TypeScript: Type-safe development
- Vite: Fast build tool and dev server
- Tailwind CSS: Utility-first styling
- shadcn/ui: High-quality component library
- React Router: Client-side routing
- Lucide React: Icon library
- Express.js: Web application framework
- Multer: File upload handling
- Anthropic SDK: Claude AI integration
- CORS: Cross-origin resource sharing
- File System: Local storage management
- MentraOS SDK: Smart glasses platform integration
- WebSocket: Real-time communication
- EJS: Template engine for webviews
- Axios: HTTP client for API calls
- Responsive Design: Works on desktop and mobile devices
- Dark/Light Mode: Automatic theme detection
- Smooth Animations: Tailwind CSS animations
- Accessible Controls: ARIA labels and keyboard navigation
- Debounced Search: Efficient search with 300ms delay
- Loading States: Visual feedback during operations
- Error Handling: Graceful error messages and recovery
- Optimistic Updates: Immediate UI feedback
- Audio Capture: Glass device or web interface captures audio
- Transcription: Speech-to-text conversion with timestamp
- Photo Capture: Manual upload or Glass device photo
- AI Processing: Claude analyzes photos and generates descriptions
- Note Generation: System groups related content into smart notes
- Summarization: AI creates concise summaries of transcriptions
- Search Indexing: Content becomes searchable including photo descriptions
- User Interface: All content accessible through web interface with TTS
- Hooks: Custom React hooks for data management (
useNotes,useAsyncSummary) - Components: Modular UI components with TypeScript props
- Utils: Helper functions for data processing and formatting
- Types: Comprehensive TypeScript definitions
frontend/src/hooks/useNotes.ts: Core note management logicfrontend/src/pages/NoteDetail.tsx: Individual note view with TTSbackend/start.js: Main server with all API endpointsGlass/src/index.ts: MentraOS integration and photo handling
- Notes are automatically generated from transcriptions every 2 minutes
- Manual notes can be created using the "New Note" button
- Both types support full editing and deletion
- Text search: Type any keyword to search titles and content
- Image search: Use
image: carto find photos containing cars - Results update in real-time as you type
- Click play buttons next to any content for text-to-speech
- Visual progress indicators show playback status
- All controls are keyboard accessible
This application represents a comprehensive solution for accessible, AI-enhanced note-taking that bridges the gap between traditional note-taking and modern assistive technology.