AI-powered AAC (Augmentative and Alternative Communication) device for ALS patients — eye tracking + voice cloning + LLM context awareness to restore communication in their own voice.
- Full-screen tile-based communication board with 3-level category hierarchy (L1 → L2 → L3 → sentences)
- Eye-tracking dwell selection — leverages iOS eye tracking, triggering selection when the patient's gaze dwells on a tile
- AI-generated contextual sentences via Amazon Bedrock (Claude Haiku 4.5), tailored to the patient's intent path and conversation history
- Sentence regeneration with nudge prompts ("More casual", "Shorter", "More urgent", etc.)
- Yes/No and Quick Phrase boards for instant responses
- Keyboard fallback for manual text input (grouped letter tiles)
- Browser speech synthesis fallback when voice model is unavailable
- Offline fallback tile set when Bedrock is unreachable
- Conversation panel with caregiver speech-to-text (Web Speech API), auto-send after silence, and conversation history
- PWA support — installable on iPad/iPhone, full-screen, works offline for tier-1 responses
- Async synthesis via Lambda → XTTS v2 fine-tuned model → S3 presigned URL
- Client polls for audio readiness and plays cloned voice when ready
- Graceful fallback to browser TTS on timeout or model failure
- Amazon Cognito sign-in/sign-up (email + password) via AWS Amplify JS (aws-amplify ^6.17.0)
- Amplify configured with Cognito User Pool in
lib/amplify.ts - Server-side JWT decoding of Cognito sub claim in
lib/auth-server.ts - Cognito sub used as stable user ID for DynamoDB patient profile lookups
- Patient profile loaded on AAC page mount from DynamoDB
- Real
MediaRecorderrecording for voice sample collection - Per-sentence audio playback for review
- Samples uploaded to S3 under
patients/{id}/voice/
- Landing page with animated hero, problem/solution sections, tech architecture diagram, caregiver dashboard mockup
- 5-step onboarding flow with animated transitions
- Voice cloning in production — XTTS v2 runs on a SageMaker notebook (GPU instance) exposed via an ngrok static tunnel. The notebook must be manually started before each demo and the Flask server re-launched. A proper SageMaker inference endpoint or EC2 deployment was not completed in time.
- Onboarding → AAC handoff — the onboarding flow (voice banking, profile setup) does not automatically populate the AAC page with the patient's profile. Patient name and ID are loaded from Cognito + DynamoDB independently on the AAC page.
- Patient profile loading edge cases — if Cognito auth is slow or the DynamoDB read fails, the AAC page degrades silently (no patient name, anonymous patient ID used for API calls).
- ngrok tunnel management — no auto-restart or health check on the Flask/ngrok side; if the SageMaker notebook stops, voice synthesis silently fails until the notebook is restarted manually.
| Service | How It's Used |
|---|---|
| Amazon Bedrock (Claude Haiku 4.5) | Generates contextual AAC sentences based on patient intent path and conversation history |
| AWS Lambda | API backend — handles generate, regenerate, log_selection, append_exchange, synthesize_async, check_audio actions |
| Amazon API Gateway | REST endpoint fronting Lambda (/prod/generate) |
| Amazon DynamoDB | Stores patient profiles (kural-patients) and async audio job status (kural-audio-jobs) |
| Amazon S3 | Stores voice banking audio samples and synthesized audio output (voiceclone-audio-samples) |
| Amazon Cognito | User authentication — sign-up, sign-in, JWT tokens; Cognito sub used as stable patient ID |
| AWS Amplify JS | Client-side Cognito auth wrapper — sign-in/sign-up flows, token management |
| Amazon SageMaker | Hosts the XTTS v2 fine-tuned voice cloning model on a GPU notebook instance |
Built entirely from scratch. No boilerplate, starter templates, or third-party UI component libraries were used.
Open-source dependencies of note:
- Next.js 16 — React framework
- Framer Motion — animations
- XTTS v2 — open-source voice cloning model (Coqui TTS)
- AWS Amplify JS — Cognito auth client
- Lucide React — icons