Workflow-Aware Screen Reader Companion for Predictive UI Intent Analysis
We focus on awareness before action. Instead of just reading labels like βSubmit,β our tool explains what the action really does, subscribing to emails or sending bank info, before the user clicks.
Blindly is a Chrome extension that predicts the consequences of UI actions before users interact with them. It combines local DOM analysis, workflow context detection, and Gemini AI reasoning to provide clear explanations via accessible visual overlays and natural ElevenLabs voice output.
Screen reader users face uncertainty navigating web forms and interactive elements. They often don't know if clicking a button will:
- Charge their credit card
- Delete their account permanently
- Submit official government forms
- Initiate irreversible financial transactions (like Interac e-Transfers)
Blindly analyzes UI elements in real-time using a multi-layered approach:
-
Local Signal Detection (Fast, deterministic)
- DOM patterns and form structure analysis
- Keyword matching for payments, risks, and workflows
- Canadian-specific patterns (Interac, government domains)
-
Workflow Context Analysis
- Stepper UI detection (Step X of Y)
- Progress indicators and breadcrumbs
- Multi-page flow understanding
-
Gemini AI Reasoning (Complex cases)
- Converts signals into human explanations
- Structured JSON responses with risk levels
- Handles ambiguous or novel UI patterns
-
Accessible Output
- High-contrast visual overlays (WCAG AAA)
- ElevenLabs natural voice synthesis
- Keyboard navigation and screen reader compatibility
- Event Trigger: Focus/hover on interactive element (300ms delay)
- Context Extraction: DOM analysis, form detection, signal gathering
- Local Analysis: Fast deterministic checks for high-confidence cases
- AI Analysis: Gemini API (direct) β Backend (fallback) β Local (final fallback)
- Response Processing: JSON validation, risk level assignment
- Output: Visual overlay + TTS (ElevenLabs β Chrome TTS fallback)
-
Load Extension in Chrome
# Navigate to extension folder cd intent-reader/extension # Open Chrome and go to chrome://extensions/ # Enable "Developer mode" # Click "Load unpacked" and select the extension folder
-
Configure API Key (Required for AI features)
- Create
extension/env.jsonwith your OpenRouter API key:
{ "OPENROUTER_API_KEY": "your_openrouter_api_key_here" } - Create
-
Configure Settings
- Click the Blindly icon in Chrome toolbar
- Toggle Canada Mode (recommended for Canadian sites)
- Enable Privacy Mode (redacts sensitive data - default: ON)
- Enable Auto-Read Intent (automatic voice feedback - default: ON)
-
Test with Demo Pages
- Open
demo-pages/checkout.htmlfor payment detection - Open
demo-pages/delete-account.htmlfor irreversibility warnings - Open
demo-pages/gov-form.htmlfor Canada Mode (ensure it's enabled)
- Open
- Ctrl+Shift+I: Read intent of currently focused element
- Ctrl+Shift+S: Summarize form before submission
- Ctrl+Shift+V: Toggle voice companion (speech-to-text form filling)
- Keywords: pay, checkout, total, tax, billing, charge, purchase
- Providers: Stripe, PayPal, Interac, Shop Pay, Apple Pay
- Field Patterns: 16-digit card numbers, CVV fields, expiry dates
- Result: HIGH risk + clear warning
- Keywords: delete, permanent, cannot be undone, irreversible
- Context: account settings, danger zones
- Result: HIGH risk + consequence explanation
- Signals:
<form>with action, submit buttons - Analysis: Field count, required fields, consent checkboxes
- Result: MEDIUM risk + field summary
- Signals: "Step X of Y", progress bars, breadcrumbs
- Prediction: Next step in flow (payment β review β confirmation)
- Result: LOW-MEDIUM risk + next page type
- Interac Keywords: e-transfer, autodeposit, Interac Online
- Gov Domains: canada.ca, ontario.ca, cra-arc.gc.ca, serviceontario
- Postal Codes: A1A 1A1 pattern validation
- Result: Enhanced warnings for Canadian services
Before sending data to backend, we redact:
- Credit card numbers β
[CARD] - SSN/SIN β
[SSN] - Email addresses β
[EMAIL] - Phone numbers β
[PHONE]
- Users can toggle Privacy Mode in settings
- Clear indication when data is sent to AI service
- Fallback to local analysis if backend unavailable
Only send to Gemini:
- Element tag, type, and redacted text
- Detected signals (boolean flags, not raw data)
- Form field labels (not values)
- Domain and URL (not credentials)
- Form input values
- Passwords
- Credit card details
- Personal information
- All features accessible via keyboard (no mouse required)
- Tab navigation through overlay or key shortcuts and escape to close
- Concise spoken summaries (10-15 words) with Real Human Voice
- Enhanced detection for Canadian services
- Automatic voice feedback on element focus/hover
- Canadian Payment and Interac e-Transfer detection
- Canadian Government domain, health keywords, and more recognition
- Postal code validation
- Support for Canadian accessibility requirements
- Tested and specialized for Canadian government websites and forms
- Novel approach: predictive intent analysis vs. reactive screen reading
- Combines deterministic heuristics with AI reasoning
- Workflow-aware: understands multi-step processes
- Clean architecture: extension β backend separation
- Efficient: local analysis for high-confidence cases
- Robust: fallbacks at every layer (ElevenLabs β Chrome TTS)
- Accessible by design (WCAG AAA compliance)
- Non-intrusive overlay
- Clear, actionable information
- Works without keyboard/mouse
- Privacy Mode with local redaction
- User control over all features
- Frontend: JavaScript (ES6+), Chrome Extension API (Manifest V3)
- AI Integration:
- Google Gemini 1.5 Flash (via OpenRouter API - primary)
- Google Gemini API (direct - backend fallback)
- Voice Synthesis: ElevenLabs TTS API + Chrome TTS (fallback)
- Backend: Python 3.11, FastAPI, Uvicorn (optional)
- Cloud: Google Cloud Run, Secret Manager
- Deployment: Docker, gcloud CLI
- Architecture: Multi-layer fallback system for reliability
- OCR for image-based buttons
- Multi-language support (French, Spanish)
- Browser extension for Firefox, Edge
- Mobile app (React Native)
- Improve localAnalysis and GeminiAnalysis for faster responses
- Custom companion training and personalization for user preferences
- Machine learning model for workflow prediction
- Integration with PDF files, Excel, etc.
Blindly - Making the web more predictable and accessible, one interaction at a time. β‘