A Chrome/Firefox browser extension that automatically scrolls web pages at a constant speed and accepts voice commands powered by OpenAI's GPT for intelligent control.
- Auto-Scrolling: Smooth, constant-speed scrolling through web pages
- Voice Control: Natural language voice commands processed by OpenAI
- Adjustable Speed: Control scroll speed from 0.5x to 10x
- Smart Commands: Pause, resume, speed control, and navigation via voice
- Visual Feedback: On-screen status notifications for all actions
Once voice control is enabled, you can say:
- "Pause" or "Stop" - Pause scrolling
- "Resume" or "Continue" - Resume scrolling
- "Go faster" or "Speed up" - Increase scroll speed
- "Go slower" or "Slow down" - Decrease scroll speed
- "Go to [website]" - Navigate to a different website (e.g., "Go to github.com")
- "Scroll to top" - Jump to the top of the page
- "Scroll to bottom" - Jump to the bottom of the page
- Clone or download this repository
- Open Chrome and navigate to
chrome://extensions/ - Enable "Developer mode" in the top right
- Click "Load unpacked" and select the
AutoReaderfolder - The extension icon should appear in your toolbar
- Clone or download this repository
- Open Firefox and navigate to
about:debugging#/runtime/this-firefox - Click "Load Temporary Add-on"
- Navigate to the
AutoReaderfolder and selectmanifest.json
-
Get an OpenAI API Key:
- Visit OpenAI API Keys
- Create a new API key
- Copy the key (it starts with
sk-)
-
Configure the API Key:
- Copy
config.example.jstoconfig.js - Open
config.jsand replace'sk-your-api-key-here'with your actual API key - Save the file
- IMPORTANT: Never commit
config.jsto version control (it's in.gitignore)
- Copy
-
Grant Microphone Permission:
- When you first enable voice control, your browser will ask for microphone permission
- Click "Allow" to enable voice commands
- Navigate to any web page
- Click the AutoReader extension icon
- Click "Start Scrolling" to begin auto-scrolling
- Adjust the speed slider as needed
- Click "Stop Scrolling" to pause
- Click the extension icon
- Click "Enable Voice Control"
- Grant microphone permission if prompted
- Speak your commands naturally
- The extension will show you what it heard and execute the command
The extension uses requestAnimationFrame for smooth, efficient scrolling at a consistent frame rate. The scroll speed is adjustable in real-time.
- Uses the browser's built-in Web Speech API for voice recognition
- No audio data is sent to OpenAI - only the transcribed text
- Works offline for voice recognition (only command interpretation requires API)
- Voice transcripts are sent to OpenAI's GPT-4 model
- The AI interprets natural language commands and converts them to structured actions
- Uses the
gpt-4o-minimodel for fast, cost-effective processing - Your API key is stored locally in browser storage
- Your OpenAI API key is stored locally using Chrome's storage API
- Only voice command transcripts (text) are sent to OpenAI
- No browsing history or page content is transmitted
- All processing happens locally except for command interpretation
- Voice recognition is free (uses browser API)
- OpenAI API calls cost approximately $0.0001-0.0002 per command
- For typical usage (10-20 commands per session), costs are minimal
- Consider setting usage limits in your OpenAI account
- Chrome: Full support (v88+)
- Firefox: Full support with minor UI differences
- Edge: Full support (Chromium-based)
- Safari: Not supported (requires manifest v2)
- Ensure microphone permission is granted
- Check that you're using HTTPS (required for microphone access)
- Verify your API key is correct
- Try speaking more clearly or closer to the microphone
- Some websites override scroll behavior
- Try disabling other scrolling extensions
- Refresh the page and try again
- Verify your API key is valid and has credits
- Check your OpenAI account usage limits
- Ensure you have a stable internet connection
AutoReader/
├── manifest.json # Extension manifest
├── content.js # Main content script (scrolling & voice)
├── popup.html # Extension popup UI
├── popup.css # Popup styling
├── popup.js # Popup logic
├── background.js # Background service worker
├── icons/ # Extension icons
└── README.md # Documentation
Edit the system prompt in content.js:87-104 to add new command types.
Modify the model parameter in content.js:109 to use a different OpenAI model (e.g., gpt-4).
Contributions are welcome! Please feel free to submit pull requests or open issues.
MIT License - feel free to use and modify as needed.
- Uses OpenAI's GPT models for natural language understanding
- Built with the Web Speech API
- Inspired by the need for hands-free web browsing
- Custom speed presets
- Bookmark positions in long articles
- Reading progress tracking
- Multiple language support
- Offline command processing
- Custom wake word
- Reading mode with text highlighting