Skip to content

Rick0317/AutoReader

Repository files navigation

AutoReader - Voice Controlled Auto-Scrolling Browser Extension

A Chrome/Firefox browser extension that automatically scrolls web pages at a constant speed and accepts voice commands powered by OpenAI's GPT for intelligent control.

Features

  • Auto-Scrolling: Smooth, constant-speed scrolling through web pages
  • Voice Control: Natural language voice commands processed by OpenAI
  • Adjustable Speed: Control scroll speed from 0.5x to 10x
  • Smart Commands: Pause, resume, speed control, and navigation via voice
  • Visual Feedback: On-screen status notifications for all actions

Voice Commands

Once voice control is enabled, you can say:

  • "Pause" or "Stop" - Pause scrolling
  • "Resume" or "Continue" - Resume scrolling
  • "Go faster" or "Speed up" - Increase scroll speed
  • "Go slower" or "Slow down" - Decrease scroll speed
  • "Go to [website]" - Navigate to a different website (e.g., "Go to github.com")
  • "Scroll to top" - Jump to the top of the page
  • "Scroll to bottom" - Jump to the bottom of the page

Installation

Chrome

  1. Clone or download this repository
  2. Open Chrome and navigate to chrome://extensions/
  3. Enable "Developer mode" in the top right
  4. Click "Load unpacked" and select the AutoReader folder
  5. The extension icon should appear in your toolbar

Firefox

  1. Clone or download this repository
  2. Open Firefox and navigate to about:debugging#/runtime/this-firefox
  3. Click "Load Temporary Add-on"
  4. Navigate to the AutoReader folder and select manifest.json

Setup

  1. Get an OpenAI API Key:

    • Visit OpenAI API Keys
    • Create a new API key
    • Copy the key (it starts with sk-)
  2. Configure the API Key:

    • Copy config.example.js to config.js
    • Open config.js and replace 'sk-your-api-key-here' with your actual API key
    • Save the file
    • IMPORTANT: Never commit config.js to version control (it's in .gitignore)
  3. Grant Microphone Permission:

    • When you first enable voice control, your browser will ask for microphone permission
    • Click "Allow" to enable voice commands

Usage

Basic Controls

  1. Navigate to any web page
  2. Click the AutoReader extension icon
  3. Click "Start Scrolling" to begin auto-scrolling
  4. Adjust the speed slider as needed
  5. Click "Stop Scrolling" to pause

Voice Control

  1. Click the extension icon
  2. Click "Enable Voice Control"
  3. Grant microphone permission if prompted
  4. Speak your commands naturally
  5. The extension will show you what it heard and execute the command

How It Works

Auto-Scrolling

The extension uses requestAnimationFrame for smooth, efficient scrolling at a consistent frame rate. The scroll speed is adjustable in real-time.

Voice Recognition

  • Uses the browser's built-in Web Speech API for voice recognition
  • No audio data is sent to OpenAI - only the transcribed text
  • Works offline for voice recognition (only command interpretation requires API)

OpenAI Integration

  • Voice transcripts are sent to OpenAI's GPT-4 model
  • The AI interprets natural language commands and converts them to structured actions
  • Uses the gpt-4o-mini model for fast, cost-effective processing
  • Your API key is stored locally in browser storage

Privacy & Security

  • Your OpenAI API key is stored locally using Chrome's storage API
  • Only voice command transcripts (text) are sent to OpenAI
  • No browsing history or page content is transmitted
  • All processing happens locally except for command interpretation

Cost Considerations

  • Voice recognition is free (uses browser API)
  • OpenAI API calls cost approximately $0.0001-0.0002 per command
  • For typical usage (10-20 commands per session), costs are minimal
  • Consider setting usage limits in your OpenAI account

Browser Compatibility

  • Chrome: Full support (v88+)
  • Firefox: Full support with minor UI differences
  • Edge: Full support (Chromium-based)
  • Safari: Not supported (requires manifest v2)

Troubleshooting

Voice Control Not Working

  1. Ensure microphone permission is granted
  2. Check that you're using HTTPS (required for microphone access)
  3. Verify your API key is correct
  4. Try speaking more clearly or closer to the microphone

Scrolling Issues

  1. Some websites override scroll behavior
  2. Try disabling other scrolling extensions
  3. Refresh the page and try again

API Errors

  1. Verify your API key is valid and has credits
  2. Check your OpenAI account usage limits
  3. Ensure you have a stable internet connection

Development

Project Structure

AutoReader/
├── manifest.json       # Extension manifest
├── content.js         # Main content script (scrolling & voice)
├── popup.html         # Extension popup UI
├── popup.css          # Popup styling
├── popup.js           # Popup logic
├── background.js      # Background service worker
├── icons/            # Extension icons
└── README.md         # Documentation

Modifying Voice Commands

Edit the system prompt in content.js:87-104 to add new command types.

Changing the AI Model

Modify the model parameter in content.js:109 to use a different OpenAI model (e.g., gpt-4).

Contributing

Contributions are welcome! Please feel free to submit pull requests or open issues.

License

MIT License - feel free to use and modify as needed.

Acknowledgments

  • Uses OpenAI's GPT models for natural language understanding
  • Built with the Web Speech API
  • Inspired by the need for hands-free web browsing

Future Enhancements

  • Custom speed presets
  • Bookmark positions in long articles
  • Reading progress tracking
  • Multiple language support
  • Offline command processing
  • Custom wake word
  • Reading mode with text highlighting

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors