A Chrome extension that captures screenshots and extracts code using Google Gemini Vision API.
- Screenshot Capture: Select any area of your screen to capture
- AI Vision Analysis: Uses Google Gemini Vision API to directly analyze images
- Smart Code Detection: Automatically detects and extracts code with auto-completion
- Copy to Clipboard: Easy one-click copy of detected code
- Fast Processing: Single API call for instant results
- Click "Capture Screenshot" button
- Select the area containing code on your screen
- Gemini Vision API analyzes the image and extracts code
- Detected code is displayed with formatting
- Copy to clipboard with one click
Note: No OCR preprocessing needed - Gemini's vision model directly understands code in images!
- Node.js (v16 or higher)
- npm or yarn
- Google Gemini API key (Get it free here)
-
Clone the repository
git clone <repository-url> cd code_detector
-
Install dependencies
npm install
-
Configure API Key
- Copy
.env.exampleto.env:cp .env.example .env
- Open
.envand add your Gemini API key:VITE_GEMINI_API_KEY=your_actual_api_key_here - Important: Never commit the
.envfile to version control!
- Copy
-
Build the extension
npm run build
-
Load in Chrome
- Open Chrome and go to
chrome://extensions/ - Enable "Developer mode" (toggle in top right)
- Click "Load unpacked"
- Select the
distfolder from this project
- Open Chrome and go to
-
Start using
- Click the extension icon to open the popup
- Capture any code visible on your screen
- Click the extension icon in Chrome toolbar
- Click "📸 Capture Screenshot"
- Drag to select the area with code
- Wait for AI analysis (2-5 seconds)
- View and copy the detected code
The extension requires a Gemini API key to function. This is stored in a .env file:
VITE_GEMINI_API_KEY=your_api_key_hereGetting your API key:
- Visit Google AI Studio
- Sign in with your Google account
- Create a new API key
- Copy the key to your
.envfile
Security Note:
- The
.envfile is ignored by git and should never be committed - Use
.env.exampleas a template for other developers - Keep your API key private and secure
- React 19
- Vite
- Google Gemini 2.0 Flash (Vision API)
- Chrome Extension Manifest V3
# Install dependencies
npm install
# Build for production
npm run build
# Run linter
npm run lint- App.jsx: Main application component with screenshot capture logic
- ScreenshotCapture.jsx: Handles area selection and image capture
- GeminiService.js: Sends images to Gemini Vision API for code detection
- CodeEditor.jsx: Displays detected code with copy functionality
- background.js: Handles screenshot capture via Chrome API
✅ Faster: Single API call vs OCR + text analysis ✅ More Accurate: Vision model understands code structure and context ✅ Auto-completion: Can fix incomplete or partially visible code ✅ No OCR Dependency: Direct image-to-code extraction ✅ Better Error Handling: Clearer feedback on detection issues