A browser extension that converts screenshots into optimized text descriptions for LLM context. This tool helps save tokens when using premium AI models like Claude and ChatGPT by pre-processing visual information through Gemini.
- Area Selection: Capture specific portions of any webpage with an intuitive selection tool
- Image Processing: Automatically process screenshots through Google's Gemini API
- Customizable Prompts: Choose from different prompt templates to control how images are described
- Copy to Clipboard: Easily copy the generated descriptions to paste into your preferred LLM
- Settings Management: Configure your API key and customize prompt templates
- Click the extension icon and select an area of the screen to capture
- The extension processes the image through Gemini's vision capabilities
- A detailed text description is generated that you can copy and use as context in other LLMs
- Save tokens in your premium LLM conversations by using pre-processed visual information
- Token Efficiency: Reduce token usage in premium models by pre-processing images
- Better Context: Get detailed descriptions of visual content for your AI conversations
- Seamless Workflow: Quickly capture and process visual information without leaving your browser
- Currently only supports Google Chrome
- Download this repository from https://github.com/szeyu/Screenshot2LLMContext
- Open Chrome and navigate to
chrome://extensions/ - Enable "Developer mode" in the top-right corner
- Click "Load unpacked" and select the downloaded repository folder
- Get your Gemini API key from Google AI Studio
- Add your Gemini API key in the extension settings page
- Start capturing screenshots and generating optimized LLM context
Bachelor of Computer Science (Artificial Intelligence)
Universiti Malaya
⭐️ Don't forget to star this repository if you find it helpful! ⭐️