Image Insight is an application that leverages the power of Google's Gemini Pro Vision model to generate descriptive content for uploaded images. This project provides a simple and interactive interface for users to explore the capabilities of the Gemini Pro Vision model.
-
Upload an Image: Choose an image by dropping it into the designated area or clicking to upload.
-
Type a Prompt: Enter a prompt or a description related to the image to guide the generative process.
-
Click "Submit": Interact with the "Submit" button to initiate the model and receive descriptive text based on the image and prompt.
- Gradio: Powering the user interface and enabling seamless interactions.
- Google GenerativeAI (Gemini Pro Vision): Driving the image description generation.
- PIL (Python Imaging Library): Handling image processing.
- Install the required packages:
pip install -r requirements.txt
- Run the application:
python app.py
Demo: Image insight (https://huggingface.co/spaces/Papireddy/geminipro-describe-image)