Skip to content

Experience cutting-edge image captioning with our project powered by the Gemini Pro Vision model. This state-of-the-art solution combines the power of generative AI with the precision of the Gemini Pro Vision model to automatically generate rich and contextually relevant captions for images.

License

Notifications You must be signed in to change notification settings

riad5089/Image-Captioning-Web-App-with-Gemini-Pro-Vision

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Image-Captioning-Web-App-with-Gemini-Pro-Vision

Introduction

Image captioning has become an essential tool in making content accessible and interactive in digital spaces. With the advent of advanced LLM models like Google’s Gemini Pro Vision, generating captions for images has become more accurate and contextually relevant. In this blog, we will explore how to build a simple web application using Streamlit and Google Google’s Gemini Pro Vision to create a tool that generates captions for uploaded images.

STEPS to run the project:

STEP 01- Clone the repository

Project repo: https://github.com/riad5089/Image-Captioning-Web-App-with-Gemini-Pro-Vision.git

STEP 02-Create a conda environment after opening the repository

python -m venv env
env\Scripts\activate

STEP 03- install the requirements

pip install -r requirements.txt

Project Demo

Deployment

I made a web application using streamlit framework. This web application is hosted in share.streamlit you can check out this app here.

About

Experience cutting-edge image captioning with our project powered by the Gemini Pro Vision model. This state-of-the-art solution combines the power of generative AI with the precision of the Gemini Pro Vision model to automatically generate rich and contextually relevant captions for images.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages