AI Image Captioning System

An intelligent image captioning system that generates descriptive captions and relevant hashtags for uploaded images using state-of-the-art AI models. The system combines computer vision and natural language processing to create Instagram-style captions.

Features

Image Captioning: Generates multiple descriptive captions for uploaded images
Hashtag Generation: Automatically creates relevant hashtags from captions
Cloud Storage: Stores images in AWS S3 bucket
Database Integration: Maintains a record of all processed images and their captions
User-friendly Interface: Simple web interface built with Streamlit
RESTful API: Backend service built with Flask

Technical Stack

Frontend: Streamlit
Backend: Flask
AI Models:
- Vision Encoder-Decoder (ViT-GPT2) for image captioning
- Transformers pipeline for hashtag generation
Database: MySQL
Cloud Storage: AWS S3
Additional Libraries:
- NLTK for text processing
- PIL for image handling
- Boto3 for AWS integration

Prerequisites

Python 3.x
MySQL Server
AWS Account with S3 bucket
Required Python packages (see requirements.txt)

Installation

Clone the repository:

git clone <repository-url>
cd image-caption

Install dependencies:

pip install -r requirements.txt

Set up MySQL database:

CREATE DATABASE `image-caption`;
USE `image-caption`;

CREATE TABLE `image_data` (
  `image_id` varchar(255) NOT NULL,
  `captions` text NOT NULL,
  `hashtags` text NOT NULL,
  PRIMARY KEY (`image_id`)
);

Configure AWS credentials:
- Create an AWS account if you don't have one
- Create an S3 bucket named 'imagecaptionbucket-1'
- Configure AWS credentials in your environment

Running the Application

Start the backend server:

cd server
python app.py

Start the frontend client:

cd client
streamlit run index.py

Access the application at http://localhost:8501

Usage

Open the web interface in your browser
Click "Choose a file" to select an image
Click "Upload" to process the image
View the generated captions and hashtags

Project Structure

image-caption/
├── client/
│   └── index.py          # Streamlit frontend
├── server/
│   └── app.py           # Flask backend
├── requirements.txt     # Python dependencies
└── README.md           # This file

API Endpoints

POST /uploadfile

Purpose: Upload and process an image
Input: Image file
Output: JSON containing captions and hashtags

Response Format:

{
  "captions": ["caption1", "caption2", ...],
  "hashtags": "#tag1 #tag2 ..."
}

Model Details

Image Captioning Model: ViT-GPT2
- Vision Transformer (ViT) for image encoding
- GPT-2 for text generation
- Generates multiple captions per image
- Maximum caption length: 16 tokens
- Beam search with 7 beams
Hashtag Generation:
- Uses text summarization pipeline
- Removes stopwords
- Formats output as Instagram-style hashtags

Contributing

Feel free to submit issues and enhancement requests!

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
ImageCaptioning		ImageCaptioning
client		client
server		server
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Image Captioning System

Features

Technical Stack

Prerequisites

Installation

Running the Application

Usage

Project Structure

API Endpoints

POST /uploadfile

Model Details

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI Image Captioning System

Features

Technical Stack

Prerequisites

Installation

Running the Application

Usage

Project Structure

API Endpoints

POST /uploadfile

Model Details

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages