🖼️ AI Image Caption Generator

A modern, full-stack web application that generates intelligent captions for images using OpenAI's Vision API. This project combines a beautiful React frontend with a robust Python Flask backend for seamless image analysis and caption generation.

✨ Features

🖼️ Image Processing

Drag & Drop Upload: Easy image upload with preview
Batch Processing: Process multiple images at once
Image Optimization: Automatic image optimization and EXIF data removal
Multiple Formats: Support for PNG, JPG, JPEG, GIF, WebP

🤖 AI-Powered Captions

GPT-4 Vision API: Using OpenAI's latest vision model
Detailed Analysis: Get comprehensive image analysis
Smart Descriptions: Context-aware caption generation
Customizable Prompts: Tailor captions to your needs

🌙 User Experience

Dark/Light Theme: Toggle between themes
Responsive Design: Works seamlessly on all devices
Real-time Processing: Live caption generation
Modern UI: Built with React, Tailwind CSS, and Framer Motion

📋 History & Management

Caption History: Save and manage your caption history
Export Options: Download captions and images
Favorites: Mark and organize favorite captions
Search & Filter: Easily find previous captions

🌍 Multi-language Support

Multilingual Captions: Generate captions in different languages
Translation API: Translate existing captions
Language Detection: Automatic language detection

🔒 Security & Performance

Rate Limiting: API rate limiting for fair usage
File Size Limits: Maximum 5MB file upload
CORS Protection: Secure cross-origin requests
Caching: Redis-based caching for performance

🛠️ Tech Stack

Frontend

React 18: Modern UI library
Vite: Lightning-fast build tool
Tailwind CSS: Utility-first CSS framework
Framer Motion: Smooth animations
React Dropzone: File upload handling
React Hot Toast: Toast notifications
Axios: HTTP client

Backend

Python Flask: Web framework
OpenAI Vision API: Image analysis
Pillow: Image processing
Flask-CORS: Cross-origin support
Flask-Limiter: Rate limiting
Redis: Caching (optional)
Docker: Containerization

📋 Requirements

Backend

Python 3.8+
pip package manager
OpenAI API key
Redis (optional, for caching)

Frontend

Node.js 16.x or higher
npm or yarn package manager

🚀 Installation

1. Clone the Repository

git clone git@github.com:itzmeahammed/Image-caption-generator-ai.git
cd image-caption

2. Set up the Backend

Option A: Direct Installation

cd backend
pip install -r requirements.txt
cp .env.example .env
# Edit .env and add your OpenAI API key
python app.py

Option B: Using Docker

docker-compose up -d

3. Set up the Frontend

cd frontend
npm install
npm run dev

4. Access the Application

Frontend: http://localhost:5173
Backend API: http://localhost:5000

🔧 Configuration

Environment Variables

Create a .env file in the backend directory:

# OpenAI Configuration
OPENAI_API_KEY=your_openai_api_key_here

# Flask Configuration
FLASK_ENV=development
FLASK_DEBUG=True

# Redis Configuration (optional)
REDIS_URL=redis://localhost:6379

# API Configuration
MAX_FILE_SIZE=5242880
ALLOWED_EXTENSIONS=png,jpg,jpeg,gif,webp

# CORS Configuration
CORS_ORIGINS=http://localhost:5173,http://localhost:3000

📁 Project Structure

image-caption/
├── backend/
│   ├── app.py                 # Flask application
│   ├── requirements.txt       # Python dependencies
│   ├── .env.example          # Environment variables template
│   └── Dockerfile            # Docker configuration
├── frontend/
│   ├── src/
│   │   ├── components/       # React components
│   │   ├── pages/           # Page components
│   │   ├── hooks/           # Custom hooks
│   │   ├── utils/           # Utility functions
│   │   ├── App.jsx          # Main App component
│   │   └── main.jsx         # Entry point
│   ├── public/              # Static assets
│   ├── package.json         # Node dependencies
│   ├── vite.config.js       # Vite configuration
│   ├── tailwind.config.js   # Tailwind CSS config
│   └── Dockerfile.dev       # Development Docker config
├── docker-compose.yml       # Docker Compose configuration
├── Dockerfile              # Production Docker config
├── setup.sh               # Linux/Mac setup script
├── setup.bat              # Windows setup script
├── test_api.py            # API testing script
├── test_backend.py        # Backend testing script
├── QUICKSTART.md          # Quick start guide
├── DEPLOYMENT.md          # Deployment guide
└── README.md             # This file

🎯 API Endpoints

Caption Generation

POST /api/caption - Generate basic caption for an image

{
  "image": "base64_encoded_image",
  "style": "descriptive"
}

Image Analysis

POST /api/analyze - Get detailed image analysis

{
  "image": "base64_encoded_image",
  "detail_level": "high"
}

Translation

POST /api/translate - Translate caption to different language

{
  "caption": "English caption",
  "target_language": "es"
}

History

GET /api/history - Get caption history
DELETE /api/history/<id> - Delete history entry

🚀 Usage

Basic Caption Generation

Upload an image using drag & drop or file picker
Wait for AI analysis
View generated caption
Copy, download, or share the caption

Batch Processing

Upload multiple images
Select batch processing mode
Generate captions for all images
Download results as CSV or JSON

Multilingual Captions

Upload an image
Select target language
Generate caption in chosen language
Translate to other languages as needed

🧪 Testing

Test Backend API

python test_api.py

Test Simple Functionality

python test_simple.py

Test Full Backend

python test_backend.py

📦 Docker Deployment

Build and Run

docker-compose up -d

Stop Services

docker-compose down

View Logs

docker-compose logs -f

🌐 Deployment

Frontend Deployment (Vercel)

cd frontend
npm run build
vercel --prod

Backend Deployment (Render/Railway)

Connect GitHub repository
Set environment variables
Deploy with Docker

See DEPLOYMENT.md for detailed instructions.

🔐 Security Considerations

API Key Protection: Never commit .env files
Rate Limiting: Implemented to prevent abuse
File Validation: Strict file type and size checks
CORS Protection: Configured for allowed origins
Input Sanitization: All inputs are validated

🐛 Troubleshooting

Backend Issues

OpenAI API Error: Verify API key is correct and has sufficient credits
Port Already in Use: Change port in app.py or kill existing process
CORS Error: Check CORS configuration in .env

Frontend Issues

API Connection Failed: Ensure backend is running on correct port
Image Upload Failed: Check file size and format
Theme Not Persisting: Clear browser cache

Docker Issues

Container Won't Start: Check logs with docker-compose logs
Port Conflicts: Change ports in docker-compose.yml
Volume Issues: Ensure Docker has proper permissions

📚 Documentation

QUICKSTART.md - Quick start guide
DEPLOYMENT.md - Deployment instructions
API Documentation - Backend API docs

🤝 Contributing

Contributions are welcome! Please feel free to submit pull requests or open issues for bugs and feature requests.

📝 License

MIT License - see LICENSE file for details

🙏 Acknowledgments

OpenAI for the Vision API
React and Vite communities
Flask and Python communities
All open-source contributors

Happy Captioning! 🎉✨

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
DEPLOYMENT.md		DEPLOYMENT.md
Dockerfile		Dockerfile
QUICKSTART.md		QUICKSTART.md
README.md		README.md
docker-compose.yml		docker-compose.yml
setup.bat		setup.bat
setup.sh		setup.sh
test_api.py		test_api.py
test_backend.py		test_backend.py
test_simple.py		test_simple.py

itzmeahammed/Image-caption-generator-ai

Folders and files

Latest commit

History

Repository files navigation

🖼️ AI Image Caption Generator

✨ Features

🖼️ Image Processing

🤖 AI-Powered Captions

🌙 User Experience

📋 History & Management

🌍 Multi-language Support

🔒 Security & Performance

🛠️ Tech Stack

Frontend

Backend

📋 Requirements

Backend

Frontend

🚀 Installation

1. Clone the Repository

2. Set up the Backend

Option A: Direct Installation

Option B: Using Docker

3. Set up the Frontend

4. Access the Application

🔧 Configuration

Environment Variables

📁 Project Structure

🎯 API Endpoints

Caption Generation

Image Analysis

Translation

History

🚀 Usage

Basic Caption Generation

Batch Processing

Multilingual Captions

🧪 Testing

Test Backend API

Test Simple Functionality

Test Full Backend

📦 Docker Deployment

Build and Run

Stop Services

View Logs

🌐 Deployment

Frontend Deployment (Vercel)

Backend Deployment (Render/Railway)

🔐 Security Considerations

🐛 Troubleshooting

Backend Issues

Frontend Issues

Docker Issues

📚 Documentation

🤝 Contributing

📝 License

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages