Skip to content

itzmeahammed/Image-caption-generator-ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ–ΌοΈ AI Image Caption Generator

A modern, full-stack web application that generates intelligent captions for images using OpenAI's Vision API. This project combines a beautiful React frontend with a robust Python Flask backend for seamless image analysis and caption generation.

✨ Features

πŸ–ΌοΈ Image Processing

  • Drag & Drop Upload: Easy image upload with preview
  • Batch Processing: Process multiple images at once
  • Image Optimization: Automatic image optimization and EXIF data removal
  • Multiple Formats: Support for PNG, JPG, JPEG, GIF, WebP

πŸ€– AI-Powered Captions

  • GPT-4 Vision API: Using OpenAI's latest vision model
  • Detailed Analysis: Get comprehensive image analysis
  • Smart Descriptions: Context-aware caption generation
  • Customizable Prompts: Tailor captions to your needs

πŸŒ™ User Experience

  • Dark/Light Theme: Toggle between themes
  • Responsive Design: Works seamlessly on all devices
  • Real-time Processing: Live caption generation
  • Modern UI: Built with React, Tailwind CSS, and Framer Motion

πŸ“‹ History & Management

  • Caption History: Save and manage your caption history
  • Export Options: Download captions and images
  • Favorites: Mark and organize favorite captions
  • Search & Filter: Easily find previous captions

🌍 Multi-language Support

  • Multilingual Captions: Generate captions in different languages
  • Translation API: Translate existing captions
  • Language Detection: Automatic language detection

πŸ”’ Security & Performance

  • Rate Limiting: API rate limiting for fair usage
  • File Size Limits: Maximum 5MB file upload
  • CORS Protection: Secure cross-origin requests
  • Caching: Redis-based caching for performance

πŸ› οΈ Tech Stack

Frontend

  • React 18: Modern UI library
  • Vite: Lightning-fast build tool
  • Tailwind CSS: Utility-first CSS framework
  • Framer Motion: Smooth animations
  • React Dropzone: File upload handling
  • React Hot Toast: Toast notifications
  • Axios: HTTP client

Backend

  • Python Flask: Web framework
  • OpenAI Vision API: Image analysis
  • Pillow: Image processing
  • Flask-CORS: Cross-origin support
  • Flask-Limiter: Rate limiting
  • Redis: Caching (optional)
  • Docker: Containerization

πŸ“‹ Requirements

Backend

  • Python 3.8+
  • pip package manager
  • OpenAI API key
  • Redis (optional, for caching)

Frontend

  • Node.js 16.x or higher
  • npm or yarn package manager

πŸš€ Installation

1. Clone the Repository

git clone git@github.com:itzmeahammed/Image-caption-generator-ai.git
cd image-caption

2. Set up the Backend

Option A: Direct Installation

cd backend
pip install -r requirements.txt
cp .env.example .env
# Edit .env and add your OpenAI API key
python app.py

Option B: Using Docker

docker-compose up -d

3. Set up the Frontend

cd frontend
npm install
npm run dev

4. Access the Application

  • Frontend: http://localhost:5173
  • Backend API: http://localhost:5000

πŸ”§ Configuration

Environment Variables

Create a .env file in the backend directory:

# OpenAI Configuration
OPENAI_API_KEY=your_openai_api_key_here

# Flask Configuration
FLASK_ENV=development
FLASK_DEBUG=True

# Redis Configuration (optional)
REDIS_URL=redis://localhost:6379

# API Configuration
MAX_FILE_SIZE=5242880
ALLOWED_EXTENSIONS=png,jpg,jpeg,gif,webp

# CORS Configuration
CORS_ORIGINS=http://localhost:5173,http://localhost:3000

πŸ“ Project Structure

image-caption/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ app.py                 # Flask application
β”‚   β”œβ”€β”€ requirements.txt       # Python dependencies
β”‚   β”œβ”€β”€ .env.example          # Environment variables template
β”‚   └── Dockerfile            # Docker configuration
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ components/       # React components
β”‚   β”‚   β”œβ”€β”€ pages/           # Page components
β”‚   β”‚   β”œβ”€β”€ hooks/           # Custom hooks
β”‚   β”‚   β”œβ”€β”€ utils/           # Utility functions
β”‚   β”‚   β”œβ”€β”€ App.jsx          # Main App component
β”‚   β”‚   └── main.jsx         # Entry point
β”‚   β”œβ”€β”€ public/              # Static assets
β”‚   β”œβ”€β”€ package.json         # Node dependencies
β”‚   β”œβ”€β”€ vite.config.js       # Vite configuration
β”‚   β”œβ”€β”€ tailwind.config.js   # Tailwind CSS config
β”‚   └── Dockerfile.dev       # Development Docker config
β”œβ”€β”€ docker-compose.yml       # Docker Compose configuration
β”œβ”€β”€ Dockerfile              # Production Docker config
β”œβ”€β”€ setup.sh               # Linux/Mac setup script
β”œβ”€β”€ setup.bat              # Windows setup script
β”œβ”€β”€ test_api.py            # API testing script
β”œβ”€β”€ test_backend.py        # Backend testing script
β”œβ”€β”€ QUICKSTART.md          # Quick start guide
β”œβ”€β”€ DEPLOYMENT.md          # Deployment guide
└── README.md             # This file

🎯 API Endpoints

Caption Generation

  • POST /api/caption - Generate basic caption for an image
    {
      "image": "base64_encoded_image",
      "style": "descriptive"
    }

Image Analysis

  • POST /api/analyze - Get detailed image analysis
    {
      "image": "base64_encoded_image",
      "detail_level": "high"
    }

Translation

  • POST /api/translate - Translate caption to different language
    {
      "caption": "English caption",
      "target_language": "es"
    }

History

  • GET /api/history - Get caption history
  • DELETE /api/history/<id> - Delete history entry

πŸš€ Usage

Basic Caption Generation

  1. Upload an image using drag & drop or file picker
  2. Wait for AI analysis
  3. View generated caption
  4. Copy, download, or share the caption

Batch Processing

  1. Upload multiple images
  2. Select batch processing mode
  3. Generate captions for all images
  4. Download results as CSV or JSON

Multilingual Captions

  1. Upload an image
  2. Select target language
  3. Generate caption in chosen language
  4. Translate to other languages as needed

πŸ§ͺ Testing

Test Backend API

python test_api.py

Test Simple Functionality

python test_simple.py

Test Full Backend

python test_backend.py

πŸ“¦ Docker Deployment

Build and Run

docker-compose up -d

Stop Services

docker-compose down

View Logs

docker-compose logs -f

🌐 Deployment

Frontend Deployment (Vercel)

cd frontend
npm run build
vercel --prod

Backend Deployment (Render/Railway)

  1. Connect GitHub repository
  2. Set environment variables
  3. Deploy with Docker

See DEPLOYMENT.md for detailed instructions.

πŸ” Security Considerations

  • API Key Protection: Never commit .env files
  • Rate Limiting: Implemented to prevent abuse
  • File Validation: Strict file type and size checks
  • CORS Protection: Configured for allowed origins
  • Input Sanitization: All inputs are validated

πŸ› Troubleshooting

Backend Issues

  • OpenAI API Error: Verify API key is correct and has sufficient credits
  • Port Already in Use: Change port in app.py or kill existing process
  • CORS Error: Check CORS configuration in .env

Frontend Issues

  • API Connection Failed: Ensure backend is running on correct port
  • Image Upload Failed: Check file size and format
  • Theme Not Persisting: Clear browser cache

Docker Issues

  • Container Won't Start: Check logs with docker-compose logs
  • Port Conflicts: Change ports in docker-compose.yml
  • Volume Issues: Ensure Docker has proper permissions

πŸ“š Documentation

🀝 Contributing

Contributions are welcome! Please feel free to submit pull requests or open issues for bugs and feature requests.

πŸ“ License

MIT License - see LICENSE file for details

πŸ™ Acknowledgments

  • OpenAI for the Vision API
  • React and Vite communities
  • Flask and Python communities
  • All open-source contributors

Happy Captioning! πŸŽ‰βœ¨

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published