Skip to content

tmarktg/image-captioning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Image Captioning Meme Generator

Upload an image (or search Unsplash), get a BLIP-generated caption, have Claude rewrite it as a meme, and download the result with text burned into the image.

Demo

Setup

1. Backend

cd backend
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

Copy .env.example to .env and fill in your keys:

cp .env.example .env
# then edit .env

Required key:

Optional:

Start the backend (from inside backend/):

uvicorn main:app --reload

The first run will download the BLIP and DETR models (~2 GB total) and cache them in ~/.cache/huggingface/. The /api/health endpoint returns {"ready": true} once models are loaded.

Backend runs on: http://localhost:8000

2. Frontend

cd frontend
npm install
npm run dev

Frontend runs on: http://localhost:5173

Usage

  1. Start the backend and wait for the "Models ready" confirmation in the terminal (or watch the loading banner in the UI).
  2. Open http://localhost:5173.
  3. Drag & drop an image or click the dropzone to upload.
  4. Optionally use the Unsplash search box to pull a fresh image by keyword.
  5. Wait ~10–30 seconds for inference + Claude rewrite.
  6. Download the meme.

Architecture

frontend (Vite React :5173)
    POST /api/caption   →  image upload → BLIP → DETR → Claude Haiku → Pillow → PNG
    GET  /api/unsplash  →  Unsplash fetch → same pipeline
    GET  /api/health    →  readiness check

backend/
  main.py              FastAPI app, endpoints, CORS, lifespan model load
  caption.py           BLIP captioning (Salesforce/blip-image-captioning-large)
  detect.py            DETR object detection (facebook/detr-resnet-50), optional
  memeify.py           Claude Haiku rewrite + Pillow text rendering
  unsplash_source.py   Unsplash API fetch
  fonts/Anton-Regular.ttf  bundled Impact-style font

About

AI-powered meme generator with image captioning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors