TextScan — AI Image to Text Extractor

A free, open-source web app that extracts text from images using Google Gemini Vision AI. Upload up to 20 images, crop to the exact region you need, and get clean extracted text instantly. No sign-up, no login required.

Live demo: ocr.gifted.co.ke GitHub: github.com/mauricegift/text-scan

Features

Upload 1–20 images at once (drag & drop or file picker)
Per-image cropping before extraction
Supports JPG, PNG, BMP, WEBP, GIF, TIFF (up to 20 MB each)
Extracts handwriting, printed text, screenshots, and scanned documents
Per-image copy buttons and a global "Copy All" button
Scan history page — past sessions saved privately in the browser (up to 50)
Light / dark theme with system preference detection
Smooth scroll animations (AOS)
Scraping protection (origin validation + user-agent filtering)
Rate limiting: 60 requests per 15 minutes burst + 20 extractions per IP per day (UTC reset)
Zero authentication — fully public

Tech Stack

Layer	Technology
Runtime	Node.js ≥ 18
Server	Express.js
OCR	Google Gemini 2.5 Flash (via `@google/genai`)
Image processing	Sharp
File uploads	Multer
Security	Helmet, express-rate-limit, custom IP daily limiter
Frontend	Vanilla HTML / CSS / JS, AOS animations, Font Awesome

Project Structure

textscan/
├── config.js              # Central config — loads .env, exports origins/rate limits
├── server.js              # Express entry point
├── render.yaml            # Render.com deployment config
├── .env.example           # Environment variable reference
├── lib/
│   ├── ipLimit.js         # IP-based daily rate limiter (20 req/day, persists to ips.json)
│   ├── ips.json           # Daily usage store — auto-purged each UTC day
│   ├── ocr.js             # Gemini API calls + key rotation
│   ├── routes.js          # API routes + origin guard middleware
│   └── upload.js          # Multer file upload handler
├── public/
│   ├── assets/
│   │   ├── favicon.svg    # App favicon
│   │   └── og.svg         # Social share card (1200×630)
│   ├── history/
│   │   └── index.html     # Scan history page (served at /history)
│   └── index.html         # Main app (single-page)
└── uploads/               # Temp directory — files deleted after OCR

Quick Start

Fork /import/clone this repo.
Open the Secrets panel and add:

Secret Value

API_KEYS Your Gemini API key (or key1,key2 for multiple)
Click Run. The app starts on port 7432 and is accessible via the preview URL.

Self-Hosted / VPS Setup

1. Clone the repository

git clone https://github.com/mauricegift/text-scan.git
cd text-scan

2. Install dependencies

npm install

3. Create a `.env` file

Copy the example and fill in your values:

cp .env.example .env

Edit .env:

# One or more Gemini API keys, comma-separated.
# Keys are tried in order — if one hits its quota, the next is used automatically.
API_KEYS=AIzaSy...key1,AIzaSy...key2

# Additional allowed origins for the /extract endpoint (optional).
# ALLOWED_ORIGINS=https://yourdomain.com

# Port the server listens on.
PORT=7432

4. Run the app

node server.js
# or with PM2 for production:
pm2 start server.js --name textscan

Deploy to Render.com

A render.yaml is included for one-click deploy:

Push the repo to GitHub.
Go to render.com → New Web Service → connect the repo.
Render auto-detects render.yaml and configures everything.
In the Render dashboard, set these environment variables:
- API_KEYS — your Gemini key(s)
- SESSION_SECRET — any random string
Deploy. Health check is at /health.

Note: The free plan uses an ephemeral filesystem, so lib/ips.json (daily usage counts) resets on each redeploy. Upgrade to a paid plan with persistent disk if you need usage to survive restarts.

Getting a Gemini API Key

Go to aistudio.google.com/app/apikey
Click Create API key and copy the key.
Add it to API_KEYS in your .env file.

The free tier is generous — for higher throughput, add multiple keys(use different google accounts):

API_KEYS=key1,key2,key3

Keys are rotated automatically when one hits its quota.

Environment Variables Reference

Variable	Required	Description
`API_KEYS`	Yes	Comma-separated Gemini API keys
`ALLOWED_ORIGINS`	No	Extra origins allowed to call `/extract`
`PORT`	No	Server port (default: `7432`)
`SESSION_SECRET`	No	Used on Render.com deployment

API Endpoints

Method	Path	Description
`GET`	`/`	Main app
`GET`	`/history`	Scan history page
`POST`	`/extract`	OCR endpoint — accepts `multipart/form-data` with `images[]`
`GET`	`/health`	Health check — returns `{ status, keys, uptime }`
`GET`	`/ready`	Readiness probe — returns `{ ready, keys }`

Rate Limiting

Two layers protect the /extract endpoint:

Burst limiter — express-rate-limit: 60 requests per IP per 15-minute window.
Daily limiter — custom ipDailyLimit middleware: 20 extractions per IP per UTC calendar day.
- Counts persist in lib/ips.json so server restarts don't reset usage.
- Automatically resets at UTC midnight — each new calendar day starts fresh.
- Stale entries from previous days are purged on every write.
- On limit hit, returns 429 with countdown: "Resets in 4h 22m."

Scraping Protection

The /extract endpoint is protected by two layers:

User-agent filtering — requests from curl, wget, Python-requests, axios, node-fetch, and other known scripting tools are rejected with 403 Access denied.
Origin validation — if an Origin or Referer header is present, it must match one of:
- Hardcoded: localhost:7432, ocr.giftedtech.co.ke
- Pattern: any *.giftedtech.co.ke subdomain,
- Custom: anything in ALLOWED_ORIGINS env var

Same-origin browser requests (no Origin header) always pass through.

Scan History

The /history page stores past extraction sessions in the browser's localStorage (key: ts-history):

Max 50 sessions retained (oldest auto-removed)
Each session shows: date, relative time, file count, character count
Expandable cards with per-image text and copy buttons
Copy-all, delete individual session, clear-all (with confirmation)
No server-side storage — fully private to the user's browser

Contributing

Pull requests are welcome. For major changes please open an issue first.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TextScan — AI Image to Text Extractor

Features

Tech Stack

Project Structure

Quick Start

Self-Hosted / VPS Setup

1. Clone the repository

2. Install dependencies

3. Create a `.env` file

4. Run the app

Deploy to Render.com

Getting a Gemini API Key

Environment Variables Reference

API Endpoints

Rate Limiting

Scraping Protection

Scan History

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
lib		lib
public		public
uploads		uploads
.env.example		.env.example
LICENSE		LICENSE
README.md		README.md
config.js		config.js
package.json		package.json
render.yaml		render.yaml
server.js		server.js

Folders and files

Latest commit

History

Repository files navigation

TextScan — AI Image to Text Extractor

Features

Tech Stack

Project Structure

Quick Start

Self-Hosted / VPS Setup

1. Clone the repository

2. Install dependencies

3. Create a .env file

4. Run the app

Deploy to Render.com

Getting a Gemini API Key

Environment Variables Reference

API Endpoints

Rate Limiting

Scraping Protection

Scan History

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

3. Create a `.env` file

Packages