AI-powered OCR service for converting PDFs and images to Markdown with 99.9% accuracy.
LLMOCR is a comprehensive AI-powered OCR (Optical Character Recognition) service that provides high-accuracy text extraction and document conversion capabilities. Built with Next.js and powered by advanced AI models, it offers multiple OCR features including PDF to Markdown conversion, multilingual text recognition, formula recognition, and key information extraction.
- PDF to Markdown: Convert PDF documents to Markdown format with high accuracy
- Image to Markdown: Extract text from images and convert to Markdown
- Multilingual Text Recognition: Support for multiple languages with advanced recognition
- Text Recognition: General-purpose text extraction from images
- Key Information Extraction: Extract structured data from documents
- Formula Recognition: Recognize mathematical formulas and equations
- Advanced Recognition: Enhanced OCR capabilities for complex documents
- API Key Management: Automatic API key rotation and load balancing
- Subscription System: Flexible credit-based billing with multiple plans
- User Authentication: Secure login with Google, GitHub, and email
- Document History: Track and manage all processed documents
- RESTful API: Developer-friendly API for integration
- Next.js 15.5.4 - React framework with App Router
- React 18.3.1 - UI library
- TypeScript 5.5.3 - Type-safe development
- Tailwind CSS - Utility-first CSS framework
- Radix UI - Accessible component primitives
- Framer Motion - Animation library
- Next.js API Routes - Serverless API endpoints
- NextAuth.js 5.0 - Authentication solution
- Prisma ORM - Type-safe database access
- PostgreSQL - Primary database
- Cloudflare R2 - File storage
- Creem - Payment processing
- Docker - Containerization
- Node.js 20 or higher
- PostgreSQL database
- pnpm package manager
- Clone the repository
git clone https://github.com/Selenium39/llmocr.git
cd llmocr- Install dependencies
pnpm install- Set up environment variables
cp .env.example .envEdit .env file with your configuration (see Environment Variables)
- Initialize the database
npx prisma migrate deploy- Run the development server
pnpm devOpen http://localhost:3000 in your browser.
# Application URL
NEXT_PUBLIC_APP_URL=http://localhost:3000
NEXT_PUBLIC_APP_NAME=LLMOCR
FREE_TRIAL_CREDITS=30
# Database
DATABASE_URL='postgresql://user:password@localhost:5432/llmocr'
# Authentication (generate with: openssl rand -base64 32)
AUTH_SECRET=your_secret_key
AUTH_URL=http://localhost:3000
AUTH_TRUST_HOST=true
# OAuth Providers (optional)
GOOGLE_CLIENT_ID=
GOOGLE_CLIENT_SECRET=
GITHUB_ID=
GITHUB_SECRET=
# Cloudflare R2 Storage
STORAGE_REGION=auto
STORAGE_BUCKET_NAME=your_bucket_name
STORAGE_ACCESS_KEY_ID=your_access_key
STORAGE_SECRET_ACCESS_KEY=your_secret_key
STORAGE_ENDPOINT=https://your_endpoint.r2.cloudflarestorage.com
STORAGE_PUBLIC_URL=https://your_public_url.r2.dev
# Payment Integration (Creem)
CREEM_API_KEY=
CREEM_API_URL=https://api.creem.io
CREEM_WEBHOOK_SECRET=
CREEM_PRODUCT_BASIC=
CREEM_PRODUCT_PRO=
CREEM_PRODUCT_ULTRA=LLMOCR uses a database-based API key management system. Configure your OCR provider API keys in the admin dashboard:
- Navigate to Admin Panel > API Keys
- Add API keys for your OCR providers (MISTRAL, DASHSCOPE)
- Keys are automatically rotated using round-robin algorithm
- Failed keys are automatically disabled
-
The project includes a
docker-compose.ymlfile in the root directory. Update the environment variables as needed. -
Start the services:
docker-compose up -d# Build the image
docker build -t llmocr .
# Run the container
docker run -p 3000:3000 \
-e DATABASE_URL="your_database_url" \
-e AUTH_SECRET="your_secret" \
llmocrAll API endpoints require authentication via API key or session token.
Using API Key:
curl -X POST https://your-domain.com/api/pdf-to-markdown?key=YOUR_API_KEY \
-H "Content-Type: application/json" \
-d '{"file_url": "https://example.com/document.pdf"}'POST /api/pdf-to-markdown
POST /api/image-to-markdown
POST /api/text-recognition
POST /api/multilingual-text-recognition
POST /api/key-information-extraction
POST /api/formula-recognition
POST /api/advanced-recognition
const response = await fetch('https://your-domain.com/api/image-to-markdown?key=YOUR_API_KEY', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
images: [
{
type: 'image_url',
image_url: 'data:image/jpeg;base64,...'
}
]
})
});
const result = await response.json();
console.log(result.content);llmocr/
├── app/ # Next.js app directory
│ ├── api/ # API routes
│ ├── [locale]/ # Internationalized pages
│ └── ...
├── components/ # React components
├── lib/ # Utility functions and services
│ ├── dto/ # Data transfer objects
│ ├── services/ # Business logic
│ └── ...
├── prisma/ # Database schema and migrations
├── public/ # Static assets
├── config/ # Configuration files
└── content/ # Content for static pages
The application uses Prisma ORM with PostgreSQL. Key models include:
- User: User accounts and authentication
- Subscription: User subscription plans and credits
- ApiKey: OCR provider API key management
- Document: Processed documents (PDF, Image, etc.)
- BillingHistory: Transaction records
- RedeemCode: Promotional codes
# Generate migration
npx prisma migrate dev --name migration_name
# Apply migrations
npx prisma migrate deploy
# Open Prisma Studio
npx prisma studiopnpm build
pnpm startLLMOCR implements an intelligent API key rotation system:
- Round-Robin Algorithm: Distributes requests evenly across available keys
- Automatic Failover: Disables failed keys automatically
- Usage Tracking: Monitors key usage statistics
- Load Balancing: Prevents rate limiting by rotating keys
Flexible credit-based billing:
- Free Trial: 30 pages for new users
- Basic Plan: 1,000 pages/month
- Pro Plan: 5,000 pages/month
- Ultra Plan: 20,000 pages/month
- Secure cloud storage with Cloudflare R2
- Document history and tracking
- Download in multiple formats
- Automatic cleanup of old files
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE.md file for details.
- Email: selenium39@qq.com
- GitHub Issues: Create an issue
- Built with Next.js
- Powered by AI OCR services
- UI components from Radix UI
- Styled with Tailwind CSS