Transform unstructured documents into actionable intelligence with AI-powered analysis, intelligent routing, and workflow automation.
- Universal Document Processing: Upload PDFs, DOCX, or TXT files, or paste text directly
- AI-Powered Analysis: Automatic classification, entity extraction, and summarization
- Smart Routing: Documents are automatically classified and assigned to the right team
- Risk Detection: Identify missing information, compliance issues, and potential risks
- Workflow Automation: Auto-generated checklists, draft emails, and action items
- Full Audit Trail: Complete history of all changes and actions
- Case Management: Track, filter, and manage all document cases
- Frontend: Next.js 14 (App Router), TypeScript, Tailwind CSS, shadcn/ui
- Backend: Next.js API Routes, Prisma ORM
- Database: PostgreSQL (Supabase)
- AI: Groq API (llama-3.3-70b-versatile) with fallback mock mode
- OCR: PyTesseract for PDF text extraction
- Node.js 18+
- npm or yarn
- Python 3.8+
- Tesseract OCR (for PDF text extraction)
- Poppler (for PDF to image conversion)
- Supabase account (free tier works fine)
- Clone the repository:
git clone <repository-url>
cd docops-copilot- Install Node.js dependencies:
npm install- Install system dependencies for OCR:
macOS:
brew install tesseract popplerUbuntu/Debian:
sudo apt-get update
sudo apt-get install tesseract-ocr poppler-utilsWindows:
- Tesseract: https://github.com/UB-Mannheim/tesseract/wiki
- Poppler: https://github.com/oschwartz10612/poppler-windows/releases
- Install Python dependencies for OCR:
cd python-services
pip3 install -r requirements.txt
cd ..- Set up environment variables:
cp .env.example .envEdit .env and add your credentials:
# Groq API Key (get from https://console.groq.com)
GROQ_API_KEY="your_groq_api_key_here"
# Supabase Database URLs (see SUPABASE_SETUP.md for details)
DATABASE_URL="postgresql://postgres:[PASSWORD]@db.[PROJECT-REF].supabase.co:5432/postgres?pgbouncer=true"
DIRECT_URL="postgresql://postgres:[PASSWORD]@db.[PROJECT-REF].supabase.co:5432/postgres"
# Optional Settings
MOCK_AI="false"
NEXT_PUBLIC_BASE_URL="http://localhost:3000"📚 See "Supabase Database Setup" section below for detailed instructions
- Initialize the database:
# Generate Prisma client
npm run db:generate
# Push schema to Supabase
npm run db:push- (Optional) Seed with sample data:
npm run seed- Start the development server:
npm run dev| Variable | Description | Required |
|---|---|---|
DATABASE_URL |
Supabase PostgreSQL connection string (pooled) | Yes |
DIRECT_URL |
Supabase PostgreSQL direct connection (for migrations) | Yes |
GROQ_API_KEY |
Groq API key for AI analysis | No (uses mock mode) |
MOCK_AI |
Force mock AI mode | No (default: false) |
NEXT_PUBLIC_BASE_URL |
Base URL for OCR API calls | No (default: http://localhost:3000) |
- Go to https://supabase.com
- Sign up or log in
- Click "New Project"
- Choose your organization
- Enter project details:
- Name: docops-copilot (or your choice)
- Database Password: Generate a strong password (SAVE THIS!)
- Region: Choose closest to you
- Pricing Plan: Free tier works fine for development
- Click "Create new project"
- Wait 2-3 minutes for project to be provisioned
- In your Supabase project dashboard, go to Settings (gear icon)
- Click Database in the left sidebar
- Scroll to "Connection string" section
- You'll see two types of connection strings:
- Click on "URI" tab
- Copy the connection string that looks like:
postgresql://postgres.xxxxx:password@aws-0-region.pooler.supabase.com:5432/postgres - This uses connection pooling (recommended for Next.js/Vercel)
- Click on "Direct connection"
- Copy the connection string that looks like:
postgresql://postgres:password@db.xxxxx.supabase.co:5432/postgres - This is used for migrations
Update your .env file in the project root:
# Groq API Key
GROQ_API_KEY=your_groq_api_key_here
# Supabase Database URLs
# Replace [YOUR-PASSWORD] with your database password
# Replace [YOUR-PROJECT-REF] with your project reference
DATABASE_URL="postgresql://postgres:[YOUR-PASSWORD]@db.[YOUR-PROJECT-REF].supabase.co:5432/postgres?pgbouncer=true"
DIRECT_URL="postgresql://postgres:[YOUR-PASSWORD]@db.[YOUR-PROJECT-REF].supabase.co:5432/postgres"
# Optional Settings
MOCK_AI=false
NEXT_PUBLIC_BASE_URL=http://localhost:3000Important: Replace the placeholders:
[YOUR-PASSWORD]= Your database password[YOUR-PROJECT-REF]= Your project reference (e.g.,abcdefghijklmnop)
# Generate Prisma client for PostgreSQL
npm run db:generate
# Push schema to Supabase
npm run db:push
# Or use migrations (recommended for production)
npx prisma migrate dev --name initnpm run seedThis will create 10 sample cases for testing.
# Open Prisma Studio to view your data
npm run db:studioYou should see your Supabase database with the Case, AuditEvent, CaseNote, and Attachment tables.
- Check your DATABASE_URL is correct
- Verify your database password
- Make sure your Supabase project is running (green indicator in dashboard)
- Use DIRECT_URL for migrations
- Use DATABASE_URL (with pooling) for the application
- Supabase requires SSL by default (already included in the connection strings)
- Supabase free tier has connection limits
- Make sure you're using connection pooling (DATABASE_URL)
- Close unused connections
- Table Editor: View and edit data directly
- SQL Editor: Run custom SQL queries
- Database: Manage tables, triggers, functions
- Storage: File storage (for future file uploads)
- Auth: User authentication (if needed later)
- API: Auto-generated REST and GraphQL APIs
- 500 MB database space
- 2 GB bandwidth per month
- 50,000 monthly active users
- Unlimited API requests
- Automatic backups (7 days)
Perfect for development and small production deployments!
/src
/app # Next.js app directory
/api # API route handlers
/analyze # Document analysis endpoint
/cases # Case CRUD endpoints
/stats # Dashboard statistics
/cases # Cases pages
/[id] # Case detail page
page.tsx # Home/upload page
layout.tsx # Root layout
/components # React components
/ui # shadcn/ui components
/document # Document-related components
/case # Case-related components
/layout # Layout components
/lib # Utilities
/ai # AI integration (analyzer, mock)
/db # Prisma client
types.ts # TypeScript types
utils.ts # Helper functions
/prisma
schema.prisma # Database schema
seed.ts # Seed data script
/migrations # Database migrations
-
Upload a Document
- Navigate to the home page
- Drag and drop a PDF, DOCX, or TXT file, or paste text directly
- Click "Analyze Document"
-
Review Analysis Results
- See the AI-generated classification, summary, and entities
- Review risk flags and missing information
- Check the recommended checklist and draft email
-
Create a Case
- Click "Save as Case" to create a case from the analysis
- You'll be redirected to the case detail page
-
Manage Cases
- Navigate to the Cases dashboard
- Filter by team, priority, or status
- Search for specific cases
-
Work on a Case
- Open a case to see full details
- Mark checklist items as complete
- Add notes and track progress
- View the complete audit trail
The seed data includes 10 diverse sample documents:
- Disputed Invoice - AP team, missing PO number
- NDA Contract - Legal team, missing signatures
- Software Engineer Resume - HR team, urgent
- IT Incident Report - IT team, security critical
- Sales Meeting Notes - Sales team, follow-up actions
- Privacy Policy Update - Legal team, GDPR compliance
- Purchase Order - Procurement, pending approval
- Customer Support Escalation - Support team, SLA breach
- Expense Report - Finance team, missing receipts
- Partnership Proposal - Sales/Legal, requires review
| Script | Description |
|---|---|
npm run dev |
Start development server |
npm run build |
Build for production |
npm run start |
Start production server |
npm run seed |
Seed database with sample data |
npm run db:migrate |
Run database migrations |
npm run db:studio |
Open Prisma Studio (database GUI) |
npm run db:generate |
Generate Prisma client |
npm run lint |
Run ESLint |
POST /api/analyze
- Content-Type: multipart/form-data (file upload) or application/json (text)
- Returns: AI analysis results
POST /api/ocr
- Content-Type: multipart/form-data (PDF file upload)
- Returns: Extracted text using OCR
GET /api/cases # List cases (with filters)
POST /api/cases # Create case
GET /api/cases/[id] # Get case details
PATCH /api/cases/[id] # Update case
DELETE /api/cases/[id] # Delete case
GET /api/cases/[id]/notes # Get case notes
POST /api/cases/[id]/notes # Add note
GET /api/cases/[id]/audit # Get audit events
POST /api/cases/[id]/audit # Create audit event
GET /api/stats # Get dashboard statistics
When running without a Groq API key (or with MOCK_AI=true), the system uses intelligent mock responses:
- Document type detection based on keyword analysis
- Entity extraction using regex patterns
- Team assignment based on document classification
- Priority detection from urgency indicators
- Pre-defined templates for each document type
This allows full demo functionality without API costs.
The system uses PyTesseract for OCR text extraction from PDF files:
- Standard PDF extraction is tried first (fast, works for text-based PDFs)
- OCR extraction is used as fallback for scanned/image-based PDFs
- PDFs are converted to images at 300 DPI for optimal OCR accuracy
- Text is extracted page by page and combined
To test OCR directly:
python3 python-services/ocr_service.py path/to/document.pdf# Reset database (Supabase)
npx prisma migrate reset
# Or just push schema changes
npm run db:push
# Re-seed data
npm run seednpm run db:generaterm -rf node_modules
npm installMIT