OCR is a full-stack document processing app that turns uploaded PDFs into structured outputs through a queued OCR pipeline.
App: https://ocr.tuturu.io
- PDF upload and process tracking
- PDF split pipeline
- OCR transcription and post-processing
- Structured output download
- OpenAPI docs exposed by the backend outside production
- Frontend: React, TypeScript, Vite, TanStack Router
- Backend: Node.js, TypeScript, tRPC
- Database: PostgreSQL with Drizzle ORM
- Queue and cache: RabbitMQ, Redis
- Object storage: MinIO / S3-compatible storage
- Monorepo: pnpm workspaces
Install dependencies:
pnpm installStart infrastructure with Docker:
docker compose up -dStart the app in dev mode:
pnpm devpnpm dev runs a runtime build first, then starts the frontend, backend, and workers.
Build runtime packages and apps:
pnpm buildRun lint across workspaces:
pnpm -r --if-present lintProduction compose file:
docker compose -f docker-compose.prod.yaml up -d --buildEnvironment template:
Main exposed ports in production compose:
- Front services: configure separately if you add the frontend service
- Backend:
4010 - Postgres:
5436 - Redis:
6380 - RabbitMQ:
5673 - MinIO API:
9002
- The production Dockerfiles build workspace artifacts and run compiled files from
dist. - The frontend is currently not included in
docker-compose.prod.yaml.