Skip to content

iNeoO/ocr

Repository files navigation

OCR

OCR is a full-stack document processing app that turns uploaded PDFs into structured outputs through a queued OCR pipeline.

App: https://ocr.tuturu.io

Features

  • PDF upload and process tracking
  • PDF split pipeline
  • OCR transcription and post-processing
  • Structured output download
  • OpenAPI docs exposed by the backend outside production

Stack

  • Frontend: React, TypeScript, Vite, TanStack Router
  • Backend: Node.js, TypeScript, tRPC
  • Database: PostgreSQL with Drizzle ORM
  • Queue and cache: RabbitMQ, Redis
  • Object storage: MinIO / S3-compatible storage
  • Monorepo: pnpm workspaces

Development

Install dependencies:

pnpm install

Start infrastructure with Docker:

docker compose up -d

Start the app in dev mode:

pnpm dev

pnpm dev runs a runtime build first, then starts the frontend, backend, and workers.

Build

Build runtime packages and apps:

pnpm build

Run lint across workspaces:

pnpm -r --if-present lint

Production Docker

Production compose file:

docker compose -f docker-compose.prod.yaml up -d --build

Environment template:

Main exposed ports in production compose:

  • Front services: configure separately if you add the frontend service
  • Backend: 4010
  • Postgres: 5436
  • Redis: 6380
  • RabbitMQ: 5673
  • MinIO API: 9002

Notes

  • The production Dockerfiles build workspace artifacts and run compiled files from dist.
  • The frontend is currently not included in docker-compose.prod.yaml.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors