Skip to content
sarmakska edited this page Jun 7, 2026 · 5 revisions

receipt-scanner

Working vision OCR starter. Photograph a receipt, get structured JSON, export to your accounting tool.

Built by Sarma Linux. MIT licence. Source at github.com/sarmakska/receipt-scanner.


What this is

You upload one receipt or fifty. Each image is downscaled and re-encoded, sent to a high-resolution vision model with the receipt schema as a structured-output constraint, validated against a Zod contract, and returned to the UI as a table. The model is constrained to emit exactly the schema, so the output is valid JSON in the right shape every time rather than free-form text I have to repair.

It extracts vendor name and address, transaction date and time, itemised line items with quantity and unit price, subtotal, tax, tip, total, currency, and payment method when visible. From there you can store the originals on Cloudflare R2, export to CSV or OFX for Xero or QuickBooks, or wire the JSON into Supabase, an accounting API, or n8n.

Who this is for

  • Small business teams replacing manual receipt entry.
  • Builders prototyping an AI expense or bookkeeping product.
  • Engineers who want to see a vision-OCR pipeline end to end, with batching and a real accounting-tool export.

Architecture

Single-process Next.js 14 application. No separate worker, queue, or database in the default build. The whole pipeline runs server-side, which keeps the API key off the client and makes the cost surface easy to reason about.

flowchart TD
    A[Browser: app/page.tsx] -->|one file or many| B{route}
    B -->|single| C[app/api/scan]
    B -->|batch| D[app/api/scan/batch]
    C --> E[lib/pipeline: processReceipt]
    D --> F[lib/pipeline: processBatch]
    F --> E
    E --> G[lib/storage: R2 put + SHA-256]
    E --> H[lib/vision: Opus 4.7 structured output]
    H --> I[lib/schema: Zod validate]
    E --> J[lib/persist: save]
    I --> K[StoredReceipt JSON to UI]
    K --> L[app/api/export/ofx: OFX 1.0.2]
    K --> M[app/api/export/csv: CSV summary or items]
Loading

The request lifecycle, step by step:

  1. Upload. app/page.tsx posts the files as multipart form data. One file goes to /api/scan, several go to /api/scan/batch. No client-side processing, so the browser never holds an API key.
  2. Store the original. lib/storage.ts writes the image to Cloudflare R2, content-addressed by SHA-256. When R2 is not configured this is a no-op that still computes the hash, so deduplication and audit hashing work either way.
  3. Pre-process. sharp corrects EXIF orientation, downscales the long edge to MAX_IMAGE_PX (default 2576), and re-encodes to JPEG. This is the single biggest cost lever.
  4. Vision call. lib/vision.ts sends the base64 image to Claude Opus 4.7 with the receipt JSON Schema as a structured-output constraint and a cached system prompt.
  5. Validate. The result is parsed by the Zod schema in lib/schema.ts. Because the model was already constrained, this is a boundary, not a repair step.
  6. Persist and respond. lib/persist.ts (a no-op stub by default) returns a complete StoredReceipt, which the UI renders and can export to CSV or OFX.

Why a strict schema boundary

The schema is the contract. Structured outputs constrain the model to emit it, and Zod re-validates on the way in. This is why swapping the vision call (Opus 4.7 to another provider, or to a local model) requires no changes outside lib/vision.ts: the rest of the app only ever sees a validated Receipt.

The JSON Schema handed to the model in lib/vision.ts is written by hand rather than derived from the Zod schema, because the SDK's Zod-to-JSON-Schema helper requires Zod 4 and the project pins Zod 3. Keep the two in sync.

Component map

File Responsibility
app/page.tsx Upload UI, parsed tables, CSV and OFX export triggers
app/api/scan/route.ts Single-scan endpoint
app/api/scan/batch/route.ts Batch endpoint, up to 50 files, per-file results
app/api/export/csv/route.ts CSV download, summary or line-item layout
app/api/export/ofx/route.ts OFX 1.0.2 statement download
lib/pipeline.ts The one path: store, scan, persist; batch fan-out
lib/vision.ts The single vision call. Opus 4.7, structured output, caching
lib/schema.ts The Zod contract and the StoredReceipt type
lib/storage.ts Optional Cloudflare R2 original-image storage
lib/csv.ts RFC 4180 CSV generation, summary or line-item layout
lib/ofx.ts OFX 1.0.2 generation
lib/persist.ts save() stub. Replace with a Supabase insert or webhook
docs/schema.sql Postgres / Supabase tables that mirror the contract

Real-world examples

Expense capture for a small team. Staff snap a photo on their phone, the scan returns structured fields, the original lands in R2 for audit, and you insert into Supabase. Wire lib/persist.ts to a single insert against the tables in docs/schema.sql. See Wire-to-Database.

Month-end accounting import. Drop a folder of receipts into the batch upload, then click Export OFX and import the statement into Xero, QuickBooks, or GnuCash, or click Export CSV for a spreadsheet you can sort and pivot. See Batch-Upload, OFX-Export, and CSV-Export.

Automation fan-out with n8n. Add a webhook target in the pipeline and POST every validated receipt to an n8n workflow. From there you can branch on vendor, route for approval, or push to a spreadsheet without touching this codebase again.

Provider benchmarking. Want to compare Opus 4.7 against another vision model on your own receipts? Replace the body of lib/vision.ts, keep the same schema, and the UI and validation stay identical. See Vision-Models.

Troubleshooting

ANTHROPIC_API_KEY is not set or a 401 from the model. Copy .env.example to .env.local and set a key with vision access. The key is read server-side only. Restart the dev server after changing env files.

The build fails with a sharp native binary error. sharp ships platform-specific binaries. If your package manager skipped its build script, run pnpm rebuild sharp. On serverless, confirm the platform provides the native libraries; Vercel does. See Deployment.

A scan throws a validation error. The model returned output that did not satisfy the schema. This is the boundary doing its job. Inspect the raw response; if the failure is systematic, tighten the system prompt in lib/vision.ts or relax the affected field in both lib/schema.ts and the hand-written RECEIPT_JSON_SCHEMA. See Edge-Cases.

Images do not appear in R2. R2 is optional and only active when all four R2_* variables are set. With them unset, scanning still works and image_key is null. See Image-Storage.

A batch reports some files failed. Each file is scanned independently, so one unreadable image is reported as a per-file error while the rest succeed. Check the results array in the response. See Batch-Upload.

OFX import shows nothing or wrong amounts. Receipts with a null total post as zero. Make sure the receipts you export carry a numeric total, or correct them before export. See OFX-Export.

Blurry or low-light photos return sparse fields. The model returns what it can read. Improve capture conditions, or raise MAX_IMAGE_PX to retain more detail at the cost of more tokens. See Configuration.

Stack

Next.js 14 App Router, TypeScript, Claude Opus 4.7 vision (claude-opus-4-7), structured outputs, sharp, Zod, Cloudflare R2 via @aws-sdk/client-s3, CSV (RFC 4180) and OFX 1.0.2 export, Tailwind CSS.


Wiki pages

Clone this wiki locally