Skip to content

JoseRFelix/phase1-document-extractor

Repository files navigation

Phase 1 Invoice Extractor

This is a Next.js 16 app for a focused v1 invoice extraction workflow:

  • Upload a single PDF, PNG, or JPG invoice
  • For PDFs, render up to the first 5 pages as images
  • Send the document to Gemini through the Vercel AI SDK
  • Validate the result against a strict Zod schema
  • Review the extracted fields, line items, and raw JSON in the UI

Environment

Create a local env file and add your Google AI Studio key:

cp .env.example .env.local
GOOGLE_GENERATIVE_AI_API_KEY=your_google_ai_studio_api_key
GEMINI_MODEL=gemini-3-flash-preview

GEMINI_MODEL is optional. The app defaults to gemini-3-flash-preview.

Run locally

pnpm dev

Open http://localhost:3000.

What this v1 does

  • Invoice-only extraction
  • PDF/image upload through a Next.js route handler
  • Page-image-heavy PDF processing with a PDF-file fallback
  • Strict JSON output using generateText() plus Output.object()
  • shadcn/ui upload and results interface

What this v1 does not do yet

  • Persistent file storage
  • Background job queues
  • Contract extraction
  • Cross-document search or embeddings

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors