InvoiceFlow

InvoiceFlow is a MVP for automated invoice ingestion and review. Users upload invoice PDFs, the backend extracts structured fields with deterministic parsing (plus OCR fallback), stores results in PostgreSQL, and exposes a polished server-rendered UI for search, review, and manual correction.

Features

PDF invoice upload from web UI
Deterministic parser pipeline:
- Direct PDF text extraction (pdfplumber, pypdf)
- Text normalization
- Regex/heuristic field extraction
- OCR fallback (pdf2image + pytesseract) when text extraction is weak
Extracts these fields when available:
- vendor_name, invoice_number, invoice_date, due_date
- currency, net_amount, tax_amount, total_amount
- iban, vat_id
Stores original file on disk under structured upload folders (uploads/YYYY/MM/DD/...)
Stores parse metadata:
- parse_status, parse_confidence, parsing_notes, raw_text_excerpt, extracted_data
Invoice list with search/filter/sorting (HTMX-powered)
Invoice detail and edit pages (manual correction workflow)
Duplicate flagging based on invoice_number + vendor_name + total_amount
CSV export endpoint
Dashboard statistics cards
Alembic migrations and seed script
Dockerized setup with PostgreSQL
Pytest coverage for parser behavior, invoice creation flow, and basic routes

Tech Stack

Python 3.12
FastAPI
SQLAlchemy 2.x
Alembic
PostgreSQL
Jinja2 + HTMX + Tailwind CDN
pdfplumber + pypdf
OCR fallback: pytesseract + pdf2image
Docker + docker-compose
pytest

Project Structure

app/
  core/              # settings + logging
  db/                # base + session factory
  models/            # SQLAlchemy models
  repositories/      # DB data access
  routes/            # web routes
  schemas/           # Pydantic schemas
  services/          # invoice + parsing service logic
  templates/         # Jinja templates
  static/            # CSS
alembic/             # migration environment + versions
scripts/             # seed script
tests/               # pytest suite

Setup (Local)

Create env file:

cp .env.example .env

Install dependencies:

python3.12 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Conda works as well:

conda create -n invoiceflow python=3.12 -y
conda activate invoiceflow
pip install -r requirements.txt

Start PostgreSQL (Docker):

docker compose up -d db

Run migrations:

alembic upgrade head

Run app:

uvicorn app.main:app --reload

Open http://localhost:8000.

Local development defaults:

DATABASE_URL may point to localhost for non-Docker local setups
UPLOADS_DIR should be uploads

Docker/VPS deployments should use the internal service host db and /app/uploads.

Full Docker Run

docker compose up --build

Database Migration Commands

alembic upgrade head
alembic downgrade -1
alembic revision --autogenerate -m "your message"

Seed Data

If you want sample entries without uploading PDFs:

python scripts/seed_data.py

OCR Dependencies

OCR fallback requires system tools:

tesseract-ocr
poppler-utils (for pdf2image)

These are installed in Dockerfile. For local non-Docker runs, install them via your OS package manager.

Test

pytest

Example UI Screenshot Placeholders

docs/screenshots/dashboard.png
docs/screenshots/upload.png
docs/screenshots/invoice-list.png
docs/screenshots/invoice-detail.png

Notes

Parsing is deterministic and intentionally transparent for maintainability.
You can extend parser rules in app/services/pdf_parser.py for new invoice formats.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

InvoiceFlow

Features

Tech Stack

Project Structure

Setup (Local)

Full Docker Run

Database Migration Commands

Seed Data

OCR Dependencies

Test

Example UI Screenshot Placeholders

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
alembic		alembic
app		app
docs/screenshots		docs/screenshots
sample_invoices		sample_invoices
scripts		scripts
tests		tests
uploads		uploads
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
alembic.ini		alembic.ini
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

InvoiceFlow

Features

Tech Stack

Project Structure

Setup (Local)

Full Docker Run

Database Migration Commands

Seed Data

OCR Dependencies

Test

Example UI Screenshot Placeholders

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages