AI agent that resolves invoice exceptions for accounts payable teams. Upload an invoice (image or text) → the agent extracts structured data → matches against purchase orders → detects exceptions → verifies the vendor via web search → reasons through company policy → delivers a decision with full audit trail.
Live demo: https://genuine-dream-production-1287.up.railway.app
- Node.js 22+
- An OpenRouter API key (for the LLM — kimi-k2.5)
- A Tavily API key (for real-time vendor web search)
cd "tavily-ibm hack/app"
npm install
# Both keys live on the server — they never reach the browser.
export OPENROUTER_API_KEY=sk-or-v1-...
export TAVILY_API_KEY=tvly-...
npm run devOpen http://localhost:5173. Pick a sample invoice from the sidebar or upload an image, then click Run agent.
docker build -t ap-exception .
docker run -p 3000:3000 \
-e OPENROUTER_API_KEY=sk-or-v1-... \
-e TAVILY_API_KEY=tvly-... \
ap-exceptionOr deploy to Railway with one command (requires railway CLI and a project token):
RAILWAY_TOKEN=... railway up --service <service-name> --detachSet OPENROUTER_API_KEY and TAVILY_API_KEY as Railway service variables.
| Variable | Required | Description |
|---|---|---|
OPENROUTER_API_KEY |
Yes | OpenRouter API key for LLM calls (kimi-k2.5) |
TAVILY_API_KEY |
Recommended | Tavily API key for real-time vendor web search. Without it, vendor lookups fall back to mock data. |
PORT |
No | Server port (default 3000 in prod, 5173 in dev) |
OPENROUTER_URL |
No | Override the upstream LLM endpoint (for testing with a fake server) |
The default model is moonshotai/kimi-k2.5 via OpenRouter. It handles both
text extraction and multimodal image extraction natively. To override:
// browser console
localStorage.setItem("openrouter_model", "some/other-model"); ┌──────────────┐
Invoice (image or text) ────▶ │ 1. Extract │ LLM reads the invoice
│ (kimi-k2.5) │ (multimodal for images)
└──────┬───────┘
▼
┌──────────────┐
│ 2. Match │ Deterministic PO match
│ (rules) │ + exception detection
└──────┬───────┘
▼
┌──────────────┐
(opt.) │ 3. Judge │ LLM reviews exceptions
│ (custom │ against custom rules
│ rules) │ and may dismiss them
└──────┬───────┘
▼
┌───────────────┬───────────────┐
▼ ▼ │
┌─────────────┐ ┌─────────────┐ │
│ 4a. Search │ │ 4b. Memory │ │
│ (Tavily) │ │ (similarity │ │
│ vendor news,│ │ search) │ │
│ prices, │ └─────────────┘ │
│ legitimacy │ │
└──────┬──────┘ │
└───────────────┬──────────────┘
▼
┌──────────────┐
│ 5. Reason │ LLM streams chain-of-
│ (policy + │ thought → structured
│ context) │ decision + citations
└──────┬───────┘
▼
┌──────────────┐
│ Decision │ AUTO_APPROVE,
│ + audit │ AUTO_RESOLVE, ESCALATE,
│ trail │ FLAG_FOR_REVIEW, REJECT
└──────────────┘
If there are no exceptions after the match step (or after the judge
dismisses them all), the pipeline short-circuits to AUTO_APPROVE without
calling the reasoning LLM — fast and cheap.
Four pre-loaded sample invoices in the sidebar, each triggering a different path:
| Sample | Resolution | Why |
|---|---|---|
| Acme Industrial — INV-0923 | AUTO_APPROVE |
Tax rounding ($0.85), within 2% tolerance |
| Bluepeak Components — INV-4471 | AUTO_RESOLVE |
+5% price increase, vendor web search finds published surcharge notice |
| GhostVendor LLC — INV-889 | ESCALATE |
Unknown vendor, no PO, shell-company pattern flagged by web search |
| Northstar Logistics — INV-7712 | ESCALATE |
Quantity overage: 6 billed vs 4 authorized |
You can also upload a real invoice image (PNG, JPEG, PDF) — kimi-k2.5 reads it natively via multimodal input and extracts the fields.
Open Exception Rules in the sidebar to customize:
- Amount tolerance — percentage variance before flagging (default 2%)
- Quantity tolerance — unit variance before flagging (default 0)
- Approved vendor list — one name per line
- Company policy — free text passed verbatim to the reasoning LLM
- LLM-as-judge — toggle on to let the LLM dismiss triggered exceptions based on custom rules you define in plain English
- Custom rules — natural language, e.g. "Ignore shipping surcharges under $75 from approved vendors."
Rules persist to localStorage. Load defaults and Reset restore baseline.
npm test25 tests, three suites, Node-native test runner (no extra dependencies):
| Suite | Tests | What it covers |
|---|---|---|
src/lib/__tests__/matching.test.ts |
10 | PO matching, every exception type, custom tolerance/vendor rules |
src/lib/__tests__/agent.integration.test.ts |
8 | Full runAgent() pipeline with mocked fetch — all 4 demo paths, custom rules, LLM-as-judge on/off, multimodal image input |
server/__tests__/proxy.test.ts |
7 | Proxy: missing key → 500, bearer auth, body forwarding, stream:true injection, error propagation |
All served by the same process (Vite middleware in dev, standalone Node server in prod).
| Endpoint | Method | Description |
|---|---|---|
/api/health |
GET | {ok, tavily, multimodal} — are keys configured? |
/api/chat |
POST | Non-streaming LLM completion (forwarded to OpenRouter) |
/api/chat-stream |
POST | Streaming LLM completion (SSE piped from OpenRouter) |
/api/search-vendor |
POST | {vendorName} → Tavily web search → VendorContext |
The OpenRouter key and Tavily key never leave the server.
| File | Purpose |
|---|---|
| Frontend | |
src/App.tsx |
Main dashboard (upload, extract, reasoning stream, decision, audit, rules editor) |
src/Deck.tsx |
Business case presentation at /deck |
src/types.ts |
Core TypeScript types |
src/data/purchaseOrders.ts |
Mock PO database (6 POs) |
src/data/sampleInvoices.ts |
4 pre-loaded demo invoices |
src/lib/rules.ts |
Configurable rules type, defaults, localStorage persistence |
src/lib/openrouter.ts |
Browser client → /api/chat, /api/chat-stream |
src/lib/matching.ts |
Deterministic PO match + exception rules (parameterized by Rules) |
src/lib/abstractions.ts |
Vendor search (→ Tavily via backend), similarity search, agent plan |
src/lib/agent.ts |
Agent orchestration (extract → match → judge → gather → reason) |
| Backend | |
server/proxy.ts |
OpenRouter proxy (forwardChat, forwardChatStream, getHealth) |
server/tavily.ts |
Tavily web search integration with fallback |
server/vitePlugin.ts |
Vite middleware mounting all /api/* routes in dev |
server/standalone.ts |
Production Node HTTP server (serves dist + API routes) |
| Infra | |
Dockerfile |
Multi-stage build: npm ci → tsc+vite build → slim runtime |
railway.toml |
Railway deployment config |
- Frontend: React 19, TypeScript, Vite 8
- LLM: kimi-k2.5 via OpenRouter (multimodal — reads invoice images natively)
- Web search: Tavily API (real-time vendor verification, price changes, fraud signals)
- Similarity search: In-memory TF-IDF cosine similarity (demonstrates the pattern Redis vector search would power at scale)
- Backend: Node 22 native HTTP server, zero runtime dependencies beyond React
- Deploy: Docker, Railway