A REST API that generates professional PDF business documents (invoices and receipts) from structured JSON data. Built with Express, Handlebars templates, Puppeteer for PDF rendering, and an optional LLM layer for intelligent field mapping.
- Generate invoice and receipt PDFs from JSON data
- Handlebars-based templates — easily extendable
- Optional DRAFT watermark overlay
- LLM-powered field remapping (handles mismatched or informal field names)
- PDF scanning: upload an existing PDF to extract structured field data
- Zod schema validation on all inputs
- Graceful shutdown with Puppeteer browser cleanup
https://docgen-production-503d.up.railway.app
Health check: GET https://docgen-production-503d.up.railway.app/health
- Node.js >= 20
- npm >= 10
- An OpenRouter API key (only required for LLM features)
# Clone the repository
git clone https://github.com/your-username/DocGen.git
cd DocGen
# Install dependencies
npm install
# Configure environment
cp .env.example .env
# Edit .env and set OPENROUTER_API_KEY
# Start development server
npm run devThe server starts at http://localhost:3000.
| Variable | Required | Default | Description |
|---|---|---|---|
PORT |
No | 3000 |
HTTP port |
OPENROUTER_API_KEY |
For LLM features | — | OpenRouter API key |
MODEL |
No | stepfun/step-3.5-flash:free |
OpenRouter model ID |
GET /health
{ "status": "ok", "service": "DocGen Engine", "version": "1.0.0", "timestamp": "..." }GET /api/v1/templates
{ "success": true, "templates": ["invoice", "receipt"] }GET /api/v1/fields/:templateId
Returns the field schema for a given template, including required/optional status and types.
POST /api/v1/generate
Content-Type: application/json
Request body:
{
"templateId": "invoice",
"data": {
"invoice_number": "INV-2026-001",
"issue_date": "2026-03-05",
"due_date": "2026-04-05",
"client_name": "PT Contoh Klien",
"items": [
{ "description": "Jasa Konsultasi", "qty": 10, "unit_price": 500000 }
],
"subtotal": 5000000,
"tax": 550000,
"total": 5550000
},
"options": {
"useLLM": false,
"outputFormat": "pdf",
"watermark": { "enabled": false }
}
}Response: Raw binary PDF (Content-Type: application/pdf).
Note: Totals are not auto-calculated. Supply
subtotalandtotalyourself.
POST /api/v1/analyze
Content-Type: application/json
Remaps fields using LLM and validates data without generating a PDF. Useful when field names from the user don't match the template schema.
{ "templateId": "invoice", "data": { "customer": "PT Klien", "amount": 5000000 } }POST /api/v1/scan
Content-Type: multipart/form-data
Upload an existing PDF to extract structured field data. Query param ?templateId=invoice is optional.
- Field name:
file - Max size: 10 MB
- File type: PDF only
| Field | Type | Required |
|---|---|---|
invoice_number |
string | Yes |
issue_date |
string (date) | Yes |
due_date |
string (date) | Yes |
client_name |
string | Yes |
items |
LineItem[] | Yes |
subtotal |
number | Yes |
total |
number | Yes |
client_address |
string | No |
company_name |
string | No |
company_address |
string | No |
company_email |
string | No |
tax |
number | No |
tax_rate |
number | No |
discount |
number | No |
notes |
string | No |
| Field | Type | Required |
|---|---|---|
receipt_number |
string | Yes |
receipt_date |
string (date) | Yes |
payer_name |
string | Yes |
items |
LineItem[] | Yes |
subtotal |
number | Yes |
total |
number | Yes |
payer_address |
string | No |
payment_method |
string | No |
company_name |
string | No |
notes |
string | No |
{ "description": "string", "qty": 1, "unit_price": 500000 }| Status | Meaning |
|---|---|
| 400 | Validation error — check details in response body |
| 404 | Template not found |
| 422 | Missing required fields — listed in missingFields |
| 500 | Internal server error |
| Command | Description |
|---|---|
npm run dev |
Start dev server with hot reload (ts-node-dev) |
npm run build |
Compile TypeScript to dist/ |
npm start |
Run compiled output (node dist/app.js) |
src/
├── app.ts # Express entry point
├── controllers/ # Request handlers
├── engine/
│ ├── TemplateLoader.ts # Loads .hbs templates and schema.json
│ ├── FieldScanner.ts # Reads field definitions from schema
│ ├── FieldMapper.ts # Maps incoming data to template fields
│ ├── PDFRenderer.ts # Puppeteer-based HTML-to-PDF rendering
│ ├── Watermarker.ts # pdf-lib watermark overlay
│ ├── PDFScanner.ts # pdf-parse text extraction
│ └── DataExtractor.ts # Regex-based field extraction from PDF text
├── llm/
│ ├── LLMFieldAnalyzer.ts # OpenRouter field remapping
│ └── LLMValidator.ts # Semantic validation
├── middleware/
│ └── upload.ts # Multer memory storage for PDF uploads
├── routes/
│ └── document.routes.ts # Route definitions
├── templates/
│ ├── invoice/ # template.hbs + schema.json
│ └── receipt/ # template.hbs + schema.json
└── utils/
└── validation.ts # Zod schemas
This project is configured for deployment on Railway. See Dockerfile for container setup.
Puppeteer requires Chromium system dependencies — these are installed in the Dockerfile. The --no-sandbox flag is set in PDFRenderer.ts for container compatibility.
A skill for AI agents is available in the skill/ directory. It enables agents to call this API to generate documents on behalf of users via natural language.
See skill/README.md for installation instructions.
ISC