-
Notifications
You must be signed in to change notification settings - Fork 0
Deployment
Three viable paths. Pick the one that matches your team's ops appetite.
Easiest. Deploys on every push. Free tier covers most personal projects.
- Push your fork to GitHub
- Go to vercel.com/new, import the repo
- Vercel detects Next.js, no config needed
- In environment variables, add
OPENAI_API_KEY - Click Deploy
Live in 90 seconds. Subsequent pushes auto-deploy.
Vercel project → Settings → Domains. Add yours, point DNS, done.
| Limit | Free | Pro ($20/mo) |
|---|---|---|
| Function invocations | 100k/mo | 1M/mo |
| Function memory | 1024MB | 3008MB |
| Function timeout | 10s default, 60s max | up to 300s |
| Bandwidth | 100GB | 1TB |
The 60-second function timeout is set in app/api/upload/route.ts (export const maxDuration = 60). This is enough for PDFs up to ~200 pages with text-embedding-3-small. Above that, switch to background processing or use a longer-timeout deployment target.
More control, predictable cost, no platform lock-in.
FROM node:24-alpine AS deps
WORKDIR /app
COPY package*.json pnpm-lock.yaml* ./
RUN corepack enable && pnpm install --frozen-lockfile
FROM node:24-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN corepack enable && pnpm build
FROM node:24-alpine AS runner
WORKDIR /app
ENV NODE_ENV=production
COPY --from=builder /app/.next/standalone ./
COPY --from=builder /app/.next/static ./.next/static
COPY --from=builder /app/public ./public
EXPOSE 3000
CMD ["node", "server.js"]Add output: 'standalone' to next.config.mjs to make the standalone build available.
docker build -t rag-over-pdf .
docker run -d \
-e OPENAI_API_KEY=sk-... \
-p 3000:3000 \
--restart unless-stopped \
rag-over-pdf- Hetzner CCX13 (€13/mo, 2 vCPU, 8GB) — fits this app comfortably with room for pgvector
- Fly.io (~$5/mo at low scale) — global edge, easy deploy
- Render ($7/mo starter) — Heroku-feel, simple
Front it with Caddy or Cloudflare for HTTPS.
When you outgrow in-memory, you need Postgres + pgvector running too.
graph LR
CDN[Cloudflare] --> NX[Caddy]
NX --> APP[rag-over-pdf<br/>Node container]
APP --> PG[(Postgres 15<br/>pgvector)]
APP --> OAI[OpenAI API]
classDef ext fill:#a78bfa,stroke:#a78bfa,color:#fff
class OAI ext
services:
app:
build: .
environment:
OPENAI_API_KEY: ${OPENAI_API_KEY}
DATABASE_URL: postgresql://rag:${PG_PASS}@db:5432/rag
depends_on: [db]
restart: unless-stopped
ports: ["3000:3000"]
db:
image: pgvector/pgvector:pg16
environment:
POSTGRES_USER: rag
POSTGRES_PASSWORD: ${PG_PASS}
POSTGRES_DB: rag
volumes: [pgdata:/var/lib/postgresql/data]
restart: unless-stopped
volumes:
pgdata:Add the table schema from Swap to pgvector once Postgres is up.
Before deploying, verify all of these:
-
OPENAI_API_KEYset -
EMBEDDING_MODELif overriding default -
CHAT_MODELif overriding default -
DATABASE_URLif using pgvector - Function timeout extended if you'll index large PDFs
# Should return 200
curl -I https://your-domain.com
# Upload a PDF (returns a docId, page count, and the document list)
curl -F "file=@test.pdf" https://your-domain.com/api/upload
# List indexed documents
curl https://your-domain.com/api/upload
# Ask a question (response is an NDJSON stream: a citations event, then token
# events, then a done event)
curl -N -X POST https://your-domain.com/api/chat \
-H "Content-Type: application/json" \
-d '{"question": "what is this about?"}'Vercel: built-in analytics + function logs.
Self-host: pipe stdout to a log aggregator. Logtail, Axiom, or journalctl if you're old-school.
Track:
- Upload error rate (PDFs that fail to parse)
- Average chunks per upload
- Average time-to-first-token on chat
- OpenAI rate-limit responses