Quick Start

Five steps. Roughly 90 seconds if you have an OpenAI key handy.

1. Clone

git clone https://github.com/sarmakska/rag-over-pdf.git
cd rag-over-pdf

pnpm install

(npm install works too, but pnpm is faster.)

If you don't have one, create a key here. Free tier credits are usually enough to test this repo.

cp .env.example .env.local

Open .env.local and paste your key:

OPENAI_API_KEY=sk-proj-...

pnpm dev

Open http://localhost:3000, upload one or more PDFs, ask questions.

An upload form.
After upload, the document appears in a list with its chunk and page counts. Upload more to build a multi-document corpus.
Tick the documents you want to search, or leave all unticked to search everything.
Ask a question. The answer streams in token by token, with a numbered source list underneath linking each citation to a document and page.

Error	Cause	Fix
`OPENAI_API_KEY is not set`	env not loaded	Restart `pnpm dev` after editing `.env.local`
`PDF has no extractable text`	scanned PDF	Use a different PDF or add OCR (out of scope)
Upload hangs forever	huge PDF	Try one under 5MB first
429 from OpenAI	rate limited	Wait, or upgrade your OpenAI tier
Empty answers	retrieval missed	Try a different question phrasing, check the PDF really contains the answer

A few PDFs that work well for testing:

Avoid PDFs that are mostly images, charts, or scanned pages. They have no extractable text.