Skip to content

Quick Start

sarmakska edited this page May 31, 2026 · 2 revisions

Quick Start

Five steps. Roughly 90 seconds if you have an OpenAI key handy.

1. Clone

git clone https://github.com/sarmakska/rag-over-pdf.git
cd rag-over-pdf

2. Install

pnpm install

(npm install works too, but pnpm is faster.)

3. Get an OpenAI key

If you don't have one, create a key here. Free tier credits are usually enough to test this repo.

4. Configure

cp .env.example .env.local

Open .env.local and paste your key:

OPENAI_API_KEY=sk-proj-...

5. Run

pnpm dev

Open http://localhost:3000, upload one or more PDFs, ask questions.

What you should see

  1. An upload form.
  2. After upload, the document appears in a list with its chunk and page counts. Upload more to build a multi-document corpus.
  3. Tick the documents you want to search, or leave all unticked to search everything.
  4. Ask a question. The answer streams in token by token, with a numbered source list underneath linking each citation to a document and page.

If something breaks

Error Cause Fix
OPENAI_API_KEY is not set env not loaded Restart pnpm dev after editing .env.local
PDF has no extractable text scanned PDF Use a different PDF or add OCR (out of scope)
Upload hangs forever huge PDF Try one under 5MB first
429 from OpenAI rate limited Wait, or upgrade your OpenAI tier
Empty answers retrieval missed Try a different question phrasing, check the PDF really contains the answer

Try it with

A few PDFs that work well for testing:

  • A product manual or spec sheet
  • A research paper (arXiv works great)
  • An employee handbook
  • A privacy policy
  • Last year's annual report

Avoid PDFs that are mostly images, charts, or scanned pages. They have no extractable text.

Next

Clone this wiki locally