📄 DocPilot AI

Intelligent Document Processing & RAG-Powered Q&A System
Transform PDFs into actionable insights with AI-powered extraction and enterprise automation

Features • Architecture • Tech Stack • Getting Started • API Reference • Deployment

🎯 Overview

DocPilot AI is an end-to-end intelligent document processing platform that combines Retrieval-Augmented Generation (RAG) with structured data extraction. It enables users to upload PDF documents (invoices, BOQs, contracts), ask natural language questions, and extract structured JSON data for downstream automation.

Key Capabilities

📤 PDF Ingestion — Upload and process PDF documents with automatic text extraction
💬 RAG-Powered Q&A — Ask questions about your documents with AI-generated answers and source citations
📊 Structured Extraction — Extract machine-readable JSON (invoice details, BOQ line items) using schema-based parsing
🔄 Enterprise Automation — Trigger n8n workflows to sync extracted data with Airtable, Jira, and Email

✨ Features

1. Document Upload & RAG Ingestion

Parse PDFs page-by-page with text extraction
Semantic chunking (800 tokens with 120-token overlap)
Vector embeddings using BAAI/bge-small-en-v1.5 model via FastEmbed
FAISS index storage for fast similarity search

2. Intelligent Q&A (RAG Chat)

Natural language queries on uploaded documents
Top-K similarity search for context retrieval
LLM-powered answer generation via Groq API
Source citations for answer verification

3. Structured Data Extraction

Schema Type	Extracted Fields
Invoice	Vendor name, Invoice number, Date, Total amount, GSTIN
BOQ	Item count, Line items, Sample entries

4. DocOps Automation via n8n

Webhook-triggered workflows on successful extraction
Airtable record creation for audit trails
Automatic Jira ticket creation
Email notifications to stakeholders

🏗️ Architecture

High-Level System Design

flowchart TB
  %% ================
  %% Swimlanes Style
  %% ================

  subgraph Client["🖥️ Client Layer"]
    U[👤 User]
    FE["DocPilot UI<br/>Next.js (Vercel)"]
    U --> FE
  end

  subgraph API["⚙️ Service Layer"]
    BE["DocPilot Backend<br/>FastAPI (Render)"]
    PDF["PDF Parser<br/>+ Chunker"]
    EMB["Embeddings Engine<br/>FastEmbed (BGE)"]
    VS[("Vector Store<br/>FAISS Index")]
    LLM["LLM Inference<br/>Groq API"]
  end

  subgraph Automation["🔄 Automation Layer"]
    N8N["n8n Orchestrator<br/>DocOps Workflow"]
    AIR[("Airtable<br/>DocOps DB")]
    JIRA["Jira Cloud<br/>Ticketing"]
    MAIL["📧 Email Service<br/>SMTP/Gmail"]
  end

  %% Client -> Backend
  FE -->|"POST /ingest"| BE
  FE -->|"POST /ask"| BE
  FE -->|"POST /extract"| BE

  %% Ingestion
  BE --> PDF --> EMB --> VS
  VS -->|"doc_id"| BE --> FE

  %% RAG Ask
  BE -->|"Search chunks"| VS --> BE
  BE -->|"Prompt + Context"| LLM --> BE --> FE

  %% Extraction
  BE -->|"Schema Prompt"| LLM -->|"Structured JSON"| BE --> FE

  %% Automation Trigger
  BE -->|"Webhook Event"| N8N
  N8N --> AIR
  N8N --> JIRA
  N8N --> MAIL
  N8N -->|"jira_key + status"| BE --> FE

Detailed Data Flow

flowchart LR
  %% =========================
  %% DocPilot AI Architecture
  %% =========================

  %% --- Actors ---
  U[👤 User / Stakeholder]

  %% --- Frontend ---
  FE["DocPilot UI<br/>Next.js (Vercel)"]

  %% --- Backend ---
  BE["DocPilot API<br/>FastAPI (Render)"]
  PDF["📄 PDF Parser + Chunker"]
  EMB["🧠 Embeddings Engine<br/>FastEmbed"]
  VS[("💾 Vector Store<br/>FAISS Index")]
  LLM["🤖 LLM Inference<br/>Groq API"]

  %% --- Automation ---
  N8N["⚡ n8n Orchestrator<br/>DocOps Workflow (Render)"]
  AIR[("📊 Airtable<br/>DocOps DB")]
  JIRA["🎫 Jira Cloud<br/>Create Issue"]
  MAIL["📧 Email Notification<br/>SMTP/Gmail"]

  %% =========================
  %% User -> Frontend
  %% =========================
  U -->|"Upload PDF / Ask Question / Extract"| FE

  %% =========================
  %% Frontend -> Backend
  %% =========================
  FE -->|"POST /ingest<br/>PDF Upload"| BE
  FE -->|"POST /ask<br/>(doc_id, query)"| BE
  FE -->|"POST /extract<br/>(doc_id, schema)"| BE

  %% =========================
  %% Ingestion Pipeline (RAG)
  %% =========================
  BE -->|"Extract Text"| PDF
  PDF -->|"Chunk Text"| EMB
  EMB -->|"Store vectors + chunks"| VS
  VS -->|"Return doc_id"| BE

  %% =========================
  %% RAG Query Pipeline
  %% =========================
  BE -->|"Top-k similarity search"| VS
  VS -->|"Relevant chunks"| BE
  BE -->|"Prompt + Context"| LLM
  LLM -->|"Answer + Sources"| BE
  BE -->|"Response to UI"| FE

  %% =========================
  %% Structured Extraction Pipeline
  %% =========================
  BE -->|"Schema-based extraction<br/>Invoice/BOQ JSON"| LLM
  LLM -->|"Structured JSON output"| BE
  BE -->|"Show extracted fields"| FE

  %% =========================
  %% Automation Trigger (n8n)
  %% =========================
  BE -->|"POST Webhook<br/>doc_id + extracted_json"| N8N

  %% =========================
  %% DocOps Workflow (n8n)
  %% =========================
  N8N -->|"Create Record"| AIR
  N8N -->|"Create Jira Task"| JIRA
  N8N -->|"Update Airtable (jira_key/status)"| AIR
  N8N -->|"Send Summary Email"| MAIL

  %% =========================
  %% Return workflow output back
  %% =========================
  N8N -->|"Workflow status + jira_key"| BE
  BE -->|"UI Notification<br/>Jira created"| FE

🛠️ Tech Stack

Frontend

Technology	Purpose
Next.js 16	React framework with App Router
React 19	UI library
TailwindCSS 4	Utility-first styling
TypeScript	Type safety
Vercel	Deployment platform

Backend

Technology	Purpose
FastAPI	High-performance Python API framework
FAISS	Vector similarity search
FastEmbed	Lightweight embeddings (BGE-small-en-v1.5)
pypdf	PDF text extraction
Groq	LLM inference (fast, affordable)
Render	Deployment platform

Automation

Technology	Purpose
n8n	Workflow orchestration
Airtable	Document records storage
Jira Cloud	Task/ticket automation
Email (SMTP)	Stakeholder notifications

🚀 Getting Started

Prerequisites

Node.js 18+ (for frontend)
Python 3.10+ (for backend)
Groq API Key (get one at console.groq.com)

Backend Setup

# Navigate to backend directory
cd backend

# Create virtual environment
python -m venv .venv

# Activate virtual environment
# Windows:
.venv\Scripts\activate
# macOS/Linux:
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Create .env file
cp .env.example .env
# Edit .env and add your GROQ_API_KEY

# Run the server
uvicorn app.main:app --reload --port 8000

Frontend Setup

# Navigate to frontend directory
cd frontend

# Install dependencies
npm install

# Create .env.local file
cp .env.example .env.local
# Edit .env.local and set NEXT_PUBLIC_API_URL

# Run the development server
npm run dev

Environment Variables

Backend (`.env`)

GROQ_API_KEY=your_groq_api_key_here

Frontend (`.env.local`)

NEXT_PUBLIC_API_URL=http://localhost:8000

📡 API Reference

Base URL

Local: http://localhost:8000
Production: https://your-backend.onrender.com

Endpoints

`POST /ingest`

Upload and process a PDF document.

Request:

curl -X POST "http://localhost:8000/ingest" \
  -F "file=@invoice.pdf"

Response:

{
  "doc_id": "79bc66af-1234-5678-abcd-ef1234567890"
}

`POST /ask`

Ask a question about an uploaded document.

Request:

{
  "doc_id": "79bc66af-1234-5678-abcd-ef1234567890",
  "query": "What is the total amount?"
}

Response:

{
  "answer": "The total amount on this invoice is ₹59,000.",
  "sources": [
    {
      "chunk_id": 0,
      "text": "Total Amount: ₹59,000..."
    }
  ]
}

`POST /extract`

Extract structured data from a document.

Request:

{
  "doc_id": "79bc66af-1234-5678-abcd-ef1234567890",
  "schema_type": "invoice"
}

Response:

{
  "vendor_name": "XYZ Solutions Pvt. Ltd.",
  "invoice_number": "INV-2024-001",
  "invoice_date": "24 Jan 2024",
  "total_amount": "₹59,000",
  "gstin": "27AABCX1234A1Z5"
}

`GET /health`

Health check endpoint.

Response:

{
  "status": "ok"
}

🔄 Automation Flows

Flow 1: Document Upload + RAG Ingestion

User uploads PDF from UI
Frontend calls POST /ingest
Backend extracts text, chunks it, generates embeddings
Stores vectors in FAISS index
Returns doc_id for future queries

Flow 2: RAG-Powered Q&A

User asks a natural language question
Frontend calls POST /ask with doc_id and query
Backend searches FAISS for relevant chunks
Builds prompt with context and calls Groq LLM
Returns answer with source citations

Flow 3: Structured Extraction

User clicks "Extract" and selects schema (Invoice/BOQ)
Frontend calls POST /extract
Backend performs schema-based extraction
Returns structured JSON response

Flow 4: DocOps Automation (n8n)

After successful extraction, backend triggers n8n webhook
n8n creates Airtable record with extracted data
n8n creates Jira ticket for follow-up
n8n sends email notification to stakeholders
Returns workflow status back to UI

🌐 Deployment

Frontend (Vercel)

Push code to GitHub
Connect repository to Vercel
Set Root Directory to frontend
Add environment variable: NEXT_PUBLIC_API_URL
Deploy

Backend (Render)

Push code to GitHub
Create new Web Service on Render
Set Root Directory to backend
Build Command: pip install -r requirements.txt
Start Command: uvicorn app.main:app --host 0.0.0.0 --port $PORT
Add environment variable: GROQ_API_KEY
Deploy

n8n (Render)

Create new Web Service with n8n Docker image
Configure webhook URL
Set up Airtable, Jira, and Email credentials
Import workflow template

📁 Project Structure

docpilot-ai/
├── backend/
│   ├── app/
│   │   ├── __init__.py
│   │   ├── main.py          # FastAPI endpoints
│   │   ├── rag.py           # RAG pipeline (ingest, search)
│   │   ├── extract.py       # Structured extraction logic
│   │   ├── llm.py           # Groq LLM integration
│   │   └── schemas.py       # Pydantic models
│   ├── data/                # Stored PDFs and FAISS indexes
│   ├── Dockerfile
│   ├── requirements.txt
│   └── .env.example
│
├── frontend/
│   ├── src/
│   │   ├── app/             # Next.js App Router pages
│   │   ├── components/      # React components
│   │   └── lib/             # Utility functions
│   ├── public/
│   ├── package.json
│   └── .env.example
│
└── README.md

🎯 Roadmap

PDF upload and text extraction
RAG-powered Q&A with citations
Invoice structured extraction
BOQ structured extraction
LLM-enhanced extraction (beyond regex)
n8n webhook integration

👨‍💻 Author

Built with ❤️ by Kaushal

DocPilot AI — From documents to decisions, powered by intelligence.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

📄 DocPilot AI

🎯 Overview

Key Capabilities

✨ Features

1. Document Upload & RAG Ingestion

2. Intelligent Q&A (RAG Chat)

3. Structured Data Extraction

4. DocOps Automation via n8n

🏗️ Architecture

High-Level System Design

Detailed Data Flow

🛠️ Tech Stack

Frontend

Backend

Automation

🚀 Getting Started

Prerequisites

Backend Setup

Frontend Setup

Environment Variables

Backend (.env)

Frontend (.env.local)

📡 API Reference

Base URL

Endpoints

POST /ingest

POST /ask

POST /extract

GET /health

🔄 Automation Flows

Flow 1: Document Upload + RAG Ingestion

Flow 2: RAG-Powered Q&A

Flow 3: Structured Extraction

Flow 4: DocOps Automation (n8n)

🌐 Deployment

Frontend (Vercel)

Backend (Render)

n8n (Render)

📁 Project Structure

🎯 Roadmap

👨‍💻 Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Backend (`.env`)

Frontend (`.env.local`)

`POST /ingest`

`POST /ask`

`POST /extract`

`GET /health`

Packages