Skip to content

sploitengineer/docpilot-ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

13 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

FastAPI Next.js FAISS Groq n8n

πŸ“„ DocPilot AI

Intelligent Document Processing & RAG-Powered Q&A System
Transform PDFs into actionable insights with AI-powered extraction and enterprise automation

Features β€’ Architecture β€’ Tech Stack β€’ Getting Started β€’ API Reference β€’ Deployment


🎯 Overview

DocPilot AI is an end-to-end intelligent document processing platform that combines Retrieval-Augmented Generation (RAG) with structured data extraction. It enables users to upload PDF documents (invoices, BOQs, contracts), ask natural language questions, and extract structured JSON data for downstream automation.

Key Capabilities

  • πŸ“€ PDF Ingestion β€” Upload and process PDF documents with automatic text extraction
  • πŸ’¬ RAG-Powered Q&A β€” Ask questions about your documents with AI-generated answers and source citations
  • πŸ“Š Structured Extraction β€” Extract machine-readable JSON (invoice details, BOQ line items) using schema-based parsing
  • πŸ”„ Enterprise Automation β€” Trigger n8n workflows to sync extracted data with Airtable, Jira, and Email

✨ Features

1. Document Upload & RAG Ingestion

  • Parse PDFs page-by-page with text extraction
  • Semantic chunking (800 tokens with 120-token overlap)
  • Vector embeddings using BAAI/bge-small-en-v1.5 model via FastEmbed
  • FAISS index storage for fast similarity search

2. Intelligent Q&A (RAG Chat)

  • Natural language queries on uploaded documents
  • Top-K similarity search for context retrieval
  • LLM-powered answer generation via Groq API
  • Source citations for answer verification

3. Structured Data Extraction

Schema Type Extracted Fields
Invoice Vendor name, Invoice number, Date, Total amount, GSTIN
BOQ Item count, Line items, Sample entries

4. DocOps Automation via n8n

  • Webhook-triggered workflows on successful extraction
  • Airtable record creation for audit trails
  • Automatic Jira ticket creation
  • Email notifications to stakeholders

πŸ—οΈ Architecture

High-Level System Design

flowchart TB
  %% ================
  %% Swimlanes Style
  %% ================

  subgraph Client["πŸ–₯️ Client Layer"]
    U[πŸ‘€ User]
    FE["DocPilot UI<br/>Next.js (Vercel)"]
    U --> FE
  end

  subgraph API["βš™οΈ Service Layer"]
    BE["DocPilot Backend<br/>FastAPI (Render)"]
    PDF["PDF Parser<br/>+ Chunker"]
    EMB["Embeddings Engine<br/>FastEmbed (BGE)"]
    VS[("Vector Store<br/>FAISS Index")]
    LLM["LLM Inference<br/>Groq API"]
  end

  subgraph Automation["πŸ”„ Automation Layer"]
    N8N["n8n Orchestrator<br/>DocOps Workflow"]
    AIR[("Airtable<br/>DocOps DB")]
    JIRA["Jira Cloud<br/>Ticketing"]
    MAIL["πŸ“§ Email Service<br/>SMTP/Gmail"]
  end

  %% Client -> Backend
  FE -->|"POST /ingest"| BE
  FE -->|"POST /ask"| BE
  FE -->|"POST /extract"| BE

  %% Ingestion
  BE --> PDF --> EMB --> VS
  VS -->|"doc_id"| BE --> FE

  %% RAG Ask
  BE -->|"Search chunks"| VS --> BE
  BE -->|"Prompt + Context"| LLM --> BE --> FE

  %% Extraction
  BE -->|"Schema Prompt"| LLM -->|"Structured JSON"| BE --> FE

  %% Automation Trigger
  BE -->|"Webhook Event"| N8N
  N8N --> AIR
  N8N --> JIRA
  N8N --> MAIL
  N8N -->|"jira_key + status"| BE --> FE
Loading

Detailed Data Flow

flowchart LR
  %% =========================
  %% DocPilot AI Architecture
  %% =========================

  %% --- Actors ---
  U[πŸ‘€ User / Stakeholder]

  %% --- Frontend ---
  FE["DocPilot UI<br/>Next.js (Vercel)"]

  %% --- Backend ---
  BE["DocPilot API<br/>FastAPI (Render)"]
  PDF["πŸ“„ PDF Parser + Chunker"]
  EMB["🧠 Embeddings Engine<br/>FastEmbed"]
  VS[("πŸ’Ύ Vector Store<br/>FAISS Index")]
  LLM["πŸ€– LLM Inference<br/>Groq API"]

  %% --- Automation ---
  N8N["⚑ n8n Orchestrator<br/>DocOps Workflow (Render)"]
  AIR[("πŸ“Š Airtable<br/>DocOps DB")]
  JIRA["🎫 Jira Cloud<br/>Create Issue"]
  MAIL["πŸ“§ Email Notification<br/>SMTP/Gmail"]

  %% =========================
  %% User -> Frontend
  %% =========================
  U -->|"Upload PDF / Ask Question / Extract"| FE

  %% =========================
  %% Frontend -> Backend
  %% =========================
  FE -->|"POST /ingest<br/>PDF Upload"| BE
  FE -->|"POST /ask<br/>(doc_id, query)"| BE
  FE -->|"POST /extract<br/>(doc_id, schema)"| BE

  %% =========================
  %% Ingestion Pipeline (RAG)
  %% =========================
  BE -->|"Extract Text"| PDF
  PDF -->|"Chunk Text"| EMB
  EMB -->|"Store vectors + chunks"| VS
  VS -->|"Return doc_id"| BE

  %% =========================
  %% RAG Query Pipeline
  %% =========================
  BE -->|"Top-k similarity search"| VS
  VS -->|"Relevant chunks"| BE
  BE -->|"Prompt + Context"| LLM
  LLM -->|"Answer + Sources"| BE
  BE -->|"Response to UI"| FE

  %% =========================
  %% Structured Extraction Pipeline
  %% =========================
  BE -->|"Schema-based extraction<br/>Invoice/BOQ JSON"| LLM
  LLM -->|"Structured JSON output"| BE
  BE -->|"Show extracted fields"| FE

  %% =========================
  %% Automation Trigger (n8n)
  %% =========================
  BE -->|"POST Webhook<br/>doc_id + extracted_json"| N8N

  %% =========================
  %% DocOps Workflow (n8n)
  %% =========================
  N8N -->|"Create Record"| AIR
  N8N -->|"Create Jira Task"| JIRA
  N8N -->|"Update Airtable (jira_key/status)"| AIR
  N8N -->|"Send Summary Email"| MAIL

  %% =========================
  %% Return workflow output back
  %% =========================
  N8N -->|"Workflow status + jira_key"| BE
  BE -->|"UI Notification<br/>Jira created"| FE
Loading

πŸ› οΈ Tech Stack

Frontend

Technology Purpose
Next.js 16 React framework with App Router
React 19 UI library
TailwindCSS 4 Utility-first styling
TypeScript Type safety
Vercel Deployment platform

Backend

Technology Purpose
FastAPI High-performance Python API framework
FAISS Vector similarity search
FastEmbed Lightweight embeddings (BGE-small-en-v1.5)
pypdf PDF text extraction
Groq LLM inference (fast, affordable)
Render Deployment platform

Automation

Technology Purpose
n8n Workflow orchestration
Airtable Document records storage
Jira Cloud Task/ticket automation
Email (SMTP) Stakeholder notifications

πŸš€ Getting Started

Prerequisites

  • Node.js 18+ (for frontend)
  • Python 3.10+ (for backend)
  • Groq API Key (get one at console.groq.com)

Backend Setup

# Navigate to backend directory
cd backend

# Create virtual environment
python -m venv .venv

# Activate virtual environment
# Windows:
.venv\Scripts\activate
# macOS/Linux:
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Create .env file
cp .env.example .env
# Edit .env and add your GROQ_API_KEY

# Run the server
uvicorn app.main:app --reload --port 8000

Frontend Setup

# Navigate to frontend directory
cd frontend

# Install dependencies
npm install

# Create .env.local file
cp .env.example .env.local
# Edit .env.local and set NEXT_PUBLIC_API_URL

# Run the development server
npm run dev

Environment Variables

Backend (.env)

GROQ_API_KEY=your_groq_api_key_here

Frontend (.env.local)

NEXT_PUBLIC_API_URL=http://localhost:8000

πŸ“‘ API Reference

Base URL

  • Local: http://localhost:8000
  • Production: https://your-backend.onrender.com

Endpoints

POST /ingest

Upload and process a PDF document.

Request:

curl -X POST "http://localhost:8000/ingest" \
  -F "file=@invoice.pdf"

Response:

{
  "doc_id": "79bc66af-1234-5678-abcd-ef1234567890"
}

POST /ask

Ask a question about an uploaded document.

Request:

{
  "doc_id": "79bc66af-1234-5678-abcd-ef1234567890",
  "query": "What is the total amount?"
}

Response:

{
  "answer": "The total amount on this invoice is β‚Ή59,000.",
  "sources": [
    {
      "chunk_id": 0,
      "text": "Total Amount: β‚Ή59,000..."
    }
  ]
}

POST /extract

Extract structured data from a document.

Request:

{
  "doc_id": "79bc66af-1234-5678-abcd-ef1234567890",
  "schema_type": "invoice"
}

Response:

{
  "vendor_name": "XYZ Solutions Pvt. Ltd.",
  "invoice_number": "INV-2024-001",
  "invoice_date": "24 Jan 2024",
  "total_amount": "β‚Ή59,000",
  "gstin": "27AABCX1234A1Z5"
}

GET /health

Health check endpoint.

Response:

{
  "status": "ok"
}

πŸ”„ Automation Flows

Flow 1: Document Upload + RAG Ingestion

  1. User uploads PDF from UI
  2. Frontend calls POST /ingest
  3. Backend extracts text, chunks it, generates embeddings
  4. Stores vectors in FAISS index
  5. Returns doc_id for future queries

Flow 2: RAG-Powered Q&A

  1. User asks a natural language question
  2. Frontend calls POST /ask with doc_id and query
  3. Backend searches FAISS for relevant chunks
  4. Builds prompt with context and calls Groq LLM
  5. Returns answer with source citations

Flow 3: Structured Extraction

  1. User clicks "Extract" and selects schema (Invoice/BOQ)
  2. Frontend calls POST /extract
  3. Backend performs schema-based extraction
  4. Returns structured JSON response

Flow 4: DocOps Automation (n8n)

  1. After successful extraction, backend triggers n8n webhook
  2. n8n creates Airtable record with extracted data
  3. n8n creates Jira ticket for follow-up
  4. n8n sends email notification to stakeholders
  5. Returns workflow status back to UI

🌐 Deployment

Frontend (Vercel)

  1. Push code to GitHub
  2. Connect repository to Vercel
  3. Set Root Directory to frontend
  4. Add environment variable: NEXT_PUBLIC_API_URL
  5. Deploy

Backend (Render)

  1. Push code to GitHub
  2. Create new Web Service on Render
  3. Set Root Directory to backend
  4. Build Command: pip install -r requirements.txt
  5. Start Command: uvicorn app.main:app --host 0.0.0.0 --port $PORT
  6. Add environment variable: GROQ_API_KEY
  7. Deploy

n8n (Render)

  1. Create new Web Service with n8n Docker image
  2. Configure webhook URL
  3. Set up Airtable, Jira, and Email credentials
  4. Import workflow template

πŸ“ Project Structure

docpilot-ai/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ main.py          # FastAPI endpoints
β”‚   β”‚   β”œβ”€β”€ rag.py           # RAG pipeline (ingest, search)
β”‚   β”‚   β”œβ”€β”€ extract.py       # Structured extraction logic
β”‚   β”‚   β”œβ”€β”€ llm.py           # Groq LLM integration
β”‚   β”‚   └── schemas.py       # Pydantic models
β”‚   β”œβ”€β”€ data/                # Stored PDFs and FAISS indexes
β”‚   β”œβ”€β”€ Dockerfile
β”‚   β”œβ”€β”€ requirements.txt
β”‚   └── .env.example
β”‚
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ app/             # Next.js App Router pages
β”‚   β”‚   β”œβ”€β”€ components/      # React components
β”‚   β”‚   └── lib/             # Utility functions
β”‚   β”œβ”€β”€ public/
β”‚   β”œβ”€β”€ package.json
β”‚   └── .env.example
β”‚
└── README.md

🎯 Roadmap

  • PDF upload and text extraction
  • RAG-powered Q&A with citations
  • Invoice structured extraction
  • BOQ structured extraction
  • LLM-enhanced extraction (beyond regex)
  • n8n webhook integration

πŸ‘¨β€πŸ’» Author

Built with ❀️ by Kaushal


DocPilot AI β€” From documents to decisions, powered by intelligence.

About

Organizations struggle with manual document processing - extracting data from PDFs (invoices, BOQs, contracts), answering questions about documents, and triggering downstream workflows.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors