Skip to content
/ DocSpark Public

A comprehensive Retrieval-Augmented Generation (RAG) application for chatting with documents.

Notifications You must be signed in to change notification settings

z-scd/DocSpark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PDF RAG (Retrieval Augmented Generation)

This project implements a PDF document processing system with vector embeddings for semantic search capabilities. It consists of a client-server architecture where users can upload PDFs and perform semantic searches through their content.

Project Structure

pdf-rag/
├── client/          # Next.js frontend
├── server/          # Node.js backend
└── docker-compose.yml  # Docker configuration

Prerequisites

  • Node.js (v18 or higher)
  • Docker and Docker Compose
  • pnpm (Package Manager)

Setup

  1. Clone the repository:
git clone <your-repo-url>
cd pdf-rag
  1. Install dependencies:
# Install server dependencies
cd server
pnpm install

# Install client dependencies
cd ../client
pnpm install
  1. Set up environment variables:

    • Copy .env.example to .env in both client and server directories
    • Fill in the required environment variables
  2. Start the services:

# Start Docker containers (Redis and Qdrant)
docker-compose up -d

# Start the server
cd server
pnpm dev

# Start the client (in a new terminal)
cd client
pnpm dev

Features

  • PDF document upload
  • Vector embeddings generation using Google AI
  • Document storage in Qdrant vector database
  • Semantic search capabilities
  • Real-time processing with Redis queue

Tech Stack

  • Frontend: Next.js
  • Backend: Node.js
  • Vector Database: Qdrant
  • Queue: BullMQ with Redis
  • Embeddings: Google AI
  • Container: Docker

About

A comprehensive Retrieval-Augmented Generation (RAG) application for chatting with documents.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published