Retrieval-Augmented Generation (RAG) Pipeline

This repository implements a modular Retrieval-Augmented Generation (RAG) system with a complete ingestion workflow, semantic text chunking, embedding generation, vector search, and multi-turn conversational capabilities. The system is designed with extensibility and maintainability in mind, using FastAPI, SQLAlchemy, and OpenAI models.

Architecture Overview

The system is composed of independent but connected subsystems:

Ingestion Pipeline — loads documents, extracts metadata, chunks text, and generates embeddings.
Vector Search Engine — performs similarity search using stored embeddings.
Chat Pipeline — orchestrates multi-turn conversations and optional retrieval.
Knowledge Store — relational models for documents, chunks, embeddings, conversations, and messages.
API Gateway — exposes /ingest, /search, and /chat endpoints.

A high-level architecture diagram is available below:

Additional diagrams and design notes are available in the docs/ directory.

Features

Document ingestion with metadata and structured storage
Semantic chunking optimized for embedding models
Embedding generation using OpenAI embedding APIs
Vector similarity search over chunked documents
Retrieval-augmented multi-turn chat completion
SQLAlchemy ORM modeling with UUID-based conversation sessions
Modular services layer for easy extension or substitution
RESTful API exposure via FastAPI
Extensible codebase structured for testing and integration

Database Schema

The schema models the core elements of a RAG system.

documents

Stores metadata for each ingested source file.

id
name
path
created_at
document_metadata (JSON)

chunks

Semantic text chunks with associated embeddings.

id
document_id
chunk_index
text
embedding (JSON)
created_at
chunk_metadata (JSON)

conversations

Represents a conversational session.

id (UUID)
knowledge_base_id
created_at

messages

Linked to conversations; stores user and assistant messages.

id
conversation_id
role
content
created_at

Indexing is applied based on common retrieval patterns.

API Endpoints

POST /ingest

Processes documents and populates the knowledge base.

POST /search

Performs semantic search over stored document embeddings.

Request

{
  "query": "What does clause 7 describe?"
}

Response

{
  "results": [...],
  "total_found": 5
}

POST /chat

Generates a conversational response, optionally using retrieved context.

Request

{
  "query": "Explain the confidentiality section",
  "conversation_id": "uuid"
}

Project Structure

app/
  api/                # FastAPI route handlers
  core/               # Configurations and shared constants
  db/                 # SQLAlchemy models and database session
  ingest.py           # Ingestion workflow entry point
  logging_config.py   # Application-wide logging setup
  main.py             # FastAPI application bootstrap
  services/           # Embedding, chunking, retrieval, generation services
  utils/              # Common utilities
docs/
  architecture-diagram.png
  additional-design-docs.md
sample_data/
tests/
requirements.txt

Running Locally

Install dependencies

pip install -r requirements.txt

Set environment variables

Create app/.env:

OPENAI_API_KEY=your_api_key_here

Run ingestion

python3 -m app.ingest

Start the API server

uvicorn app.main:app --reload

Open API Documentation:

http://localhost:8000/docs

Testing

Run test suite

pytest

With coverage

pytest --cov=app tests/

Roadmap

Integrate a dedicated vector database (FAISS, Qdrant, Weaviate)
Add hybrid retrieval (dense + sparse)
Stream responses for chat completions
Implement ingestion via REST endpoint
Add web-based admin dashboard
Enhance conversation summarization

License

MIT License

Author

Towseef Altaf
Software Engineer – Distributed Systems, Developer Productivity, AI Engineering

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.github		.github
app		app
docs		docs
tests		tests
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Retrieval-Augmented Generation (RAG) Pipeline

Architecture Overview

Features

Database Schema

documents

chunks

conversations

messages

API Endpoints

POST /ingest

POST /search

POST /chat

Project Structure

Running Locally

Install dependencies

Set environment variables

Run ingestion

Start the API server

Testing

Run test suite

With coverage

Roadmap

License

Author

About

Uh oh!

Releases

Packages

Languages

License

towseef41/RAG-Based-Knowledge-Assistant

Folders and files

Latest commit

History

Repository files navigation

Retrieval-Augmented Generation (RAG) Pipeline

Architecture Overview

Features

Database Schema

documents

chunks

conversations

messages

API Endpoints

POST /ingest

POST /search

POST /chat

Project Structure

Running Locally

Install dependencies

Set environment variables

Run ingestion

Start the API server

Testing

Run test suite

With coverage

Roadmap

License

Author

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages