🚀 RAGForge – Kubernetes-Native Hybrid RAG Lab

RAGForge is a Kubernetes-native Hybrid Retrieval-Augmented Generation (RAG) system built using official Kubernetes documentation as a real-world technical corpus.

This project is designed with AI Infrastructure principles from day one:

Container-first architecture
Kubernetes deployment (Minikube for lab)
Modular RAG design
Production-ready mindset
Observability-ready structure

🎯 Project Objective

Build a fully containerized Hybrid RAG system running on Kubernetes (Minikube) using:

📄 Official Kubernetes Basics Documentation
Source: https://kubernetes.io/docs/tutorials/kubernetes-basics/_print/

Initial Dataset: Printable version (~24 pages)

Why start with this dataset?

Controlled size
Structured technical content
DevOps-aligned domain
Ideal for parameter tuning

🧠 Architecture Overview (Kubernetes-Native)

Flow:

User
↓
Ingress (Minikube)
↓
Service (ClusterIP / LoadBalancer)
↓
RAGForge API Pod (FastAPI)
├── Retriever
├── RAG Logic
└── LLM Client
↓
Vector DB Pod (Chroma / FAISS)
↓
External LLM API

Observability (Optional – Stage 4+):

Prometheus
Grafana
Structured Logging

☸️ Why Kubernetes From Day One?

Clean isolation from local OS
Portability across environments
Easy scaling (HPA later)
Future-ready production design
Strong alignment with AI Infra / MLOps roles

Minikube provides:

Local cluster
Safe experimentation
No pollution of host system
Real Kubernetes workflow

📂 Project Structure

ragforge/ │ ├── README.md ├── ROADMAP.md ├── requirements.txt ├── .env.example ├── .gitignore │ ├── data/ │ ├── raw/ │ │ └── kubernetes_basics.pdf │ └── processed/ │ ├── src/ │ ├── ingestion/ │ ├── embeddings/ │ ├── retriever/ │ ├── generation/ │ ├── evaluation/ │ └── config/ │ ├── k8s/ │ ├── namespace.yaml │ ├── ragforge-deployment.yaml │ ├── ragforge-service.yaml │ ├── vector-db-deployment.yaml │ ├── vector-db-service.yaml │ └── ingress.yaml │ └── tests/

🧩 Core Modules

1️⃣ Ingestion

Extract text from PDF
Clean formatting artifacts
Normalize text
Remove duplicated headers
Section-aware chunking

Key Parameters:

chunk_size
chunk_overlap
section tagging
page metadata

2️⃣ Embeddings

Generate vector representations
Batch processing
Cache embeddings
Normalize vectors

Key Concepts:

Cosine similarity
Embedding dimensionality
Semantic proximity

3️⃣ Retriever

Top-K similarity search
Similarity threshold filtering
Metadata filtering
Section-aware retrieval

Why: Retrieval precision > model size.

4️⃣ Generation

Grounded prompt template
Context injection
Citation enforcement
Anti-hallucination rules

Security:

Prompt injection mitigation
Input validation
Context isolation

5️⃣ Evaluation

Metrics:

Context relevance
Faithfulness
Latency
Token usage

MLOps mindset from start.

🔐 Security Best Practices

No secrets in Git
.env excluded
Kubernetes secrets for API keys
Input length control
Prompt injection mitigation
Mandatory citation policy

⚙️ Configurable Parameters

Parameter	Impact
chunk_size	Retrieval granularity
chunk_overlap	Context continuity
top_k	Retrieval depth
similarity_threshold	Noise filtering
temperature	Determinism
max_tokens	Cost control
embedding_model	Vector quality

🚀 Running on Minikube

Start Minikube
Build Docker image
Apply namespace
Deploy vector DB
Deploy RAGForge API
Expose via Ingress
Query via browser / curl

This simulates a real production deployment workflow.

🔮 Future Expansion

Hybrid Search (BM25 + Vector)
Re-ranking layer
HPA autoscaling
Prometheus metrics export
Grafana dashboards
CI/CD with GitHub Actions
Helm chart
Multi-environment setup (dev/stage/prod)

🎯 Engineering Outcome

After completion, this project demonstrates:

Kubernetes-native AI system design
Hybrid RAG architecture
AI Infrastructure thinking
Observability awareness
Production-oriented DevOps skills

Target Roles: AI Infrastructure Engineer
MLOps Engineer
LLMOps Engineer
Platform Engineer (AI Focus)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 RAGForge – Kubernetes-Native Hybrid RAG Lab

🎯 Project Objective

🧠 Architecture Overview (Kubernetes-Native)

☸️ Why Kubernetes From Day One?

📂 Project Structure

🧩 Core Modules

1️⃣ Ingestion

2️⃣ Embeddings

3️⃣ Retriever

4️⃣ Generation

5️⃣ Evaluation

🔐 Security Best Practices

⚙️ Configurable Parameters

🚀 Running on Minikube

🔮 Future Expansion

🎯 Engineering Outcome

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
k8s		k8s
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
ROADMAP.md		ROADMAP.md
requirements.txt		requirements.txt

kratosvil/RAGForge

Folders and files

Latest commit

History

Repository files navigation

🚀 RAGForge – Kubernetes-Native Hybrid RAG Lab

🎯 Project Objective

🧠 Architecture Overview (Kubernetes-Native)

☸️ Why Kubernetes From Day One?

📂 Project Structure

🧩 Core Modules

1️⃣ Ingestion

2️⃣ Embeddings

3️⃣ Retriever

4️⃣ Generation

5️⃣ Evaluation

🔐 Security Best Practices

⚙️ Configurable Parameters

🚀 Running on Minikube

🔮 Future Expansion

🎯 Engineering Outcome

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages