🧠 AI Knowledge Service — RAG & FastAPI

This project is a Refactored and Production-Ready version of the original rizts/rag-ai-concept.
It aims to be more structured, modular, and ready for real-world deployment, with a clear separation between Backend (FastAPI) and Frontend (Vite + ReactJS).

🔗 Live Demo:

Frontend: Deployed on Vercel
Backend API: Deployed on Railway

🚀 Overview

This project demonstrates a complete AI pipeline with:

🧩 Backend (FastAPI) — API layer, Gemini integration, and LangChain-based chunking.
⚛️ Frontend (Vite + React + TypeScript) — interactive RAG chat UI.
🧠 AI Layer — intelligent chunking, multiple embedding providers (Jina AI as default), semantic retrieval, and Gemini answer generation.
💾 Vector Database (Qdrant Cloud) — document storage and vector similarity search.
🚀 CI/CD — Automated deployment via GitHub Actions to Railway (backend) + Vercel (frontend).

The goal is to demonstrate a production-level Retrieval-Augmented Generation (RAG) system that can:

Process documents and intelligently chunk them using LangChain.
Generate embeddings using multiple embedding providers with Jina AI as the default and store them in Qdrant Vector Database.
Expose APIs for semantic search and knowledge retrieval.
Integrate with modern DevOps practices to showcase AI orchestration lifecycle.

🔄 Important Change: Multiple Embedding Providers

Recent Update: We've implemented support for multiple embedding providers to give you flexibility in choosing the best service for your needs. The system now supports:

Jina AI (default) - Free 8000 requests/day
Cohere - 1000 requests/month free
Voyage AI - 20M tokens/month free
HuggingFace - Legacy option for local models

Benefits:

🔄 Flexibility - Choose the provider that best fits your use case
💰 Cost optimization - Select based on pricing and free tier options
🚀 Performance - Different providers offer different strengths
🛡️ Privacy - Option to use local models with HuggingFace

Configuration:

Provide your JINA_API_KEY in environment variables
The service uses jina-embeddings-v3 via Jina API

Note: If you encounter issues with the default model not being available via the feature_extraction API, consider using alternative models like sentence-transformers/all-MiniLM-L6-v2 which are more reliably supported on the Jina Inference API.

This approach is specifically optimized for resource-constrained environments like Railway's free tier.

🏗️ Architecture Overview

graph TD
  A[📄 Document Upload] --> B[🔍 Intelligent Chunking LangChain]
  B --> C[🔢 Multiple Embedding Providers (Jina AI Default)]
  C --> D[(🧠 Qdrant Vector DB)]
  E[💬 Query Request] --> F[Semantic Retrieval + Contextual Search]
  F --> G[🧠 Gemini Generative Response]
  G --> H[💡 Answer Output React UI]

🗂️ Project Structure

rag-stack/
├── .github/
│   └── workflows/
│       └── deploy.yml          # CI/CD configuration
├── backend/
│   ├── app/
│   │   ├── api/
│   │   │   ├── health.py       # Health check endpoint
│   │   │   └── rag.py          # RAG endpoints (index, query)
│   │   ├── core/
│   │   │   ├── config.py       # Centralized configuration
│   │   │   └── logger.py       # Logging setup
│   │   ├── services/
│   │   │   ├── embeddings.py   # Local sentence-transformer embeddings
│   │   │   ├── semantic.py     # Semantic search logic
│   │   │   └── vectorstore_qdrant.py      # Qdrant integration
│   │   ├── utils/
│   │   │   └── chunking.py     # LangChain text chunking
│   │   └── main.py             # FastAPI application
│   ├── tests/
│   │   └── test_rag_basic.py   # Unit tests
│   ├── Dockerfile
│   ├── requirements.txt
│   └── requirements-dev.txt
├── frontend/
│   ├── src/
│   │   ├── api/
│   │   │   └── rag.ts          # API client
│   │   ├── components/
│   │   │   ├── AnswerCard.tsx  # Answer display
│   │   │   └── QueryForm.tsx   # Query input form
│   │   ├── App.tsx
│   │   └── main.tsx
│   ├── package.json
│   ├── vite.config.ts
│   └── vercel.json             # Vercel configuration
└── README.md

🛠️ Local Development Setup

Prerequisites

Python 3.11+
Node.js 20+
Docker (for local Qdrant)
Git

1. Clone Repository

git clone https://github.com/rizts/rag-stack.git
cd rag-stack

2. Backend Setup

cd backend

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Copy environment file
cp .env.example .env

# Edit .env and add your API keys:
# - GEMINI_API_KEY
# - JINA_API_KEY
# - QDRANT_URL (use localhost for local dev)

3. Run Qdrant (Local Development)

cd backend/docker
docker compose up -d

Qdrant will run at http://localhost:6333

4. Run Backend

cd backend
uvicorn app.main:app --reload

Backend runs at http://localhost:8000

Check health: http://localhost:8000/health

5. Frontend Setup

cd frontend

# Install dependencies
npm install

# Copy environment file
cp .env.example .env

# Edit .env:
# VITE_API_BASE=http://localhost:8000

6. Run Frontend

npm run dev

Frontend runs at http://localhost:5173

🌐 Production Deployment

CI/CD Pipeline

This project uses GitHub Actions for automated deployment:

Workflow Overview

Trigger: Push to main or Pull Request
├── 1. Detect Changes (backend/frontend/both)
├── 2. Run Tests
│   ├── Backend: pytest
│   └── Frontend: lint + build
├── 3. Deploy Backend (if backend changed)
│   └── Railway auto-deploys via GitHub integration
└── 4. Deploy Frontend (if frontend changed)
    └── Vercel deployment via CLI

Setup Instructions

1. Railway Setup (Backend)

Sign up at railway.app
Create new project → Connect GitHub repo
Configure:
- Root Directory: backend
- Builder: Dockerfile
- Add environment variables (see below)
Generate domain in Settings → Networking

2. Vercel Setup (Frontend)

cd frontend
vercel login
vercel link

Get credentials from .vercel/project.json:

orgId → VERCEL_ORG_ID
projectId → VERCEL_PROJECT_ID

Generate token: vercel.com/account/tokens

3. GitHub Secrets

Add these secrets in Settings → Secrets and variables → Actions:

VERCEL_TOKEN=xxx
VERCEL_ORG_ID=team_xxx
VERCEL_PROJECT_ID=prj_xxx

4. Environment Variables

Railway (Backend):

GEMINI_API_KEY=your_key

# Embedding Provider Configuration
EMBEDDING_PROVIDER=jina  # Options: jina, cohere, voyage, huggingface
JINA_API_KEY=your_key    # Required when EMBEDDING_PROVIDER=jina
# COHERE_API_KEY=your_key  # Required when EMBEDDING_PROVIDER=cohere
# VOYAGE_API_KEY=your_key  # Required when EMBEDDING_PROVIDER=voyage
# HF_API_KEY=your_key      # Required when EMBEDDING_PROVIDER=huggingface

QDRANT_URL=https://your-cluster.cloud.qdrant.io:6333
QDRANT_API_KEY=your_key
CORS_ORIGINS=https://your-app.vercel.app
ENVIRONMENT=production

Vercel (Frontend):

VITE_API_BASE=https://rag-stack-production.up.railway.app

📝 Environment Variables

Backend (`.env`)

# FastAPI Configuration
APP_NAME=RAG AI Backend
ENVIRONMENT=development
APP_PORT=8000

# Google Gemini
GEMINI_API_KEY=your_gemini_api_key

# Embedding Provider Configuration (Choose one)
EMBEDDING_PROVIDER=jina  # Options: jina, cohere, voyage, huggingface
JINA_API_KEY=your_jina_api_key    # Required when EMBEDDING_PROVIDER=jina
# COHERE_API_KEY=your_cohere_api_key  # Required when EMBEDDING_PROVIDER=cohere
# VOYAGE_API_KEY=your_voyage_api_key  # Required when EMBEDDING_PROVIDER=voyage
# HF_API_KEY=your_hf_api_key          # Required when EMBEDDING_PROVIDER=huggingface
JINA_MODEL_NAME=jina-embeddings-v3  # Used when EMBEDDING_PROVIDER=jina
# Alternative models if needed:
# EMBEDDING_MODEL_NAME=sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
# EMBEDDING_MODEL_NAME=intfloat/multilingual-e5-small
# Alternative models if the above doesn't work: 
# EMBEDDING_MODEL_NAME=sentence-transformers/all-MiniLM-L6-v2
# EMBEDDING_MODEL_NAME=intfloat/multilingual-e5-small

# Qdrant
QDRANT_URL=http://localhost:6333
QDRANT_API_KEY=

# CORS
CORS_ORIGINS=http://localhost:5173,http://localhost:3000

# Chunking
CHUNK_SIZE=1000
CHUNK_OVERLAP=100

Frontend (`.env`)

# API Configuration
VITE_API_BASE=http://localhost:8000

# Port
VITE_PORT=5173

🧠 RAG API Endpoints

1️⃣ Health Check

curl http://localhost:8000/health

Response:

{"status": "ok"}

2️⃣ Index Content

curl -X POST "http://localhost:8000/rag/index" \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "content=FastAPI is a modern Python web framework for building APIs."

Response:

{
  "status": "indexed",
  "detail": {
    "chunks_indexed": 1
  }
}

3️⃣ Query Knowledge Base

curl -X POST "http://localhost:8000/rag/query" \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "query=What is FastAPI?"

Response:

{
  "query": "What is FastAPI?",
  "answer": "FastAPI is a modern Python web framework designed for building APIs quickly and efficiently...",
  "context_used": ["FastAPI is a modern Python web framework..."]
}

🧪 Testing

Backend Tests

cd backend
pytest tests/ -v

Frontend Tests

cd frontend
npm run lint
npm run build

🎯 Key Features

Smart Path-Based Deployment

The CI/CD pipeline intelligently detects changes:

Backend changes → Tests + Railway auto-deploy
Frontend changes → Tests + Vercel deploy
Root changes → Run all tests

This saves GitHub Actions minutes and speeds up deployment!

Preview Deployments

Pull requests automatically get preview deployments:

Frontend preview on Vercel
Comments in PR with preview URL

🧩 Tech Stack

Layer	Technology
Backend	FastAPI, Python 3.11
Frontend	React 18, TypeScript, Vite
AI/ML	LangChain, Google Gemini
Embeddings	Multiple Providers (Jina AI, Cohere, Voyage AI, HuggingFace)
Vector DB	Qdrant Cloud
Deployment	Railway (backend), Vercel
CI/CD	GitHub Actions
Containerization	Docker

🔄 RAG Pipeline Flow

Stage	Component	Description
1️⃣ Chunking	LangChain	Intelligent text splitting (size + overlap configurable)
2️⃣ Embedding	Multiple Providers (Jina AI Default)	Generate vector embeddings using selected provider
3️⃣ Storage	Qdrant Cloud	Store embeddings and metadata
4️⃣ Query	Semantic Search	Retrieve contextually similar chunks
5️⃣ Generation	Gemini	Compose human-like, context-aware answers

🚀 Deployment Checklist

Set up Railway account & project
Configure Railway environment variables
Set up Vercel account & link project
Add GitHub secrets (Vercel tokens)
Update BACKEND_URL in workflow
Update VITE_API_BASE in Vercel settings
Test local development
Push to main → Verify deployments
Create PR → Test preview deployment

📊 Monitoring

Railway: Built-in logs and metrics dashboard
Vercel: Analytics and deployment logs
GitHub Actions: Workflow runs and job details

🤝 Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to branch (git push origin feature/amazing-feature)
Open a Pull Request

📄 License

This project is licensed under the MIT License. See the LICENSE file for details.

👨‍💻 Author

🙏 Acknowledgments

Original concept: rag-ai-concept
Built with modern AI/ML frameworks
Deployed with cloud-native infrastructure

⭐ If you find this project useful, please give it a star!

❤️ Happy coding!

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
.github/workflows		.github/workflows
backend		backend
frontend		frontend
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

License

rizts/rag-stack

Folders and files

Latest commit

History

Repository files navigation

🧠 AI Knowledge Service — RAG & FastAPI

🚀 Overview

🔄 Important Change: Multiple Embedding Providers

🏗️ Architecture Overview

🗂️ Project Structure

🛠️ Local Development Setup

Prerequisites

1. Clone Repository

2. Backend Setup

3. Run Qdrant (Local Development)

4. Run Backend

5. Frontend Setup

6. Run Frontend

🌐 Production Deployment

CI/CD Pipeline

Workflow Overview

Setup Instructions

📝 Environment Variables

Backend (.env)

Frontend (.env)

🧠 RAG API Endpoints

1️⃣ Health Check

2️⃣ Index Content

3️⃣ Query Knowledge Base

🧪 Testing

Backend Tests

Frontend Tests

🎯 Key Features

Smart Path-Based Deployment

Preview Deployments

🧩 Tech Stack

🔄 RAG Pipeline Flow

🚀 Deployment Checklist

📊 Monitoring

🤝 Contributing

📄 License

👨‍💻 Author

🙏 Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Backend (`.env`)

Frontend (`.env`)

Packages