This project is a Refactored and Production-Ready version of the original rizts/rag-ai-concept.
It aims to be more structured, modular, and ready for real-world deployment, with a clear separation between Backend (FastAPI) and Frontend (Vite + ReactJS).
🔗 Live Demo:
- Frontend: Deployed on Vercel
- Backend API: Deployed on Railway
This project demonstrates a complete AI pipeline with:
- 🧩 Backend (FastAPI) — API layer, Gemini integration, and LangChain-based chunking.
- ⚛️ Frontend (Vite + React + TypeScript) — interactive RAG chat UI.
- 🧠 AI Layer — intelligent chunking, multiple embedding providers (Jina AI as default), semantic retrieval, and Gemini answer generation.
- 💾 Vector Database (Qdrant Cloud) — document storage and vector similarity search.
- 🚀 CI/CD — Automated deployment via GitHub Actions to Railway (backend) + Vercel (frontend).
The goal is to demonstrate a production-level Retrieval-Augmented Generation (RAG) system that can:
- Process documents and intelligently chunk them using LangChain.
- Generate embeddings using multiple embedding providers with Jina AI as the default and store them in Qdrant Vector Database.
- Expose APIs for semantic search and knowledge retrieval.
- Integrate with modern DevOps practices to showcase AI orchestration lifecycle.
Recent Update: We've implemented support for multiple embedding providers to give you flexibility in choosing the best service for your needs. The system now supports:
- Jina AI (default) - Free 8000 requests/day
- Cohere - 1000 requests/month free
- Voyage AI - 20M tokens/month free
- HuggingFace - Legacy option for local models
Benefits:
- 🔄 Flexibility - Choose the provider that best fits your use case
- 💰 Cost optimization - Select based on pricing and free tier options
- 🚀 Performance - Different providers offer different strengths
- 🛡️ Privacy - Option to use local models with HuggingFace
Configuration:
- Provide your
JINA_API_KEYin environment variables - The service uses
jina-embeddings-v3via Jina API
Note: If you encounter issues with the default model not being available via the feature_extraction API,
consider using alternative models like sentence-transformers/all-MiniLM-L6-v2 which are more reliably
supported on the Jina Inference API.
This approach is specifically optimized for resource-constrained environments like Railway's free tier.
graph TD
A[📄 Document Upload] --> B[🔍 Intelligent Chunking LangChain]
B --> C[🔢 Multiple Embedding Providers (Jina AI Default)]
C --> D[(🧠 Qdrant Vector DB)]
E[💬 Query Request] --> F[Semantic Retrieval + Contextual Search]
F --> G[🧠 Gemini Generative Response]
G --> H[💡 Answer Output React UI]
rag-stack/
├── .github/
│ └── workflows/
│ └── deploy.yml # CI/CD configuration
├── backend/
│ ├── app/
│ │ ├── api/
│ │ │ ├── health.py # Health check endpoint
│ │ │ └── rag.py # RAG endpoints (index, query)
│ │ ├── core/
│ │ │ ├── config.py # Centralized configuration
│ │ │ └── logger.py # Logging setup
│ │ ├── services/
│ │ │ ├── embeddings.py # Local sentence-transformer embeddings
│ │ │ ├── semantic.py # Semantic search logic
│ │ │ └── vectorstore_qdrant.py # Qdrant integration
│ │ ├── utils/
│ │ │ └── chunking.py # LangChain text chunking
│ │ └── main.py # FastAPI application
│ ├── tests/
│ │ └── test_rag_basic.py # Unit tests
│ ├── Dockerfile
│ ├── requirements.txt
│ └── requirements-dev.txt
├── frontend/
│ ├── src/
│ │ ├── api/
│ │ │ └── rag.ts # API client
│ │ ├── components/
│ │ │ ├── AnswerCard.tsx # Answer display
│ │ │ └── QueryForm.tsx # Query input form
│ │ ├── App.tsx
│ │ └── main.tsx
│ ├── package.json
│ ├── vite.config.ts
│ └── vercel.json # Vercel configuration
└── README.md
- Python 3.11+
- Node.js 20+
- Docker (for local Qdrant)
- Git
git clone https://github.com/rizts/rag-stack.git
cd rag-stackcd backend
# Create virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Copy environment file
cp .env.example .env
# Edit .env and add your API keys:
# - GEMINI_API_KEY
# - JINA_API_KEY
# - QDRANT_URL (use localhost for local dev)cd backend/docker
docker compose up -dQdrant will run at http://localhost:6333
cd backend
uvicorn app.main:app --reloadBackend runs at http://localhost:8000
Check health: http://localhost:8000/health
cd frontend
# Install dependencies
npm install
# Copy environment file
cp .env.example .env
# Edit .env:
# VITE_API_BASE=http://localhost:8000npm run devFrontend runs at http://localhost:5173
This project uses GitHub Actions for automated deployment:
Trigger: Push to main or Pull Request
├── 1. Detect Changes (backend/frontend/both)
├── 2. Run Tests
│ ├── Backend: pytest
│ └── Frontend: lint + build
├── 3. Deploy Backend (if backend changed)
│ └── Railway auto-deploys via GitHub integration
└── 4. Deploy Frontend (if frontend changed)
└── Vercel deployment via CLI1. Railway Setup (Backend)
- Sign up at railway.app
- Create new project → Connect GitHub repo
- Configure:
- Root Directory:
backend - Builder:
Dockerfile - Add environment variables (see below)
- Root Directory:
- Generate domain in Settings → Networking
2. Vercel Setup (Frontend)
cd frontend
vercel login
vercel linkGet credentials from .vercel/project.json:
orgId→VERCEL_ORG_IDprojectId→VERCEL_PROJECT_ID
Generate token: vercel.com/account/tokens
3. GitHub Secrets
Add these secrets in Settings → Secrets and variables → Actions:
VERCEL_TOKEN=xxx
VERCEL_ORG_ID=team_xxx
VERCEL_PROJECT_ID=prj_xxx
4. Environment Variables
Railway (Backend):
GEMINI_API_KEY=your_key
# Embedding Provider Configuration
EMBEDDING_PROVIDER=jina # Options: jina, cohere, voyage, huggingface
JINA_API_KEY=your_key # Required when EMBEDDING_PROVIDER=jina
# COHERE_API_KEY=your_key # Required when EMBEDDING_PROVIDER=cohere
# VOYAGE_API_KEY=your_key # Required when EMBEDDING_PROVIDER=voyage
# HF_API_KEY=your_key # Required when EMBEDDING_PROVIDER=huggingface
QDRANT_URL=https://your-cluster.cloud.qdrant.io:6333
QDRANT_API_KEY=your_key
CORS_ORIGINS=https://your-app.vercel.app
ENVIRONMENT=productionVercel (Frontend):
VITE_API_BASE=https://rag-stack-production.up.railway.app# FastAPI Configuration
APP_NAME=RAG AI Backend
ENVIRONMENT=development
APP_PORT=8000
# Google Gemini
GEMINI_API_KEY=your_gemini_api_key
# Embedding Provider Configuration (Choose one)
EMBEDDING_PROVIDER=jina # Options: jina, cohere, voyage, huggingface
JINA_API_KEY=your_jina_api_key # Required when EMBEDDING_PROVIDER=jina
# COHERE_API_KEY=your_cohere_api_key # Required when EMBEDDING_PROVIDER=cohere
# VOYAGE_API_KEY=your_voyage_api_key # Required when EMBEDDING_PROVIDER=voyage
# HF_API_KEY=your_hf_api_key # Required when EMBEDDING_PROVIDER=huggingface
JINA_MODEL_NAME=jina-embeddings-v3 # Used when EMBEDDING_PROVIDER=jina
# Alternative models if needed:
# EMBEDDING_MODEL_NAME=sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
# EMBEDDING_MODEL_NAME=intfloat/multilingual-e5-small
# Alternative models if the above doesn't work:
# EMBEDDING_MODEL_NAME=sentence-transformers/all-MiniLM-L6-v2
# EMBEDDING_MODEL_NAME=intfloat/multilingual-e5-small
# Qdrant
QDRANT_URL=http://localhost:6333
QDRANT_API_KEY=
# CORS
CORS_ORIGINS=http://localhost:5173,http://localhost:3000
# Chunking
CHUNK_SIZE=1000
CHUNK_OVERLAP=100# API Configuration
VITE_API_BASE=http://localhost:8000
# Port
VITE_PORT=5173curl http://localhost:8000/healthResponse:
{"status": "ok"}curl -X POST "http://localhost:8000/rag/index" \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "content=FastAPI is a modern Python web framework for building APIs."Response:
{
"status": "indexed",
"detail": {
"chunks_indexed": 1
}
}curl -X POST "http://localhost:8000/rag/query" \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "query=What is FastAPI?"Response:
{
"query": "What is FastAPI?",
"answer": "FastAPI is a modern Python web framework designed for building APIs quickly and efficiently...",
"context_used": ["FastAPI is a modern Python web framework..."]
}cd backend
pytest tests/ -vcd frontend
npm run lint
npm run buildThe CI/CD pipeline intelligently detects changes:
- Backend changes → Tests + Railway auto-deploy
- Frontend changes → Tests + Vercel deploy
- Root changes → Run all tests
This saves GitHub Actions minutes and speeds up deployment!
Pull requests automatically get preview deployments:
- Frontend preview on Vercel
- Comments in PR with preview URL
| Layer | Technology |
|---|---|
| Backend | FastAPI, Python 3.11 |
| Frontend | React 18, TypeScript, Vite |
| AI/ML | LangChain, Google Gemini |
| Embeddings | Multiple Providers (Jina AI, Cohere, Voyage AI, HuggingFace) |
| Vector DB | Qdrant Cloud |
| Deployment | Railway (backend), Vercel |
| CI/CD | GitHub Actions |
| Containerization | Docker |
| Stage | Component | Description |
|---|---|---|
| 1️⃣ Chunking | LangChain | Intelligent text splitting (size + overlap configurable) |
| 2️⃣ Embedding | Multiple Providers (Jina AI Default) | Generate vector embeddings using selected provider |
| 3️⃣ Storage | Qdrant Cloud | Store embeddings and metadata |
| 4️⃣ Query | Semantic Search | Retrieve contextually similar chunks |
| 5️⃣ Generation | Gemini | Compose human-like, context-aware answers |
- Set up Railway account & project
- Configure Railway environment variables
- Set up Vercel account & link project
- Add GitHub secrets (Vercel tokens)
- Update
BACKEND_URLin workflow - Update
VITE_API_BASEin Vercel settings - Test local development
- Push to main → Verify deployments
- Create PR → Test preview deployment
- Railway: Built-in logs and metrics dashboard
- Vercel: Analytics and deployment logs
- GitHub Actions: Workflow runs and job details
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License. See the LICENSE file for details.
Risdy
Remote Software Engineer (since 2013)
AI & Fullstack Developer — FastAPI | React | LangChain | Gemini | Jina | Qdrant
📍 Based in Indonesia
- Original concept: rag-ai-concept
- Built with modern AI/ML frameworks
- Deployed with cloud-native infrastructure
⭐ If you find this project useful, please give it a star!
❤️ Happy coding!