Skip to content

nitinchaube/AdvisorAI

Repository files navigation

AdvisorAI

Intelligent Academic Advising Chatbot for Stevens Institute of Technology

Multi-Agent RAG Pipeline · Fine-Tuned LLaMA-2-7B · LangGraph Orchestration

React FastAPI Python LangGraph ChromaDB MongoDB Firebase Docker License

Live Demo · Research Paper · Fine-Tuning & Dataset · HuggingFace Dataset · HuggingFace Model


Overview

AdvisorAI is a production-grade academic advising chatbot that answers questions about courses, professors, programs, admissions, and campus life at Stevens Institute of Technology — with zero hallucination.

Unlike a simple ChatGPT wrapper, AdvisorAI uses a multi-agent RAG (Retrieval-Augmented Generation) pipeline orchestrated by LangGraph, with entity-aware hybrid retrieval, self-reflection quality gates, and real-time SSE streaming. In parallel, we have fine-tuned LLaMA-2-7B using QLoRA on 87,000+ Q&A pairs for future self-hosted generation.


Key Features

Feature Description
Multi-Agent RAG LangGraph orchestrator coordinates ChromaDB retrieval, web search, and conversation history agents in parallel
Entity-Aware Hybrid Retrieval Regex-based course code & faculty name detection routes to targeted ChromaDB collections in sub-millisecond time
Self-Reflection Quality Gate LLM critiques its own answers (1-10 score); refines if quality < 7/10
Real-Time Streaming SSE token-by-token streaming with source citations (Perplexity-style)
Fine-Tuned LLaMA-2-7B QLoRA fine-tuning on 87K domain-specific Q&A pairs for future self-hosted generation
Multi-Provider LLM Gemini 2.0 Flash (primary) + GPT-4o-mini (fallback) with automatic failover
Multi-Layer Safety Regex blocking + LLM classification + identity protection + tech stack shielding
Admin Dashboard Manage courses, faculty, users, web scraper, jobs/internships
Resume Processing AI-powered resume parsing for personalized academic advising
Jobs & Internships Automated scraping and search for career opportunities

Architecture

┌──────────────────────────────────────────────────────┐
│          Presentation Layer (React 19 + Vite 5)      │
│   ChatInterface · Sessions · Admin Dashboard · Auth  │
└────────────────────────┬─────────────────────────────┘
                         │ HTTPS / SSE Streaming
                         ▼
┌──────────────────────────────────────────────────────┐
│             API Layer (FastAPI + Flask)               │
│   /api/chat/stream (SSE) · /api/chat/query · REST    │
└────────────────────────┬─────────────────────────────┘
                         │
                         ▼
┌──────────────────────────────────────────────────────┐
│    Intelligence Layer (LangGraph Orchestrator)        │
│                                                      │
│    Router → Gather → Evaluate → Generate →           │
│    Reflect → [Refine if <7] → Save                   │
│       │        │                                     │
│    Safety   ChromaAgent · WebAgent · HistoryAgent     │
│                │                                     │
│            HybridRetriever                           │
│            ├─ Entity Detection (sub-ms)              │
│            └─ Semantic Search (BGE embeddings)       │
└────────────────────────┬─────────────────────────────┘
                         │
                         ▼
┌──────────────────────────────────────────────────────┐
│                    Data Layer                         │
│  ChromaDB · MongoDB Atlas · Web Scraper · LLM APIs   │
└──────────────────────────────────────────────────────┘

Tech Stack

Layer Technology
Frontend React 19, Vite 5, Tailwind CSS, Framer Motion
Authentication Firebase Auth (email/password + verification)
Backend (Async) FastAPI + Uvicorn
Backend (REST) Flask (WSGI Middleware)
Database MongoDB Atlas
Vector Database ChromaDB 0.5.23
Embedding Model BAAI/bge-small-en-v1.5
LLM (Primary) Google Gemini 2.0 Flash
LLM (Fallback) OpenAI GPT-4o-mini, Claude 3.5 Haiku
Orchestration LangGraph (StateGraph)
Fine-Tuning QLoRA on LLaMA-2-7B
Deployment Docker, Google Cloud Run, Firebase Hosting
Web Search DuckDuckGo, SerpAPI

How It Works

When a user sends a message, it flows through a 7-node LangGraph pipeline:

  1. Router — Classifies the query as general, domain, or blocked. Runs regex safety checks (violence, profanity, tech probing) and LLM classification. Generates a chat session name.

  2. Gather — For domain queries, runs three agents in parallel using asyncio.gather():

    • ChromaAgent: Entity-aware hybrid retrieval — detects course codes (80+ Stevens prefixes) and faculty names via regex, routes to targeted ChromaDB collections with metadata filters. Falls back to multi-collection semantic search using BAAI/bge-small-en-v1.5.
    • WebAgent: Searches DuckDuckGo/SerpAPI, scrapes top results, cleans content.
    • HistoryAgent: Retrieves recent conversation context from memory.
  3. Evaluate — ReAct "think" step: assesses whether gathered info is sufficient.

  4. Generate — Synthesizes the answer from all context (history + ChromaDB docs + web results) using Gemini 2.0 Flash, with strict identity rules and source grounding.

  5. Reflect — Self-critique scoring (1-10) on relevance, accuracy, completeness, tone, and tech leakage. If score < 7, triggers refinement.

  6. Refine (conditional) — Improves the answer using reflection feedback.

  7. Save — Persists to in-memory store and MongoDB.

Responses are streamed token-by-token via SSE with source citations displayed at the end.


Fine-Tuning (Parallel Research)

We fine-tuned Meta's LLaMA-2-7B using QLoRA for future self-hosted generation. The complete fine-tuning pipeline, dataset, and model checkpoints are available in a dedicated repository:

github.com/nitinchaube/StevensDomainFineTunedLM

Parameter Value
Dataset 87,782 Q&A pairs scraped from Stevens website
Method QLoRA (4-bit NF4 quantization)
LoRA Rank / Alpha 16 / 32
Target Modules All 7 linear layers (q, k, v, o, gate, up, down)
Training 6 epochs, lr=2e-4, cosine scheduler, effective batch size 32
Infrastructure Google Colab GPU
Best Checkpoint Step 7,500

The fine-tuning repository includes:

  • Data collection pipeline — async web crawler for Stevens website (crawler.py)
  • Data cleaning & preprocessing — JSONL cleaning and validation (clean_jsonl.py)
  • Q&A generation — context-to-QA pair generation (DataGeneration/)
  • QLoRA training notebook — complete fine-tuning pipeline (FinetuningProcess/Fine_tuning.ipynb)
  • Model inference — loading and running the fine-tuned model
  • Sample datasetstevens_qa_finetuning_sample.jsonl for reference

The fine-tuned model is designed for drop-in replacement via the LLMRouter's provider abstraction in the production system.


Getting Started

Prerequisites

  • Python 3.12+
  • Node.js 18+
  • MongoDB Atlas account
  • Firebase project
  • API keys: Google Gemini, OpenAI (optional), Anthropic (optional)

Backend Setup

cd AdvisorAI-Web/backend
pip install -r requirements.txt

# Create .env file with required variables:
# MONGO_URI=your_mongodb_uri
# GEMINI_API_KEY=your_gemini_key
# OPENAI_API_KEY=your_openai_key  (optional)
# VECTORDB_DIR=./VectorDB_v2
# EMBEDDING_MODEL=BAAI/bge-small-en-v1.5

uvicorn main:app --host 0.0.0.0 --port 8080 --reload

Frontend Setup

cd AdvisorAI-Web/frontend
npm install
npm run dev

Docker Deployment

cd AdvisorAI-Web/backend
docker build -t advisorai .
docker run -p 8080:8080 --env-file .env advisorai

Project Structure

AdvisorAI/
├── README.md
├── AdvisorAI-Web/
│   ├── frontend/
│   │   ├── src/
│   │   │   ├── components/          # ChatInterface, AdminDashboard, etc.
│   │   │   ├── contexts/            # AuthContext
│   │   │   ├── services/            # API service layer
│   │   │   └── config/              # Firebase config
│   │   └── package.json
│   ├── backend/
│   │   ├── main.py                  # FastAPI entry, SSE chat endpoints
│   │   ├── app.py                   # Flask app, ~50 REST routes
│   │   ├── chatbot_integration.py   # Bridge between API and LangGraph
│   │   ├── chatbot/
│   │   │   ├── core/
│   │   │   │   ├── langgraph_graph.py   # ReAct + Reflection pipeline
│   │   │   │   ├── llm_router.py        # Multi-provider LLM abstraction
│   │   │   │   └── memory_store.py      # Conversation memory
│   │   │   ├── agents/              # Chroma, Web, History, General agents
│   │   │   ├── tools/               # ChromaTool, WebTool, GeneralTool
│   │   │   └── config/settings.py   # Centralized configuration
│   │   ├── newprocessingdata/
│   │   │   ├── hybrid_retriever.py  # Entity-aware hybrid retrieval
│   │   │   └── build_vectordb.py    # ChromaDB indexing pipeline
│   │   ├── Dockerfile
│   │   └── requirements.txt
│   ├── data/
│   │   ├── stevens_qa_finetuning.jsonl  # 87K Q&A dataset
│   │   └── Fine_tuning.ipynb            # QLoRA training notebook
│   ├── AdvisorAI_Research_Paper.tex     # LaTeX research paper
│   └── AdvisorAI_Research_Paper.md      # Markdown research paper

Research Paper

We have written a detailed research paper documenting the architecture, fine-tuning methodology, hybrid retrieval design, and evaluation:

"AdvisorAI: A Retrieval-Augmented Generation System with Fine-Tuned LLaMA for Domain-Specific Academic Advising"

LaTeX Source · Markdown Version


Related Repositories

Repository Description
AdvisorAI (this repo) Production chatbot — RAG pipeline, multi-agent orchestration, full-stack web app
StevensDomainFineTunedLM Fine-tuning pipeline — web scraping, data generation, QLoRA training on LLaMA-2-7B

Authors

Name Role GitHub
Nitin Chaube Full-Stack Development, LangGraph Pipeline, Fine-Tuning @nitinchaube
Paras Jadhav Backend Architecture, RAG Pipeline, Deployment @parasjadhav2610
Keval Sompura Frontend Development, Admin Dashboard, Data Collection @keval-som

Stevens Institute of Technology, Hoboken, NJ


License

This project is licensed under the MIT License — see the LICENSE file for details.


Built at Stevens Institute of Technology

Live Demo · Research Paper · Fine-Tuning Pipeline

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors