AdvisorAI

Intelligent Academic Advising Chatbot for Stevens Institute of Technology

Multi-Agent RAG Pipeline · Fine-Tuned LLaMA-2-7B · LangGraph Orchestration

Live Demo · Research Paper · Fine-Tuning & Dataset · HuggingFace Dataset · HuggingFace Model

Overview

AdvisorAI is a production-grade academic advising chatbot that answers questions about courses, professors, programs, admissions, and campus life at Stevens Institute of Technology — with zero hallucination.

Unlike a simple ChatGPT wrapper, AdvisorAI uses a multi-agent RAG (Retrieval-Augmented Generation) pipeline orchestrated by LangGraph, with entity-aware hybrid retrieval, self-reflection quality gates, and real-time SSE streaming. In parallel, we have fine-tuned LLaMA-2-7B using QLoRA on 87,000+ Q&A pairs for future self-hosted generation.

Key Features

Feature	Description
Multi-Agent RAG	LangGraph orchestrator coordinates ChromaDB retrieval, web search, and conversation history agents in parallel
Entity-Aware Hybrid Retrieval	Regex-based course code & faculty name detection routes to targeted ChromaDB collections in sub-millisecond time
Self-Reflection Quality Gate	LLM critiques its own answers (1-10 score); refines if quality < 7/10
Real-Time Streaming	SSE token-by-token streaming with source citations (Perplexity-style)
Fine-Tuned LLaMA-2-7B	QLoRA fine-tuning on 87K domain-specific Q&A pairs for future self-hosted generation
Multi-Provider LLM	Gemini 2.0 Flash (primary) + GPT-4o-mini (fallback) with automatic failover
Multi-Layer Safety	Regex blocking + LLM classification + identity protection + tech stack shielding
Admin Dashboard	Manage courses, faculty, users, web scraper, jobs/internships
Resume Processing	AI-powered resume parsing for personalized academic advising
Jobs & Internships	Automated scraping and search for career opportunities

Architecture

┌──────────────────────────────────────────────────────┐
│          Presentation Layer (React 19 + Vite 5)      │
│   ChatInterface · Sessions · Admin Dashboard · Auth  │
└────────────────────────┬─────────────────────────────┘
                         │ HTTPS / SSE Streaming
                         ▼
┌──────────────────────────────────────────────────────┐
│             API Layer (FastAPI + Flask)               │
│   /api/chat/stream (SSE) · /api/chat/query · REST    │
└────────────────────────┬─────────────────────────────┘
                         │
                         ▼
┌──────────────────────────────────────────────────────┐
│    Intelligence Layer (LangGraph Orchestrator)        │
│                                                      │
│    Router → Gather → Evaluate → Generate →           │
│    Reflect → [Refine if <7] → Save                   │
│       │        │                                     │
│    Safety   ChromaAgent · WebAgent · HistoryAgent     │
│                │                                     │
│            HybridRetriever                           │
│            ├─ Entity Detection (sub-ms)              │
│            └─ Semantic Search (BGE embeddings)       │
└────────────────────────┬─────────────────────────────┘
                         │
                         ▼
┌──────────────────────────────────────────────────────┐
│                    Data Layer                         │
│  ChromaDB · MongoDB Atlas · Web Scraper · LLM APIs   │
└──────────────────────────────────────────────────────┘

Tech Stack

Layer	Technology
Frontend	React 19, Vite 5, Tailwind CSS, Framer Motion
Authentication	Firebase Auth (email/password + verification)
Backend (Async)	FastAPI + Uvicorn
Backend (REST)	Flask (WSGI Middleware)
Database	MongoDB Atlas
Vector Database	ChromaDB 0.5.23
Embedding Model	BAAI/bge-small-en-v1.5
LLM (Primary)	Google Gemini 2.0 Flash
LLM (Fallback)	OpenAI GPT-4o-mini, Claude 3.5 Haiku
Orchestration	LangGraph (StateGraph)
Fine-Tuning	QLoRA on LLaMA-2-7B
Deployment	Docker, Google Cloud Run, Firebase Hosting
Web Search	DuckDuckGo, SerpAPI

How It Works

When a user sends a message, it flows through a 7-node LangGraph pipeline:

Router — Classifies the query as general, domain, or blocked. Runs regex safety checks (violence, profanity, tech probing) and LLM classification. Generates a chat session name.
Gather — For domain queries, runs three agents in parallel using asyncio.gather():
- ChromaAgent: Entity-aware hybrid retrieval — detects course codes (80+ Stevens prefixes) and faculty names via regex, routes to targeted ChromaDB collections with metadata filters. Falls back to multi-collection semantic search using BAAI/bge-small-en-v1.5.
- WebAgent: Searches DuckDuckGo/SerpAPI, scrapes top results, cleans content.
- HistoryAgent: Retrieves recent conversation context from memory.
Evaluate — ReAct "think" step: assesses whether gathered info is sufficient.
Generate — Synthesizes the answer from all context (history + ChromaDB docs + web results) using Gemini 2.0 Flash, with strict identity rules and source grounding.
Reflect — Self-critique scoring (1-10) on relevance, accuracy, completeness, tone, and tech leakage. If score < 7, triggers refinement.
Refine (conditional) — Improves the answer using reflection feedback.
Save — Persists to in-memory store and MongoDB.

Responses are streamed token-by-token via SSE with source citations displayed at the end.

Fine-Tuning (Parallel Research)

We fine-tuned Meta's LLaMA-2-7B using QLoRA for future self-hosted generation. The complete fine-tuning pipeline, dataset, and model checkpoints are available in a dedicated repository:

github.com/nitinchaube/StevensDomainFineTunedLM

Parameter	Value
Dataset	87,782 Q&A pairs scraped from Stevens website
Method	QLoRA (4-bit NF4 quantization)
LoRA Rank / Alpha	16 / 32
Target Modules	All 7 linear layers (q, k, v, o, gate, up, down)
Training	6 epochs, lr=2e-4, cosine scheduler, effective batch size 32
Infrastructure	Google Colab GPU
Best Checkpoint	Step 7,500

The fine-tuning repository includes:

Data collection pipeline — async web crawler for Stevens website (crawler.py)
Data cleaning & preprocessing — JSONL cleaning and validation (clean_jsonl.py)
Q&A generation — context-to-QA pair generation (DataGeneration/)
QLoRA training notebook — complete fine-tuning pipeline (FinetuningProcess/Fine_tuning.ipynb)
Model inference — loading and running the fine-tuned model
Sample dataset — stevens_qa_finetuning_sample.jsonl for reference

The fine-tuned model is designed for drop-in replacement via the LLMRouter's provider abstraction in the production system.

Getting Started

Prerequisites

Python 3.12+
Node.js 18+
MongoDB Atlas account
Firebase project
API keys: Google Gemini, OpenAI (optional), Anthropic (optional)

Backend Setup

cd AdvisorAI-Web/backend
pip install -r requirements.txt

# Create .env file with required variables:
# MONGO_URI=your_mongodb_uri
# GEMINI_API_KEY=your_gemini_key
# OPENAI_API_KEY=your_openai_key  (optional)
# VECTORDB_DIR=./VectorDB_v2
# EMBEDDING_MODEL=BAAI/bge-small-en-v1.5

uvicorn main:app --host 0.0.0.0 --port 8080 --reload

Frontend Setup

cd AdvisorAI-Web/frontend
npm install
npm run dev

Docker Deployment

cd AdvisorAI-Web/backend
docker build -t advisorai .
docker run -p 8080:8080 --env-file .env advisorai

Project Structure

AdvisorAI/
├── README.md
├── AdvisorAI-Web/
│   ├── frontend/
│   │   ├── src/
│   │   │   ├── components/          # ChatInterface, AdminDashboard, etc.
│   │   │   ├── contexts/            # AuthContext
│   │   │   ├── services/            # API service layer
│   │   │   └── config/              # Firebase config
│   │   └── package.json
│   ├── backend/
│   │   ├── main.py                  # FastAPI entry, SSE chat endpoints
│   │   ├── app.py                   # Flask app, ~50 REST routes
│   │   ├── chatbot_integration.py   # Bridge between API and LangGraph
│   │   ├── chatbot/
│   │   │   ├── core/
│   │   │   │   ├── langgraph_graph.py   # ReAct + Reflection pipeline
│   │   │   │   ├── llm_router.py        # Multi-provider LLM abstraction
│   │   │   │   └── memory_store.py      # Conversation memory
│   │   │   ├── agents/              # Chroma, Web, History, General agents
│   │   │   ├── tools/               # ChromaTool, WebTool, GeneralTool
│   │   │   └── config/settings.py   # Centralized configuration
│   │   ├── newprocessingdata/
│   │   │   ├── hybrid_retriever.py  # Entity-aware hybrid retrieval
│   │   │   └── build_vectordb.py    # ChromaDB indexing pipeline
│   │   ├── Dockerfile
│   │   └── requirements.txt
│   ├── data/
│   │   ├── stevens_qa_finetuning.jsonl  # 87K Q&A dataset
│   │   └── Fine_tuning.ipynb            # QLoRA training notebook
│   ├── AdvisorAI_Research_Paper.tex     # LaTeX research paper
│   └── AdvisorAI_Research_Paper.md      # Markdown research paper

Research Paper

We have written a detailed research paper documenting the architecture, fine-tuning methodology, hybrid retrieval design, and evaluation:

"AdvisorAI: A Retrieval-Augmented Generation System with Fine-Tuned LLaMA for Domain-Specific Academic Advising"

LaTeX Source · Markdown Version

Related Repositories

Repository	Description
AdvisorAI (this repo)	Production chatbot — RAG pipeline, multi-agent orchestration, full-stack web app
StevensDomainFineTunedLM	Fine-tuning pipeline — web scraping, data generation, QLoRA training on LLaMA-2-7B

Authors

Name	Role	GitHub
Nitin Chaube	Full-Stack Development, LangGraph Pipeline, Fine-Tuning	@nitinchaube
Paras Jadhav	Backend Architecture, RAG Pipeline, Deployment	@parasjadhav2610
Keval Sompura	Frontend Development, Admin Dashboard, Data Collection	@keval-som

Stevens Institute of Technology, Hoboken, NJ

License

This project is licensed under the MIT License — see the LICENSE file for details.

Built at Stevens Institute of Technology

Live Demo · Research Paper · Fine-Tuning Pipeline

Name		Name	Last commit message	Last commit date
Latest commit History 249 Commits
.github/workflows		.github/workflows
AdvisorAI-Web		AdvisorAI-Web
Backend		Backend
.gitignore		.gitignore
AdvisorAI_Research_Paper.pdf		AdvisorAI_Research_Paper.pdf
AdvisorAI_Testing_Guide.md		AdvisorAI_Testing_Guide.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AdvisorAI

Intelligent Academic Advising Chatbot for Stevens Institute of Technology

Overview

Key Features

Architecture

Tech Stack

How It Works

Fine-Tuning (Parallel Research)

Getting Started

Prerequisites

Backend Setup

Frontend Setup

Docker Deployment

Project Structure

Research Paper

Related Repositories

Authors

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AdvisorAI

Intelligent Academic Advising Chatbot for Stevens Institute of Technology

Overview

Key Features

Architecture

Tech Stack

How It Works

Fine-Tuning (Parallel Research)

Getting Started

Prerequisites

Backend Setup

Frontend Setup

Docker Deployment

Project Structure

Research Paper

Related Repositories

Authors

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages