Course: Foundation of Professional Analytics (MBAN-5510-2)
Program: Master of Business Analytics (MBAN)
University: Saint Mary’s University – Sobey School of Business
Student: Mandipa Raut
Date: February 2026
This project implements and compares two Retrieval-Augmented Generation (RAG) systems:
- Vanilla RAG – a simple, baseline RAG system
- Agentic RAG – an advanced RAG system with agent-based decision-making
Both systems use the Nova Scotia Health – Brain Health website as an external healthcare information source.
The purpose of this project is to evaluate whether an Agentic RAG approach can outperform a Vanilla RAG approach in terms of relevance, answer quality, and token efficiency.
The objectives of this project are to:
- Build two separate RAG systems using Python
- Use a real healthcare website as an external data source
- Apply responsible token usage with OpenAI APIs
- Compare Vanilla RAG and Agentic RAG in a structured manner
- Understand the trade-offs between simplicity and intelligent decision-making
All information is retrieved from the public healthcare website:
Nova Scotia Health – Brain Health
https://www.nshealth.ca/brain-health
This website includes information related to:
- Acquired Brain Injury (ABI)
- Epilepsy
- Concussions
- Brain health services and programs
Project2_RAG/ │ ├── agentic_rag.py # Agentic RAG implementation ├── vanilla_rag.py # Vanilla RAG implementation ├── scrape_brain_health.py # Web scraping helper functions ├── requirements.txt # Required Python libraries ├── README.md # Project documentation ├── Comparison.md # Vanilla RAG vs Agentic RAG comparison └── .env # OpenAI API key (NOT submitted)
The Vanilla RAG system follows a simple, linear pipeline:
- Scrapes a fixed, limited number of pages from the NS Health website
- Uses keyword-based matching to identify relevant content
- Filters retrieved text to control token usage
- Sends selected context to the OpenAI model for answer generation
This system serves as a baseline RAG implementation.
The Agentic RAG system introduces intelligent agents to guide the retrieval process:
- Analyzes the user query to determine clarity
- Discovers available pages from the website
- Uses an LLM agent to select the most relevant pages (2–4 only)
- Scrapes only the selected pages
- Synthesizes information from multiple sources and provides citations
This system demonstrates how agent-based reasoning can improve relevance and efficiency.
| Aspect | Vanilla RAG | Agentic RAG |
|---|---|---|
| Page Selection | Fixed and limited | Agent-driven, selective |
| Retrieval Method | Keyword-based | LLM-assisted reasoning |
| Token Efficiency | Moderate | High |
| Answer Quality | Basic | More comprehensive |
| Source Citations | Limited | Explicit citations |
| Complexity | Low | Higher |
| Response Time | Faster | Slightly slower |
Both systems implement safeguards to prevent excessive token usage:
- Limited number of scraped pages
- Character limits per page (2,500–3,000 characters)
- Maximum context size for LLM input
- Controlled response length using
max_tokens
This ensures efficient and responsible use of the OpenAI API within the project’s budget constraints.
python -m venv venv
source venv/bin/activate # macOS/Linux
2. Install dependencies
pip install -r requirements.txt
3. Configure OpenAI API Key
Create a .env file in the project root:
OPENAI_API_KEY=your_api_key_here
How to Run
Run Vanilla RAG : python vanilla_rag.py
Run Agentic RAG : python agentic_rag.py
Comparison and Analysis
A detailed comparison of the two approaches is provided in:
Comparison.md
This document includes:
- Design differences
- Token usage analysis
- Performance comparison
- Strengths and limitations of each system
Key Takeaways
- Vanilla RAG is simple, fast, and suitable for basic use cases
- Agentic RAG provides better answer quality through intelligent decision-making
- Selective retrieval reduces unnecessary token usage
- Agent-based systems are especially valuable for complex healthcare queries
Conclusion
This project demonstrates that Agentic RAG can outperform Vanilla RAG for complex healthcare information retrieval by improving relevance, answer quality, and token efficiency.
However, Vanilla RAG remains a useful baseline for simpler applications where speed and cost are prioritized.
=======
# Project2_RAG
Vanilla RAG vs Agentic RAG comparison using Nova Scotia Health brain health data
>>>>>>> 40d3a1c28d38e3f729e3e69a09d6dfc269c4dd7f