Student Name: Mickel Georges
Student ID: 23077585
Module: Digital Systems Project (2025/26)
Supervisor: Nathan Duran
This project is a privacy-preserving, retrieval-augmented generation (RAG) chatbot designed for UWE Bristol. Unlike standard static chatbots, this system features an autonomous Change Detection Notification (CDN) pipeline that periodically scrapes UWE webpages, detects content updates (via hash comparison), and automatically updates the local Vector Database.
- Autonomous Updates: Self-healing knowledge base that detects website changes and updates within a specified timeframe.
- Privacy-First (Local RAG): Runs entirely offline using Ollama and local embeddings. Zero data leaves the server.
- Verifiable Citations: Every answer includes a direct link to the source UWE webpage.
- Robust Architecture: Handles connection failures and HTML parsing errors gracefully.
- Language: Python 3.10+
- LLM Engine: Ollama (Qwen2.5:3b)
- Vector DB: ChromaDB
- Orchestration: LangChain
- Scraper: BeautifulSoup4 + Requests
- Interface: Chrome Extension
The system is divided into two pipelines:
-
Knowledge Pipeline: Scheduled scraper
$\to$ Hash Check$\to$ Semantic Chunking$\to$ Vector Store. -
Inference Pipeline: User Query
$\to$ Vector Search$\to$ LLM Generation$\to$ Answer.
We do testing in multiple areas and methods:
- A larger language model is used to test the local llms response to multiple questions, it is graded based on its faithfulness to the Context it has, and the answers relevance given in response to the users question.
- To run
src/tools/judge.py
- Relying solely on another LLM to judge performance is subjective. We want a deterministic metric. We will use NER to extract named entities (Buildings, etc...) from the AI's answer and check if those exact entities exist in the retrieved context
- Thresholds: if 100% of the entities are found in both sides, it gets 1.0 points, if 70%> 0.5 points, otherwise 0.0.
- The NER scores are tested against different domains (courses, students, buildings) to see where the bot struggles (could be great at IT support but bad at finance).
- It also runs cosine similarity checks between the answer and its chunks.
- To run
src/tools/ner_eval.py
- Python 3.10 or higher (Preferred: 3.12)
- Ollama installed and running.
git clone https://github.com/Mickelgt/Final-Year-Project.git
cd Final-Year-Projectpip install -r requirements.txtYou must pull the Qwen model into Ollama before running the app.
ollama pull qwen2.5b- Run API:
python manage.py server - Run Scraper:
python manage.py scrape - Reset DB:
python manage.py reset - Test Chat:
python manage.py chat - Check Retrieval:
python manage.py ask "question" --peek" - View Logs:
python manage.py logs -f
| Date | Component | Change / Milestone |
|---|---|---|
| 02/01/26 | Architecture | Defined 2-pipeline system (Ingestion vs Inference). |
| 03/01/26 | Scraper | Implemented Robots.txt compliance and Hash-based change detection. |
| 05/01/26 | Vector DB | Integrated ChromaDB with Singleton pattern. |
| 07/01/26 | API | Built FastAPI to support Chrome Extension. |
| 09/01/26 | RAG Engine | Implemented Cross-Encoder Re-Ranking and Intent Routing for safety. |
| 11/01/26 | Data Engineering | Upgraded to Semantic Chunking (Header-Aware) to resolve table content loss. |
| 12/01/26 | Evaluation | Developed "LLM-as-a-Judge" pipeline to automate Faithfulness & Relevance scoring. |
| 13/01/26 | System Final | Final Integration: Schedular, Chrome Extension UI and Tests. |
| 16/01/26 | System Update | Updated with error handling, configuration settings. |
| 20/01/26 | System Change towards extraction | Updated metheodology towards llm extraction rather than generation. |
| 21/02/26 | NLP Pipeline | Integrated GLiNER Entity Extraction, guardrails, updated prompt logic, remove filler words with NLTK |
