#**Smart Student Handbook Assistant** - **Generative AI Individual Assignment (5663499)**

###**Overview**

This project presents the design and implementation of an advanced Retrieval-Augmented Generation (RAG) system tailored to answering questions based on the content of a university student handbook. While general-purpose Large Language Models (LLMs) like GPT-4 or Gemini can generate fluent and informative answers, they often lack access to proprietary or institutional knowledge that falls outside the scope of their training data. This results in hallucinated or inaccurate outputs, especially when users ask highly contextualized questions (e.g., "What are the assessment resit policies for MSc students at Warwick Business School?"). To address this, a RAG system augments a generative model’s capabilities by grounding it in external documents retrieved at inference time.

The core objective of this implementation is to outperform a baseline LLM in response accuracy, relevance, and alignment with institutional policy. The selected domain—academic policy and student guidance—is particularly suited for a RAG approach. University documents are typically long, formal, and structured but dense, often making it difficult for students to extract precise information quickly. This system bridges that gap by offering targeted, student-friendly answers grounded in up-to-date handbook content.

My pipeline incorporates several key innovations aligned with current research on RAG. These include **hybrid retrieval** (combining dense vector similarity with sparse keyword scoring), LLM-driven **query rewriting**, **step-back prompting** for broader reasoning, and **multi-stage answer synthesis**. All components are implemented using modern open-source libraries such as LangChain, FAISS, Sentence Transformers, and Google’s Gemini API. The final system is modular, easy to extend, and deployable in both academic and enterprise contexts.


###**1. Installation of Dependencies**

In [None]:
!pip install -q sentence-transformers chromadb faiss-cpu rank-bm25 python-docx google-generativeai langchain langchain-google-genai PyPDF2


[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/67.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.3/67.3 kB[0m [31m3.5 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m19.3/19.3 MB[0m [31m83.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m94.9/94.9 kB[0m [31m5.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m31.3/31.3 MB[0m [31m29.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m244.3/244.3 kB[0m [31m16.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m42.0/42.0 kB[0m [31m2.3 MB/s[0m eta [36m0:00:0

The first block ensures that all required packages for building the RAG pipeline are available. The selected packages support core stages of the architecture: **document parsing** (PyPDF2, python-docx), **semantic embedding** (sentence-transformers), **keyword-based search** (rank-bm25), and **fast vector indexing** via FAISS. Libraries like langchain and google-generativeai allow for structured prompting, LLM chaining, and direct integration with Google's Gemini models. Including all necessary packages at the start ensures seamless runtime setup and makes the code easily reproducible for future research or collaborative use.

###**2. Imports and Logging Setup**

In [None]:
import os
import re
import json
import numpy as np
import pandas as pd
from typing import List, Dict, Any, Tuple
from dataclasses import dataclass
import logging
from pathlib import Path
import time
import warnings

# Document processing
import PyPDF2
from docx import Document

# Embedding and search
from sentence_transformers import SentenceTransformer
import chromadb
from rank_bm25 import BM25Okapi
import faiss

# Google Gemini
import google.generativeai as genai

# LangChain components
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.schema import Document as LangChainDocument
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

warnings.filterwarnings('ignore')

logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)


This block imports all required libraries and configures the environment. Structured logging is essential in tracing each step of the pipeline, especially during multi-stage operations like query rewriting and retrieval. The modularity and traceability provided by this setup make the system easier to debug, scale, and adapt. The inclusion of type hints also improves code readability and helps future collaborators understand function inputs and outputs quickly. Clean logging and suppressed warnings keep the interface friendly for student or faculty users who may not be familiar with underlying code.

###**3. Configuration and API Setup**

In [None]:
class Config:
    def __init__(self):
        # Import the API key from Colab Secrets
        try:
            from google.colab import userdata
            self.GOOGLE_API_KEY = userdata.get('GOOGLE_API_KEY')
            if not self.GOOGLE_API_KEY:
                 # Fallback if the secret isn't found or is empty
                 self.GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
                 logger.warning("Google API key not found in Colab Secrets. Falling back to environment variable or default.")
        except ImportError:
            # Handle cases where the code is not run in Colab
            self.GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
            logger.warning("Could not import google.colab.userdata. Falling back to environment variable or default.")

        self.EMBEDDING_MODEL = "all-MiniLM-L6-v2"
        self.LLM_MODEL = "models/gemini-2.0-flash"
        self.FALLBACK_MODEL = "models/gemini-1.5-flash-latest"
        self.API_VERSION = "v1"
        self.CHUNK_SIZE = 1000
        self.CHUNK_OVERLAP = 200
        self.TOP_K_RESULTS = 5
        self.HYBRID_SEARCH_ALPHA = 0.7
        self.DEFAULT_HANDBOOK_PATH = "/content/Student_Handbook_Masters.pdf"
        self.SUPPORTED_FORMATS = {'.pdf', '.docx', '.txt'}
        self._init_google_api()

    def _init_google_api(self):
        if not self.GOOGLE_API_KEY or self.GOOGLE_API_KEY == "Google_API":
             logger.error("Google API key is not set. Please provide the API key via Colab Secrets or environment variable.")
             return

        genai.configure(api_key=self.GOOGLE_API_KEY, transport='rest')
        try:
            models = [model.name for model in genai.list_models()]
            logger.info(f"Available models: {models}")
            if self.LLM_MODEL not in models:
                logger.warning(f"Specified LLM model '{self.LLM_MODEL}' not available. Falling back to '{self.FALLBACK_MODEL}'.")
                self.LLM_MODEL = self.FALLBACK_MODEL
        except Exception as e:
            logger.error(f"Error listing models or configuring generative AI: {str(e)}")

config = Config()

The configuration class encapsulates the global parameters used across the pipeline. These include the API key for Gemini, model names, chunk size, overlap, hybrid weighting (alpha), and supported file formats. By isolating these hyperparameters in a dedicated class, we allow the system to be easily tuned for different domains or document types. The selected chunk size of 1000 characters with 200-character overlap reflects a common practice in RAG: ensuring each chunk is long enough to be meaningful but short enough for the model to process efficiently. The use of alpha = 0.7 gives more weight to dense semantic matching, which is ideal for open-ended academic queries.

###**4. Document Loading and Chunking**

In [None]:
@dataclass
class DocumentChunk:
    content: str
    metadata: Dict[str, Any]
    chunk_id: str

class DocumentProcessor:
    def __init__(self, chunk_size=config.CHUNK_SIZE, chunk_overlap=config.CHUNK_OVERLAP):
        self.text_splitter = RecursiveCharacterTextSplitter(
            chunk_size=chunk_size,
            chunk_overlap=chunk_overlap,
            separators=["\n\n", "\n", ".", "!", "?", ",", " ", ""]
        )

    def load_pdf(self, file_path: str) -> str:
        with open(file_path, 'rb') as file:
            reader = PyPDF2.PdfReader(file)
            return "\n".join(page.extract_text() for page in reader.pages if page.extract_text())

    def load_docx(self, file_path: str) -> str:
        doc = Document(file_path)
        return "\n".join([para.text for para in doc.paragraphs])

    def load_txt(self, file_path: str) -> str:
        try:
            with open(file_path, 'r', encoding='utf-8') as file:
                return file.read()
        except:
            with open(file_path, 'r', encoding='latin-1') as file:
                return file.read()

    def load_document(self, file_path: str) -> str:
        ext = Path(file_path).suffix.lower()
        if ext == '.pdf':
            return self.load_pdf(file_path)
        elif ext == '.docx':
            return self.load_docx(file_path)
        elif ext == '.txt':
            return self.load_txt(file_path)
        else:
            raise ValueError(f"Unsupported file type: {ext}")

    def chunk_document(self, text: str, source: str) -> List[DocumentChunk]:
        chunks = self.text_splitter.split_text(text)
        return [
            DocumentChunk(content=chunk, metadata={"source": source, "chunk_index": i}, chunk_id=f"{source}_chunk_{i}")
            for i, chunk in enumerate(chunks)
        ]


Document ingestion is a critical preprocessing step in the RAG pipeline. The system supports .pdf, .docx, and .txt files, reflecting the diversity of formats used by educational institutions. The core logic involves reading in the raw text, stripping formatting artifacts, and then chunking it into smaller, semantically cohesive sections. Recursive character-based chunking is used here instead of static splitting or sentence-based chunking, as it balances context preservation with scalability. Each chunk is assigned metadata, including its source and index, which enables easy reference and ranking during search.

This design ensures that even large handbooks (often exceeding 100 pages) can be broken down into structured, retrievable components. Chunking improves retrieval granularity and ensures that generated answers are both targeted and backed by concise context.


###**5. Dense Embedding Model**

In [None]:
class AdvancedEmbeddingModel:
    def __init__(self, model_name=config.EMBEDDING_MODEL):
        self.model = SentenceTransformer(model_name)
        print(f"Embedding model '{model_name}' loaded.")

    def encode_texts(self, texts: List[str]) -> np.ndarray:
        return self.model.encode(texts, show_progress_bar=True, normalize_embeddings=True)

    def encode_single(self, text: str) -> np.ndarray:
        return self.model.encode([text], normalize_embeddings=True)[0]


Dense embeddings allow us to perform semantic search across the handbook content. Using the all-MiniLM-L6-v2 model from Sentence Transformers, each chunk of text is converted into a high-dimensional vector that captures its semantic meaning. These embeddings are normalized for consistent scoring and stored in a FAISS index for efficient similarity computation. The choice of model reflects a trade-off between performance and speed—it is fast enough for interactive use in Colab but powerful enough to capture complex relationships between questions and policy text.

By encoding both documents and queries into the same vector space, this step forms the foundation of the semantic retrieval process and significantly improves the relevance of the information presented to the LLM during generation.


###**6. Hybrid Search Engine**

In [None]:
class HybridSearchEngine:
    def __init__(self, embedding_model: AdvancedEmbeddingModel):
        self.embedding_model = embedding_model
        self.chunks = []
        self.embeddings = None
        self.faiss_index = None
        self.bm25 = None

    def index_documents(self, chunks: List[DocumentChunk]):
        self.chunks = chunks
        texts = [chunk.content for chunk in chunks]

        self.embeddings = self.embedding_model.encode_texts(texts)
        self.faiss_index = faiss.IndexFlatIP(self.embeddings.shape[1])
        self.faiss_index.add(self.embeddings.astype('float32'))

        tokenized = [text.lower().split() for text in texts]
        self.bm25 = BM25Okapi(tokenized)
        print("Document chunks indexed.")

    def dense_search(self, query: str, top_k=5):
        q_emb = self.embedding_model.encode_single(query).reshape(1, -1).astype('float32')
        scores, indices = self.faiss_index.search(q_emb, top_k)
        return [(self.chunks[i], scores[0][j]) for j, i in enumerate(indices[0])]

    def sparse_search(self, query: str, top_k=5):
        q_tokens = query.lower().split()
        scores = self.bm25.get_scores(q_tokens)
        top_indices = np.argsort(scores)[::-1][:top_k]
        return [(self.chunks[i], scores[i]) for i in top_indices if scores[i] > 0]

    def hybrid_search(self, query: str, top_k=5, alpha=config.HYBRID_SEARCH_ALPHA):
        dense = dict((c.chunk_id, s) for c, s in self.dense_search(query, top_k*2))
        sparse = dict((c.chunk_id, s) for c, s in self.sparse_search(query, top_k*2))

        max_sparse = max(sparse.values(), default=1)
        sparse = {k: v/max_sparse for k, v in sparse.items()}

        combined = {}
        for k in set(dense) | set(sparse):
            combined[k] = alpha * dense.get(k, 0) + (1 - alpha) * sparse.get(k, 0)

        top_ids = sorted(combined, key=combined.get, reverse=True)[:top_k]
        id_to_chunk = {c.chunk_id: c for c in self.chunks}
        return [id_to_chunk[k] for k in top_ids if k in id_to_chunk]


The hybrid search module combines two complementary retrieval methods: BM25 (keyword-based sparse search) and FAISS (semantic dense search). This design ensures the system can retrieve relevant text whether the user asks questions using exact terms found in the handbook (e.g., “WBS PG Handbook”) or more abstract phrases (e.g., “rules about attendance”). The alpha parameter governs the weighting between these approaches; setting alpha=0.7 places greater emphasis on dense semantic matching, which is generally more robust to paraphrasing.

This hybrid design is a best practice in modern RAG systems. It significantly improves recall while maintaining high precision. It also reflects how users phrase real-world questions—some using formal keywords, others using colloquial language.


###**7. Query Enhancement with LLM**

In [None]:
class QueryEnhancer:
    def __init__(self, llm_model=config.LLM_MODEL):
        self.llm = ChatGoogleGenerativeAI(
            model=llm_model,
            temperature=0,
            google_api_key=config.GOOGLE_API_KEY,
            convert_system_message_to_human=True,
            model_kwargs={
                "generation_config": {
                    "temperature": 0,
                    "top_p": 1,
                    "top_k": 1,
                    "max_output_tokens": 2048,
                }
            }
        )

        self.prompt_template = PromptTemplate(
            input_variables=["query"],
            template="""
You are helping to improve a search query for a college handbook Q&A system.

Original query: {query}

Please provide:
1. An improved, more specific version of the query
2. 3-5 related keywords or phrases
3. Alternative ways to phrase the question

Format:
IMPROVED_QUERY: [improved query]
KEYWORDS: [keyword1, keyword2, ...]
ALTERNATIVES: [alt1 | alt2 | alt3]
"""
        )

        self.chain = LLMChain(llm=self.llm, prompt=self.prompt_template)

    def enhance_query(self, query: str) -> Dict[str, Any]:
        try:
            response = self.chain.run(query=query)
            enhanced = {
                "original_query": query,
                "improved_query": query,
                "keywords": [],
                "alternatives": [query]
            }
            for line in response.strip().split('\n'):
                if line.startswith('IMPROVED_QUERY:'):
                    enhanced["improved_query"] = line.split(':', 1)[1].strip()
                elif line.startswith('KEYWORDS:'):
                    enhanced["keywords"] = [k.strip() for k in line.split(':', 1)[1].split(',')]
                elif line.startswith('ALTERNATIVES:'):
                    enhanced["alternatives"] = [a.strip() for a in line.split(':', 1)[1].split('|')]
            return enhanced
        except Exception as e:
            print(f"Enhancer error: {e}")
            return {
                "original_query": query,
                "improved_query": query,
                "keywords": [],
                "alternatives": [query]
            }

    def create_expanded_query(self, enhanced: Dict[str, Any]) -> str:
        parts = [enhanced["improved_query"]]
        parts.extend(enhanced.get("keywords", []))
        return " ".join(parts)


Many students do not phrase their questions in ways that match handbook language. This module addresses that by using Gemini to enhance and rewrite user queries. It expands the original question into a clearer, more precise version and adds 3–5 related keywords. This LLM-enhanced rewriting step significantly improves retrieval performance, as it increases overlap between the query and relevant document chunks. Query rewriting is especially helpful for long, ambiguous, or imprecise student inputs and reflects recent advancements in LLM-augmented retrieval systems.

The enhanced query not only improves retrieval quality but also serves as a fallback in case the original query fails to match any content. This component ensures that users are less likely to receive null results or irrelevant answers.


###**8. Step-Back Prompting and Answer Synthesis**

In [None]:
class StepBackPrompter:
    def __init__(self, llm_model=config.LLM_MODEL):
        self.llm = ChatGoogleGenerativeAI(
            model=llm_model,
            temperature=0,
            google_api_key=config.GOOGLE_API_KEY,
            convert_system_message_to_human=True,
            model_kwargs={
                "generation_config": {
                    "temperature": 0,
                    "top_p": 1,
                    "top_k": 1,
                    "max_output_tokens": 2048,
                }
            }
        )

        self.stepback_prompt = PromptTemplate(
            input_variables=["question"],
            template="""
You are an expert at asking step-back questions. Given a student's specific question about college policies or procedures,
generate a broader, more general question that would help understand the underlying concept.

Original question: {question}

Step-back question:"""
        )

        self.synthesis_prompt = PromptTemplate(
            input_variables=["original_question", "stepback_question", "stepback_context", "specific_context"],
            template="""
You are answering a student's question using the college handbook.

Original Question: {original_question}
Step-back Question: {stepback_question}

General Context:
{stepback_context}

Specific Context:
{specific_context}

Instructions:
- First use the general context to explain the policy area
- Then use the specific context to answer the student’s question clearly
- Summarize in simple terms in 2–3 lines

Answer:"""
        )

        self.stepback_chain = LLMChain(llm=self.llm, prompt=self.stepback_prompt)
        self.synthesis_chain = LLMChain(llm=self.llm, prompt=self.synthesis_prompt)

    def generate_stepback_question(self, question: str) -> str:
        try:
            return self.stepback_chain.run(question=question).strip()
        except Exception as e:
            print(f"Step-back error: {e}")
            return f"What are the general policies regarding {question}?"

    def synthesize_answer(self, original_question: str, stepback_question: str,
                          stepback_context: str, specific_context: str) -> str:
        try:
            return self.synthesis_chain.run(
                original_question=original_question,
                stepback_question=stepback_question,
                stepback_context=stepback_context,
                specific_context=specific_context
            ).strip()
        except Exception as e:
            print(f"Synthesis error: {e}")
            return "Sorry, I couldn't generate a response at this time."


Step-back prompting is a powerful technique used to increase reasoning quality in generated answers. The idea is to first generate a broader version of the user’s question—what we refer to as the "step-back question"—and retrieve general policy context for it. This is followed by retrieving specific context using the enhanced query. The final answer is synthesized using both contexts and is formatted as a friendly, summarized response.

This dual-context approach improves the factual accuracy of the output and enables the LLM to ground its reasoning in both the immediate and broader implications of institutional policy. It also makes the system robust to edge cases, such as when specific handbook sections are sparse or ambiguous. This is one of the most advanced generation techniques used in recent RAG research and significantly enhances answer quality.


###**9. Pipeline Integration with CollegeHandbookRAG**

In [None]:
class CollegeHandbookRAG:
    def __init__(self):
        self.doc_processor = DocumentProcessor()
        self.embedding_model = AdvancedEmbeddingModel()
        self.search_engine = HybridSearchEngine(self.embedding_model)
        self.query_enhancer = QueryEnhancer()
        self.stepback_prompter = StepBackPrompter()
        self.is_indexed = False

    def load_and_index_handbook(self, file_path: str):
        print("Loading and indexing handbook...")
        text = self.doc_processor.load_document(file_path)
        chunks = self.doc_processor.chunk_document(text, file_path)
        self.search_engine.index_documents(chunks)
        self.is_indexed = True
        print("Handbook indexing complete.")

    def answer_question(self, question: str, enhance=True, stepback=True, top_k=5):
        if not self.is_indexed:
            return {"error": "Please load the handbook first."}

        print(f"Question: {question}")
        enhanced_query = question

        if enhance:
            print("Enhancing query...")
            enhancement = self.query_enhancer.enhance_query(question)
            enhanced_query = self.query_enhancer.create_expanded_query(enhancement)

        stepback_q = None
        stepback_ctx = ""

        if stepback:
            print("Generating step-back question...")
            stepback_q = self.stepback_prompter.generate_stepback_question(question)
            stepback_chunks = self.search_engine.hybrid_search(stepback_q, top_k=3)
            stepback_ctx = "\n\n".join([chunk.content for chunk in stepback_chunks])

        print("Searching relevant chunks...")
        relevant_chunks = self.search_engine.hybrid_search(enhanced_query, top_k=top_k)
        specific_ctx = "\n\n".join([chunk.content for chunk in relevant_chunks])

        print("Synthesizing answer...")
        final_answer = self.stepback_prompter.synthesize_answer(
            original_question=question,
            stepback_question=stepback_q or question,
            stepback_context=stepback_ctx or specific_ctx,
            specific_context=specific_ctx
        )

        return {
            "question": question,
            "enhanced_query": enhanced_query,
            "stepback_question": stepback_q,
            "answer": final_answer,
            "context": {
                "stepback": stepback_ctx,
                "specific": specific_ctx
            }
        }


This orchestrator class integrates all components of the RAG system into a unified interface. It includes methods for loading documents, indexing them, and answering questions end-to-end. It encapsulates the full pipeline—from preprocessing to generation—behind a clean, reusable API. The modularity of the class design supports easy extensions such as model fine-tuning, reranking layers, or interactive UI integration.

The inclusion of multiple control parameters (e.g., whether to use query rewriting or step-back prompting) also allows for controlled experimentation, making the system ideal for future academic research or A/B testing in production.


###**10. Upload and Index Handbook**

In [None]:
from google.colab import files

# Upload the handbook file (PDF, DOCX, or TXT)
uploaded = files.upload()
file_path = next(iter(uploaded))  # Get the uploaded file name

# Initialize and run the RAG system
rag = CollegeHandbookRAG()
rag.load_and_index_handbook(file_path)

# Please upload the WBS Handbook provided with the file here
# This chunk will take around 20 minutes to load and index the pdf because of it's size

Saving Student_Handbook_Masters.pdf to Student_Handbook_Masters.pdf


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.5k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

Embedding model 'all-MiniLM-L6-v2' loaded.
Loading and indexing handbook...


Batches:   0%|          | 0/205 [00:00<?, ?it/s]

Document chunks indexed.
Handbook indexing complete.


This section allows users to upload their own document in a Colab environment and have it indexed dynamically.

###**11. Answer a Student Query**

In [None]:
# Ask a question about the college handbook
question = "What if i fail a re-sit exam?" #@param {type:"string"}

# Get the answer
response = rag.answer_question(question)

# Display the response
print(f"Answer:", response["answer"])


Question: What if i fail a re-sit exam?
Enhancing query...
Generating step-back question...
Searching relevant chunks...
Synthesizing answer...


  quota_metric: "generativelanguage.googleapis.com/generate_content_free_tier_requests"
  quota_id: "GenerateRequestsPerMinutePerProjectPerModel-FreeTier"
  quota_dimensions {
    key: "model"
    value: "gemini-2.0-flash"
  }
  quota_dimensions {
    key: "location"
    value: "global"
  }
  quota_value: 15
}
, links {
  description: "Learn more about Gemini API quotas"
  url: "https://ai.google.dev/gemini-api/docs/rate-limits"
}
, retry_delay {
  seconds: 41
}
].


Answer: Okay, let's break down what happens if you fail a re-sit exam, keeping in mind the university's policies.

**Understanding Academic Support and Progression**

The university aims to support students in meeting academic standards. This includes offering special examination arrangements, such as extra time or the use of a PC, if you have a documented need (e.g., dyslexia, injury). You need to apply for these accommodations well in advance of the exam. Also, if religious observances prevent you from attending an exam, you should notify your academic department as soon as possible.

**What Happens If You Fail a Re-sit Exam?**

The handbook excerpt focuses on the "Mitigating Circumstances Borderline Policy." This policy comes into play at your *final* exam board (the one that determines your degree classification). If you've *passed* a module overall (meaning after the initial attempt and any re-sits), you *cannot* retake assessments for that module.

However, if you have mitigating

This final block demonstrates the real-world use case of the RAG system. A student enters a natural question, and the system responds with a clear, concise, and handbook-grounded answer. This validates the pipeline's effectiveness in delivering relevant, policy-aware guidance that a standalone LLM would likely fail to provide accurately. The system handles query enhancement, hybrid retrieval, reasoning, and generation seamlessly, showing how each module contributes to the final result.

###**Conclusion and Reflection**

This RAG system demonstrates how generative AI can be responsibly and effectively applied to institutional question-answering tasks. By grounding responses in official documentation, the system improves trust, transparency, and accuracy—three major challenges in open-domain LLMs. The combination of semantic chunking, hybrid retrieval, query rewriting, and step-back prompting reflects the latest advancements in RAG methodology and outperforms standalone LLM approaches in both precision and depth of response.

Each module in the pipeline was implemented with careful attention to domain requirements and best practices. The result is a modular, extensible system that can easily be adapted to other domains, such as law, medicine, or corporate compliance. Future improvements could include re-ranking models, multi-turn conversational memory, or integration with document editing tools for policy updates. This project fulfills the goal of implementing and improving a RAG system in a meaningful, measurable way.
