# LTO RAG Table of Contents

1. Import Libraries
2. Connect to Ollama Server
3. Ingestion
4. QA Generation
5. Embedding and Retrieval<br>
 **5.A** Dense via FAISS  
 **5.B** FAISS Retrieval Evaluator  
 **5.C** FAISS Retrieval Evaluation  
 **5.D** Sparse Embedding via BM25  
 **5.E** Hybrid Embedding via Reciprocal Rank Fusion  
 **5.F** Hybrid Retrieval Evaluator  
 **5.G** Hybrid Retrieval Evaluation  
6. Post Retrieval<br>
 **6.A** Summarization  
 **6.B** Evaluation Generation  
7. Querying<br>
 **7.A** Query Transforms  
8. Query Generation
9. TDC Exam Evaluation
10. Similarity Evaluation
11. Relevancy Evaluation


# 1. Import Libraries

In [9]:
import os
import fitz
import re

from ollama import Client
import faiss
import pandas as pd
import numpy as np
import Stemmer
from tqdm import tqdm
import gradio as gr

from llama_index.core import Document
from llama_index.core.node_parser import TokenTextSplitter
from llama_index.core.retrievers import BaseRetriever, QueryFusionRetriever
from llama_index.core.schema import TextNode, NodeWithScore
from llama_index.retrievers.bm25 import BM25Retriever
from llama_index.llms.ollama import Ollama

# 2. Connect to Ollama Server

In [10]:
client = Client(
  host='http://localhost:11434',
)

# 3. Ingestion

In [11]:
# Path to the dataset folder
DATASET_PATH = '/mnt/c/Users/Jeryl Salas/Documents/AI 351/Project/Dataset2'

def extract_text_from_pdfs(folder_path):
    texts = []
    metadata = []
    
    for root, _, files in os.walk(folder_path):
        for file in files:
            if file.endswith(".pdf"):
                pdf_path = os.path.join(root, file)
                folder_name = os.path.basename(root)
                print(f"Extracting text from {pdf_path}...")
                
                doc = fitz.open(pdf_path)
                for page_num, page in enumerate(doc, start=1):
                    text = page.get_text()
                    if text.strip():
                        texts.append(text.strip())
                        metadata.append({
                            "source": pdf_path,
                            "folder": folder_name,
                            "title": file,
                            "page": page_num
                        })
                    else:
                        print(f"WARNING: {file} page {page_num} not processed...")
    return texts, metadata

In [12]:
docs, metadatas = extract_text_from_pdfs(DATASET_PATH)

Extracting text from /mnt/c/Users/Jeryl Salas/Documents/AI 351/Project/Dataset2/1-CC2024-INITIAL-REG.pdf...
Extracting text from /mnt/c/Users/Jeryl Salas/Documents/AI 351/Project/Dataset2/1-CC2024-MAIRDOE-NEW.pdf...
Extracting text from /mnt/c/Users/Jeryl Salas/Documents/AI 351/Project/Dataset2/1-CC2024-SETTLEMENT.pdf...
Extracting text from /mnt/c/Users/Jeryl Salas/Documents/AI 351/Project/Dataset2/1-CC2024-SP.pdf...
Extracting text from /mnt/c/Users/Jeryl Salas/Documents/AI 351/Project/Dataset2/10-CC2024-DL-CC-P-NP.pdf...
Extracting text from /mnt/c/Users/Jeryl Salas/Documents/AI 351/Project/Dataset2/10-CC2024-MV-CONDUCTION-STICKER.pdf...
Extracting text from /mnt/c/Users/Jeryl Salas/Documents/AI 351/Project/Dataset2/10-CC2024-RELEASING.pdf...
Extracting text from /mnt/c/Users/Jeryl Salas/Documents/AI 351/Project/Dataset2/11-CC2024-DL-ENHANCE.pdf...
Extracting text from /mnt/c/Users/Jeryl Salas/Documents/AI 351/Project/Dataset2/11-CC2024-MV-CONDUCTION-VERIFICATION.pdf...
Extracting t

In [13]:
documents = [Document(text=docs[t], metadata=metadatas[t]) for t in range(len(docs))]
splitter = TokenTextSplitter(
    chunk_size=512,
    chunk_overlap=20,
    separator=" ",
)
nodes = splitter.get_nodes_from_documents(documents)

In [14]:
print(nodes[0])

Node ID: e1efc25c-5c58-44c8-9703-77e65b605179
Text: WHERE TO SECURE Accredited
Manufacturer/Assembler/Importer/Rebuilder/Other Entities Accredited
insurance companies by the Insurance Commission / GSIS PNP-HPG MV
Clearance Division Owner Land Transportation Management System (LTMS)
1. SALES REPORTING AND INITIAL REGISTRATION OF MOTOR VEHICLES One of
the core mandates of the LTO pursuant to Republ...


# 4. QA Generation
Used for Retreival Evaluation:
1. Get all document nodes
2. Generate question for each node (using llama 3.2)
3. Question Answer pairs: Generated Question, Node text

In [15]:
from llama_index.core.evaluation import (
    generate_question_context_pairs,
    EmbeddingQAFinetuneDataset,
)

ollama_llm = Ollama(model="llama3.1:8b", request_timeout=300)

qa_dataset = generate_question_context_pairs(
    nodes, llm=ollama_llm, num_questions_per_chunk=1
)

queries = qa_dataset.queries.values()
print(list(queries)[2])


# Save
qa_dataset.save_json("pg_eval_dataset.json")
print("Successfully created QA dataset")

  0%|          | 0/2592 [00:00<?, ?it/s]


ConnectError: [Errno 111] Connection refused

In [None]:
# Load
qa_dataset = EmbeddingQAFinetuneDataset.from_json("pg_eval_dataset.json")

# 5. Embedding and Retrieval

## 5.A. Dense via FAISS

In [None]:
def generate_embeddings(nodes, client, model):
    # Generate embeddings for documents using Ollama
    for doc in tqdm(nodes):
        response = client.embeddings(prompt=doc.text, model=model)
        doc.embedding = response["embedding"]
    return nodes

In [None]:
class FaissIndexer:
    """
    Faiss-based indexer for efficient similarity search using inner-product (cosine) similarity.

    This class handles the creation and management of a FAISS index from node embeddings.
    
    :ivar faiss_index: The FAISS index for storing and querying embeddings.
    :vartype faiss_index: faiss.IndexFlatIP
    :ivar embedding_dim: Dimensionality of the embeddings.
    :vartype embedding_dim: int
    """

    def __init__(self):
        """
        Initialize the FaissIndexer class.

        :ivar faiss_index: The FAISS index, initialized as None.
        :ivar embedding_dim: The dimension of embeddings, initialized as None.
        """
        self.faiss_index = None
        self.embedding_dim = None

    def normalize_embeddings(self, embeddings):
        """
        Normalize embeddings to have unit L2 norm.

        :param embeddings: Array of embeddings to normalize.
        :type embeddings: np.ndarray
        :return: Normalized embeddings.
        :rtype: np.ndarray
        """
        return embeddings / np.linalg.norm(embeddings, axis=1, keepdims=True)

    def build_index(self, nodes):
        """
        Build the FAISS index from a list of nodes containing embeddings.

        :param nodes: List of nodes, where each node contains an `embedding` attribute.
        :type nodes: list
        :raises ValueError: If the nodes list is empty or embeddings are inconsistent.
        """
        if not nodes:
            raise ValueError("Nodes list cannot be empty.")
        
        embeddings = np.array([np.array(node.embedding) for node in nodes])
        normalized_embeddings = self.normalize_embeddings(embeddings)

        self.embedding_dim = normalized_embeddings[0].shape[0]
        self.faiss_index = faiss.IndexFlatIP(self.embedding_dim)  # Inner-product similarity
        self.faiss_index.add(normalized_embeddings)

    def get_index(self):
        """
        Get the FAISS index instance.

        :return: The FAISS index used for similarity search.
        :rtype: faiss.IndexFlatIP
        :raises ValueError: If the index has not been built.
        """
        if self.faiss_index is None:
            raise ValueError("Index has not been built yet. Call 'build_index' first.")
        return self.faiss_index

In [None]:
class FAISSVectorStoreRetriever(BaseRetriever):
    def __init__(self, faiss_index, documents):
        """
        Initialize the FAISS retriever.
        :param faiss_index: The FAISS index containing precomputed embeddings.
        :param documents: List of document chunks.
        :param embeddings: Precomputed embeddings corresponding to the document chunks.
        """
        self.faiss_index = faiss_index
        self.documents = documents

    def _retrieve(self, query_embedding, top_k=5):
        """
        Retrieve the top-k nearest neighbors using the FAISS index.
        :param query_embedding: The embedding of the query.
        :param top_k: Number of top results to retrieve.
        """

        norm_query_embedding = np.array([query_embedding])
        norm_query_embedding /= np.linalg.norm(norm_query_embedding, axis=1, keepdims=True)

        distances, indices = self.faiss_index.search(norm_query_embedding, top_k)
        retrieved_docs = [
            NodeWithScore(node=self.documents[idx], score=1 - dist)
            for idx, dist in zip(indices[0], distances[0])
            if idx != -1
        ]
        return retrieved_docs

## 5.B. FAISS Retreival Evaluator

In [None]:
from typing import List, Tuple, Any
from pydantic import Field, ConfigDict
from llama_index.core.evaluation.retrieval.base import (
    BaseRetrievalEvaluator,
    RetrievalEvalMode,
    RetrievalEvalResult
)
from llama_index.core.evaluation.retrieval.metrics import resolve_metrics
from llama_index.core.response.notebook_utils import display_source_node

class FAISSRetrievalEvaluator(BaseRetrievalEvaluator):
    retriever: "FAISSVectorStoreRetriever" = Field(..., description="FAISS Retriever instance")
    Print_Results: bool = Field(default=False, description="Whether to print retrieved results")

    model_config = ConfigDict(extra="forbid")  

    @classmethod
    def from_metric_names(
        cls,
        metric_names: List[str],
        retriever: "FAISSVectorStoreRetriever",
        Print_Results: bool = False,
        **kwargs: Any,
    ) -> "FAISSRetrievalEvaluator":
        metric_types = resolve_metrics(metric_names)
        metrics = [metric() for metric in metric_types]
        return cls(metrics=metrics, retriever=retriever, Print_Results=Print_Results, **kwargs)

    async def _aget_retrieved_ids_and_texts(
        self,
        query: str,
        mode: RetrievalEvalMode = RetrievalEvalMode.TEXT,
    ) -> Tuple[List[str], List[str]]:
        response = client.embeddings(prompt=query, model="mxbai-embed-large")
        query_embedding = response["embedding"]
        retrieved_docs = self.retriever._retrieve(query_embedding, top_k=15)
        
        # Conditionally print results
        if self.Print_Results:
            for doc in retrieved_docs:
                display_source_node(doc, source_length=1000)  # Directly use doc
        
        retrieved_ids = [doc.id_ for doc in retrieved_docs]  # doc, not doc.node
        retrieved_texts = [doc.text for doc in retrieved_docs]
        return retrieved_ids, retrieved_texts

    async def aevaluate(
        self,
        query: str,
        expected_ids: List[str],
        expected_texts: List[str] = [],
        mode: RetrievalEvalMode = RetrievalEvalMode.TEXT,
        **kwargs: Any,
    ) -> RetrievalEvalResult:
        retrieved_ids, retrieved_texts = await self._aget_retrieved_ids_and_texts(query, mode)
        metric_dict = {}
        
        for metric in self.metrics:
            # Call compute instead of evaluate
            result = metric.compute(
                query=query,
                expected_ids=expected_ids,
                retrieved_ids=retrieved_ids,
                expected_texts=expected_texts,
                retrieved_texts=retrieved_texts,
                **kwargs
            )
            metric_dict[metric.metric_name] = result  # Store the whole RetrievalMetricResult object
            print(f"{metric.metric_name}: {result.score}")  # Print each metric result
        
        # Return RetrievalEvalResult with all required fields
        return RetrievalEvalResult(
            query=query,
            retrieved_ids=retrieved_ids,
            retrieved_texts=retrieved_texts,  # Include retrieved texts
            expected_ids=expected_ids,         # Pass expected ids
            expected_texts=expected_texts,     # Include expected texts
            metric_dict=metric_dict            # Pass full RetrievalMetricResult objects
        )


In [None]:
nodes_embed = generate_embeddings(nodes, client, "mxbai-embed-large")

100%|██████████| 2592/2592 [07:30<00:00,  5.76it/s]  


## 5.C. FAISS Retreival Evaluation

In [None]:
indexer = FaissIndexer()

indexer.build_index(nodes)  
faiss_index = indexer.get_index()

retriever = FAISSVectorStoreRetriever(faiss_index=indexer.get_index(), documents=nodes_embed)

metrics = ["hit_rate", "mrr", "precision", "recall", "ap", "ndcg"]
evaluator = FAISSRetrievalEvaluator.from_metric_names(
    metric_names=metrics,
    retriever=retriever,
    Print_Results=True
)


sample_id, sample_query = list(qa_dataset.queries.items())[1]
sample_expected = qa_dataset.relevant_docs[sample_id]

print("Sample Problem:")
print(f"sample id: {sample_id}, text: {sample_query}")
print(f"sample exp: {sample_expected}")

print("\nRetrieval results:")

result = await evaluator.aevaluate(
    query=sample_query,
    expected_ids=sample_expected,
)

print("\n")
print(result)

Sample Problem:
sample id: 2dbe10ff-a3d8-49b1-8d9f-bf6642834340, text: What type of motor vehicles require special registration procedures, and where can they be registered?
sample exp: ['a647ee3e-c2ec-4d75-8e0c-827748e58bcc']

Retrieval results:


**Node ID:** c4c26600-9dbd-457f-b4a2-03ac4b22434d<br>**Similarity:** 0.21263229846954346<br>**Text:** Registration of Motor Vehicle<br>

**Node ID:** 4e6cb4e8-d94a-4500-aa34-79da965f66ed<br>**Similarity:** 0.22287940979003906<br>**Text:** Registration of Motor Vehicle
Special Permit
 Vehicles exceeding the permissible dimension without special permit 
are not allowed to be operated on public highways.
Additional Fees
 Any changes on Certificate of Registration or Driver’s License requires 
additional fees otherwise known as miscellaneous transaction.
Use and Authority of Certificates of Registration
 A photocopy or the original CR are required to be preserved and carried 
in the car by the owner or driver of a motor vehicle<br>

**Node ID:** 31ea2b44-5427-417a-aab2-2e41e9d4d101<br>**Similarity:** 0.26351630687713623<br>**Text:** 1 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
• Requirements for New Registration  
 
• Requirements for Renewal of Registration  
      – for All Classifications 
 
• Motor Vehicle Registration Schedule 
 
• Motor Vehicle User’s Charge 
 
• Charges and Other Fees<br>

**Node ID:** af1f67cd-dbeb-4925-9643-52809434b103<br>**Similarity:** 0.273284912109375<br>**Text:** 20 
 
3. Optional Motor Vehicle Special Plate  
Three (3) alpha characters and two (2) numeric characters 
 
TSD 77 
 
  LAST DIGIT vehicle ending 7 must be 
registered in July 
 
2nd TO THE LAST DIGIT OF THE PLATE 
NUMBER – refers to the week the vehicle 
must be registered. 2nd to the last or middle 
digit of 7 must register between 15th to 21st 
working day or 3rd week of July<br>

**Node ID:** f1cad40a-af4b-4131-a052-252bdf1025bd<br>**Similarity:** 0.2910946011543274<br>**Text:** and technology; 
J. 
Registration - refers to the act of compliance to the documentary requirements, 
standards and procedures of Republic Act No. 11698 and these Rules in order for a 
vintage vehicle to be classified by the Land Transp01tation Office under a 'vintage 
vehicle' subclassification; 
k. Registered vintage vehicle - refers to a vintage vehicle registered with a 'vintage 
vehicle' subclassification that benefits from the exemptions and privileges under 
Republic Act No. I 1698 and these Rules~ -----_ ·' __ ·-··· __ --· . __ 
I l r'. LA lfw f~ t£ ~; '( f~ ~-=-
. t 
. > 
_ • 
· · ~ 
•(,: :• ·,c 
, :• H 
•1 1 :1~
, -..·F 
{ I_ f'\ , • • 
I 
Implementing Rules and Regulations 
~l n . 
. 
. 
1'r Page 3 o" 26 
Republic Act No. 11698 - Vintage Vehicle Regulation Ac~ 
1.1 ( 
JAN 3 0 202j 
I ) '.: 
' l ~ \ IT' ([l ~ •""" Ii 'V ·:~ I i I ) ' 
1:-1~. Lfr,it ~ l {~_,~ ~-- y I 
. -------------- ----··~- -<br>

**Node ID:** 87ff145f-1051-4bee-9474-e959b516cc40<br>**Similarity:** 0.29999953508377075<br>**Text:** .. 1·\ 
I 
cr)tional Motor Vehicle Special Plates (OMVSP) 
· 
Application Form 
. -
Registered Owner Information: 
0 
Name: 
Last Name 
First Name 
Middle Name 
Address: 
No. 
Street 
Province/City 
Zip Code 
Telephone/Mobile: No. : 
Email Address: 
~ 
OMVSP Requirements: 
~ 
1. 
Duly accomplished OMVSP application form by the Owner or its duly Authorized Representative; 
2. 
Certificate of Registration and official Receipt 
3. 
Certificate of No Plate Issued; 
4. 
Identification Card of the Authorized Representative; 
5. 
Authorization Letter form the Owner (if applicable). 
-
Motor Vehicle Information: 
[I] 
MV File No.: 
MVType: 
Chassis No. : 
Make: 
Engine! No. : 
OR No. : 
Year Model: 
CR No. : 
Plate Request 
List plate request in order of preference. First available preference will· be ordered. THE ORDER CANNOT BE CHANGED. 
CANCELLATION OF THIS ORDER WILL NOT ENTITLE YOU TO A REFUND .. · 
• ANY THREE (ll Al.PH°A. COMBINATIUN +TRIPLE (l) NUMERIC EXCEPT 00 
ANY THREE (3) ALPHA...<br>

**Node ID:** 223e9b01-3f15-42cd-8751-078791c50b66<br>**Similarity:** 0.302068829536438<br>**Text:** 2 
 
 
DOCUMENTARY REQUIREMENTS 
 FOR REGISTRATION 
 
I. INITIAL REGISTRATION OF MOTOR VEHICLE 
REQUIREMENTS 
WHERE TO SECURE 
 
General Requirements 
 
• Original Sales invoice 
 
 
 
 
 
 
• Original LTO copy or electronically 
transmitted appropriate insurance 
Certificate of Cover (Third Party Liability) 
 
• Original copy of Philippine National 
Police - Highway Patrol Group (PNP- 
HPG) Motor Vehicle (MV) Clearance 
Certificate and Special Bank Receipt 
(SBR) 
 
• Original Certificate of Stock Reported 
(CSR) 
 
 
 
• Payment Reference Number if payment 
is made through e-PAT 
 
 
- Accredited Manufacturer/ 
Assembler/Importer 
/Rebuilder/ Dealer (MAIRD) 
 
- Accredited insurance 
companies by the Insurance 
Commission 
 
- PNP-HPG MV Clearance 
Division 
 
 
 
 
- Accredited Manufacturer/ 
Assembler/Importer/Rebuild
er/ Dealer (MAIRD) 
 
- LANDBANK Link.BizPortal 
Additional Requirements 
1. Brand New Motorcycle with Sidecar 
(TC) 
 
• Original Affidavit of Attachment for 
sid...<br>

**Node ID:** 67de5b2b-7d5a-4f2e-9ef2-c69361a4ab7f<br>**Similarity:** 0.3026140332221985<br>**Text:** CHECKLIST 
MANDATORY SUPPORTING DOCUMENTS 
FOR REGISTRATION TRANSACTIONS 
PRIVATE 
I. 
New Registration 
1. lmpe>tted Mt>tor Vehicle 
0 Original Invoice 
0 Certification of Payment of 
taxes 
0 CSR 
0 CHPG clearance 
O Insurance Certificate of Cover 
0 Actual inspection of Motor 
Vehicle & duly accomplished 
MVIR (stencils of motor & 
chassis nos. must be done on 
the space provided for) 
0 Early Warning Device (EWD) 
2. Locally Assemble/Rebuilt 
0 CSR 
O Original Sales Invoice/Commercial 
Invoice of motor/chassis 
0 CR & OR of motor/chassis, 
if taken from another motor 
vehicle 
O Certification of Payment of 
Taxes from Boe BOC & BIR, if 
motor/chassis are imported. 
0 CHPG clearance 
0 Insurance Certificate of Cover 
O Affidavit of Rebuilt of Owner/ 
Mechanic 
O Actual Inspection of Motor 
Vehicle & duly accomplished 
MVIR (stencils of motor & 
chassis nos. must be done on 
the space provided for) 
O Early Warning Device (EWD) 
II. 
Renewal of MV Registration 
O Original copy of ...<br>

**Node ID:** 68afede0-1696-4c83-be8b-e34b73d411a9<br>**Similarity:** 0.3033105134963989<br>**Text:** PERMIT & LICENSES
Special-purpose Vehicle - Required Training
School Service<br>

**Node ID:** 4c0c3f9c-4673-428a-9b48-5b917047b19d<br>**Similarity:** 0.3039371967315674<br>**Text:** Registration of Motor Vehicle
NUMBER PLATES
PRIVATE
FOR HIRE
GOVERMENT
DIPLOMATIC<br>

**Node ID:** bc7de25c-d18d-4401-a86d-a27d7181638e<br>**Similarity:** 0.3083765506744385<br>**Text:** b. Modifications to brakes, suspension, axles, and running gear to improve efficienc) or 
safety; 
c. Use of carburetors, fuel injection systems or emission control devices not original to 
the vintage vehicle to improve efficiency, economy or environmental pe1formance: 
d. Lnstallation of a new, modem engine of the same brand or manufacture and of the same 
general specification (e.g. fuel type, piston displacement, number of cylinders, engine 
configuration or layout, etc.) as the engine original to the vintage vehicle or vehicles 
belonging to the same historic model line or automobile brand lineage as the vintage 
vehicle; and 
e. Installation of after-market accessories and equipment, such as radios, air-conditioning, 
and directional lights, to pennit the convenient or safe use of the vehicle, except those 
which are expressly prohibited by law to be used or attached in any motor vehicle such 
as, but not limited to, sirens, bells, horns, whistles, or other similar gadgets tha...<br>

**Node ID:** ce4dcdc6-d8a0-4b91-b0a9-2058858ce44a<br>**Similarity:** 0.30905842781066895<br>**Text:** 2. RENEWAL OF MOTOR VEHICLE (MV) REGISTRATION 
Pursuant to Republic Act No. 4136 and other special laws, one of the core mandates of the LTD is to register roadworthy and emission 
compliant motor vehicles for the current year depending on the plate ending 
A. L TO District Offices (OOs) I Extension Offices (EOs) IE-Patrols 
B. For Tax Exempt (Diplomatic, Exempt Private): Authorized District Offices nearest to the Regional Office 
Office or Division: 
C. For Other Exempt Vehicles {OEVs): l TO DO I EO nearest to the Special Economic Zone 
0 . For ' For-Hire" MVs in NCR: Public Utility Vehicle Registration Center (PUVRC) and Public Utility Vehicle 
Registration Extension Center (PUVREC) 
Classification: 
Simple Transaction 
G28 - Government to Business 
Type of Transaction: 
G2C · Government to Citizen 
G2G - Government to Government 
Who may avail: 
Motor vehicle owners 
CHECKLIST OF REQUIREMENTS 
General Requirements; 
I 
WHERE TO SECURE 
1. One (1) clear photocopy of latest OR/CR (...<br>

**Node ID:** e81bca5b-0e21-4955-af14-e5c406cf42b5<br>**Similarity:** 0.31441283226013184<br>**Text:** VIOLATIONS IN CONN ECTION WITH 
REPUBLIC ACT NO. 4136 AND THE JOINT ADMINISTRATIV2 ORDER NO. 2014-01 
RA.4136 
]AO 2014-01 
Chapter II - REGISTRATION OF MOTOR A. 
DRIVING 
AN 
UNREGISTERED 
MOTOR 
VEHICLESSection 5. All motor vehicles and VEHICLE 
other vehicles must be registered. 
(a) no motor vehicle shall be used on or upon any 
public highway of the Philippines unless the same 
is properly registered for the current year 111 
accordance with the provisions of RA 4136. 
(b) Any registration of motor vehicles not renewed 
on or before the date fixed 
for 
different 
classifications as provided. 
-Operating 
a 
motor 
unregistered/improperly 
with iiwalid registration 
vehicle 
which 
is 
registered/ delinquent or 
This includes driving with an improperly registered motor 
vehicle or a motor vehicle with expired, revoked, suspended or 
invalid 
registration, unregistered 
or fake substirute or 
replacement engine, engine block or chassis. 
-Operating a motor vehicle with unregiste...<br>

**Node ID:** e47edbb5-3590-4de6-854b-bfbe791875ba<br>**Similarity:** 0.31452977657318115<br>**Text:** under a vintage vehicle subclassification as stated in the preceding 
sections of this Rule shall not be mandatory for eligible vintage vehicles. 
Moreover, an owner of a registered vintage vehicle with a regish·ation that has expired 
or is about to expire may opt to revert to a regular annual regish·ation, which shall subject the 
vehicle to all laws and regulations governing the registration and use of motor vehicles in 
general, including all emission, safety, roadworthiness and other standards. 
U.F'. LAW CENTER 
I 
OFFICE ol lht ~AllONAl ,;DMIN l ~TRAH'/£ REGISTER. 
Implementing Rules and Regulations 
Republic Act No. 11698 - Vintage Vehicle Regulation Act 
Adm1n1sua1" ~ Ru t ~• ~nd Re9ul!l!Ons 
I. 
~~e 17 of 26 
I 
JAN 3 0 2023 
l ~ I 
\ 
\
r~-· rf' llZ· , -,~
· 1 'i);'' 
.u. J-f.·'h LC.1 .J. 
l L 
T t. E __ J.1 ' {..'Q 
;· \ 
. 
' 
------ ___ 
;: __ .::..-
. . -·---<br>

**Node ID:** 2ed1cd47-e757-4b80-aa62-57b89252d263<br>**Similarity:** 0.3153288960456848<br>**Text:** 3. Original and photocopy of any valid government issued identification document of 
the registered owner with photo and signature; 
4. Philippine National Police (PNP) - Highway Patrol Group (HPG) Motor Vehicle 
Clearance Certificate; 
5. Proof of electronically transmitted appropriate Third Party Liability (TPL) 
Insurance Certificate of Cover (COC); 
6. Certificate of title issued by the country of origin or commercial invoice of motor 
vehicle; and 
7. Duly accomplished L TO application form for registration as a vintage vehicle under 
Republic Act No. 11698 (attached as "Appendix 1 '). 
b. Motor vehicles that have current or previous records of annual registration with the 
LTO (including those vehicles that were registered but are placed on storage) may be 
specially registered under a vintage vehicle subclassification to avail any exemption or 
privilege under these Rules. This transaction requires the issuance of a new Certific te 
of Registration. 
The following documentary...<br>

hit_rate: 0.0
mrr: 0.0
precision: 0.0
recall: 0.0
ap: 0.0
ndcg: 0.0


Query: What type of motor vehicles require special registration procedures, and where can they be registered?
Metrics: {'hit_rate': 0.0, 'mrr': 0.0, 'precision': 0.0, 'recall': 0.0, 'ap': 0.0, 'ndcg': 0.0}



In [None]:
eval_results = await evaluator.aevaluate_dataset(qa_dataset)

def display_results(name, eval_results):
    """Display results from evaluate."""

    metric_dicts = []
    for eval_result in eval_results:
        metric_dict = eval_result.metric_vals_dict
        metric_dicts.append(metric_dict)

    full_df = pd.DataFrame(metric_dicts)

    columns = {
        "retrievers": [name],
        **{k: [full_df[k].mean()] for k in metrics},
    }

    metric_df = pd.DataFrame(columns)

    return metric_df


display_results("top-2 eval", eval_results)

hit_rate: 0.0
mrr: 0.0
precision: 0.0
recall: 0.0
ap: 0.0
ndcg: 0.0
hit_rate: 0.0
mrr: 0.0
precision: 0.0
recall: 0.0
ap: 0.0
ndcg: 0.0
hit_rate: 0.0
mrr: 0.0
precision: 0.0
recall: 0.0
ap: 0.0
ndcg: 0.0
hit_rate: 1.0
mrr: 0.1
precision: 0.06666666666666667
recall: 1.0
ap: 0.1
ndcg: 0.2890648263178879
hit_rate: 0.0
mrr: 0.0
precision: 0.0
recall: 0.0
ap: 0.0
ndcg: 0.0
hit_rate: 1.0
mrr: 0.2
precision: 0.06666666666666667
recall: 1.0
ap: 0.2
ndcg: 0.38685280723454163
hit_rate: 0.0
mrr: 0.0
precision: 0.0
recall: 0.0
ap: 0.0
ndcg: 0.0
hit_rate: 1.0
mrr: 0.5
precision: 0.06666666666666667
recall: 1.0
ap: 0.5
ndcg: 0.6309297535714575
hit_rate: 0.0
mrr: 0.0
precision: 0.0
recall: 0.0
ap: 0.0
ndcg: 0.0
hit_rate: 1.0
mrr: 0.5
precision: 0.06666666666666667
recall: 1.0
ap: 0.5
ndcg: 0.6309297535714575
hit_rate: 0.0
mrr: 0.0
precision: 0.0
recall: 0.0
ap: 0.0
ndcg: 0.0
hit_rate: 1.0
mrr: 0.16666666666666666
precision: 0.06666666666666667
recall: 1.0
ap: 0.16666666666666666
ndcg: 0.3562071871080

Unnamed: 0,retrievers,hit_rate,mrr,precision,recall,ap,ndcg
0,top-2 eval,0.420718,0.289973,0.028048,0.420718,0.289973,0.320798


In [None]:
#indexing
index = FaissIndexer()
index.build_index(nodes_embed)
faiss_index = index.get_index()

faiss_retriever = FAISSVectorStoreRetriever(faiss_index=faiss_index,documents=nodes_embed)

## 5.D Sparse Embedding via BM25

In [None]:
bm25_retriever = BM25Retriever.from_defaults(
   nodes=nodes,
   similarity_top_k=5,
   stemmer=Stemmer.Stemmer("english"),
   language="english",
)

## 5.E Hybrid Retrieval via Reciprocal Rank

In [None]:
def hybrid_embedding(results: dict, top_k: int):
    x = QueryFusionRetriever
    ranked_results = QueryFusionRetriever._reciprocal_rerank_fusion(x, results)
    return ranked_results[:top_k]

## 5.F Hybrid Retrieval Evaluator

In [None]:
from typing import List, Tuple, Any, Dict
from pydantic import Field, ConfigDict
from llama_index.core.evaluation.retrieval.base import (
    BaseRetrievalEvaluator,
    RetrievalEvalMode,
    RetrievalEvalResult
)
from llama_index.core.evaluation.retrieval.metrics import resolve_metrics
from llama_index.core.response.notebook_utils import display_source_node

class HybridRetrievalEvaluator(BaseRetrievalEvaluator):
    faiss_retriever: "FAISSVectorStoreRetriever" = Field(..., description="FAISS Retriever instance")
    bm25_retriever: "BM25Retriever" = Field(..., description="BM25 Retriever instance")
    Print_Results: bool = Field(default=False, description="Whether to print retrieved results")

    model_config = ConfigDict(extra="forbid")  

    @classmethod
    def from_metric_names(
        cls,
        metric_names: List[str],
        faiss_retriever: "FAISSVectorStoreRetriever",
        bm25_retriever: "BM25Retriever",
        Print_Results: bool = False,
        **kwargs: Any,
    ) -> "HybridRetrievalEvaluator":
        metric_types = resolve_metrics(metric_names)
        metrics = [metric() for metric in metric_types]
        return cls(metrics=metrics, faiss_retriever=faiss_retriever, bm25_retriever=bm25_retriever, Print_Results=Print_Results, **kwargs)

    async def _aget_retrieved_ids_and_texts(
        self,
        query: str,
        mode: RetrievalEvalMode = RetrievalEvalMode.TEXT,
        top_k: int = 15
    ) -> Tuple[List[str], List[str]]:
        response = client.embeddings(prompt=query, model="mxbai-embed-large")
        query_embedding = response["embedding"]

        # FAISS retrieval
        faiss_docs = self.faiss_retriever._retrieve(query_embedding, top_k=top_k)

        # BM25 retrieval
        bm25_docs = self.bm25_retriever.retrieve(query)

        # Combine results
        results = {'faiss': faiss_docs, 'bm25': bm25_docs}
        ranked_results = QueryFusionRetriever._reciprocal_rerank_fusion(None, results)
        ranked_results = ranked_results[:top_k]

        # Optionally display results
        if self.Print_Results:
            for doc in ranked_results:
                display_source_node(doc, source_length=1000)

        retrieved_ids = [doc.id_ for doc in ranked_results]
        retrieved_texts = [doc.text for doc in ranked_results]
        return retrieved_ids, retrieved_texts

    async def aevaluate(
        self,
        query: str,
        expected_ids: List[str],
        expected_texts: List[str] = [],
        mode: RetrievalEvalMode = RetrievalEvalMode.TEXT,
        **kwargs: Any,
    ) -> RetrievalEvalResult:
        retrieved_ids, retrieved_texts = await self._aget_retrieved_ids_and_texts(query, mode)
        metric_dict = {}
        
        for metric in self.metrics:
            # Call compute instead of evaluate
            result = metric.compute(
                query=query,
                expected_ids=expected_ids,
                retrieved_ids=retrieved_ids,
                expected_texts=expected_texts,
                retrieved_texts=retrieved_texts,
                **kwargs
            )
            metric_dict[metric.metric_name] = result  # Store the whole RetrievalMetricResult object
            print(f"{metric.metric_name}: {result.score}")  # Print each metric result
        
        # Return RetrievalEvalResult with all required fields
        return RetrievalEvalResult(
            query=query,
            retrieved_ids=retrieved_ids,
            retrieved_texts=retrieved_texts,
            expected_ids=expected_ids,
            expected_texts=expected_texts,
            metric_dict=metric_dict
        )


## 5.G BM25 Retreival Evaluation

In [None]:
evaluator = HybridRetrievalEvaluator.from_metric_names(
    metric_names=metrics,
    faiss_retriever=faiss_retriever,
    bm25_retriever=bm25_retriever,
    Print_Results=True  # Toggle as needed
)

sample_id, sample_query = list(qa_dataset.queries.items())[1]
sample_expected = qa_dataset.relevant_docs[sample_id]

print("Sample Problem:")
print(f"sample id: {sample_id}, text: {sample_query}")
print(f"sample exp: {sample_expected}")

print("\nRetrieval results:")

result = await evaluator.aevaluate(
    query=sample_query,
    expected_ids=sample_expected
)

print("\n")
print(result)

Sample Problem:
sample id: 2dbe10ff-a3d8-49b1-8d9f-bf6642834340, text: What type of motor vehicles require special registration procedures, and where can they be registered?
sample exp: ['a647ee3e-c2ec-4d75-8e0c-827748e58bcc']

Retrieval results:


**Node ID:** ce4dcdc6-d8a0-4b91-b0a9-2058858ce44a<br>**Similarity:** 0.032266458495966696<br>**Text:** 2. RENEWAL OF MOTOR VEHICLE (MV) REGISTRATION 
Pursuant to Republic Act No. 4136 and other special laws, one of the core mandates of the LTD is to register roadworthy and emission 
compliant motor vehicles for the current year depending on the plate ending 
A. L TO District Offices (OOs) I Extension Offices (EOs) IE-Patrols 
B. For Tax Exempt (Diplomatic, Exempt Private): Authorized District Offices nearest to the Regional Office 
Office or Division: 
C. For Other Exempt Vehicles {OEVs): l TO DO I EO nearest to the Special Economic Zone 
0 . For ' For-Hire" MVs in NCR: Public Utility Vehicle Registration Center (PUVRC) and Public Utility Vehicle 
Registration Extension Center (PUVREC) 
Classification: 
Simple Transaction 
G28 - Government to Business 
Type of Transaction: 
G2C · Government to Citizen 
G2G - Government to Government 
Who may avail: 
Motor vehicle owners 
CHECKLIST OF REQUIREMENTS 
General Requirements; 
I 
WHERE TO SECURE 
1. One (1) clear photocopy of latest OR/CR (...<br>

**Node ID:** 2ed1cd47-e757-4b80-aa62-57b89252d263<br>**Similarity:** 0.016666666666666666<br>**Text:** 3. Original and photocopy of any valid government issued identification document of 
the registered owner with photo and signature; 
4. Philippine National Police (PNP) - Highway Patrol Group (HPG) Motor Vehicle 
Clearance Certificate; 
5. Proof of electronically transmitted appropriate Third Party Liability (TPL) 
Insurance Certificate of Cover (COC); 
6. Certificate of title issued by the country of origin or commercial invoice of motor 
vehicle; and 
7. Duly accomplished L TO application form for registration as a vintage vehicle under 
Republic Act No. 11698 (attached as "Appendix 1 '). 
b. Motor vehicles that have current or previous records of annual registration with the 
LTO (including those vehicles that were registered but are placed on storage) may be 
specially registered under a vintage vehicle subclassification to avail any exemption or 
privilege under these Rules. This transaction requires the issuance of a new Certific te 
of Registration. 
The following documentary...<br>

**Node ID:** 8c86b879-de45-4059-af0a-bebcb81a183b<br>**Similarity:** 0.016666666666666666<br>**Text:** Office or Division:
Classification:
Type of Transaction:
Who may avail:
WHERE TO SECURE
LTO
LTO
New Registration Unit, District Office, Extension Office
Simple
9. REQUEST FOR CONFIRMATION OF MOTOR VEHICLE REGISTRATION
Procedure for miscellaneous transaction where the subject motor vehicle is initially registered at a New Registration Unit (NRU) /District Office (DO)/ Extension Office
(EO) located in a different Region to that of the transacting DO/EO.
2. Other applicable documents depending on the type of  miscellaneous motor vehicle 
transaction
G2G - Government to Government
Records Officer of the transacting NRU/DO/EO
CHECKLIST OF REQUIREMENTS
1. Request for Confirmation Form
REQUEST FOR CONFIRMATION OF MOTOR VEHICLE REGISTRATION
193<br>

**Node ID:** e47edbb5-3590-4de6-854b-bfbe791875ba<br>**Similarity:** 0.01639344262295082<br>**Text:** under a vintage vehicle subclassification as stated in the preceding 
sections of this Rule shall not be mandatory for eligible vintage vehicles. 
Moreover, an owner of a registered vintage vehicle with a regish·ation that has expired 
or is about to expire may opt to revert to a regular annual regish·ation, which shall subject the 
vehicle to all laws and regulations governing the registration and use of motor vehicles in 
general, including all emission, safety, roadworthiness and other standards. 
U.F'. LAW CENTER 
I 
OFFICE ol lht ~AllONAl ,;DMIN l ~TRAH'/£ REGISTER. 
Implementing Rules and Regulations 
Republic Act No. 11698 - Vintage Vehicle Regulation Act 
Adm1n1sua1" ~ Ru t ~• ~nd Re9ul!l!Ons 
I. 
~~e 17 of 26 
I 
JAN 3 0 2023 
l ~ I 
\ 
\
r~-· rf' llZ· , -,~
· 1 'i);'' 
.u. J-f.·'h LC.1 .J. 
l L 
T t. E __ J.1 ' {..'Q 
;· \ 
. 
' 
------ ___ 
;: __ .::..-
. . -·---<br>

**Node ID:** e81bca5b-0e21-4955-af14-e5c406cf42b5<br>**Similarity:** 0.016129032258064516<br>**Text:** VIOLATIONS IN CONN ECTION WITH 
REPUBLIC ACT NO. 4136 AND THE JOINT ADMINISTRATIV2 ORDER NO. 2014-01 
RA.4136 
]AO 2014-01 
Chapter II - REGISTRATION OF MOTOR A. 
DRIVING 
AN 
UNREGISTERED 
MOTOR 
VEHICLESSection 5. All motor vehicles and VEHICLE 
other vehicles must be registered. 
(a) no motor vehicle shall be used on or upon any 
public highway of the Philippines unless the same 
is properly registered for the current year 111 
accordance with the provisions of RA 4136. 
(b) Any registration of motor vehicles not renewed 
on or before the date fixed 
for 
different 
classifications as provided. 
-Operating 
a 
motor 
unregistered/improperly 
with iiwalid registration 
vehicle 
which 
is 
registered/ delinquent or 
This includes driving with an improperly registered motor 
vehicle or a motor vehicle with expired, revoked, suspended or 
invalid 
registration, unregistered 
or fake substirute or 
replacement engine, engine block or chassis. 
-Operating a motor vehicle with unregiste...<br>

**Node ID:** 311f52bd-e806-440f-a52d-baa27246b917<br>**Similarity:** 0.016129032258064516<br>**Text:** problem of depleting alpha-numeric combinations, 
the letter “I”, “O”, and “Q” shall have a different font to distinguish it from 
zero (0). 
 
B. 
Nature and Coverage of Special Plates 
 
The OMVSP is considered as a regular MV plate.  As such it shall be 
issued in pairs and it shall be permanently assigned to a specified motor 
vehicle during its lifetime.  Moreover, it is optional and shall be issued only 
upon direct application by the owner of a brand new private motor vehicle 
which is for initial or first-time registration with the LTO.  The MV’s 
covered herein are as follows: 
 
a. 
Cars 
b. 
Sports Utility Vehicles (SUV) 
c. 
Asian Utility Vehicles (AUV) 
d. 
Sports Pick-up 
e. 
Commuter Vans 
 
The following MV types shall be excluded from the coverage of those 
which can be issued special plates: 
 
a. 
Public Utility Vehicles 
b. 
Cargo Trucks (private/for hire) 
c. 
Service 
Vehicle 
(hotel 
limousines, 
tourist 
vehicles,    rent-a-car, ambulance and funeral 
hearse)...<br>

**Node ID:** 44576740-d648-4d0f-9961-f94187623d92<br>**Similarity:** 0.015873015873015872<br>**Text:** coordinate with the present owners and the PSG, in applicable cases, to validate the truthfulness 
of the information submitted. 
Section 3. Valuation for Taliffs, Import Duties and Other Taxes. 
The Bureau of Customs (BOC), for purposes of valuation for tariffs, import duties and 
other taxes of impo1ted vintage vehicle, shall differentiate concours, RESTOMOD and for 
restoration vintage vehicle, subject to the provisions of Sections 700 to 707 of Republic Act 
No. 10863, otherwise known as the Customs Modernization and Tariff Act. Restoration mode 
and for restoration shall have a lower valuation against concours for the same make and model 
of vintage vehicle. 
RULE VI 
PROCEDURES FOR REGISTRATION 
Section 1. Creation of a Registration Subclassification for Vintage Vehicles; Applicable 
Motor Vehicle User's Charge. 
In consonance with Section 7 of Republic Act No. 4136, as amended by Batas 
Pambansa Blg. 74, a 'Vintage Vehicle' subclassification is hereby created under the Privat...<br>

**Node ID:** bc7de25c-d18d-4401-a86d-a27d7181638e<br>**Similarity:** 0.015625<br>**Text:** b. Modifications to brakes, suspension, axles, and running gear to improve efficienc) or 
safety; 
c. Use of carburetors, fuel injection systems or emission control devices not original to 
the vintage vehicle to improve efficiency, economy or environmental pe1formance: 
d. Lnstallation of a new, modem engine of the same brand or manufacture and of the same 
general specification (e.g. fuel type, piston displacement, number of cylinders, engine 
configuration or layout, etc.) as the engine original to the vintage vehicle or vehicles 
belonging to the same historic model line or automobile brand lineage as the vintage 
vehicle; and 
e. Installation of after-market accessories and equipment, such as radios, air-conditioning, 
and directional lights, to pennit the convenient or safe use of the vehicle, except those 
which are expressly prohibited by law to be used or attached in any motor vehicle such 
as, but not limited to, sirens, bells, horns, whistles, or other similar gadgets tha...<br>

**Node ID:** 42aa82b1-49e2-44d0-b8f9-8822e3740450<br>**Similarity:** 0.015625<br>**Text:** WHERE TO SECURE
Registered Motor Vehicle owner
F. For “For-Hire” MVs in NCR: Public Utility Vehicle Registration Center (PUVRC) and Public Utility Vehicle Registration Extension 
Center (PUVREC)
Classification:
Simple Transaction
Type of Transaction:
G2B - Government to Business 
G2C - Government to Citizen
G2G - Government to Government 
Office or Division:
2. VEHICLE ENCODING/LINKING 
The process wherein the portal account of a client is linked to the record of his/her motor vehicle.
B. LTO District Offices (DOs) / Extension Offices (EOs) / E-Patrols
C. For Tax Exempt (Diplomatic): Authorized District Offices nearest to the Regional Office
E. For Other Exempt Vehicles (OEVs): LTO DO / EO nearest to the Special Economic Zone
A. Registration Section, Central Office (Government, Diplomatic, Vehicles owned by Government Employees/OFW, or Other 
Vehicles as may be deemed necessary)
D. For Tax Exempt (Exempt Private or Government) and for Stolen and Recovered Vehicles: Authorized Distri...<br>

**Node ID:** 4c0c3f9c-4673-428a-9b48-5b917047b19d<br>**Similarity:** 0.015384615384615385<br>**Text:** Registration of Motor Vehicle
NUMBER PLATES
PRIVATE
FOR HIRE
GOVERMENT
DIPLOMATIC<br>

**Node ID:** 68afede0-1696-4c83-be8b-e34b73d411a9<br>**Similarity:** 0.015151515151515152<br>**Text:** PERMIT & LICENSES
Special-purpose Vehicle - Required Training
School Service<br>

**Node ID:** 67de5b2b-7d5a-4f2e-9ef2-c69361a4ab7f<br>**Similarity:** 0.014925373134328358<br>**Text:** CHECKLIST 
MANDATORY SUPPORTING DOCUMENTS 
FOR REGISTRATION TRANSACTIONS 
PRIVATE 
I. 
New Registration 
1. lmpe>tted Mt>tor Vehicle 
0 Original Invoice 
0 Certification of Payment of 
taxes 
0 CSR 
0 CHPG clearance 
O Insurance Certificate of Cover 
0 Actual inspection of Motor 
Vehicle & duly accomplished 
MVIR (stencils of motor & 
chassis nos. must be done on 
the space provided for) 
0 Early Warning Device (EWD) 
2. Locally Assemble/Rebuilt 
0 CSR 
O Original Sales Invoice/Commercial 
Invoice of motor/chassis 
0 CR & OR of motor/chassis, 
if taken from another motor 
vehicle 
O Certification of Payment of 
Taxes from Boe BOC & BIR, if 
motor/chassis are imported. 
0 CHPG clearance 
0 Insurance Certificate of Cover 
O Affidavit of Rebuilt of Owner/ 
Mechanic 
O Actual Inspection of Motor 
Vehicle & duly accomplished 
MVIR (stencils of motor & 
chassis nos. must be done on 
the space provided for) 
O Early Warning Device (EWD) 
II. 
Renewal of MV Registration 
O Original copy of ...<br>

**Node ID:** 223e9b01-3f15-42cd-8751-078791c50b66<br>**Similarity:** 0.014705882352941176<br>**Text:** 2 
 
 
DOCUMENTARY REQUIREMENTS 
 FOR REGISTRATION 
 
I. INITIAL REGISTRATION OF MOTOR VEHICLE 
REQUIREMENTS 
WHERE TO SECURE 
 
General Requirements 
 
• Original Sales invoice 
 
 
 
 
 
 
• Original LTO copy or electronically 
transmitted appropriate insurance 
Certificate of Cover (Third Party Liability) 
 
• Original copy of Philippine National 
Police - Highway Patrol Group (PNP- 
HPG) Motor Vehicle (MV) Clearance 
Certificate and Special Bank Receipt 
(SBR) 
 
• Original Certificate of Stock Reported 
(CSR) 
 
 
 
• Payment Reference Number if payment 
is made through e-PAT 
 
 
- Accredited Manufacturer/ 
Assembler/Importer 
/Rebuilder/ Dealer (MAIRD) 
 
- Accredited insurance 
companies by the Insurance 
Commission 
 
- PNP-HPG MV Clearance 
Division 
 
 
 
 
- Accredited Manufacturer/ 
Assembler/Importer/Rebuild
er/ Dealer (MAIRD) 
 
- LANDBANK Link.BizPortal 
Additional Requirements 
1. Brand New Motorcycle with Sidecar 
(TC) 
 
• Original Affidavit of Attachment for 
sid...<br>

**Node ID:** 87ff145f-1051-4bee-9474-e959b516cc40<br>**Similarity:** 0.014492753623188406<br>**Text:** .. 1·\ 
I 
cr)tional Motor Vehicle Special Plates (OMVSP) 
· 
Application Form 
. -
Registered Owner Information: 
0 
Name: 
Last Name 
First Name 
Middle Name 
Address: 
No. 
Street 
Province/City 
Zip Code 
Telephone/Mobile: No. : 
Email Address: 
~ 
OMVSP Requirements: 
~ 
1. 
Duly accomplished OMVSP application form by the Owner or its duly Authorized Representative; 
2. 
Certificate of Registration and official Receipt 
3. 
Certificate of No Plate Issued; 
4. 
Identification Card of the Authorized Representative; 
5. 
Authorization Letter form the Owner (if applicable). 
-
Motor Vehicle Information: 
[I] 
MV File No.: 
MVType: 
Chassis No. : 
Make: 
Engine! No. : 
OR No. : 
Year Model: 
CR No. : 
Plate Request 
List plate request in order of preference. First available preference will· be ordered. THE ORDER CANNOT BE CHANGED. 
CANCELLATION OF THIS ORDER WILL NOT ENTITLE YOU TO A REFUND .. · 
• ANY THREE (ll Al.PH°A. COMBINATIUN +TRIPLE (l) NUMERIC EXCEPT 00 
ANY THREE (3) ALPHA...<br>

**Node ID:** f1cad40a-af4b-4131-a052-252bdf1025bd<br>**Similarity:** 0.014285714285714285<br>**Text:** and technology; 
J. 
Registration - refers to the act of compliance to the documentary requirements, 
standards and procedures of Republic Act No. 11698 and these Rules in order for a 
vintage vehicle to be classified by the Land Transp01tation Office under a 'vintage 
vehicle' subclassification; 
k. Registered vintage vehicle - refers to a vintage vehicle registered with a 'vintage 
vehicle' subclassification that benefits from the exemptions and privileges under 
Republic Act No. I 1698 and these Rules~ -----_ ·' __ ·-··· __ --· . __ 
I l r'. LA lfw f~ t£ ~; '( f~ ~-=-
. t 
. > 
_ • 
· · ~ 
•(,: :• ·,c 
, :• H 
•1 1 :1~
, -..·F 
{ I_ f'\ , • • 
I 
Implementing Rules and Regulations 
~l n . 
. 
. 
1'r Page 3 o" 26 
Republic Act No. 11698 - Vintage Vehicle Regulation Ac~ 
1.1 ( 
JAN 3 0 202j 
I ) '.: 
' l ~ \ IT' ([l ~ •""" Ii 'V ·:~ I i I ) ' 
1:-1~. Lfr,it ~ l {~_,~ ~-- y I 
. -------------- ----··~- -<br>

hit_rate: 0.0
mrr: 0.0
precision: 0.0
recall: 0.0
ap: 0.0
ndcg: 0.0


Query: What type of motor vehicles require special registration procedures, and where can they be registered?
Metrics: {'hit_rate': 0.0, 'mrr': 0.0, 'precision': 0.0, 'recall': 0.0, 'ap': 0.0, 'ndcg': 0.0}



In [None]:
evaluator = HybridRetrievalEvaluator.from_metric_names(
    metric_names=metrics,
    faiss_retriever=faiss_retriever,
    bm25_retriever=bm25_retriever,
    Print_Results=False  
)

eval_results = await evaluator.aevaluate_dataset(qa_dataset)

def display_results(name, eval_results):
    """Display results from evaluate."""

    metric_dicts = []
    for eval_result in eval_results:
        metric_dict = eval_result.metric_vals_dict
        metric_dicts.append(metric_dict)

    full_df = pd.DataFrame(metric_dicts)

    columns = {
        "retrievers": [name],
        **{k: [full_df[k].mean()] for k in metrics},
    }

    metric_df = pd.DataFrame(columns)

    return metric_df

display_results("top-2 eval", eval_results)


hit_rate: 0.0
mrr: 0.0
precision: 0.0
recall: 0.0
ap: 0.0
ndcg: 0.0
hit_rate: 0.0
mrr: 0.0
precision: 0.0
recall: 0.0
ap: 0.0
ndcg: 0.0
hit_rate: 0.0
mrr: 0.0
precision: 0.0
recall: 0.0
ap: 0.0
ndcg: 0.0
hit_rate: 1.0
mrr: 0.1
precision: 0.06666666666666667
recall: 1.0
ap: 0.1
ndcg: 0.2890648263178879
hit_rate: 0.0
mrr: 0.0
precision: 0.0
recall: 0.0
ap: 0.0
ndcg: 0.0
hit_rate: 1.0
mrr: 0.5
precision: 0.06666666666666667
recall: 1.0
ap: 0.5
ndcg: 0.6309297535714575
hit_rate: 0.0
mrr: 0.0
precision: 0.0
recall: 0.0
ap: 0.0
ndcg: 0.0
hit_rate: 1.0
mrr: 0.25
precision: 0.06666666666666667
recall: 1.0
ap: 0.25
ndcg: 0.43067655807339306
hit_rate: 0.0
mrr: 0.0
precision: 0.0
recall: 0.0
ap: 0.0
ndcg: 0.0
hit_rate: 1.0
mrr: 1.0
precision: 0.06666666666666667
recall: 1.0
ap: 1.0
ndcg: 1.0
hit_rate: 0.0
mrr: 0.0
precision: 0.0
recall: 0.0
ap: 0.0
ndcg: 0.0
hit_rate: 1.0
mrr: 1.0
precision: 0.06666666666666667
recall: 1.0
ap: 1.0
ndcg: 1.0
hit_rate: 0.0
mrr: 0.0
precision: 0.0
recall: 0.0
ap: 0.

Unnamed: 0,retrievers,hit_rate,mrr,precision,recall,ap,ndcg
0,top-2 eval,0.409722,0.212541,0.027315,0.409722,0.212541,0.260592


# 6. Post Retrieval

## 6.A Summarization

In [None]:
def summarize_each_chunk(nodes, client, query, model="llama3.3", parent=False):
    if parent:
        chunks = [doc.text for doc in nodes]
    else:
        chunks = [doc.node.text for doc in nodes]
    summaries = []
    
    for i, chunk in enumerate(chunks):
        prompt = f"""
        Summarize the following text in one concise paragraph, focusing on key points relevant to the query: "{query}".
        
        - Emphasize information directly related to the query.
        - Exclude unrelated, redundant, or speculative details.
        - Do NOT introduce new information or answer the query itself. 
        
        Text:
        {chunk}
        
        Summary:
        """
        
        response = client.generate(model=model, prompt=prompt)
        summary = response['response'].strip()
        summaries.append(summary)

    return summaries

# 6.B Evaluation Generation

In [None]:
def generate_response_with_notice(summaries, query, client, model="llama3.3"):
    # Combine summaries into context block
    context = "\n".join(summaries)
    
    # Create prompt to answer based on summarized text
    prompt = f"""
    Use the following summarized information to answer the query accurately and concisely. 
    DO NOT USE BACKGROUND KNOWLEDGE OUTSIDE THE CONTEXT PROVIDED.
    If the information is not sufficient to fully address the query, respond ONLY with:
    "The available information is insufficient to provide a complete answer to this query."

    Summarized Context:
    {context}
    
    Query:
    {query}
    
    Response:
    """
    
    # Send the prompt to Ollama
    response = client.generate(
        model=model,
        prompt=prompt
    )
    
    return response['response'].strip()

# 7. Querying

## 7.A Query Transforms

# 8. Query Generation

In [None]:
docstore = {}

# Store documents using full metadata as the key
for doc in documents:
    key = tuple(doc.metadata.items())  # Convert metadata to tuple for hashable key
    docstore[key] = doc

In [None]:
def get_document_by_chunk_metadata(chunk_node):
    # Convert chunk metadata to tuple for matching
    metadata_key = tuple(chunk_node.metadata.items())

    # Retrieve document from docstore
    document = docstore.get(metadata_key)
    return document

In [None]:
def remove_duplicate_documents(doc_list):
    seen_ids = set()
    unique_docs = []

    for doc in doc_list:
        if doc.doc_id not in seen_ids:
            seen_ids.add(doc.doc_id)
            unique_docs.append(doc)

    return unique_docs

In [None]:
def gen_query(query, top_k, client, mode='dense', summary=False, model="llama3.3", chunks_only=False):
    response = client.embeddings(prompt=query, model="mxbai-embed-large")
    query_embedding = response["embedding"]

    top_k_docs = faiss_retriever._retrieve(query_embedding, top_k=top_k)

    bm25_retriever = BM25Retriever.from_defaults(
    nodes=nodes,
    similarity_top_k=top_k,
    stemmer=Stemmer.Stemmer("english"),
    language="english",
    )
    retrieved_nodes = bm25_retriever.retrieve(query)

    results = {'faiss': top_k_docs, 'bm25':retrieved_nodes}
    ranked_results = hybrid_embedding(results, top_k=top_k)

    if mode == 'dense':
        print('using FAISS')
        ans_nodes =top_k_docs
    elif mode == 'sparse':
        print('using BM25')
        ans_nodes = retrieved_nodes
    else:
        print('using Hybrid')
        ans_nodes = ranked_results

    parent_flag = True
    context = set([get_document_by_chunk_metadata(docs).text for docs in ans_nodes])
    if chunks_only:
        parent_flag = False
        print('using chunks only')
        context = [docs.node.text for docs in ans_nodes]
        
    if summary:
        print('using summaries')
        context_nodes = remove_duplicate_documents([get_document_by_chunk_metadata(docs) for docs in ans_nodes])

        if chunks_only:
            context_nodes=ans_nodes
        summaries = summarize_each_chunk(context_nodes, client, model='llama3.3', query=query,parent=parent_flag)
        context = summaries

    answer = generate_response_with_notice(context, query, client, model=model)

    # Format the references
    references = []
    for i, doc in enumerate(ranked_results[:top_k], start=1):
        metadata = doc.metadata
        source_info = f"Source {i}: {metadata['title']} (Page {metadata['page']}, Folder: {metadata['folder']})"
        references.append(source_info)

    return answer, "\n".join(references), "\n".join(context)

# 9. TDC Exam Evaluation

In [None]:
# Generate prompts dynamically
def generate_prompt(row):
    options = []
    for choice in ['A', 'B', 'C', 'D', 'E']:
        # Check for NaN or blank values
        if pd.notna(row[choice]) and row[choice] != '':
            options.append(f"{choice}. {row[choice]}")
    
    # Construct the prompt with few-shot examples
    prompt = f"\nActual Question: {row['Question']}\n" + "\n".join(options)
    prompt += "\nPlease answer only in letters and put them inside a bracket '[]'. If the question contains the statement 'Check all that apply' then add comma separator if there are multiple answers ONLY IF ALLOWED."
    
    return prompt

In [None]:
# Load the Excel file
file_path = '/mnt/c/Users/Jeryl Salas/Documents/AI 351/Project/LTO_EXAM.csv'
df = pd.read_csv(file_path)
df['Prompt'] = df.apply(generate_prompt, axis=1)
display(df.head())

Unnamed: 0,Question,A,B,C,D,E,Answer,Prompt
0,What should you do in case your vehicle breaks...,Open your trunk and hood,Stand on the expressway and flag down passing ...,Call for help using a mobile phone or an expre...,Park as far to the right as possible,Put your hazard warning light on,"A, C, D, E",\nActual Question: What should you do in case ...
1,What will happen when your front tire blows out?,The back end will sway towards the side of the...,The back end will sway away from the blowout,The front end will pull towards the side of th...,The front end will pull to the opposite side o...,,C,\nActual Question: What will happen when your ...
2,What should you do when an ambulance comes up ...,Stop as soon as you can,"Maintain your speed, let the ambulance driver ...",Speed up so that you don't hold the ambulance,Pull over to the right and slow down or even s...,,D,\nActual Question: What should you do when an ...
3,While driving the hood of your car lifts up bl...,Look through the gap underneath the hood or ou...,Brake suddenly so you don't leave the road,Pull to the side of the road and refasten the ...,Turn your headlights on and look out of the si...,,"A,C",\nActual Question: While driving the hood of y...
4,"In case of an accident, the first duty of the ...",pick-up the injured person and take him to the...,report the accident to the nearest hospital,report the accident to the nearest police station,,,A,"\nActual Question: In case of an accident, the..."


In [None]:
qr_range = (0,60)
df["AI"] = np.nan
ai_answer = []
for i in tqdm(range(*qr_range)):
    ai_answer.append(gen_query(df.loc[i,"Prompt"], top_k=15, client=client, mode='hybrid', model="llama3.3"))

df.loc[qr_range[0]:qr_range[1]-1, "AI"] = [answ[0] for answ in ai_answer]
df.loc[qr_range[0]:qr_range[1]-1, "Context"] = [answ[2] for answ in ai_answer]

  0%|          | 0/60 [00:00<?, ?it/s]

using Hybrid


  2%|▏         | 1/60 [00:20<19:44, 20.08s/it]

using Hybrid


  3%|▎         | 2/60 [01:20<42:08, 43.59s/it]

using Hybrid


  5%|▌         | 3/60 [01:41<31:39, 33.32s/it]

using Hybrid


  7%|▋         | 4/60 [02:17<32:00, 34.30s/it]

using Hybrid


  8%|▊         | 5/60 [02:56<33:06, 36.12s/it]

using Hybrid


 10%|█         | 6/60 [03:32<32:25, 36.03s/it]

using Hybrid


 12%|█▏        | 7/60 [04:08<31:57, 36.19s/it]

using Hybrid


 13%|█▎        | 8/60 [04:48<32:21, 37.35s/it]

using Hybrid


 15%|█▌        | 9/60 [05:26<31:52, 37.49s/it]

using Hybrid


 17%|█▋        | 10/60 [06:01<30:41, 36.84s/it]

using Hybrid


 18%|█▊        | 11/60 [06:40<30:26, 37.29s/it]

using Hybrid


 20%|██        | 12/60 [07:14<29:12, 36.51s/it]

using Hybrid


 22%|██▏       | 13/60 [07:55<29:41, 37.90s/it]

using Hybrid


 23%|██▎       | 14/60 [08:34<29:07, 37.99s/it]

using Hybrid


 25%|██▌       | 15/60 [09:07<27:28, 36.63s/it]

using Hybrid


 27%|██▋       | 16/60 [09:48<27:51, 37.98s/it]

using Hybrid


 28%|██▊       | 17/60 [10:25<26:55, 37.57s/it]

using Hybrid


 30%|███       | 18/60 [11:03<26:21, 37.66s/it]

using Hybrid


 32%|███▏      | 19/60 [11:40<25:42, 37.63s/it]

using Hybrid


 33%|███▎      | 20/60 [12:19<25:13, 37.83s/it]

using Hybrid


 35%|███▌      | 21/60 [12:56<24:25, 37.59s/it]

using Hybrid


 37%|███▋      | 22/60 [13:35<24:07, 38.10s/it]

using Hybrid


 38%|███▊      | 23/60 [14:11<23:05, 37.44s/it]

using Hybrid


 40%|████      | 24/60 [14:51<22:53, 38.17s/it]

using Hybrid


 42%|████▏     | 25/60 [15:27<21:53, 37.53s/it]

using Hybrid


 43%|████▎     | 26/60 [16:02<20:50, 36.77s/it]

using Hybrid


 45%|████▌     | 27/60 [16:43<20:57, 38.09s/it]

using Hybrid


 47%|████▋     | 28/60 [17:22<20:27, 38.37s/it]

using Hybrid


 48%|████▊     | 29/60 [18:00<19:44, 38.22s/it]

using Hybrid


 50%|█████     | 30/60 [18:38<19:09, 38.33s/it]

using Hybrid


 52%|█████▏    | 31/60 [19:12<17:50, 36.91s/it]

using Hybrid


 53%|█████▎    | 32/60 [19:49<17:19, 37.11s/it]

using Hybrid


 55%|█████▌    | 33/60 [20:29<17:03, 37.91s/it]

using Hybrid


 57%|█████▋    | 34/60 [21:08<16:31, 38.14s/it]

using Hybrid


 58%|█████▊    | 35/60 [21:46<15:53, 38.16s/it]

using Hybrid


 60%|██████    | 36/60 [22:22<14:57, 37.41s/it]

using Hybrid


 62%|██████▏   | 37/60 [22:56<13:58, 36.48s/it]

using Hybrid


 63%|██████▎   | 38/60 [23:35<13:37, 37.15s/it]

using Hybrid


 65%|██████▌   | 39/60 [24:14<13:13, 37.79s/it]

using Hybrid


 67%|██████▋   | 40/60 [24:52<12:38, 37.93s/it]

using Hybrid


 68%|██████▊   | 41/60 [25:31<12:01, 38.00s/it]

using Hybrid


 70%|███████   | 42/60 [26:08<11:23, 37.99s/it]

using Hybrid


 72%|███████▏  | 43/60 [26:44<10:32, 37.21s/it]

using Hybrid


 73%|███████▎  | 44/60 [27:22<09:59, 37.47s/it]

using Hybrid


 75%|███████▌  | 45/60 [27:56<09:07, 36.47s/it]

using Hybrid


 77%|███████▋  | 46/60 [28:38<08:53, 38.11s/it]

using Hybrid


 78%|███████▊  | 47/60 [29:12<07:57, 36.76s/it]

using Hybrid


 80%|████████  | 48/60 [29:49<07:23, 36.99s/it]

using Hybrid


 82%|████████▏ | 49/60 [30:26<06:45, 36.89s/it]

using Hybrid


 83%|████████▎ | 50/60 [31:03<06:09, 36.99s/it]

using Hybrid


 85%|████████▌ | 51/60 [31:42<05:39, 37.73s/it]

using Hybrid


 87%|████████▋ | 52/60 [32:17<04:54, 36.83s/it]

using Hybrid


 88%|████████▊ | 53/60 [32:56<04:22, 37.54s/it]

using Hybrid


 90%|█████████ | 54/60 [33:32<03:41, 36.90s/it]

using Hybrid


 92%|█████████▏| 55/60 [34:11<03:08, 37.69s/it]

using Hybrid


 93%|█████████▎| 56/60 [34:49<02:30, 37.74s/it]

using Hybrid


 95%|█████████▌| 57/60 [35:27<01:53, 37.70s/it]

using Hybrid


 97%|█████████▋| 58/60 [36:02<01:14, 37.09s/it]

using Hybrid


 98%|█████████▊| 59/60 [36:40<00:37, 37.07s/it]

using Hybrid


100%|██████████| 60/60 [37:16<00:00, 37.27s/it]
  df.loc[qr_range[0]:qr_range[1]-1, "AI"] = [answ[0] for answ in ai_answer]


In [None]:
import re


def process_answers(answers):
    formatted_answers = []
    
    for a in answers:
        
        matches = re.findall(r'\[?\s*([A-E](?:\s*,\s*[A-E])*)\s*\]?', str(a)) # Extract answers like [A, C, D] or [A] or [B, D]
        answers = []
        for match in matches:
            answers.extend(re.split(r'\s*,\s*', match))  # Split by comma and remove spaces
        unique_sorted_answers = sorted(set(answers), key=lambda x: ['A', 'B', 'C', 'D', 'E'].index(x))
        if not unique_sorted_answers:
            formatted_answers.append(None)
        else:
            formatted_answers.append(unique_sorted_answers)
    return formatted_answers

df_results = df.loc[qr_range[0]:qr_range[1]-1, ["Question","Answer","AI"]]
df_results['Answer'] = df_results['Answer'].apply(lambda x: x.split(', '))
df_results['AI'] = process_answers(df_results["AI"])
df_results['Answer'] = process_answers(df_results["Answer"])



def calculate_scores(df):
    scores = []
    for index, row in df.iterrows():
        correct_answers = set(row['Answer'] if row['Answer'] is not None else [])
        ai_answers = set(row['AI'] if row['AI'] is not None else [])
        if ai_answers == correct_answers:
            score = 1.0
        else:
            score = 0.0
        scores.append(score)
    
    df['Score'] = scores
    accuracy = scores.count(1.0) / len(scores)
    print(f'Final Score: {scores.count(1.0):.2f}/{len(scores):.2f}')
    print(f'Accuracy: {accuracy:.2f}%')
    return df

# Apply the scoring function
scored_df = calculate_scores(df_results)

# Display the dataframe to verify the results
display(scored_df[['Question', 'Answer', 'AI', 'Score']])

Final Score: 48.00/60.00
Accuracy: 0.80%


Unnamed: 0,Question,Answer,AI,Score
0,What should you do in case your vehicle breaks...,"[A, C, D, E]","[C, D, E]",0.0
1,What will happen when your front tire blows out?,[C],[C],1.0
2,What should you do when an ambulance comes up ...,[D],[D],1.0
3,While driving the hood of your car lifts up bl...,"[A, C]","[A, C]",1.0
4,"In case of an accident, the first duty of the ...",[A],[C],0.0
5,"When a vehicle starts to skid, what should the...",[B],[C],0.0
6,"In case of injuries caused by an accident, the...",[A],[A],1.0
7,What will happen when your rear tire blows out?,[B],[A],0.0
8,"When a vehicle is stalled or disabled, the dri...",[C],[C],1.0
9,If you are the first to arrive at the scene of...,[B],[B],1.0


In [None]:
# Generate prompts dynamically
def generate_prompt(row):
    
    # Construct the prompt with few-shot examples
    prompt = f"\nActual Question: {row['Question']}\n" 
    prompt += "\nPlease answer the question based on the given context."
    
    return prompt

In [None]:
# Load the Excel file
file_path = '/mnt/c/Users/Jeryl Salas/Documents/AI 351/Project/LTO_EXAM_QnA.csv'
df = pd.read_csv(file_path, encoding='ISO-8859-1')
df['Prompt'] = df.apply(generate_prompt, axis=1)
display(df.head())

Unnamed: 0,Question,Answer,Prompt
0,Traffic Jam can be prevented if you,Keep opposing lanes open,\nActual Question: Traffic Jam can be prevente...
1,When making a right turn you should,Stay on the outermost lane of the road then si...,\nActual Question: When making a right turn yo...
2,"When you intend to turn right or left, signal ...",25 meters before you intend to make your turn,\nActual Question: When you intend to turn rig...
3,"At an intersection with a traffic light, make ...",The green light is on and there is a left turn...,\nActual Question: At an intersection with a t...
4,Graft and corruption in the traffic enforcemen...,Self disciplined by drivers and obeying traffi...,\nActual Question: Graft and corruption in the...


In [None]:
qr_range = (0,60)
df["AI"] = np.nan
ai_answer = []
for i in tqdm(range(*qr_range)):
    ai_answer.append(gen_query(df.loc[i,"Prompt"], top_k=15, client=client, mode='hybrid', model="llama3.3"))

df.loc[qr_range[0]:qr_range[1]-1, "AI"] = [answ[0] for answ in ai_answer]
df.loc[qr_range[0]:qr_range[1]-1, "Context"] = [answ[2] for answ in ai_answer]
df_new = df.loc[qr_range[0]:qr_range[1]-1].copy()
df = df_new.copy()

  0%|          | 0/60 [00:00<?, ?it/s]

using Hybrid


  2%|▏         | 1/60 [00:42<41:44, 42.45s/it]

using Hybrid


  3%|▎         | 2/60 [01:46<53:31, 55.36s/it]

using Hybrid


  5%|▌         | 3/60 [02:30<47:42, 50.21s/it]

using Hybrid


  7%|▋         | 4/60 [03:11<43:13, 46.31s/it]

using Hybrid


  8%|▊         | 5/60 [04:00<43:15, 47.19s/it]

using Hybrid


 10%|█         | 6/60 [04:43<41:22, 45.97s/it]

using Hybrid


 12%|█▏        | 7/60 [05:26<39:44, 44.99s/it]

using Hybrid


 13%|█▎        | 8/60 [06:18<40:56, 47.24s/it]

using Hybrid


 15%|█▌        | 9/60 [06:59<38:21, 45.14s/it]

using Hybrid


 17%|█▋        | 10/60 [07:38<36:10, 43.41s/it]

using Hybrid


 18%|█▊        | 11/60 [08:46<41:26, 50.75s/it]

using Hybrid


 20%|██        | 12/60 [09:29<38:52, 48.60s/it]

using Hybrid


 22%|██▏       | 13/60 [10:16<37:43, 48.16s/it]

using Hybrid


 23%|██▎       | 14/60 [11:06<37:16, 48.61s/it]

using Hybrid


 25%|██▌       | 15/60 [11:52<35:50, 47.80s/it]

using Hybrid


 27%|██▋       | 16/60 [12:38<34:44, 47.38s/it]

using Hybrid


 28%|██▊       | 17/60 [13:18<32:19, 45.11s/it]

using Hybrid


 30%|███       | 18/60 [14:20<35:09, 50.23s/it]

using Hybrid


 32%|███▏      | 19/60 [15:08<33:43, 49.35s/it]

using Hybrid


 33%|███▎      | 20/60 [15:50<31:30, 47.26s/it]

using Hybrid


 35%|███▌      | 21/60 [16:33<29:56, 46.06s/it]

using Hybrid


 37%|███▋      | 22/60 [17:24<30:01, 47.41s/it]

using Hybrid


 38%|███▊      | 23/60 [18:39<34:26, 55.85s/it]

using Hybrid


 40%|████      | 24/60 [19:26<31:45, 52.93s/it]

using Hybrid


 42%|████▏     | 25/60 [20:18<30:45, 52.74s/it]

using Hybrid


 43%|████▎     | 26/60 [21:02<28:30, 50.31s/it]

using Hybrid


 45%|████▌     | 27/60 [21:52<27:28, 49.94s/it]

using Hybrid


 47%|████▋     | 28/60 [22:39<26:14, 49.19s/it]

using Hybrid


 48%|████▊     | 29/60 [23:28<25:25, 49.22s/it]

using Hybrid


 50%|█████     | 30/60 [24:18<24:45, 49.50s/it]

using Hybrid


 52%|█████▏    | 31/60 [25:05<23:31, 48.67s/it]

using Hybrid


 53%|█████▎    | 32/60 [25:45<21:26, 45.95s/it]

using Hybrid


 55%|█████▌    | 33/60 [26:43<22:23, 49.77s/it]

using Hybrid


 57%|█████▋    | 34/60 [27:23<20:10, 46.55s/it]

using Hybrid


 58%|█████▊    | 35/60 [28:10<19:30, 46.82s/it]

using Hybrid


 60%|██████    | 36/60 [29:10<20:20, 50.87s/it]

using Hybrid


 62%|██████▏   | 37/60 [29:54<18:38, 48.65s/it]

using Hybrid


 63%|██████▎   | 38/60 [30:55<19:16, 52.55s/it]

using Hybrid


 65%|██████▌   | 39/60 [31:38<17:24, 49.72s/it]

using Hybrid


 67%|██████▋   | 40/60 [32:16<15:21, 46.09s/it]

using Hybrid


 68%|██████▊   | 41/60 [33:03<14:40, 46.33s/it]

using Hybrid


 70%|███████   | 42/60 [33:49<13:52, 46.28s/it]

using Hybrid


 72%|███████▏  | 43/60 [34:40<13:30, 47.67s/it]

using Hybrid


 73%|███████▎  | 44/60 [35:34<13:12, 49.54s/it]

using Hybrid


 75%|███████▌  | 45/60 [36:21<12:10, 48.72s/it]

using Hybrid


 77%|███████▋  | 46/60 [37:05<11:03, 47.37s/it]

using Hybrid


 78%|███████▊  | 47/60 [38:06<11:09, 51.53s/it]

using Hybrid


 80%|████████  | 48/60 [38:45<09:32, 47.68s/it]

using Hybrid


 82%|████████▏ | 49/60 [39:25<08:20, 45.52s/it]

using Hybrid


 83%|████████▎ | 50/60 [40:15<07:47, 46.74s/it]

using Hybrid


 85%|████████▌ | 51/60 [41:02<07:01, 46.87s/it]

using Hybrid


 87%|████████▋ | 52/60 [41:48<06:13, 46.68s/it]

using Hybrid


 88%|████████▊ | 53/60 [42:50<05:58, 51.22s/it]

using Hybrid


 90%|█████████ | 54/60 [43:35<04:56, 49.35s/it]

using Hybrid


 92%|█████████▏| 55/60 [44:18<03:56, 47.34s/it]

using Hybrid


 93%|█████████▎| 56/60 [44:58<03:00, 45.11s/it]

using Hybrid


 95%|█████████▌| 57/60 [45:39<02:11, 43.79s/it]

using Hybrid


 97%|█████████▋| 58/60 [46:22<01:27, 43.74s/it]

using Hybrid


 98%|█████████▊| 59/60 [47:04<00:43, 43.17s/it]

using Hybrid


100%|██████████| 60/60 [47:44<00:00, 47.73s/it]
  df.loc[qr_range[0]:qr_range[1]-1, "AI"] = [answ[0] for answ in ai_answer]


# 10. Similarity Evaluation

In [None]:
from llama_index.core.evaluation import SemanticSimilarityEvaluator
from llama_index.core.base.embeddings.base import BaseEmbedding
import asyncio
from llama_index.core.embeddings import resolve_embed_model
from pydantic import PrivateAttr

class OllamaEmbeddingModel(BaseEmbedding):
    _client: Client = PrivateAttr()

    def __init__(self, model_name: str = "mxbai-embed-large", timeout: int = 300):
        super().__init__()
        self.model_name = model_name
        self._client = Client() 

    async def _aget_query_embedding(self, query: str) -> list[float]:
        return await self._aget_text_embedding(query)

    async def _aget_text_embedding(self, text: str) -> list[float]:
        loop = asyncio.get_event_loop()
        embedding_response = await loop.run_in_executor(
            None, self._client.embeddings, self.model_name, text
        )
        return embedding_response['embedding']  

    def _get_query_embedding(self, query: str) -> list[float]:
        return self._get_text_embedding(query)

    def _get_text_embedding(self, text: str) -> list[float]:
        embedding_response = self._client.embeddings(
            model=self.model_name,
            prompt=text
        )
        return embedding_response['embedding']


embed_model = OllamaEmbeddingModel(model_name="mxbai-embed-large")
evaluator = SemanticSimilarityEvaluator(
    embed_model=embed_model,
    similarity_threshold=0.6
)

results_scores = []
results_passing = []
for i in tqdm(range(len(df))):
    response = df.loc[i, "AI"]
    reference = df.loc[i, "Answer"]

    result = await evaluator.aevaluate(
    response=response,
    reference=reference,
    )
    results_scores.append(result.score)
    results_passing.append(result.passing)
    
df['Score'] = results_scores
df['Passing'] = results_passing

average_score = df['Score'].mean()
total_items = len(df)
passing_items = df['Passing'].sum()  
print(f"Average Score: {average_score:.4f}")
print(f"Passing: {passing_items}/{total_items}")
display(df[['Question', 'Answer', 'AI', 'Score', 'Passing']])

100%|██████████| 60/60 [00:42<00:00,  1.41it/s]

Average Score: 0.6461
Passing: 44/60





Unnamed: 0,Question,Answer,AI,Score,Passing
0,Traffic Jam can be prevented if you,Keep opposing lanes open,"follow traffic rules and regulations, such as ...",0.696892,True
1,When making a right turn you should,Stay on the outermost lane of the road then si...,"Based on the provided context, the specific st...",0.742046,True
2,"When you intend to turn right or left, signal ...",25 meters before you intend to make your turn,"Based on the provided text, it is not explicit...",0.765621,True
3,"At an intersection with a traffic light, make ...",The green light is on and there is a left turn...,"At an intersection with a traffic light, make ...",0.769149,True
4,Graft and corruption in the traffic enforcemen...,Self disciplined by drivers and obeying traffi...,"Based on the provided context, graft and corru...",0.646904,True
5,"On a four(4) lane road with single white line,...",Overtake by passing over the solid white line,"On a four-lane road with a single white line, ...",0.821234,True
6,A double solid yellow line with broken white l...,Absolutely no overtaking,"No, according to the text, a double solid yell...",0.584989,False
7,"When making a U-Turn, you should",Check for traffic behind you and indicate your...,"Unfortunately, there is no information provide...",0.620563,True
8,Signs that are triangular in shape and with a ...,Caution or warning signs,"Based on the context, the signs that are trian...",0.656201,True
9,"Signs that are round, inverted triangle or oct...",Regulatory signs,Regulatory Signs,1.0,True


# 11. Relevancy Evaluation

In [None]:
from llama_index.core.evaluation import RelevancyEvaluator

ollama_llm = Ollama(model="llama3.2", request_timeout=300)
evaluator = RelevancyEvaluator(llm=ollama_llm)

eval_results = []

for i in tqdm(range(len(df))):
    eval_result = await evaluator.aevaluate(
        query=df.loc[i, "Question"],
        response=df.loc[i, "AI"],
        contexts=[df.loc[i, "Context"]]  
    )
    eval_results.append(eval_result.passing)

df['Eval'] = eval_results

total_items = len(df)
passing_items = df['Eval'].sum()
score = f"Score: {passing_items}/{total_items}"
percentage = passing_items / total_items if total_items > 0 else 0
print(score)
print(f"Percentage: {percentage:.2%}")
display(df[['Question', 'Answer', 'AI', 'Eval']])

100%|██████████| 60/60 [26:32<00:00, 26.55s/it]

Score: 48/60
Percentage: 80.00%





Unnamed: 0,Question,Answer,AI,Eval
0,Traffic Jam can be prevented if you,Keep opposing lanes open,"follow traffic rules and regulations, such as ...",True
1,When making a right turn you should,Stay on the outermost lane of the road then si...,"Based on the provided context, the specific st...",True
2,"When you intend to turn right or left, signal ...",25 meters before you intend to make your turn,"Based on the provided text, it is not explicit...",True
3,"At an intersection with a traffic light, make ...",The green light is on and there is a left turn...,"At an intersection with a traffic light, make ...",True
4,Graft and corruption in the traffic enforcemen...,Self disciplined by drivers and obeying traffi...,"Based on the provided context, graft and corru...",True
5,"On a four(4) lane road with single white line,...",Overtake by passing over the solid white line,"On a four-lane road with a single white line, ...",True
6,A double solid yellow line with broken white l...,Absolutely no overtaking,"No, according to the text, a double solid yell...",True
7,"When making a U-Turn, you should",Check for traffic behind you and indicate your...,"Unfortunately, there is no information provide...",True
8,Signs that are triangular in shape and with a ...,Caution or warning signs,"Based on the context, the signs that are trian...",False
9,"Signs that are round, inverted triangle or oct...",Regulatory signs,Regulatory Signs,False
