# **Ragas Evaluation**

### **Setup**

**Loading Arxiv Document**

In [1]:
from langchain_community.document_loaders.arxiv import ArxivLoader

loader = ArxivLoader("Vaswani Attention Is All You Need", load_max_docs=1)
docs = loader.load()

In [2]:
docs[0].metadata

{'Published': '2023-08-02',
 'Title': 'Attention Is All You Need',
 'Authors': 'Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin',
 'Summary': 'The dominant sequence transduction models are based on complex recurrent or\nconvolutional neural networks in an encoder-decoder configuration. The best\nperforming models also connect the encoder and decoder through an attention\nmechanism. We propose a new simple network architecture, the Transformer, based\nsolely on attention mechanisms, dispensing with recurrence and convolutions\nentirely. Experiments on two machine translation tasks show these models to be\nsuperior in quality while being more parallelizable and requiring significantly\nless time to train. Our model achieves 28.4 BLEU on the WMT 2014\nEnglish-to-German translation task, improving over the existing best results,\nincluding ensembles by over 2 BLEU. On the WMT 2014 English-to-French\ntranslation task, 

**Loading evaluation dataset**

In [5]:
import pandas as pd
testset = pd.read_csv("data/groundtruth_eval_dataset.csv")

In [6]:
testset.head()

Unnamed: 0,question,context,ground_truth
0,Under what conditions does Google grant permis...,"Provided proper attribution is provided, Googl...",Google grants permission to reproduce the tabl...
1,What are the advantages of the Transformer net...,mechanism. We propose a new simple network arc...,The advantages of the Transformer network arch...
2,Who proposed replacing RNNs with self-attentio...,best models from the literature. We show that ...,Jakob
3,How did Lukasz and Aidan contribute to improvi...,efficient inference and visualizations. Lukasz...,Lukasz and Aidan contributed to improving resu...
4,How do recurrent models factor computation alo...,"architectures [38, 24, 15].\nRecurrent models ...",Recurrent models factor computation along the ...


### **Creating Vector Database and RAG Chain**

In [7]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = text_splitter.split_documents(docs)

In [25]:
from langchain_openai import OpenAI
from langchain_community.vectorstores.faiss import FAISS
from langchain_community.embeddings import HuggingFaceEmbeddings

embedding = HuggingFaceEmbeddings()
model = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio", temperature=0)

vectorstore = FAISS.from_documents(chunks[:37], embedding)
retriever = vectorstore.as_retriever()

In [26]:
retriever.get_relevant_documents('Under what conditions does Google grant permission to reproduce the tables and figures in this paper?')

[Document(page_content='where the query, keys, values, and output are all vectors. The output is computed as a weighted sum\n3\nScaled Dot-Product Attention\nMulti-Head Attention\nFigure 2: (left) Scaled Dot-Product Attention. (right) Multi-Head Attention consists of several\nattention layers running in parallel.\nof the values, where the weight assigned to each value is computed by a compatibility function of the\nquery with the corresponding key.\n3.2.1\nScaled Dot-Product Attention\nWe call our particular attention "Scaled Dot-Product Attention" (Figure 2). The input consists of\nqueries and keys of dimension dk, and values of dimension dv. We compute the dot products of the\nquery with all keys, divide each by √dk, and apply a softmax function to obtain the weights on the\nvalues.\nIn practice, we compute the attention function on a set of queries simultaneously, packed together\ninto a matrix Q. The keys and values are also packed together into matrices K and V . We compute\nthe m

In [9]:
from langchain_core.prompts import PromptTemplate

template = """Answer the question based only on the following context:
{context}

Question: {question}
"""

prompt = PromptTemplate(
    template=template,
    input_variables=["context","question"]
  )

In [10]:
from langchain.schema.runnable import RunnablePassthrough
from langchain.schema.output_parser import StrOutputParser

rag_chain = (
    {"context": retriever,  "question": RunnablePassthrough()}
    | prompt
    | model
    | StrOutputParser()
)

In [11]:
questions = testset["question"].to_list()
ground_truth = testset["ground_truth"].to_list()

In [21]:
questions[0]

'Under what conditions does Google grant permission to reproduce the tables and figures in this paper?'

In [20]:
from datasets import Dataset

questions = testset["question"].to_list()
ground_truth = testset["ground_truth"].to_list()

data = {"question": [], "answer": [], "contexts": [], "ground_truth": ground_truth}

for query in questions:
    data["question"].append(query)
    data["answer"].append(rag_chain.invoke(query))
    data["contexts"].append([doc.page_content for doc in retriever.get_relevant_documents(query)])

dataset = Dataset.from_dict(data)

KeyboardInterrupt: 