## Pluralsight Course : Building and Deploying RAG in Production

#### Demo: Evaluating RAG Function Metrics

In [None]:
# install the required libraries
!pip install -qqq llama-index llama-index-llms-openai llama-index-vector-stores-chroma

In [2]:
import os 
# set API Key
os.environ["OPENAI_API_KEY"] = "<YOU_API_KEY>"

### Define Embedding and LLM Model

In [2]:
from llama_index.embeddings.openai import OpenAIEmbedding
# define embeding model 
embed_model = OpenAIEmbedding()

In [3]:
from llama_index.llms.openai import OpenAI
# define LLM model
llm = OpenAI(model="gpt-4o", temperature=0)

In [4]:
from llama_index.core import Settings

# setting embedding model and llm model globally
Settings.embed_model = embed_model
Settings.llm = llm

### Ingestion Pipeline

In [5]:
# set chunk size
Settings.chunk_size = 1024

In [6]:
# import libraries 
from llama_index.core import SimpleDirectoryReader
from llama_index.core.node_parser import TokenTextSplitter

# load
documents = SimpleDirectoryReader("data").load_data()


In [7]:
# define chunking strategy
text_splitter = TokenTextSplitter()

In [8]:
# define vector database and store 
import chromadb
from llama_index.vector_stores.chroma import ChromaVectorStore

# create local in-memory client
chroma_client = chromadb.EphemeralClient()
# create a collection
chroma_collection = chroma_client.create_collection("ps-foo-rag", get_or_create=True)
# define the vector store using the collection
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)


In [9]:
# define ingestion pipeline
from llama_index.core.ingestion import IngestionPipeline

pipeline = IngestionPipeline(
    transformations=[
        text_splitter,
        embed_model,
    ],
    vector_store=vector_store,
)


In [10]:
# run the ingestion pipeline
nodes = pipeline.run(documents=documents)
print(f"number of nodes or chunks : {len(nodes)}")

number of nodes or chunks : 35


### RAG Pipeline

In [11]:
from llama_index.core import VectorStoreIndex

vector_index = VectorStoreIndex.from_vector_store(
    vector_store=vector_store,
    embed_model=embed_model,
)

In [12]:
# create semantic query engine 
vector_query_engine = vector_index.as_query_engine()

In [13]:
# query 
response = vector_query_engine.query("who is the CEO of the company ?")
print(response)

The CEO of the company is Bar.


***Sample Queries to test***

- When did Bar started his entrepreneurial journey ?
- Who are the co-founders of the company ?
- What is the Bar's AI vision?
- Who is the CEO of company Foo? and what are the other companies he has started earlier.


### Evaluating RAG Functional Metrics

In [None]:
!pip install -qqq deepeval

In [18]:
# ignore warnings in jupyter notebook
import warnings
warnings.filterwarnings('ignore')

# For handling event-loop in jupyter notebook.
import nest_asyncio
nest_asyncio.apply()

In [14]:
# example input for RAG pipeline
user_input = "who is the CEO of company Foo? and what are the other companies he has started earlier."

# LlamaIndex returns a response object that contains
# both the output string and retrieved nodes
response_object = vector_query_engine.query(user_input)


In [15]:
# Process the response object to get the output string
# and retrieved nodes
if response_object is not None:
    actual_output = response_object.response
    retrieval_context = [node.get_content() for node in response_object.source_nodes]

In [16]:
print(f"Input : {user_input} \
      \nOutput: {actual_output} \
      \nRetrieved Chunks Count: {len(retrieval_context)}  \
      \nRetrieved Context: {retrieval_context}")

Input : who is the CEO of company Foo? and what are the other companies he has started earlier.       
Output: The CEO of the company Foo is Bar. Before founding Foo, Bar started two other companies: Cam, a tech start-up aimed at revolutionizing data management for businesses, and Tres, a company specializing in providing cloud-based solutions for small businesses.       
Retrieved Chunks Count: 2        
Retrieved Context: ['**The Founders: Bar and Qux**\n\nTo understand the success of Foo, it’s essential to delve deeper into the backgrounds of its founders, Bar and Qux. Bar, the visionary behind Foo, is not your typical entrepreneur. His journey to becoming the CEO of one of the most respected adventure gear companies in the world is a tale of passion, perseverance, and a deep connection with nature.\n\nBar’s entrepreneurial journey began in the late 1990s, a time when the internet was still in its infancy. His first venture, Cam, was a tech start-up that aimed to revolutionize the w

#### Completeness : Answer Relevancy 

In [19]:
from deepeval.test_case import LLMTestCase
from deepeval.metrics import AnswerRelevancyMetric

metric = AnswerRelevancyMetric(
    threshold=0.7,
    model="gpt-4o",
    include_reason=True
)

test_case = LLMTestCase(
    input=user_input,
    actual_output=actual_output
)

print(f"Input : {user_input} \nOutput: {actual_output}")
metric.measure(test_case)
print(metric.score)
print(metric.reason)

Input : who is the CEO of company Foo? and what are the other companies he has started earlier. 
Output: The CEO of the company Foo is Bar. Before founding Foo, Bar started two other companies: Cam, a tech start-up aimed at revolutionizing data management for businesses, and Tres, a company specializing in providing cloud-based solutions for small businesses.


1.0
The score is 1.00 because the answer fully addresses the questions asked about the CEO of company Foo and the other companies he has started earlier, with no irrelevant statements.


#### Faithfulness

In [20]:
from deepeval.metrics import FaithfulnessMetric

metric = FaithfulnessMetric(
    threshold=0.7,
    model="gpt-4o",
    include_reason=True
)

test_case = LLMTestCase(
    input=user_input,
    actual_output=actual_output,
    retrieval_context=retrieval_context
)

print(f"Retrieved Context: {retrieval_context} \nOutput: {actual_output}")

metric.measure(test_case)
print(metric.score)
print(metric.reason)

Retrieved Context: ['**The Founders: Bar and Qux**\n\nTo understand the success of Foo, it’s essential to delve deeper into the backgrounds of its founders, Bar and Qux. Bar, the visionary behind Foo, is not your typical entrepreneur. His journey to becoming the CEO of one of the most respected adventure gear companies in the world is a tale of passion, perseverance, and a deep connection with nature.\n\nBar’s entrepreneurial journey began in the late 1990s, a time when the internet was still in its infancy. His first venture, Cam, was a tech start-up that aimed to revolutionize the way businesses managed their data. While Cam was not a massive success, it was a critical learning experience for Bar. He learned the importance of understanding the market, the value of customer feedback, and the need for constant innovation. These lessons would prove invaluable when he later founded Foo.\n\nIn 2000, Bar founded Tres, a company that specialized in providing cloud-based solutions for small 

1.0
The score is 1.00 because there are no contradictions between the actual output and the retrieval context. Great job maintaining faithfulness!


#### Toxicity

In [21]:
from deepeval.metrics import ToxicityMetric

metric = ToxicityMetric(
    threshold=0.5,
    model="gpt-4o",
    include_reason=True)

test_case = LLMTestCase(
    input=user_input,
    actual_output=actual_output,
)

print(f"Input : {user_input} \nOutput: {actual_output}")
metric.measure(test_case)
print(metric.score)
print(metric.reason)

Input : who is the CEO of company Foo? and what are the other companies he has started earlier. 
Output: The CEO of the company Foo is Bar. Before founding Foo, Bar started two other companies: Cam, a tech start-up aimed at revolutionizing data management for businesses, and Tres, a company specializing in providing cloud-based solutions for small businesses.


0
The score is 0.00 because the output is completely non-toxic and free of any harmful language.


#### Bias

In [22]:
from deepeval.metrics import BiasMetric

metric = BiasMetric(
    threshold=0.8,
    model="gpt-4o",
    include_reason=True)

test_case = LLMTestCase(
    input=user_input,
    actual_output=actual_output,
    retrieval_context=retrieval_context
)

print(f"Input : {user_input} \nOutput: {actual_output} \n Retrieved Context: {retrieval_context}")

metric.measure(test_case)
print(metric.score)
print(metric.reason)

Input : who is the CEO of company Foo? and what are the other companies he has started earlier. 
Output: The CEO of the company Foo is Bar. Before founding Foo, Bar started two other companies: Cam, a tech start-up aimed at revolutionizing data management for businesses, and Tres, a company specializing in providing cloud-based solutions for small businesses. 
 Retrieved Context: ['**The Founders: Bar and Qux**\n\nTo understand the success of Foo, it’s essential to delve deeper into the backgrounds of its founders, Bar and Qux. Bar, the visionary behind Foo, is not your typical entrepreneur. His journey to becoming the CEO of one of the most respected adventure gear companies in the world is a tale of passion, perseverance, and a deep connection with nature.\n\nBar’s entrepreneurial journey began in the late 1990s, a time when the internet was still in its infancy. His first venture, Cam, was a tech start-up that aimed to revolutionize the way businesses managed their data. While Cam w

0
The score is 0.00 because the actual output shows no signs of bias, demonstrating a well-balanced and neutral perspective.
