# OpenAlex RAG Evaluation

Author: Alex Davis

Date: 09/08/2025

The purpose of this script is to evaluate the retriever and the RAG that we generated in 'RAG' notebook.

## Load Packages

In [0]:
%pip install -U deepeval
%pip install faiss-cpu
%pip install -qU langchain-community faiss-cpu
%pip install --upgrade --quiet  langchain langchain-huggingface sentence_transformers
%pip install -qU langchain-openai

In [0]:
%restart_python

In [0]:
#import packages for evaluation
from deepeval.test_case import LLMTestCase, LLMTestCaseParams
from deepeval import evaluate
from deepeval.metrics import GEval
from deepeval.metrics import (
    ContextualPrecisionMetric,
    ContextualRecallMetric,
    ContextualRelevancyMetric)

#import packages for RAG pipeline
import os
import faiss
from langchain_community.vectorstores import FAISS
from langchain_huggingface.embeddings import HuggingFaceEmbeddings
from langchain_openai import OpenAI
from langchain.chains import RetrievalQA
from langchain import PromptTemplate

## Load Models and Embeddings

In [0]:
#load embedding model
embeddings = HuggingFaceEmbeddings(model_name="thenlper/gte-small")

#load vector DB
db = FAISS.load_local(
    "Data/faiss_index", embeddings, allow_dangerous_deserialization=True)

#set API key
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY","API KEY") 

#load llm
llm = OpenAI(openai_api_key=OPENAI_API_KEY)

## Create Retriever and RAG Pipeline

In [0]:
#test that vector database is working
retriever = db.as_retriever(search_kwargs={"k": 3})

In [0]:
#create a prompt template
template = """<|user|>
Relevant information:
{context}

Provide a concise answer to the following question using relevant information provided above:
{question}
If the information above does not answer the question, say that you do not know. Keep answers to 3 sentences or shorter.<|end|>
<|assistant|>"""

#define prompt template
prompt = PromptTemplate(
    template=template,
    input_variables=["context", "question"])

#create RAG pipeline
rag = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=retriever, return_source_documents=True,
                                  chain_type_kwargs={"prompt": prompt}, verbose = True)

## Retriever Evaluation

In [0]:
#set your OpenAI API key
os.environ["OPENAI_API_KEY"] = "API KEY"

# Initialize metrics
contextual_precision = ContextualPrecisionMetric()
contextual_recall = ContextualRecallMetric()
contextual_relevancy = ContextualRelevancyMetric()

In [0]:
#define user query
input = 'What are the most recent advancements in computer vision?'

#RAG output
actual_output = rag.invoke(input)['result']

#contexts used from the retriver
retrieved_contexts = []
for el in range(0,3):
  retrieved_contexts.append(rag.invoke(input)['source_documents'][el].page_content)
  
#expected output (example)
expected_output = 'Recent advancements in computer vision include Vision-Language Models (VLMs) that merge vision and language, Neural Radiance Fields (NeRFs) for 3D scene generation, and powerful Diffusion Models and Generative AI for creating realistic visuals. Other key areas are Edge AI for real-time processing, enhanced 3D vision techniques like NeRFs and Visual SLAM, advanced self-supervised learning methods, deepfake detection systems, and increased focus on Ethical AI and Explainable AI (XAI) to ensure fairness and transparency.'

In [0]:
#create test case
test_case = LLMTestCase(
    input=input,
    actual_output=actual_output,
    retrieval_context=retrieved_contexts,
    expected_output=expected_output)

In [0]:
#compute contextual precision and print results
contextual_precision.measure(test_case)
print("Score: ", contextual_precision.score)
print("Reason: ", contextual_precision.reason)

#compute contextual recall and print results
contextual_recall.measure(test_case)
print("Score: ", contextual_recall.score)
print("Reason: ", contextual_recall.reason)

#compute relevancy precision and print results
contextual_relevancy.measure(test_case)
print("Score: ", contextual_relevancy.score)
print("Reason: ", contextual_relevancy.reason)

Score:  1.0
Reason:  The score is 1.00 because the relevant nodes are ranked at the top: the first node discusses 'recent progress on computer vision algorithms' and 'prominent achievements,' and the second node covers the 'evolution of computer vision' and foundational advancements. The irrelevant node, which only describes the OpenCV toolkit and lacks discussion of recent advancements, is correctly ranked last. This perfect ordering ensures the highest contextual precision.

Score:  0.0
Reason:  The score is 0.00 because none of the sentences in the expected output can be traced back to any node(s) in the retrieval context; there is no overlap or relevant information present.

Score:  0.5555555555555556
Reason:  The score is 0.56 because, while there are several statements that discuss recent progress and deep learning advancements in computer vision (e.g., 'The prominent achievements in computer vision tasks such as image classification, object detection and image segmentation brought by deep learning techniques are highlighted.'), much of the context is general background or unrelated details (e.g., 'The explanation of the term 'convolutional' as a mathematical operation is not directly relevant to the advancements in computer vision.').

In [0]:
#run all metrics with 'evaluate' function
evaluate(test_cases=[test_case],
         metrics=[contextual_precision, contextual_recall, contextual_relevancy])

EvaluationResult(test_results=[TestResult(name='test_case_0', success=False, metrics_data=[MetricData(name='Contextual Precision', threshold=0.5, success=True, score=1.0, reason="The score is 1.00 because the top two nodes in the retrieval contexts are highly relevant, discussing 'recent progress on computer vision algorithms' (node 1) and 'the evolution of computer vision, including the impact of deep learning and convolutional neural networks' (node 2), both directly addressing recent advancements. The only irrelevant node, which 'primarily describes the OpenCV toolkit' and does not mention recent trends, is correctly ranked last (node 3), ensuring all relevant information is prioritized at the top.", strict_mode=False, evaluation_model='gpt-4.1', error=None, evaluation_cost=0.007902, verbose_logs='Verdicts:\n[\n    {\n        "verdict": "yes",\n        "reason": "This context discusses \'recent progress on computer vision algorithms and their corresponding hardware implementations\' and highlights \'prominent achievements in computer vision tasks such as image classification, object detection and image segmentation brought by deep learning techniques.\' It also mentions \'promising directions for future research,\' which is relevant to advancements, even if it does not specify the exact technologies listed in the expected output."\n    },\n    {\n        "verdict": "yes",\n        "reason": "This context provides a detailed overview of the evolution of computer vision, including the impact of deep learning and convolutional neural networks (CNNs), which are foundational to many recent advancements such as Vision-Language Models and Generative AI. It also mentions real-time processing and applications in healthcare, which relate to Edge AI and ethical considerations."\n    },\n    {\n        "verdict": "no",\n        "reason": "This context primarily describes the OpenCV toolkit, its programming language support, and its use in face detection and object detection. While it is related to computer vision, it does not discuss recent advancements or trends such as Vision-Language Models, NeRFs, or Diffusion Models."\n    }\n]'), MetricData(name='Contextual Recall', threshold=0.5, success=False, score=0.0, reason='The score is 0.00 because none of the sentences in the expected output can be attributed to any of the nodes in the retrieval context; there is no overlap in content.', strict_mode=False, evaluation_model='gpt-4.1', error=None, evaluation_cost=0.00592, verbose_logs='Verdicts:\n[\n    {\n        "verdict": "no",\n        "reason": "None of the 3 nodes mention Vision-Language Models (VLMs), Neural Radiance Fields (NeRFs), Diffusion Models, or Generative AI. These specific advancements are not discussed in the retrieval context."\n    },\n    {\n        "verdict": "no",\n        "reason": "Edge AI, enhanced 3D vision techniques like NeRFs and Visual SLAM, advanced self-supervised learning, deepfake detection, Ethical AI, and Explainable AI (XAI) are not mentioned in any of the 3 nodes of the retrieval context."\n    }\n]'), MetricData(name='Contextual Relevancy', threshold=0.5, success=True, score=0.5777777777777777, reason="The score is 0.58 because, while there are several relevant statements such as 'The field of computer vision is experiencing a great-leap-forward development today.' and 'The prominent achievements in computer vision tasks such as image classification, object detection and image segmentation brought by deep learning techniques are highlighted.', much of the retrieval context is general background or about tools like OpenCV, which, as noted, 'does not discuss advancements in computer vision.' This mix of relevant and irrelevant content justifies a moderate score.", strict_mode=False, evaluation_model='gpt-4.1', error=None, evaluation_cost=0.03207, verbose_logs='Verdicts:\n[\n    {\n        "verdicts": [\n            {\n                "statement": "The field of computer vision is experiencing a great-leap-forward development today.",\n                "verdict": "yes",\n                "reason": null\n            },\n            {\n                "statement": "This paper aims at providing a comprehensive survey of the recent progress on computer vision algorithms and their corresponding hardware implementations.",\n                "verdict": "yes",\n                "reason": null\n            },\n            {\n                "statement": "The prominent achievements in computer vision tasks such as image classification, object detection and image segmentation brought by deep learning techniques are highlighted.",\n                "verdict": "yes",\n                "reason": null\n            },\n            {\n                "statement": "Review of techniques for implementing and optimizing deep-learning-based computer vision algorithms on GPU, FPGA and other new generations of hardware accelerators are presented to facilitate real-time and/or energy-efficient operations.",\n                "verdict": "yes",\n                "reason": null\n            },\n            {\n                "statement": "Several promising directions for future research are presented to motivate further development in the field.",\n                "verdict": "yes",\n                "reason": null\n            }\n        ]\n    },\n    {\n        "verdicts": [\n            {\n                "statement": "Computer vision is one of the fields of computer science that is one of the most powerful and persuasive types of artificial intelligence.",\n                "verdict": "yes",\n                "reason": null\n            },\n            {\n                "statement": "It is similar to the human vision system, as it enables computers to recognize and process objects in pictures and videos in the same way as humans do.",\n                "verdict": "yes",\n                "reason": null\n            },\n            {\n                "statement": "Computer vision technology has rapidly evolved in many fields and contributed to solving many problems, as computer vision contributed to self-driving cars, and cars were able to understand their surroundings.",\n                "verdict": "yes",\n                "reason": null\n            },\n            {\n                "statement": "The cameras record video from different angles around the car, then a computer vision system gets images from the video, and then processes the images in real-time to find roadside ends, detect other cars, and read traffic lights, pedestrians, and objects.",\n                "verdict": "yes",\n                "reason": null\n            },\n            {\n                "statement": "Computer vision also contributed to facial recognition; this technology enables computers to match images of people\\u2019s faces to their identities. which these algorithms detect facial features in images and then compare them with databases.",\n                "verdict": "yes",\n                "reason": null\n            },\n            {\n                "statement": "Computer vision also play important role in Healthcare, in which algorithms can help automate tasks such as detecting Breast cancer, finding symptoms in x-ray, cancerous moles in skin images, and MRI scans.",\n                "verdict": "yes",\n                "reason": null\n            },\n            {\n                "statement": "Computer vision also contributed to many fields such as image classification, object discovery, motion recognition, subject tracking, and medicine.",\n                "verdict": "yes",\n                "reason": null\n            },\n            {\n                "statement": "The rapid development of artificial intelligence is making machine learning more important in his field of research. Use algorithms to find out every bit of data and predict the outcome. This has become an important key to unlocking the door to AI.",\n                "verdict": "no",\n                "reason": "The statement \'Use algorithms to find out every bit of data and predict the outcome. This has become an important key to unlocking the door to AI.\' is too general and not specifically about recent advancements in computer vision."\n            },\n            {\n                "statement": "If we had looked to deep learning concept, we find deep learning is a subset of machine learning, algorithms inspired by structure and function of the human brain called artificial neural networks, learn from large amounts of data.",\n                "verdict": "no",\n                "reason": "The statement is a general explanation of deep learning and neural networks, not specifically about recent advancements in computer vision."\n            },\n            {\n                "statement": "Deep learning algorithm perform a task repeatedly, each time tweak it a little to improve the outcome. So, the development of computer vision was due to deep learning.",\n                "verdict": "yes",\n                "reason": null\n            },\n            {\n                "statement": "Now we\'ll take a tour around the convolution neural networks, let us say that convolutional neural networks are one of the most powerful supervised deep learning models (abbreviated as CNN or ConvNet).",\n                "verdict": "yes",\n                "reason": null\n            },\n            {\n                "statement": "This name ;convolutional ; is a token from a mathematical linear operation between matrixes called convolution.",\n                "verdict": "no",\n                "reason": "The statement \'This name ;convolutional ; is a token from a mathematical linear operation between matrixes called convolution.\' is about the etymology of CNNs, not about advancements in computer vision."\n            },\n            {\n                "statement": "CNN structure can be used in a variety of real-world problems including, computer vision, image recognition, natural language processing (NLP), anomaly detection, video analysis, drug discovery, recommender systems, health risk assessment, and time-series forecasting.",\n                "verdict": "yes",\n                "reason": null\n            },\n            {\n                "statement": "If we look at convolutional neural networks, we see that CNN are similar to normal neural networks, the only difference between CNN and ANN is that CNNs are used in the field of pattern recognition within images mainly.",\n                "verdict": "yes",\n                "reason": null\n            },\n            {\n                "statement": "This allows us to encode the features of an image into the structure, making the network more suitable for image-focused tasks, with reducing the parameters required to set-up the model.",\n                "verdict": "yes",\n                "reason": null\n            },\n            {\n                "statement": "One of the advantages of CNN that it has an excellent performance in machine learning problems.",\n                "verdict": "yes",\n                "reason": null\n            },\n            {\n                "statement": "So, we will use CNN as a classifier for image classification.",\n                "verdict": "yes",\n                "reason": null\n            },\n            {\n                "statement": "So, the objective of this paper is that we will talk in detail about image classification in the following sections.",\n                "verdict": "no",\n                "reason": "The statement \'So, the objective of this paper is that we will talk in detail about image classification in the following sections.\' is a meta-statement about the structure of the paper, not about advancements in computer vision."\n            }\n        ]\n    },\n    {\n        "verdicts": [\n            {\n                "statement": "Computer Vision is one of the most fascinating and challenging tasks in the field of Artificial Intelligence.",\n                "verdict": "yes",\n                "reason": "This statement provides high-level context about computer vision, which is relevant to understanding advancements in the field."\n            },\n            {\n                "statement": "Computer Vision serves as a link between computer software and the visuals we see around us.",\n                "verdict": "yes",\n                "reason": "This statement explains the purpose of computer vision, which is foundational to discussing advancements."\n            },\n            {\n                "statement": "It enables computer software to comprehend and learn about the visuals in its environment.",\n                "verdict": "yes",\n                "reason": "This is relevant as it describes the core capability of computer vision, which is the subject of advancements."\n            },\n            {\n                "statement": "As an example: The fruit is determined by its color, shape, and size.",\n                "verdict": "no",\n                "reason": "The example \'The fruit is determined by its color, shape, and size.\' is a basic illustration and does not discuss recent advancements in computer vision."\n            },\n            {\n                "statement": "In the Computer Vision pipeline, we first collect data, then conduct data processing operations, and then train and educate the model to learn how to differentiate between fruits based on size, shape, and color.",\n                "verdict": "no",\n                "reason": "This statement describes a standard computer vision pipeline for fruit classification, not recent advancements."\n            },\n            {\n                "statement": "The main goal is to identify and comprehend the images and offer new images that are more useful for us in different life fields.",\n                "verdict": "yes",\n                "reason": "This statement addresses the goals of computer vision, which is relevant to understanding the direction of advancements."\n            },\n            {\n                "statement": "The term \'OpenCV\' is an abbreviation for \'open source computer vision.\'",\n                "verdict": "no",\n                "reason": "This is a definition of OpenCV and does not discuss advancements in computer vision."\n            },\n            {\n                "statement": "The architecture is made up of software, databases, and plugins that are pre-programmed with support for integrating computer vision applications.",\n                "verdict": "no",\n                "reason": "This describes the architecture of OpenCV, not recent advancements in computer vision."\n            },\n            {\n                "statement": "It is one of the most used toolkits with a large developer group.",\n                "verdict": "no",\n                "reason": "This statement is about the popularity of OpenCV, not about advancements in computer vision."\n            },\n            {\n                "statement": "It is well-known for the size at which it builds realworld usage cases for industrial use.",\n                "verdict": "no",\n                "reason": "This statement discusses OpenCV\'s industrial use, not recent advancements in computer vision."\n            },\n            {\n                "statement": "OpenCV follows C/C++, Python, Java programming languages and can be used to build computer vision software for desktop and smartphone platforms such as Windows, Linux, macOS, Android, and iOS.",\n                "verdict": "no",\n                "reason": "This statement is about the programming languages and platforms supported by OpenCV, not about advancements in computer vision."\n            },\n            {\n                "statement": "The most recent releases are OpenCV-4.5.2 and OpenCV-3.4.14.",\n                "verdict": "no",\n                "reason": "This statement is about OpenCV version releases, not about advancements in computer vision as a field."\n            },\n            {\n                "statement": "It is free and open-source, as well as simple to use and install.",\n                "verdict": "no",\n                "reason": "This statement is about OpenCV\'s licensing and usability, not about advancements in computer vision."\n            },\n            {\n                "statement": "It is intended for numerical productivity with a heavy emphasis on real-time applications.",\n                "verdict": "no",\n                "reason": "This statement is about OpenCV\'s intended use, not about advancements in computer vision."\n            },\n            {\n                "statement": "The first version was in the C programming language; however, its success increased with the release of Version 2.0, which had a C++ implementation.",\n                "verdict": "no",\n                "reason": "This statement is about the history of OpenCV, not about advancements in computer vision."\n            },\n            {\n                "statement": "C++ is used to create new features.",\n                "verdict": "no",\n                "reason": "This statement is about the programming language used for OpenCV features, not about advancements in computer vision."\n            },\n            {\n                "statement": "OpenCV can be downloaded for free from http://opencv.org.",\n                "verdict": "no",\n                "reason": "This statement is about downloading OpenCV, not about advancements in computer vision."\n            },\n            {\n                "statement": "This platform includes the most recent distribution update (version: 4.5.2) as well as older iterations.",\n                "verdict": "no",\n                "reason": "This statement is about OpenCV distribution updates, not about advancements in computer vision."\n            },\n            {\n                "statement": "Photos must be in BGR or Grayscale format in order to be displayed or saved via OpenCV. Otherwise, unfavorable outcomes could occur.",\n                "verdict": "no",\n                "reason": "This statement is about image formats in OpenCV, not about advancements in computer vision."\n            },\n            {\n                "statement": "Face detection is a form of computer vision that aids in detecting and visualizing facial features in captured pictures or real-time videos.",\n                "verdict": "yes",\n                "reason": "This statement describes a key application area in computer vision, which is relevant to advancements in the field."\n            },\n            {\n                "statement": "This type of object detection technique detects instances of semantic artifacts of a given class (such as people, cars, and houses) in digital pictures and videos.",\n                "verdict": "yes",\n                "reason": "This statement describes object detection, a major area of advancement in computer vision."\n            },\n            {\n                "statement": "Face recognition has become increasingly important as technology has advanced, especially in fields such as photography, defense, and marketing.",\n                "verdict": "yes",\n                "reason": "This statement links the advancement of technology with the growing importance of face recognition, which is relevant to the input question."\n            }\n        ]\n    }\n]')], conversational=False, multimodal=False, input='What are the most recent advancements in computer vision?', actual_output='\nThe most recent advancements in computer vision include the use of deep learning techniques for image classification, object detection, and image segmentation. There has also been progress in implementing and optimizing these algorithms on hardware accelerators such as GPUs and FPGAs for real-time and energy-efficient operations. However, there is ongoing research in the field, and new advancements are constantly being made.', expected_output='Recent advancements in computer vision include Vision-Language Models (VLMs) that merge vision and language, Neural Radiance Fields (NeRFs) for 3D scene generation, and powerful Diffusion Models and Generative AI for creating realistic visuals. Other key areas are Edge AI for real-time processing, enhanced 3D vision techniques like NeRFs and Visual SLAM, advanced self-supervised learning methods, deepfake detection systems, and increased focus on Ethical AI and Explainable AI (XAI) to ensure fairness and transparency.', context=None, retrieval_context=['The field of computer vision is experiencing a great-leap-forward development today. This paper aims at providing a comprehensive survey of the recent progress on computer vision algorithms and their corresponding hardware implementations. In particular, the prominent achievements in computer vision tasks such as image classification, object detection and image segmentation brought by deep learning techniques are highlighted. On the other hand, review of techniques for implementing and optimizing deep-learning-based computer vision algorithms on GPU, FPGA and other new generations of hardware accelerators are presented to facilitate real-time and/or energy-efficient operations. Finally, several promising directions for future research are presented to motivate further development in the field.', "Computer vision is one of the fields of computer science that is one of the most powerful and persuasive types of artificial intelligence. It is similar to the human vision system, as it enables computers to recognize and process objects in pictures and videos in the same way as humans do. Computer vision technology has rapidly evolved in many fields and contributed to solving many problems, as computer vision contributed to self-driving cars, and cars were able to understand their surroundings. The cameras record video from different angles around the car, then a computer vision system gets images from the video, and then processes the images in real-time to find roadside ends, detect other cars, and read traffic lights, pedestrians, and objects. Computer vision also contributed to facial recognition; this technology enables computers to match images of people’s faces to their identities. which these algorithms detect facial features in images and then compare them with databases. Computer vision also play important role in Healthcare, in which algorithms can help automate tasks such as detecting Breast cancer, finding symptoms in x-ray, cancerous moles in skin images, and MRI scans. Computer vision also contributed to many fields such as image classification, object discovery, motion recognition, subject tracking, and medicine. The rapid development of artificial intelligence is making machine learning more important in his field of research. Use algorithms to find out every bit of data and predict the outcome. This has become an important key to unlocking the door to AI. If we had looked to deep learning concept, we find deep learning is a subset of machine learning, algorithms inspired by structure and function of the human brain called artificial neural networks, learn from large amounts of data. Deep learning algorithm perform a task repeatedly, each time tweak it a little to improve the outcome. So, the development of computer vision was due to deep learning. Now we'll take a tour around the convolution neural networks, let us say that convolutional neural networks are one of the most powerful supervised deep learning models (abbreviated as CNN or ConvNet). This name ;convolutional ; is a token from a mathematical linear operation between matrixes called convolution. CNN structure can be used in a variety of real-world problems including, computer vision, image recognition, natural language processing (NLP), anomaly detection, video analysis, drug discovery, recommender systems, health risk assessment, and time-series forecasting. If we look at convolutional neural networks, we see that CNN are similar to normal neural networks, the only difference between CNN and ANN is that CNNs are used in the field of pattern recognition within images mainly. This allows us to encode the features of an image into the structure, making the network more suitable for image-focused tasks, with reducing the parameters required to set-up the model. One of the advantages of CNN that it has an excellent performance in machine learning problems. So, we will use CNN as a classifier for image classification. So, the objective of this paper is that we will talk in detail about image classification in the following sections.", 'Computer Vision is one of the most fascinating and challenging tasks in the field of Artificial Intelligence.Computer Vision serves as a link between computer software and the visuals we see around us.It enables computer software to comprehend and learn about the visuals in its environment.As an example: The fruit is determined by its color, shape, and size.This job may seem simple for the human brain, but in the Computer Vision pipeline, we first collect data, then conduct data processing operations, and then train and educate the model to learn how to differentiate between fruits based on size, shape, and color.The main goal is to identify and comprehend the images and offer new images that are more useful for us in different life fieldsThe term "OpenCV" is an abbreviation for "open source computer vision."The architecture is made up of software, databases, and plugins that are pre-programmed with support for integrating computer vision applications [3].It is one of the most used toolkits with a large developer group.It is well-known for the size at which it builds realworld usage cases for industrial use.OpenCV follows C/C++, Python, Java programming languages and can be used to build computer vision software for desktop and smartphone platforms such as Windows, Linux, macOS, Android, and iOS.The most recent releases are OpenCV-4.5.2 and OpenCV-3.4.14.It is free and open-source, as well as simple to use and install.It is intended for numerical productivity with a heavy emphasis on real-time applications.The first version was in the C programming language; however, its success increased with the release of Version 2.0, which had a C++ implementation [2].C++ is used to create new features.OpenCV can be downloaded for free from http://opencv.org.This platform includes the most recent distribution update (version: 4.5.2) as well as older iterations.Photos must be in BGR or Grayscale format in order to be displayed or saved via OpenCV.Otherwise, unfavorable outcomes could occur [1].Face detection is a form of computer vision that aids in detecting and visualizing facial features in captured pictures or real-time videos.This type of object detection technique detects instances of semantic artifacts of a given class (such as people, cars, and houses) in digital pictures and videos.Face recognition has become increasingly important as technology has advanced, especially in fields such as photography, defense, and marketing [4], [5].'], additional_metadata=None)], confident_link=None)

## Generation Evaluation

In [0]:
from deepeval.metrics import AnswerRelevancyMetric, FaithfulnessMetric

answer_relevancy = AnswerRelevancyMetric()
faithfulness = FaithfulnessMetric()

In [0]:
#compute answer relevancy and print results
answer_relevancy.measure(test_case)
print("Score: ", answer_relevancy.score)
print("Reason: ", answer_relevancy.reason)

#compute faithfulness and print results
faithfulness.measure(test_case)
print("Score: ", faithfulness.score)
print("Reason: ", faithfulness.reason)

Score:  1.0
Reason:  The score is 1.00 because the answer was fully relevant and addressed the question directly without any irrelevant information. Great job staying focused and informative!

Score:  1.0
Reason:  Great job! There are no contradictions, so the actual output is fully faithful to the retrieval context.
No such comm: 6e4deda83ff347f58004e420c9c51751
No such comm: 6e4deda83ff347f58004e420c9c51751

In [0]:
#run all metrics with 'evaluate' function
evaluate(test_cases=[test_case],
         metrics=[answer_relevancy, faithfulness])

EvaluationResult(test_results=[TestResult(name='test_case_0', success=True, metrics_data=[MetricData(name='Answer Relevancy', threshold=0.5, success=True, score=1.0, reason='The score is 1.00 because the answer was fully relevant and addressed the question directly without any irrelevant information. Great job staying focused and informative!', strict_mode=False, evaluation_model='gpt-4.1', error=None, evaluation_cost=0.005181999999999999, verbose_logs='Statements:\n[\n    "Recent advancements in computer vision include the use of deep learning techniques for image classification.",\n    "Deep learning techniques are used for object detection in computer vision.",\n    "Deep learning techniques are used for image segmentation in computer vision.",\n    "There has been progress in implementing and optimizing these algorithms on hardware accelerators such as GPUs and FPGAs.",\n    "These implementations enable real-time and energy-efficient operations.",\n    "Ongoing research continues in the field of computer vision.",\n    "New advancements are constantly being made in computer vision."\n] \n \nVerdicts:\n[\n    {\n        "verdict": "yes",\n        "reason": null\n    },\n    {\n        "verdict": "yes",\n        "reason": null\n    },\n    {\n        "verdict": "yes",\n        "reason": null\n    },\n    {\n        "verdict": "yes",\n        "reason": null\n    },\n    {\n        "verdict": "yes",\n        "reason": null\n    },\n    {\n        "verdict": "yes",\n        "reason": null\n    },\n    {\n        "verdict": "yes",\n        "reason": null\n    }\n]'), MetricData(name='Faithfulness', threshold=0.5, success=True, score=1.0, reason='Great job! There are no contradictions, so the actual output is fully faithful to the retrieval context.', strict_mode=False, evaluation_model='gpt-4.1', error=None, evaluation_cost=0.016356, verbose_logs='Truths (limit=None):\n[\n    "The field of computer vision is experiencing rapid development.",\n    "This paper provides a comprehensive survey of recent progress in computer vision algorithms and their hardware implementations.",\n    "Deep learning techniques have brought prominent achievements in computer vision tasks such as image classification, object detection, and image segmentation.",\n    "The paper reviews techniques for implementing and optimizing deep-learning-based computer vision algorithms on GPU, FPGA, and other hardware accelerators.",\n    "Computer vision enables computers to recognize and process objects in pictures and videos in a way similar to humans.",\n    "Computer vision technology has evolved rapidly and contributed to solving problems in various fields, including self-driving cars, facial recognition, and healthcare.",\n    "In self-driving cars, computer vision systems process images from cameras to detect roadside ends, other cars, traffic lights, pedestrians, and objects in real time.",\n    "Facial recognition technology enables computers to match images of people\'s faces to their identities by detecting facial features and comparing them with databases.",\n    "Computer vision algorithms can help automate tasks in healthcare, such as detecting breast cancer, finding symptoms in x-rays, identifying cancerous moles in skin images, and analyzing MRI scans.",\n    "Computer vision contributes to fields such as image classification, object discovery, motion recognition, subject tracking, and medicine.",\n    "The rapid development of artificial intelligence is making machine learning more important in computer vision research.",\n    "Deep learning is a subset of machine learning, inspired by the structure and function of the human brain, called artificial neural networks.",\n    "Deep learning algorithms learn from large amounts of data and improve their performance by repeating tasks and tweaking outcomes.",\n    "The development of computer vision has been significantly advanced by deep learning.",\n    "Convolutional neural networks (CNNs or ConvNets) are powerful supervised deep learning models used in computer vision.",\n    "The term \'convolutional\' in CNNs comes from a mathematical linear operation between matrices called convolution.",\n    "CNNs can be used in real-world problems including computer vision, image recognition, natural language processing, anomaly detection, video analysis, drug discovery, recommender systems, health risk assessment, and time-series forecasting.",\n    "CNNs are similar to normal neural networks, but are mainly used for pattern recognition within images.",\n    "CNNs encode image features into their structure, making them suitable for image-focused tasks and reducing the number of parameters required.",\n    "CNNs have excellent performance in machine learning problems, especially as classifiers for image classification.",\n    "Computer vision serves as a link between computer software and the visuals in the environment, enabling software to comprehend and learn about visuals.",\n    "In computer vision, data is collected, processed, and used to train models to differentiate between objects based on features such as size, shape, and color.",\n    "The main goal of computer vision is to identify and comprehend images and provide useful new images for various fields.",\n    "OpenCV stands for \'open source computer vision.\'",\n    "OpenCV is made up of software, databases, and plugins that support integrating computer vision applications.",\n    "OpenCV is widely used and has a large developer community.",\n    "OpenCV supports C/C++, Python, and Java programming languages and can be used on Windows, Linux, macOS, Android, and iOS platforms.",\n    "Recent releases of OpenCV include versions 4.5.2 and 3.4.14.",\n    "OpenCV is free, open-source, and simple to use and install.",\n    "OpenCV is intended for numerical productivity with a focus on real-time applications.",\n    "The first version of OpenCV was in C, and its success increased with the release of version 2.0, which had a C++ implementation.",\n    "C++ is used to create new features in OpenCV.",\n    "OpenCV can be downloaded for free from http://opencv.org.",\n    "OpenCV provides the most recent distribution update (version 4.5.2) and older versions.",\n    "Photos must be in BGR or Grayscale format to be displayed or saved via OpenCV; otherwise, unfavorable outcomes could occur.",\n    "Face detection is a form of computer vision that detects and visualizes facial features in images or real-time videos.",\n    "Object detection techniques in computer vision can detect instances of semantic artifacts of a given class (such as people, cars, and houses) in digital images and videos.",\n    "Face recognition has become increasingly important in fields such as photography, defense, and marketing."\n] \n \nClaims:\n[\n    "The most recent advancements in computer vision include the use of deep learning techniques for image classification, object detection, and image segmentation.",\n    "There has been progress in implementing and optimizing these algorithms on hardware accelerators such as GPUs and FPGAs for real-time and energy-efficient operations.",\n    "There is ongoing research in the field of computer vision, and new advancements are constantly being made."\n] \n \nVerdicts:\n[\n    {\n        "verdict": "yes",\n        "reason": "The context states that deep learning techniques have brought prominent achievements in computer vision tasks such as image classification, object detection, and image segmentation."\n    },\n    {\n        "verdict": "yes",\n        "reason": "The context mentions that the paper reviews techniques for implementing and optimizing deep-learning-based computer vision algorithms on GPU, FPGA, and other hardware accelerators."\n    },\n    {\n        "verdict": "yes",\n        "reason": "The context states that the field of computer vision is experiencing rapid development, which supports the claim of ongoing research and constant advancements."\n    }\n]')], conversational=False, multimodal=False, input='What are the most recent advancements in computer vision?', actual_output='\nThe most recent advancements in computer vision include the use of deep learning techniques for image classification, object detection, and image segmentation. There has also been progress in implementing and optimizing these algorithms on hardware accelerators such as GPUs and FPGAs for real-time and energy-efficient operations. However, there is ongoing research in the field, and new advancements are constantly being made.', expected_output='Recent advancements in computer vision include Vision-Language Models (VLMs) that merge vision and language, Neural Radiance Fields (NeRFs) for 3D scene generation, and powerful Diffusion Models and Generative AI for creating realistic visuals. Other key areas are Edge AI for real-time processing, enhanced 3D vision techniques like NeRFs and Visual SLAM, advanced self-supervised learning methods, deepfake detection systems, and increased focus on Ethical AI and Explainable AI (XAI) to ensure fairness and transparency.', context=None, retrieval_context=['The field of computer vision is experiencing a great-leap-forward development today. This paper aims at providing a comprehensive survey of the recent progress on computer vision algorithms and their corresponding hardware implementations. In particular, the prominent achievements in computer vision tasks such as image classification, object detection and image segmentation brought by deep learning techniques are highlighted. On the other hand, review of techniques for implementing and optimizing deep-learning-based computer vision algorithms on GPU, FPGA and other new generations of hardware accelerators are presented to facilitate real-time and/or energy-efficient operations. Finally, several promising directions for future research are presented to motivate further development in the field.', "Computer vision is one of the fields of computer science that is one of the most powerful and persuasive types of artificial intelligence. It is similar to the human vision system, as it enables computers to recognize and process objects in pictures and videos in the same way as humans do. Computer vision technology has rapidly evolved in many fields and contributed to solving many problems, as computer vision contributed to self-driving cars, and cars were able to understand their surroundings. The cameras record video from different angles around the car, then a computer vision system gets images from the video, and then processes the images in real-time to find roadside ends, detect other cars, and read traffic lights, pedestrians, and objects. Computer vision also contributed to facial recognition; this technology enables computers to match images of people’s faces to their identities. which these algorithms detect facial features in images and then compare them with databases. Computer vision also play important role in Healthcare, in which algorithms can help automate tasks such as detecting Breast cancer, finding symptoms in x-ray, cancerous moles in skin images, and MRI scans. Computer vision also contributed to many fields such as image classification, object discovery, motion recognition, subject tracking, and medicine. The rapid development of artificial intelligence is making machine learning more important in his field of research. Use algorithms to find out every bit of data and predict the outcome. This has become an important key to unlocking the door to AI. If we had looked to deep learning concept, we find deep learning is a subset of machine learning, algorithms inspired by structure and function of the human brain called artificial neural networks, learn from large amounts of data. Deep learning algorithm perform a task repeatedly, each time tweak it a little to improve the outcome. So, the development of computer vision was due to deep learning. Now we'll take a tour around the convolution neural networks, let us say that convolutional neural networks are one of the most powerful supervised deep learning models (abbreviated as CNN or ConvNet). This name ;convolutional ; is a token from a mathematical linear operation between matrixes called convolution. CNN structure can be used in a variety of real-world problems including, computer vision, image recognition, natural language processing (NLP), anomaly detection, video analysis, drug discovery, recommender systems, health risk assessment, and time-series forecasting. If we look at convolutional neural networks, we see that CNN are similar to normal neural networks, the only difference between CNN and ANN is that CNNs are used in the field of pattern recognition within images mainly. This allows us to encode the features of an image into the structure, making the network more suitable for image-focused tasks, with reducing the parameters required to set-up the model. One of the advantages of CNN that it has an excellent performance in machine learning problems. So, we will use CNN as a classifier for image classification. So, the objective of this paper is that we will talk in detail about image classification in the following sections.", 'Computer Vision is one of the most fascinating and challenging tasks in the field of Artificial Intelligence.Computer Vision serves as a link between computer software and the visuals we see around us.It enables computer software to comprehend and learn about the visuals in its environment.As an example: The fruit is determined by its color, shape, and size.This job may seem simple for the human brain, but in the Computer Vision pipeline, we first collect data, then conduct data processing operations, and then train and educate the model to learn how to differentiate between fruits based on size, shape, and color.The main goal is to identify and comprehend the images and offer new images that are more useful for us in different life fieldsThe term "OpenCV" is an abbreviation for "open source computer vision."The architecture is made up of software, databases, and plugins that are pre-programmed with support for integrating computer vision applications [3].It is one of the most used toolkits with a large developer group.It is well-known for the size at which it builds realworld usage cases for industrial use.OpenCV follows C/C++, Python, Java programming languages and can be used to build computer vision software for desktop and smartphone platforms such as Windows, Linux, macOS, Android, and iOS.The most recent releases are OpenCV-4.5.2 and OpenCV-3.4.14.It is free and open-source, as well as simple to use and install.It is intended for numerical productivity with a heavy emphasis on real-time applications.The first version was in the C programming language; however, its success increased with the release of Version 2.0, which had a C++ implementation [2].C++ is used to create new features.OpenCV can be downloaded for free from http://opencv.org.This platform includes the most recent distribution update (version: 4.5.2) as well as older iterations.Photos must be in BGR or Grayscale format in order to be displayed or saved via OpenCV.Otherwise, unfavorable outcomes could occur [1].Face detection is a form of computer vision that aids in detecting and visualizing facial features in captured pictures or real-time videos.This type of object detection technique detects instances of semantic artifacts of a given class (such as people, cars, and houses) in digital pictures and videos.Face recognition has become increasingly important as technology has advanced, especially in fields such as photography, defense, and marketing [4], [5].'], additional_metadata=None)], confident_link=None)

## Beyond Generic Evaluation

In [0]:
#create evaluation for technical language
tech_eval = GEval(
    name="Technical Language",
    criteria="Determine how technically written the actual output is",
    evaluation_params=[LLMTestCaseParams.ACTUAL_OUTPUT])

#run evaluation
tech_eval.measure(test_case)
print("Score: ", tech_eval.score)
print("Reason: ", tech_eval.reason)

Score:  0.6437823499114202
Reason:  The response uses appropriate technical terminology such as 'deep learning', 'image classification', 'object detection', 'image segmentation', 'GPUs', and 'FPGAs'. The explanations are clear but somewhat general, lacking specific examples or recent breakthroughs. The technical detail is moderate, mentioning both algorithmic and hardware aspects, but does not delve into particular models or methods. The writing is mostly formal and adheres to technical conventions, but the depth and specificity could be improved.