# **Evaluating Natural Language Generation w/ RAGAS**

### Overview
In this notebook you will explore the RAGAS (by ExplodingGradients) open source NLG evaluation framework. Ragas aims to create an open standard, providing developers with the tools and techniques to leverage continual learning in their RAG applications. using RAGAS for NLG evaluation you will be able to evaluate each component of your RAG pipeline in isolation. RAGAS primarily uses 4 core metrics:
1. Faithfulness: How factually accurate a generated answer is
2. Answer Relevence: How relevent an answer is to the question
3. Context Precision: The signal to noise ration fo the retreived context
4. Context Recall: Is all required relevant information retreived to answer the question (_requires ground truth_)

_Notes_  
- For this notebook, we will use 30 smaples from the [FIQA](https://sites.google.com/view/fiqa/) public dataset from ExplodingGradients
  - _Schema_ = question,ground_truths,answer,contexts
- For this notebook we will use the previously established Azure OpenAI connection, however a regular OpenAI connection can also be used

 **_Go Deeper_**  
[RAGAS Documentation](https://docs.ragas.io/en/stable/index.html)  
[RAGAS Project GitHub](https://github.com/explodinggradients/ragas)
  
**_Prerequisites_**  
  
Ensure that your environment is setup by completing the steps outlines in [0_setup.ipynb](./0_setup.ipynb)

In [None]:
# Import Libraries
import os
import pandas as pd
from datasets import load_dataset
from azure.identity import DefaultAzureCredential
from azure.ai.ml import MLClient
from dotenv import load_dotenv, find_dotenv
from langchain_openai.chat_models import AzureChatOpenAI
from langchain_openai.embeddings import AzureOpenAIEmbeddings
from ragas import evaluate
from ragas.metrics import (
    context_precision,
    answer_relevancy,
    faithfulness,
    context_recall,
)

In [None]:
# Setup environment
load_dotenv(find_dotenv(), override=True)
print(os.getenv("WORKSPACE_NAME"))

# Get a handle to the workspace
ml_client = MLClient(
    credential=DefaultAzureCredential(),
    subscription_id = os.environ.get('SUBSCRIPTION_ID'),
    resource_group_name = os.environ.get('RESOURCE_GROUP_NAME'),
    workspace_name = os.environ.get('WORKSPACE_NAME'),
)

In [None]:
# Set config variables
metrics = [
    faithfulness,
    answer_relevancy,
    context_recall,
    context_precision,
]

azure_configs = {
    "azure_endpoint": os.environ.get("AZURE_OPENAI_ENDPOINT"),
    "aoai_key": os.environ.get("AZURE_OPENAI_KEY"),
    "model_deployment": "aoai-gpt4",
    "model_name": "gpt-4",
    "embedding_deployment": "aoai-ada",
    "embedding_name": "text-embedding-ada-002"
}

print(azure_configs)

In [None]:
# Load dataset
fiqa = load_dataset("explodinggradients/fiqa", "ragas_eval")
display(fiqa)

In [None]:
# Create model instances to be used for evaluation

azure_model = AzureChatOpenAI(
    openai_api_version="2023-07-01-preview",
    azure_endpoint=azure_configs["azure_endpoint"],
    azure_deployment=azure_configs["model_deployment"],
    model=azure_configs["model_name"],
    openai_api_type="azure",
    openai_api_key=azure_configs["aoai_key"],
    validate_base_url=False,
)

# init the embeddings for answer_relevancy, answer_correctness and answer_similarity
azure_embeddings = AzureOpenAIEmbeddings(
    openai_api_version="2023-07-01-preview",
    azure_endpoint=azure_configs["azure_endpoint"],
    azure_deployment=azure_configs["embedding_deployment"],
    model=azure_configs["embedding_name"],
    openai_api_type="azure",
    openai_api_key=azure_configs["aoai_key"],
)

In [None]:
# Evaluate - this may take several minutes
result = evaluate(
    fiqa["baseline"],
    metrics=metrics,
    llm=azure_model,
    embeddings=azure_embeddings,
    raise_exceptions=False
)

In [None]:
# View Results
display(result.to_pandas())

Note known issues with the RAGAS framework that may materialize during this notebook:
- [#555 Index Errors](https://github.com/explodinggradients/ragas/issues/555)
- [#395 Dictionary Format Outpus](https://github.com/explodinggradients/ragas/issues/395)
- [#536 OpenAI Integration Broken](https://github.com/explodinggradients/ragas/issues/536)
- [#449 Azure Content Filter Triggered](https://github.com/explodinggradients/ragas/issues/449)