<a href="https://colab.research.google.com/github/uptrain-ai/uptrain/blob/main/examples/root_cause_analysis/rag_with_citation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<h1 align="center">
  <a href="https://uptrain.ai">
    <img width="300" src="https://user-images.githubusercontent.com/108270398/214240695-4f958b76-c993-4ddd-8de6-8668f4d0da84.png" alt="uptrain">
  </a>
</h1>

<h1 style="text-align: center;">Root Cause Analysis</h1>

<h2 style="text-align: center;">RAG with Citation</h2>

**What is RAG with citation RCA?**: RAG with citation helps you analyse failure causes in a RAG pipeline. Using it you can highlight failure causes such as:
- Poor Retrieval: The context does not have information relevant to the question asked.
- Hallucinations: The generated response is not factually correct.
- Poor Citation: The cited information is not factually correct.
- Poor Context Utilization: The cited information is not relevant to the question asked.

**Data schema**: The data schema required for this evaluation is as follows:

| Column Name | Description |
| ----------- | ----------- |
| question | The question asked by the user |
| context | Additional information provided that can be used to answer the question |
| cited_context | Information retieved from the context |
| response | The response given by the model |

 If you face any difficulties, need some help with using UpTrain or want to brainstorm on custom evaluations for your use-case, [speak to the maintainers of UpTrain here](https://calendly.com/uptrain-sourabh/30min).

## Step 1: Install UpTrain

In [1]:
# !pip install uptrain

## Step 2: Let's define our dataset to run evaluations upon

In [2]:
data = [
    {
        'question': 'Which team won the 2023 ICC Cricket World Cup?',
        'context': 'Argentina won the 2022 FIFA World Cup. The 2022 FIFA World Cup took place in Qatar from 20 November to 18 December 2022. The previous FIFA World Cup was held in Russia.',
        'cited_context': 'The 2022 FIFA World Cup took place in Qatar from 20 November to 18 December 2022',
        'response': 'The 2023 ICC Cricket World Cup was won by Qatar.'        
    },
    {
        'question': 'Where was the 2022 FIFA World Cup held?',
        'context': 'Argentina won the 2022 FIFA World Cup. The 2022 FIFA World Cup took place in Qatar from 20 November to 18 December 2022. The previous FIFA World Cup was held in Russia.',
        'cited_context': 'The previous FIFA World Cup was held in Russia.',
        'response': 'The previous FIFA World Cup was held in Russia.'        
    },
    {
        'question': 'Who won the 2022 FIFA World Cup?',
        'context': 'Aliens won the 2022 FIFA World Cup. The 2022 FIFA World Cup took place in Qatar from 20 November to 18 December 2022. The previous FIFA World Cup was held in Russia.',
        'cited_context': 'Aliens won the FIFA World Cup.',
        'response': 'The 2022 FIFA World Cup was won by Aliens.'        
    }
]

## Step 3: Running evaluations using UpTrain

In [3]:
from uptrain import APIClient, RcaTemplate
import json

UPTRAIN_API_KEY = "up-********************"  # Insert your UpTrain API key here

uptrain_client = APIClient(uptrain_api_key=UPTRAIN_API_KEY)

res = uptrain_client.perform_root_cause_analysis(
    'Sample-RCA',
    data = data,
    rca_template = RcaTemplate.RAG_WITH_CITATION
)

print(json.dumps(res,indent=3))

[32m2024-02-19 00:50:02.884[0m | [1mINFO    [0m | [36muptrain.framework.remote[0m:[36mperform_root_cause_analysis[0m:[36m505[0m - [1mSending root cause analysis request for rows 0 to <50 to the Uptrain server[0m


[
   {
      "question": "Which team won the 2023 ICC Cricket World Cup?",
      "context": "Argentina won the 2022 FIFA World Cup. The 2022 FIFA World Cup took place in Qatar from 20 November to 18 December 2022. The previous FIFA World Cup was held in Russia.",
      "cited_context": "The 2022 FIFA World Cup took place in Qatar from 20 November to 18 December 2022",
      "response": "The 2023 ICC Cricket World Cup was won by Qatar.",
      "error_mode": "Poor Retrieval",
      "error_resolution_suggestion": "Context Retrieval Pipeline needs improvement",
      "score_question_completeness": 1,
      "score_valid_response": 1.0,
      "explanation_valid_response": "Step by step reasoning:\n\n1. The question asks for the team that won the 2023 ICC Cricket World Cup.\n2. The response states \"The 2023 ICC Cricket World Cup was won by Qatar.\"\n\nConclusion:\nThe given response does contain some information.\n\n[Choice]: A",
      "score_context_relevance": 0.0,
      "explanation_conte

## Step 4: Let's look at some of the results 

### Sample with Poor Retrieval Example

In [4]:
print(json.dumps(res[0],indent=3))

{
   "question": "Which team won the 2023 ICC Cricket World Cup?",
   "context": "Argentina won the 2022 FIFA World Cup. The 2022 FIFA World Cup took place in Qatar from 20 November to 18 December 2022. The previous FIFA World Cup was held in Russia.",
   "cited_context": "The 2022 FIFA World Cup took place in Qatar from 20 November to 18 December 2022",
   "response": "The 2023 ICC Cricket World Cup was won by Qatar.",
   "error_mode": "Poor Retrieval",
   "error_resolution_suggestion": "Context Retrieval Pipeline needs improvement",
   "score_question_completeness": 1,
   "score_valid_response": 1.0,
   "explanation_valid_response": "Step by step reasoning:\n\n1. The question asks for the team that won the 2023 ICC Cricket World Cup.\n2. The response states \"The 2023 ICC Cricket World Cup was won by Qatar.\"\n\nConclusion:\nThe given response does contain some information.\n\n[Choice]: A",
   "score_context_relevance": 0.0,
   "explanation_context_relevance": " \"The extracted conte

### Sample with Poor Context Utilization Example

In [5]:
print(json.dumps(res[1],indent=3))

{
   "question": "Where was the 2022 FIFA World Cup held?",
   "context": "Argentina won the 2022 FIFA World Cup. The 2022 FIFA World Cup took place in Qatar from 20 November to 18 December 2022. The previous FIFA World Cup was held in Russia.",
   "cited_context": "The previous FIFA World Cup was held in Russia.",
   "response": "The previous FIFA World Cup was held in Russia.",
   "error_mode": "Poor Context Utilization",
   "error_resolution_suggestion": "Add intermediary steps so as the LLM can better understand context and generate a complete response",
   "score_question_completeness": 1,
   "score_valid_response": 1.0,
   "explanation_valid_response": "The question is \"Where was the 2022 FIFA World Cup held?\" and the response is \"The previous FIFA World Cup was held in Russia.\"\n\nStep by step reasoning:\n1. The response provides information about the location of the previous FIFA World Cup, stating that it was held in Russia.\n2. Although the response does not directly answ

### Sample with Hallucinations Example

In [6]:
print(json.dumps(res[2],indent=3))

{
   "question": "Who won the 2022 FIFA World Cup?",
   "context": "Aliens won the 2022 FIFA World Cup. The 2022 FIFA World Cup took place in Qatar from 20 November to 18 December 2022. The previous FIFA World Cup was held in Russia.",
   "cited_context": "Aliens won the FIFA World Cup.",
   "response": "The 2022 FIFA World Cup was won by Aliens.",
   "error_mode": "Hallucinations",
   "error_resolution_suggestion": "Add instructions to your LLM to adher to the context provide - Try tipping",
   "score_question_completeness": 1,
   "score_valid_response": 1.0,
   "explanation_valid_response": "The given response to the question \"Who won the 2022 FIFA World Cup?\" is \"The 2022 FIFA World Cup was won by Aliens.\"\n\nStep by step reasoning:\n1. The response provides information about the winner of the 2022 FIFA World Cup, stating that it was won by \"Aliens.\"\n2. Although the information provided is not factually accurate, it does contain a specific response to the question asked.\n\nT