# Hallucination Rule Demonstration Notebook

In this notebook we will walk through the following steps in order to demonstrate what the Shield Hallucination Rule is capable of.

- Hallucination detected.
- No hallucination detected. 
- Not evaluated for hallucinaion.

Pre  Requisites: 
- A Shield env and API key.

1. Create a new task for Hallucination evaluation 
2. Arthur Benchmark dataset evaluation 
   1. Run the examples against a pre-configured Shield task from Step 1 
   2. View our results 
3. Additional examples evaluation using datasets referenced in our documentation: https://shield.docs.arthur.ai/docs/hallucination#benchmarks
   1. Run the examples against a pre-configured Shield task from Step 1 
    2. View our results 

#### Configure Shield Test Env Details

In [None]:
%pip install datasets
%pip install scikit-learn
from datasets import load_dataset, concatenate_datasets
import pandas as pd
from os.path import abspath, join
import sys
import random

utils_path = abspath(join('..', 'utils'))
if utils_path not in sys.path:
    sys.path.append(utils_path)

from shield_utils import setup_env, set_up_task_and_rule, run_shield_evaluation
from analysis_utils import print_performance_metrics, granular_result_dfs


pd.set_option('display.max_colwidth', None)
pd.set_option('display.max_rows', None)

setup_env(base_url="<URL>", api_key="<API_KEY>")

---
### 1.Setup: Configure a test task and enable Prompt Injection Rule 

In [None]:
hallucination_rule_config =  {
    "name": "Hallucination Rule",
    "type": "ModelHallucinationRuleV2",
    "apply_to_prompt": False,
    "apply_to_response": True
}

# Create task, archive all rules except the one we pass, create the rule we pass 
hallucination_rule, hallucination_task = set_up_task_and_rule(hallucination_rule_config, "pii-task")

print(hallucination_rule)
print(hallucination_task)

---
### 2. Arthur benchmark dataset evaluation

In [32]:
hallucination_benchmark_df_arthur = pd.read_csv("./arthur_benchmark_datasets/hallucination_benchmark_dummy.csv")

hallucination_benchmark_df_arthur['label'] = hallucination_benchmark_df_arthur['binary_label'].map({0: False, 1: True})

#### 2.1  Run the examples against a pre-configured Shield task from Step 1 

In [33]:
if (len(hallucination_task["rules"]) > 1):
    raise Exception("Cannot have more than one rule enabled for this test.")
else: 
    if hallucination_task["rules"][0]["type"] != "ModelHallucinationRuleV2":
            raise Exception("Invalid rule type enabled. Must be PromptInjectionRule.")
    else: 
         print(f"Valid task {hallucination_task}")

from datetime import datetime

current_datetime = datetime.now().strftime("%Y-%m-%d %H:%M:%S")

task_id = hallucination_task["id"]

def shield_hallucination_evaluation(row): 
    shield_prompt_inference = task_prompt_validation("dummy", 1, task_id)

    inference_id = shield_prompt_inference["inference_id"]

    shield_result = task_response_validation(row.text, row.context, inference_id, task_id)

    # TODO - add the reason or claims breakdown 
    
    for rule_result in shield_result["rule_results"]:
        if rule_result["id"] == hallucination_rule["id"]:
            result = rule_result["result"]

            print(result)
            if result == "Pass": 
                result = False
            else:
                result = True

            return result
        

hallucination_benchmark_df_arthur["shield_result"] = hallucination_benchmark_df_arthur.apply(shield_hallucination_evaluation, axis=1).apply(pd.Series)

# # Save to CSV to avoid having to run this again to view results 
hallucination_benchmark_df_arthur.to_csv(f"./results/hallucination_benchmark_df_arthur_{current_datetime}.csv")

Valid task {'id': '276c85b0-55f1-48a1-bcb2-cb5d6d806df8', 'name': 'hallucination-test-task-2257', 'created_at': 1711732552332, 'updated_at': 1711732552332, 'rules': [{'id': '9f3737b0-395c-4420-81d1-b56851d983d4', 'name': 'Hallucination Rule', 'type': 'ModelHallucinationRuleV2', 'apply_to_prompt': False, 'apply_to_response': True, 'enabled': True, 'scope': 'task', 'created_at': 1711732552481, 'updated_at': 1711732552481, 'config': None}]}
Fail
Pass
Fail


#### 2.2 Analyze Results

In [None]:
print_performance_metrics(hallucination_benchmark_df_arthur)

arthur_fn, arthur_fp, arthur_tp, arthur_tn = granular_result_dfs(hallucination_benchmark_df_arthur)

---
### 3. Load benchmark datasets from https://shield.docs.arthur.ai/docs/hallucination#benchmarks

**DISCLAIMER**: This is for demonstration and guidance purposes only and does not reflect the performance of the model behind the Shield score, as sampling techniques may not be optimal. 
**DISCLAIMER 2**: This dataset contains German and Arthur prompt injection is only trained on English. We do our best to filter it out, but some German may slip through. 

#### 3.1  Run the examples against a pre-configured Shield task from Step 1 

#### 3.2 Analyze Results