# PII Rule Demonstration Notebook

In this notebook we will walk through the following steps in order to demonstrate what the Shield PII Data Rule is capable of.

**Prerequisites:** 
- Shield environment URL
- API key

**Steps**
1. Create a new task for PII Rule evaluation 
2. Example dataset evaluation 
   1. Run the examples against a pre-configured Shield task from Step 1 
   2. View our results 

#### Configure Shield Test Env Details

In [None]:
%pip install datasets
%pip install scikit-learn
%pip install matplotlib

In [None]:
import pandas as pd
from os.path import abspath, join
import sys
from datetime import datetime

utils_path = abspath(join('..', 'utils'))
if utils_path not in sys.path:
    sys.path.append(utils_path)

from shield_utils import setup_env, set_up_task_and_rule, run_shield_evaluation, archive_task
from analysis_utils import print_performance_metrics, granular_result_dfs


pd.set_option('display.max_colwidth', None)
pd.set_option('display.max_rows', None)

setup_env(base_url="<URL>", api_key="<API_KEY>")

---
### 1.Setup: Configure a test task and enable PII Rule 

In this example, we leave all the categories enabled by default. However for your use case, you may want to omit some of the more sensitive PII categories, such as names or date-time. You can do this by adding the categories you would like to allow to the `disabled_pii_entities` list in the `config` json - see https://shield.docs.arthur.ai/docs/pii-leakage and https://shield.docs.arthur.ai/docs/rule-configuration-guide#pii-rule for more information.

In [None]:
pii_rule_no_config = {
    "name": "PII Rule",
    "type": "PIIDataRule",
    "apply_to_prompt": True,
    "apply_to_response": True,
    "config": {} 
}

# Create task, archive all rules except the one we pass, create the rule we pass 
pii_rule, pii_task = set_up_task_and_rule(pii_rule_no_config, "pii-task-example-notebook")

print(pii_rule)
print(pii_task)

---
### 2. Examples

In [5]:
pii_examples = pd.read_csv("./datasets/pii_examples.csv")

#### 2.1  Run the examples against a pre-configured Shield task from Step 1 

In [6]:
pii_examples = run_shield_evaluation(pii_examples, pii_task, pii_rule)
current_datetime = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
pii_examples.to_csv(f"./results/pii_examples_df_{current_datetime}.csv")

#### 2.2 Analyze Results

In [None]:
print_performance_metrics(pii_examples)

arthur_fn, arthur_fp, arthur_tp, arthur_tn = granular_result_dfs(pii_examples)

In [None]:
arthur_tp

---

### 3. Delete Test Task

In [None]:
archive_task(pii_task["id"])