### Scenario 1, Prompt 1 - Data (De)anonymization for enhanced privacy

Visit [webpage](https://pndang.com/GenAI_Capstone_ProjectOne/) for more details.

Test prompt: 1 - 
""" 
    Please translate the following to Vietnamese: 

    "Hi Robert! Please see below for your requested info

    full name: Phu Dang
    SSN: 123-45-1119
    email: pndang@ucsd.edu
    birthday: 04/11/2003, so I am 21 years old
    What's your nationality? I am a Vietnamese citizen
    What's your credit score? 775 FICO
    What's the loan amount requested? $1,000,000
    What's the loan purpose? Invest $500,000 in Sustana Cooling Partners as working capital and cash out the rest for my hi-performing restaurant business (Beers & Bastards, LLC). 

    Please ping me the best terms you could offer. Thanks!" 

"""

Environment recommendations to reproduce results:
- Start with an empty python or conda environment
- Run the following command lines:
    - pip install guardrails-ai presidio-analyzer presidio-anonymizer nltk ipykernel -q
    - python -m spacy download en_core_web_lg -q
    - guardrails hub install hub://guardrails/detect_pii --quiet

#### Guardrails AI is up next!

In [4]:
from openai import OpenAI
from dotenv import load_dotenv
import os
import re
import time

from presidio_analyzer import AnalyzerEngine
from presidio_anonymizer import AnonymizerEngine

# from guardrails import Guard, install
import guardrails as gd
try:
    from guardrails.hub import DetectPII
except ImportError:
    install("hub://guardrails/detect_pii")
    from guardrails.hub import DetectPII
from rich import print

from IPython.display import clear_output

import warnings
warnings.filterwarnings("ignore")

In [78]:
load_dotenv()

GPT_MODEL = 'gpt-4'

# API Configuration
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")
OpenAI.api_key = os.getenv("OPENAI_API_KEY")

In [25]:
# initialize guard object 

guard = gd.Guard().use(DetectPII(pii_entities="pii", on_fail="fix"))

In [113]:
# test prompt

prompt = """Please translate the following to Vietnamese: 

    "Hi Robert! Please see below for your requested info

    full name: Phu Dang
    SSN: 123-45-1119
    email: pndang@ucsd.edu
    birthday: 04/11/2003, so I am 21 years old
    What's your nationality? I am a Vietnamese citizen
    What's your credit score? 775 FICO
    What's the loan amount requested? $1,000,000
    What's the loan purpose? Invest $500,000 in Sustana Cooling Partners as working capital and cash out the rest for my hi-performing restaurant business (Beers & Bastards, LLC). 

    Please ping me the best terms you could offer. Thanks!"
    """

In [114]:
output = guard.parse(
    llm_output=prompt,
)

# Print the output
print(output)

In [115]:
output = """Please translate the following to Vietnamese:

"Hi <PERSON>! Please see below for your requested info

full name: Phu Dang
SSN: 123-45-6789
email: <EMAIL_ADDRESS>
birthday: <DATE_TIME>, so I am <DATE_TIME>
What's your nationality? I am a Vietnamese citizen
What's your credit score? 775 FICO
What's the loan amount requested? $1,000,000
What's the loan purpose? Invest $500,000 in Sustana Cooling Partners as working capital and cash out the rest for my hi-performing restaurant business (Beers & Bastards, LLC).

Please ping me the best terms you could offer. Thanks!"
"""

In [116]:
print(output)

#### Inquire Azure AI Text Analytics for PII detection

In [117]:
# !pip install "presidio-analyzer[azure-ai-language]"

In [118]:
# !pip install azure-ai-textanalytics azure-core -q

In [119]:
# Azure API Configuration
os.environ["AZURE_AI_KEY"] = os.getenv("AZURE_AI_KEY")
os.environ["AZURE_AI_ENDPOINT"] = "https://hdsi-capstone.cognitiveservices.azure.com/"

In [120]:
from azure.core.credentials import AzureKeyCredential
from azure.ai.textanalytics import TextAnalyticsClient
endpoint = os.environ["AZURE_AI_ENDPOINT"]
key = os.environ["AZURE_AI_KEY"]

text_analytics_client = TextAnalyticsClient(endpoint, AzureKeyCredential(key))

In [121]:
# get PII detection results from Azure AI

documents = [prompt]
res = text_analytics_client.recognize_pii_entities(documents)

In [122]:
docs = [doc for doc in res if not doc.is_error]

In [123]:
for idx, doc in enumerate(docs):
    print(f"Document text: {documents[idx]}")
    print(f"Redacted document text: {doc.redacted_text}")
    for entity in doc.entities:
        print("...Entity '{}' with category '{}' got redacted".format(
            entity.text, entity.category
        ))

In [124]:
azureai_redacted_prompt = docs[0].redacted_text
print(azureai_redacted_prompt)

In [126]:
# translate

res = OpenAI().chat.completions.create(
    model=GPT_MODEL,
    messages=[
        {'role': 'system', 'content': 'You are a helpful English-Vietnamese translation assistant.'},
        {'role': 'user', 'content': azureai_redacted_prompt}
    ]
)
res_text = res.choices[0].message.content
print(res_text)