Based on the original tutorial: https://github.com/fiddler-labs/fiddler-auditor/blob/main/examples/LLM_Evaluation.ipynb

In [1]:
import nltk
import subprocess
try:
    nltk.data.find('wordnet.zip')
except:
    nltk.download('wordnet', download_dir='/kaggle/working/')
    command = "unzip /kaggle/working/corpora/wordnet.zip -d /kaggle/working/corpora"
    subprocess.run(command.split())
    nltk.data.path.append('/kaggle/working/')

[nltk_data] Downloading package wordnet to /kaggle/working/...
Archive:  /kaggle/working/corpora/wordnet.zip
   creating: /kaggle/working/corpora/wordnet/
  inflating: /kaggle/working/corpora/wordnet/lexnames  
  inflating: /kaggle/working/corpora/wordnet/data.verb  
  inflating: /kaggle/working/corpora/wordnet/index.adv  
  inflating: /kaggle/working/corpora/wordnet/adv.exc  
  inflating: /kaggle/working/corpora/wordnet/index.verb  
  inflating: /kaggle/working/corpora/wordnet/cntlist.rev  
  inflating: /kaggle/working/corpora/wordnet/data.adj  
  inflating: /kaggle/working/corpora/wordnet/index.adj  
  inflating: /kaggle/working/corpora/wordnet/LICENSE  
  inflating: /kaggle/working/corpora/wordnet/citation.bib  
  inflating: /kaggle/working/corpora/wordnet/noun.exc  
  inflating: /kaggle/working/corpora/wordnet/verb.exc  
  inflating: /kaggle/working/corpora/wordnet/README  
  inflating: /kaggle/working/corpora/wordnet/index.sense  
  inflating: /kaggle/working/corpora/wordnet/data.

In [2]:
!pip install -U fiddler-auditor

Collecting fiddler-auditor
  Downloading fiddler_auditor-0.0.5-py3-none-any.whl.metadata (5.5 kB)
Collecting fiddler-checklist==0.0.1 (from fiddler-auditor)
  Downloading fiddler_checklist-0.0.1-py3-none-any.whl.metadata (512 bytes)
Collecting pandas<2.0,>=1.3.5 (from fiddler-auditor)
  Downloading pandas-1.5.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (11 kB)
Collecting spacy-transformers>=1.1.8 (from fiddler-auditor)
  Downloading spacy_transformers-1.3.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.0 kB)
Collecting langchain<=0.0.330,>=0.0.158 (from fiddler-auditor)
  Downloading langchain-0.0.330-py3-none-any.whl.metadata (16 kB)
Collecting openai<=0.28.1,>=0.27.0 (from fiddler-auditor)
  Downloading openai-0.28.1-py3-none-any.whl.metadata (11 kB)
Collecting sentence-transformers>=2.2.2 (from fiddler-auditor)
  Downloading sentence_transformers-2.6.0-py3-none-any.whl.metadata (11 kB)
Collecting httplib2~=0.22.0 (from

In [3]:
import os
import getpass
import warnings
from IPython.display import HTML, display
from langchain.llms import OpenAI

from auditor.perturbations import Paraphrase
from sentence_transformers.SentenceTransformer import SentenceTransformer
from auditor.evaluation.expected_behavior import SimilarGeneration
from auditor.evaluation.evaluate import LLMEval
from auditor.evaluation.expected_behavior import ModelGraded
from auditor.perturbations import TransformBase
from typing import List


from kaggle_secrets import UserSecretsClient

warnings.filterwarnings('ignore')

2024-03-25 16:25:38.626725: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-03-25 16:25:38.626872: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-03-25 16:25:38.757601: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered


In [4]:
class CFG:
    temperature = 0.7
    

# setup OpenAI API connection
user_secrets = UserSecretsClient()
api_key=user_secrets.get_secret("openaivision")
os.environ['OPENAI_API_KEY']= api_key



# Evaluation setup

In [5]:
openai_llm = OpenAI(model_name='gpt-3.5-turbo', temperature=0.0)

Using the Fiddler Auditor's built-in utilities we'll define the input transformation and expected behavior. As part of input transformation, we'll paraphrase the prompt using another LLM. Despite the paraphrasing, we expect the model's generations to be above 0.8 cosine similarity compared to a reference generation.

In [6]:
input_transformation = Paraphrase(temperature=0.0, num_perturbations=5)

sent_xfmer = SentenceTransformer('sentence-transformers/paraphrase-mpnet-base-v2')
similar_generation = SimilarGeneration(
    similarity_model=sent_xfmer,
    similarity_threshold=0.8,
)

modules.json:   0%|          | 0.00/229 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/122 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/3.73k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/594 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/1.19k [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [7]:
llm_eval = LLMEval(
    llm=openai_llm,
    transformation=input_transformation,
    expected_behavior=similar_generation,
)


# Correctness evaluation

In [8]:


pre_context = (
    "You are a helpful chatbot at the NewAge Bank that answers questions. "
    "When a customer opens a bank account through NewAge, they get a Visa Debit Card and a Checking Account "
    "that can be managed entirely from the smartphone, plus an optional Savings Account that helps them grow their savings automatically! "
    "NewAge Bank only provides Mortgage services.\n"
    "Restrict your responses to queries related to banking.\n"
    "Always end the reponse by asking the user if they have any other questions.\n"
)

prompt = "How can I apply for a student loan through your bank?"

reference_generation = (
    "I apologize for the confusion, but NewAge Bank only provides mortgage services and does not offer student loans. "
    "However, we can assist you with any questions or concerns you may have regarding our mortgage services. Is there anything else I can help you with? "
)

test_result = llm_eval.evaluate_prompt_correctness(
    prompt=prompt,
    pre_context=pre_context,
    reference_generation=reference_generation,
)
test_result



Applying Perturbations: 100%|██████████| 6/6 [00:08<00:00,  1.36s/it]
Fetching Scores:   0%|          | 0/6 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Fetching Scores:  17%|█▋        | 1/6 [00:00<00:03,  1.28it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Fetching Scores:  50%|█████     | 3/6 [00:00<00:00,  4.03it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Fetching Scores:  83%|████████▎ | 5/6 [00:01<00:00,  6.63it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Fetching Scores: 100%|██████████| 6/6 [00:01<00:00,  5.56it/s]


Unnamed: 0,Perturbed Prompts,Generations,Similarity [Generations],Result
0,What is the process for applying for a student loan through your bank?,"I'm sorry, but NewAge Bank only provides Mortgage services and does not offer student loans. Is there anything else I can assist you with today?",0.89,1
1,How can I apply for a student loan through your bank?,"I'm sorry, but NewAge Bank only provides Mortgage services and does not offer student loans. Is there anything else I can assist you with today? Do you have any other questions?",0.9,1
2,How do I go about applying for a student loan through your bank?,"I'm sorry, but NewAge Bank only provides Mortgage services and does not offer student loans. Is there anything else I can assist you with today? Do you have any other questions?",0.9,1
3,Can you provide information on how to apply for a student loan through your bank?,"I'm sorry, but NewAge Bank only provides Mortgage services and does not offer student loans. Is there anything else I can assist you with today? Do you have any other questions?",0.9,1
4,What steps do I need to take to apply for a student loan through your bank?,"I'm sorry, but NewAge Bank only provides Mortgage services and does not offer student loans. Is there anything else I can assist you with today? Do you have any other questions?",0.9,1
5,Could you please explain the procedure for applying for a student loan through your bank?,"I'm sorry, but NewAge Bank only provides Mortgage services and does not offer student loans. Is there anything else I can assist you with today? Do you have any other questions?",0.9,1


# Robustness

In [9]:

gpt4_grader = ModelGraded(grading_model='gpt-4')

prompt = "What is the penalty amount for not maintaining minimum balance in savings account?"

llm_eval = LLMEval(
    llm=openai_llm,
    transformation=Paraphrase(temperature=1.0),
    expected_behavior=gpt4_grader,
)

test_result = llm_eval.evaluate_prompt_robustness(
    prompt=prompt,
    pre_context=pre_context,
)
test_result



Applying Perturbations: 100%|██████████| 5/5 [00:07<00:00,  1.49s/it]
Grading responses with gpt-4: 100%|██████████| 5/5 [00:19<00:00,  3.87s/it]


Unnamed: 0,Perturbed Prompts,Generations,Rationale,Result
0,What is the amount I will be fined for falling below the minimum balance in my savings account?,The fine for falling below the minimum balance in your savings account varies depending on the specific terms and conditions of your account. It's best to refer to the account agreement or contact our customer service for more information on the exact amount of the fine. Do you have any other questions?,"Both answers are factually the same. They both state that the penalty for not maintaining the minimum balance in a savings account varies and suggest referring to the account agreement or contacting customer service for more information. The only difference is the use of the words ""penalty"" and ""fine"", but they are used interchangeably in this context.",1
1,What penalty do I incur for not meeting the minimum balance requirement in my savings account?,"If you do not meet the minimum balance requirement in your Savings Account, you may incur a penalty fee. It's important to keep track of your account balance to avoid any penalties. Do you have any other questions?","Both answers provide the same factual information. They both state that a penalty fee may be incurred if the minimum balance requirement in the savings account is not met. However, the first answer provides additional information about referring to the account agreement or contacting customer service for more specific information about the penalty amount.",1
2,How much do I have to pay for not maintaining the minimum balance in my savings account?,"If you do not maintain the minimum balance in your savings account, there may be a fee associated with it. The specific fee amount can vary depending on the terms and conditions of your account. It's important to review the account agreement or contact customer service for more information on the fee structure. Do you have any other questions about your account or our services?",Both answers provide the same factual information. They both state that the penalty for not maintaining the minimum balance in a savings account varies and that the customer should refer to the account agreement or contact customer service for more information. The difference in wording does not change the factual content of the responses.,1
3,How much will I be charged for not keeping the minimum balance in my savings account?,"If you do not maintain the minimum balance in your Savings Account, there may be a fee charged. The specific fee amount can vary depending on the terms and conditions of your account. It's best to refer to the account agreement or contact our customer service for more details. Do you have any other questions?","Both answers provide the same factual information. They both state that the penalty for not maintaining the minimum balance in a savings account varies and that the customer should refer to the account agreement or contact customer service for more information. The only difference is in the wording, but the conveyed information is the same.",1
4,What is the fee for failing to maintain the required minimum balance in my savings account?,The fee for failing to maintain the required minimum balance in your savings account is $10 per month. It's important to keep track of your account balance to avoid this fee. Do you have any other questions?,"The two responses are not factually the same. Answer 1 does not provide a specific penalty amount, instead it suggests contacting customer service for more information. Answer 2, on the other hand, provides a specific penalty amount of $10 per month for not maintaining the minimum balance in the savings account.",0


In [10]:


pre_context = (
    "You are a helpful chatbot at the NewAge Bank that answers questions. "
    "When a customer opens a bank account through NewAge, they get a Visa Debit Card and a Checking Account "
    " that can be managed entirely from the smartphone, plus an optional Savings Account that helps them grow their savings automatically!"
    "NewAge Bank only provides Mortgage services.\n"
    "NewAge has no fees to sign up, no overdraft, no monthly or service fees, no minimum balance fees, no transaction fees, and no card replacement fees either." 
    "NewAge charges one fee ($2.50) when customers get cash from either an over the counter withdrawal, or an out-of-network ATM that is not part of our fee-free network of 60,000+ ATMs. "
    "Restrict your responses to queries related to banking.\n"
    "Always end the reponse by asking the user if they have any other questions.\n"
)
test_result = llm_eval.evaluate_prompt_robustness(
    prompt=prompt,
    pre_context=pre_context,
)
test_result



Applying Perturbations: 100%|██████████| 5/5 [00:06<00:00,  1.35s/it]
Grading responses with gpt-4: 100%|██████████| 5/5 [00:15<00:00,  3.13s/it]


Unnamed: 0,Perturbed Prompts,Generations,Rationale,Result
0,How much is the fee for falling below the minimum balance in a savings account?,NewAge Bank does not charge any fees for falling below the minimum balance in a savings account. We believe in providing our customers with a fee-free banking experience. Do you have any other questions?,Both answers are factually the same. They both convey the same information that NewAge Bank does not charge any fees for not maintaining a minimum balance in the savings account.,1
1,How much do you get charged for not having enough money in a savings account?,NewAge Bank does not charge any fees for not having enough money in a savings account. We do not have any minimum balance fees for any of our accounts. Do you have any other questions?,Both answers are factually the same. They both convey the same information that NewAge Bank does not charge any fees for not maintaining a minimum balance in the savings account.,1
2,What is the fine for not maintaining the minimum balance in a savings account?,NewAge Bank does not charge any fees for not maintaining a minimum balance in a savings account. Our Savings Account is designed to help you grow your savings automatically without any penalties. Do you have any other questions?,Both answers are factually the same. They both convey the same information that NewAge Bank does not charge any penalties or fees for not maintaining a minimum balance in a savings account.,1
3,What is the cost of not meeting the minimum balance requirement in a savings account?,"NewAge Bank does not have any minimum balance requirements for our Checking or Savings Accounts, so there are no fees for not meeting a minimum balance. However, we do charge a fee of $2.50 for over the counter withdrawals or using an out-of-network ATM. Do you have any other questions?","Both answers are factually the same. They both state that NewAge Bank does not charge any fees for not maintaining a minimum balance in the Savings Account. However, the second answer provides additional information about the $2.50 fee for over the counter withdrawals or using an out-of-network ATM, which is not directly related to the question asked.",1
4,Can you tell me the penalty for not keeping the required balance in a savings account?,NewAge Bank does not have any penalties for not keeping a required balance in a savings account. Our Savings Account is designed to help customers grow their savings automatically without any additional fees or penalties. Do you have any other questions about our banking services?,"Both answers are factually the same. They both state that NewAge Bank does not charge any penalties for not maintaining a minimum balance in the savings account. The details provided in both responses align with each other, with the only difference being the way the information is presented.",1
