
<h1><center><font size=10> Generative AI for NLP Program</center></font></h1>
<h1><center> Project </center></h1>

# **GA-NLP Project: Financial Product Complaint Classification and Summarization**

## **Business Context**

### **Description**
*In the modern financial industry, customer complaints play a crucial role in identifying areas where financial institutions can improve their services. Effectively categorizing these complaints into specific product categories, such as credit reports, student loans, or money transfers, is essential for addressing customer concerns promptly by routing the tickets to relevant personnel. Leveraging Generative AI for text classification can help financial institutions better understand customer grievances and respond more efficiently. Apart from this, a summary of the customer complaint helps the support personnel quickly grasp the gist of the grievance*

### **Objective**
*The primary goal of this project is to utilize Generative AI techniques to improve the classification and summarization of customer complaints in the financial sector.
Specifically, the project will focus on:*

1. **Text-to-Label Classification:** *Implementing Zero-shot and Few-shot prompting methods to accurately classify customer complaints into relevant product categories.*
2. **Text-to-Text Summarization:** *Using Zero-shot prompting to generate concise summaries of customer complaints, enabling more personalized and effective responses.*

### **Conclusion**
*Upon completing this project, you will have the capability to develop solutions for LLM-based text classification and summarization. These tools will enable financial institutions to automate the complaint handling process, leading to faster, more accurate responses to customer issues, improved customer satisfaction, and enhanced compliance with industry regulations. This project will also provide you with valuable skills and experience that can be applied to a range of real-world business challenges.*


# **Section 1 : Setting Up for Prompt Engineering with Mistral Model**

### **Install & Importing neccessary libraries**

In [1]:
!apt-get update
!apt-get install -y ninja-build cmake
!pip install ipywidgets --upgrade

Reading package lists... Done
E: Could not open lock file /var/lib/apt/lists/lock - open (13: Permission denied)
E: Unable to lock directory /var/lib/apt/lists/
W: Problem unlinking the file /var/cache/apt/pkgcache.bin - RemoveCaches (13: Permission denied)
W: Problem unlinking the file /var/cache/apt/srcpkgcache.bin - RemoveCaches (13: Permission denied)
E: Could not open lock file /var/lib/dpkg/lock-frontend - open (13: Permission denied)
E: Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), are you root?


Installing collected packages: widgetsnbextension, jupyterlab-widgets, ipywidgets
Successfully installed ipywidgets-8.1.5 jupyterlab-widgets-3.0.13 widgetsnbextension-4.0.13
[0m

In [2]:
import torch

In [3]:
# This part of code will skip all the un-necessary warnings which can occur during the execution of this project.
import warnings
warnings.filterwarnings("ignore", category=Warning)

In [4]:
# Installation for GPU llama-cpp-python==0.2.69
!CMAKE_ARGS="-DLLAMA_CUDA=on" pip install llama-cpp-python==0.2.69
# For downloading the models from HF Hub
!pip install huggingface_hub



In [5]:
!pip install evaluate
!pip install bert-score



Collecting tokenizers<0.22,>=0.21
  Downloading tokenizers-0.21.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.0/3.0 MB[0m [31m131.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting regex!=2019.12.17
  Downloading regex-2024.11.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (781 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m781.7/781.7 KB[0m [31m111.4 MB/s[0m eta [36m0:00:00[0m
Collecting safetensors>=0.4.1
  Downloading safetensors-0.5.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (461 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m462.0/462.0 KB[0m [31m82.8 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: safetensors, regex, tokenizers, transformers, bert-score
Successfully installed bert-score-0.3.13 regex-2024.11.6 safetensors-0.5.2 tokenizers-0.21.0 transformers-4.48.3
[0m

In [6]:
!pip freeze > requirement.txt

In [7]:
# Basic Imports for Libraries
from sklearn.model_selection import train_test_split
from sklearn.metrics import f1_score

import pandas as pd
import numpy as np
from tqdm import tqdm
import json
import re

import torch
import evaluate

# from google.colab import drive
import locale

### **Question 1: Importing Libaries and Mistral Model (3 Marks)**

- For the Mistral Model name or path and model basename, refer to the **Week 3 Additional Content: Prompt Engineering Fundamentals**
- Code Notebook: Self-Consistency and Tree-of-Thought Prompting with Llama 2 and Mistral.





https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF/blob/main/mistral-7b-instruct-v0.2.Q5_K_M.gguf

In [8]:
from huggingface_hub import hf_hub_download
from llama_cpp import Llama

In [9]:
model_name_or_path = "TheBloke/Mistral-7B-Instruct-v0.2-GGUF" 
model_basename = "mistral-7b-instruct-v0.2.Q5_K_M.gguf"

In [10]:
model_path = hf_hub_download(
    repo_id=model_name_or_path,
    filename=model_basename
    )

In [11]:
lcpp_llm = Llama(
        model_path=model_path,
        n_threads=2,  # CPU cores
        n_batch=512,  # Should be between 1 and n_ctx, consider the amount of VRAM in your GPU.
        n_gpu_layers=43,  # Change this value based on your model and your GPU VRAM pool.
        n_ctx=4096,  # Context window
    )

llama_model_loader: loaded meta data with 24 key-value pairs and 291 tensors from /home/andy/.cache/huggingface/hub/models--TheBloke--Mistral-7B-Instruct-v0.2-GGUF/snapshots/3a6fbf4a41a1d52e415a4958cde6856d34b2db93/mistral-7b-instruct-v0.2.Q5_K_M.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.name str              = mistralai_mistral-7b-instruct-v0.2
llama_model_loader: - kv   2:                       llama.context_length u32              = 32768
llama_model_loader: - kv   3:                     llama.embedding_length u32              = 4096
llama_model_loader: - kv   4:                          llama.block_count u32              = 32
llama_model_loader: - kv   5:                  llama.feed_forward_length u32              = 14336
llama_mode

# **Section 2: Text to Label generation**

### **Question 2: Zero-Shot Prompting for Text Classification (5 Marks)**

##### **Q2.1: Define the Prompt Template, System Message, generate_prompt** **(2 Marks)**

- Define a **system message** as a string and assign it to the variable system_message to generate product class.
- Create a **zero shot prompt template** that incorporates the system message and user input.
- Define **generate_prompt** function that takes both the system_message and user_input as arguments and formats them into a prompt template


Write a Python function called **generate_mistral_response** that takes a single parameter, narrative, which represents the user's complain. Inside the function, you should perform the following tasks:


- **Combine the system_message and narrative to create a prompt string using generate_prompt function.**

*Generate a response from the Mistral model using the lcpp_llm instance with the following parameters:*

- prompt should be the combined prompt string.
- max_tokens should be set to 1200.
- temperature should be set to 0.
- top_p should be set to 0.95.
- repeat_penalty should be set to 1.2.
- top_k should be set to 50.
- stop should be set as a list containing '/s'.
- echo should be set to False.
Extract and return the response text from the generated response.

Don't forget to provide a value for the system_message variable before using it in the function.

In [12]:
system_message = "You are a helpful assistant that classifies user complaints into one of the following product categories: credit_card, retail_banking, credit_reporting, mortgages_and_loans, or debt_collection. Respond only with the product category."


In [13]:
zero_shot_prompt_template = """[System Message]\n{system_message}\n\n[User Complaint]\n{user_input}\n\n[Response]"""


In [14]:
def generate_prompt(system_message,user_input):
    prompt = zero_shot_prompt_template.format(system_message=system_message, user_input=user_input)
    return prompt

In [15]:
def generate_mistral_response(input_text):

    # Combine user_prompt and system_message to create the prompt
    prompt = generate_prompt(system_message,input_text)

    # Define the Llama model along with its parameters for generating a response
    response = lcpp_llm(
        prompt=prompt,
        max_tokens=1200,
        temperature=0,
        top_p=0.95,
        repeat_penalty=1.2,
        top_k=50,
        stop=["/s"],
        echo=False
    )
    # Extract and return the response text
    response_text = response["choices"][0]["text"]
    print(response_text)
    return response_text

In [16]:
# Load a CSV File containing Dataset of 500 products, narrative and summary (summary of narrative)
data = pd.read_csv("Complaints_classification.csv")


In [17]:
# Randomly select 30 rows
new_data = data.sample(n=30, random_state=40)

In [18]:
new_data

Unnamed: 0,product,narrative,summary
167,retail_banking,fraudulent charge totaling made capital one ch...,A fraudulent charge was made on the individual...
169,credit_reporting,block except otherwise provided section consum...,The text outlines various stipulations regardi...
461,credit_card,usaa master plan collect cancellation debt usa...,The input appears to be a complaint about USAA...
253,credit_reporting,block except otherwise provided section consum...,The text pertains to the stipulations and oper...
42,credit_reporting,open account acct opened balance account acct ...,The input is about various accounts being open...
369,credit_reporting,except otherwise provided section consumer rep...,This legal text details rules and regulations ...
26,mortgages_and_loans,trying close loan month provided income paymen...,The individual has been attempting to close a ...
377,credit_reporting,except otherwise provided section consumer rep...,This legislation requires that a consumer repo...
238,credit_reporting,block except otherwise provided section consum...,The section explains the rules and procedures ...
374,credit_reporting,except otherwise provided section consumer rep...,The input text seems to discuss a law provisio...


##### **Q2.2: Create a new column in the DataFrame called 'mistral_response' and populate it with responses generated by applying the 'generate_mistral_response' function to each 'narrative' in the DataFrame and prepare the mistral_response_cleaned column using extract_category function** **(1 Marks)**

In [19]:
# example - new_data['mistral_response'] = new_data['narrative'].apply(lambda x:______ )
#new_data['mistral_response'] = new_data['narrative'].apply(lambda x: "_____")

new_data['mistral_response'] = new_data['narrative'].apply(lambda x: generate_mistral_response(x))



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.30 ms /     6 runs   (    0.22 ms per token,  4629.63 tokens per second)
llama_print_timings: prompt eval time =     408.95 ms /   284 tokens (    1.44 ms per token,   694.45 tokens per second)
llama_print_timings:        eval time =     156.06 ms /     5 runs   (   31.21 ms per token,    32.04 tokens per second)
llama_print_timings:       total time =     576.36 ms /   289 tokens
Llama.generate: prefix-match hit



credit_card



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.37 ms /     7 runs   (    0.20 ms per token,  5124.45 tokens per second)
llama_print_timings: prompt eval time =     431.81 ms /   505 tokens (    0.86 ms per token,  1169.50 tokens per second)
llama_print_timings:        eval time =     187.49 ms /     6 runs   (   31.25 ms per token,    32.00 tokens per second)
llama_print_timings:       total time =     632.48 ms /   511 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.22 ms /     6 runs   (    0.20 ms per token,  4909.98 tokens per second)
llama_print_timings: prompt eval time =     221.97 ms /   182 tokens (    1.22 ms per token,   819.95 tokens per second)
llama_print_timings:        eval time =     160.22 ms /     5 runs   (   32.04 ms per token,    31.21 tokens per second)
llama_print_timings:       total time =     392.25 ms /   187 tokens
Llama.generate: prefix-match hit



debt_collection



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.35 ms /     7 runs   (    0.19 ms per token,  5185.19 tokens per second)
llama_print_timings: prompt eval time =     429.15 ms /   505 tokens (    0.85 ms per token,  1176.73 tokens per second)
llama_print_timings:        eval time =     186.24 ms /     6 runs   (   31.04 ms per token,    32.22 tokens per second)
llama_print_timings:       total time =     626.45 ms /   511 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.23 ms /     7 runs   (    0.18 ms per token,  5714.29 tokens per second)
llama_print_timings: prompt eval time =     187.41 ms /    74 tokens (    2.53 ms per token,   394.85 tokens per second)
llama_print_timings:        eval time =     188.47 ms /     6 runs   (   31.41 ms per token,    31.84 tokens per second)
llama_print_timings:       total time =     387.44 ms /    80 tokens
Llama.generate: prefix-match hit



retail_banking



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.27 ms /     7 runs   (    0.18 ms per token,  5520.50 tokens per second)
llama_print_timings: prompt eval time =     377.24 ms /   420 tokens (    0.90 ms per token,  1113.35 tokens per second)
llama_print_timings:        eval time =     185.79 ms /     6 runs   (   30.96 ms per token,    32.30 tokens per second)
llama_print_timings:       total time =     575.62 ms /   426 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       2.08 ms /    11 runs   (    0.19 ms per token,  5291.01 tokens per second)
llama_print_timings: prompt eval time =     190.05 ms /    87 tokens (    2.18 ms per token,   457.78 tokens per second)
llama_print_timings:        eval time =     309.85 ms /    10 runs   (   30.98 ms per token,    32.27 tokens per second)
llama_print_timings:       total time =     518.14 ms /    97 tokens
Llama.generate: prefix-match hit



mortgages_and_loans



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.34 ms /     7 runs   (    0.19 ms per token,  5227.78 tokens per second)
llama_print_timings: prompt eval time =     378.68 ms /   420 tokens (    0.90 ms per token,  1109.11 tokens per second)
llama_print_timings:        eval time =     186.19 ms /     6 runs   (   31.03 ms per token,    32.23 tokens per second)
llama_print_timings:       total time =     576.48 ms /   426 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.27 ms /     7 runs   (    0.18 ms per token,  5524.86 tokens per second)
llama_print_timings: prompt eval time =     383.01 ms /   432 tokens (    0.89 ms per token,  1127.92 tokens per second)
llama_print_timings:        eval time =     196.59 ms /     6 runs   (   32.77 ms per token,    30.52 tokens per second)
llama_print_timings:       total time =     591.10 ms /   438 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.34 ms /     7 runs   (    0.19 ms per token,  5223.88 tokens per second)
llama_print_timings: prompt eval time =     376.38 ms /   420 tokens (    0.90 ms per token,  1115.89 tokens per second)
llama_print_timings:        eval time =     184.64 ms /     6 runs   (   30.77 ms per token,    32.50 tokens per second)
llama_print_timings:       total time =     571.72 ms /   426 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.40 ms /     7 runs   (    0.20 ms per token,  5007.15 tokens per second)
llama_print_timings: prompt eval time =     210.70 ms /   147 tokens (    1.43 ms per token,   697.66 tokens per second)
llama_print_timings:        eval time =     188.04 ms /     6 runs   (   31.34 ms per token,    31.91 tokens per second)
llama_print_timings:       total time =     411.90 ms /   153 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       2.20 ms /     7 runs   (    0.31 ms per token,  3177.49 tokens per second)
llama_print_timings: prompt eval time =     430.03 ms /   511 tokens (    0.84 ms per token,  1188.28 tokens per second)
llama_print_timings:        eval time =     208.12 ms /     6 runs   (   34.69 ms per token,    28.83 tokens per second)
llama_print_timings:       total time =     655.49 ms /   517 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.28 ms /     7 runs   (    0.18 ms per token,  5464.48 tokens per second)
llama_print_timings: prompt eval time =     373.78 ms /   420 tokens (    0.89 ms per token,  1123.65 tokens per second)
llama_print_timings:        eval time =     188.54 ms /     6 runs   (   31.42 ms per token,    31.82 tokens per second)
llama_print_timings:       total time =     573.05 ms /   426 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       2.23 ms /    11 runs   (    0.20 ms per token,  4928.32 tokens per second)
llama_print_timings: prompt eval time =     250.83 ms /   232 tokens (    1.08 ms per token,   924.95 tokens per second)
llama_print_timings:        eval time =     312.35 ms /    10 runs   (   31.23 ms per token,    32.02 tokens per second)
llama_print_timings:       total time =     581.85 ms /   242 tokens
Llama.generate: prefix-match hit



mortgages_and_loans



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.40 ms /     7 runs   (    0.20 ms per token,  4996.43 tokens per second)
llama_print_timings: prompt eval time =     433.02 ms /   511 tokens (    0.85 ms per token,  1180.08 tokens per second)
llama_print_timings:        eval time =     189.64 ms /     6 runs   (   31.61 ms per token,    31.64 tokens per second)
llama_print_timings:       total time =     634.41 ms /   517 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.29 ms /     7 runs   (    0.18 ms per token,  5443.23 tokens per second)
llama_print_timings: prompt eval time =     385.84 ms /   438 tokens (    0.88 ms per token,  1135.19 tokens per second)
llama_print_timings:        eval time =     188.04 ms /     6 runs   (   31.34 ms per token,    31.91 tokens per second)
llama_print_timings:       total time =     586.39 ms /   444 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.29 ms /     7 runs   (    0.18 ms per token,  5443.23 tokens per second)
llama_print_timings: prompt eval time =     378.63 ms /   420 tokens (    0.90 ms per token,  1109.25 tokens per second)
llama_print_timings:        eval time =     188.00 ms /     6 runs   (   31.33 ms per token,    31.91 tokens per second)
llama_print_timings:       total time =     578.27 ms /   426 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.19 ms /     6 runs   (    0.20 ms per token,  5037.78 tokens per second)
llama_print_timings: prompt eval time =     238.22 ms /   193 tokens (    1.23 ms per token,   810.16 tokens per second)
llama_print_timings:        eval time =     161.98 ms /     5 runs   (   32.40 ms per token,    30.87 tokens per second)
llama_print_timings:       total time =     411.11 ms /   198 tokens
Llama.generate: prefix-match hit



credit_card



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.49 ms /     7 runs   (    0.21 ms per token,  4710.63 tokens per second)
llama_print_timings: prompt eval time =     207.50 ms /   139 tokens (    1.49 ms per token,   669.89 tokens per second)
llama_print_timings:        eval time =     186.33 ms /     6 runs   (   31.05 ms per token,    32.20 tokens per second)
llama_print_timings:       total time =     406.48 ms /   145 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       2.37 ms /    11 runs   (    0.22 ms per token,  4645.27 tokens per second)
llama_print_timings: prompt eval time =     310.24 ms /   358 tokens (    0.87 ms per token,  1153.93 tokens per second)
llama_print_timings:        eval time =     313.66 ms /    10 runs   (   31.37 ms per token,    31.88 tokens per second)
llama_print_timings:       total time =     643.35 ms /   368 tokens
Llama.generate: prefix-match hit



mortgages_and_loans



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.25 ms /     7 runs   (    0.18 ms per token,  5604.48 tokens per second)
llama_print_timings: prompt eval time =     185.58 ms /    72 tokens (    2.58 ms per token,   387.98 tokens per second)
llama_print_timings:        eval time =     185.55 ms /     6 runs   (   30.93 ms per token,    32.34 tokens per second)
llama_print_timings:       total time =     382.54 ms /    78 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.42 ms /     7 runs   (    0.20 ms per token,  4929.58 tokens per second)
llama_print_timings: prompt eval time =     302.55 ms /   336 tokens (    0.90 ms per token,  1110.57 tokens per second)
llama_print_timings:        eval time =     184.35 ms /     6 runs   (   30.73 ms per token,    32.55 tokens per second)
llama_print_timings:       total time =     499.09 ms /   342 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.35 ms /     7 runs   (    0.19 ms per token,  5192.88 tokens per second)
llama_print_timings: prompt eval time =     195.56 ms /   111 tokens (    1.76 ms per token,   567.60 tokens per second)
llama_print_timings:        eval time =     187.75 ms /     6 runs   (   31.29 ms per token,    31.96 tokens per second)
llama_print_timings:       total time =     396.11 ms /   117 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.34 ms /     7 runs   (    0.19 ms per token,  5239.52 tokens per second)
llama_print_timings: prompt eval time =     208.22 ms /   133 tokens (    1.57 ms per token,   638.75 tokens per second)
llama_print_timings:        eval time =     183.57 ms /     6 runs   (   30.60 ms per token,    32.68 tokens per second)
llama_print_timings:       total time =     402.53 ms /   139 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.32 ms /     7 runs   (    0.19 ms per token,  5291.01 tokens per second)
llama_print_timings: prompt eval time =     290.51 ms /   290 tokens (    1.00 ms per token,   998.24 tokens per second)
llama_print_timings:        eval time =     184.44 ms /     6 runs   (   30.74 ms per token,    32.53 tokens per second)
llama_print_timings:       total time =     487.21 ms /   296 tokens
Llama.generate: prefix-match hit



retail_banking



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.24 ms /     7 runs   (    0.18 ms per token,  5649.72 tokens per second)
llama_print_timings: prompt eval time =     377.04 ms /   420 tokens (    0.90 ms per token,  1113.94 tokens per second)
llama_print_timings:        eval time =     186.55 ms /     6 runs   (   31.09 ms per token,    32.16 tokens per second)
llama_print_timings:       total time =     575.26 ms /   426 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.25 ms /     7 runs   (    0.18 ms per token,  5622.49 tokens per second)
llama_print_timings: prompt eval time =     193.51 ms /   103 tokens (    1.88 ms per token,   532.28 tokens per second)
llama_print_timings:        eval time =     187.03 ms /     6 runs   (   31.17 ms per token,    32.08 tokens per second)
llama_print_timings:       total time =     391.71 ms /   109 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.44 ms /     7 runs   (    0.21 ms per token,  4861.11 tokens per second)
llama_print_timings: prompt eval time =     213.35 ms /   152 tokens (    1.40 ms per token,   712.46 tokens per second)
llama_print_timings:        eval time =     184.77 ms /     6 runs   (   30.79 ms per token,    32.47 tokens per second)
llama_print_timings:       total time =     410.39 ms /   158 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.38 ms /     7 runs   (    0.20 ms per token,  5083.51 tokens per second)
llama_print_timings: prompt eval time =     433.55 ms /   505 tokens (    0.86 ms per token,  1164.81 tokens per second)
llama_print_timings:        eval time =     191.54 ms /     6 runs   (   31.92 ms per token,    31.32 tokens per second)
llama_print_timings:       total time =     636.36 ms /   511 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.39 ms /     7 runs   (    0.20 ms per token,  5032.35 tokens per second)
llama_print_timings: prompt eval time =     192.09 ms /    95 tokens (    2.02 ms per token,   494.57 tokens per second)
llama_print_timings:        eval time =     182.99 ms /     6 runs   (   30.50 ms per token,    32.79 tokens per second)
llama_print_timings:       total time =     385.91 ms /   101 tokens



credit_reporting


In [20]:
new_data['mistral_response']

167            \ncredit_card
169       \ncredit_reporting
461        \ndebt_collection
253       \ncredit_reporting
42          \nretail_banking
369       \ncredit_reporting
26     \nmortgages_and_loans
377       \ncredit_reporting
238       \ncredit_reporting
374       \ncredit_reporting
140       \ncredit_reporting
175       \ncredit_reporting
388       \ncredit_reporting
62     \nmortgages_and_loans
256       \ncredit_reporting
332       \ncredit_reporting
386       \ncredit_reporting
56             \ncredit_card
157       \ncredit_reporting
48     \nmortgages_and_loans
163       \ncredit_reporting
9         \ncredit_reporting
441       \ncredit_reporting
102       \ncredit_reporting
0           \nretail_banking
364       \ncredit_reporting
139       \ncredit_reporting
150       \ncredit_reporting
173       \ncredit_reporting
463       \ncredit_reporting
Name: mistral_response, dtype: object

In [21]:
def extract_category(text):
    # Define the regex pattern to match "category:" or "Category:" followed by a word
    pattern = r'category:\s*(\w+)'  # The pattern itself remains the same

    # Use re.search with the re.IGNORECASE flag to make it case-insensitive
    match = re.search(pattern, text, re.IGNORECASE)

    # If a match is found, return the captured group, else return None
    if match:
        return match.group(1)
    else:
        pattern1 = r'(credit_card|retail_banking|credit_reporting|mortgages_and_loans|debt_collection)'
        match = re.search(pattern1, text, re.IGNORECASE)
        if match:
            return match.group()
        else:
            return ''

In [22]:
# example - new_data['mistral_response_cleaned'] = new_data['narrative'].apply(lambda x:______ )
new_data['mistral_response_cleaned'] = new_data['mistral_response'].apply(lambda x:extract_category(x) )

In [23]:
new_data.head()

Unnamed: 0,product,narrative,summary,mistral_response,mistral_response_cleaned
167,retail_banking,fraudulent charge totaling made capital one ch...,A fraudulent charge was made on the individual...,\ncredit_card,credit_card
169,credit_reporting,block except otherwise provided section consum...,The text outlines various stipulations regardi...,\ncredit_reporting,credit_reporting
461,credit_card,usaa master plan collect cancellation debt usa...,The input appears to be a complaint about USAA...,\ndebt_collection,debt_collection
253,credit_reporting,block except otherwise provided section consum...,The text pertains to the stipulations and oper...,\ncredit_reporting,credit_reporting
42,credit_reporting,open account acct opened balance account acct ...,The input is about various accounts being open...,\nretail_banking,retail_banking


##### **Q2.3: Calculate the F1 score** **(1 Marks)**

In [24]:
# Calculate F1 score for 'product' and 'mistral_response'
f1 =  f1_score(new_data['product'], new_data['mistral_response'],average='micro')

print(f'F1 Score: {f1}')

F1 Score: 0.0


In [25]:
# Calculate F1 score for 'product' and 'mistral_response_cleaned'
f2 = f1_score(new_data['product'], new_data['mistral_response_cleaned'], average='micro')
print(f'F1 Score: {f2}')

F1 Score: 0.8333333333333334


##### **Q2.4: Explain the difference in F1 scores between mistral_response and mistral_response_cleaned.** **(1 Marks)**

#### mistral_response_cleaned has a higher F1 score because it extracts only the predicted category, removing extra text from the full mistral_response.


### **Question 3: Few-Shot Prompting for Text Classification (7 Marks)**

##### **Q3.1: Prepare examples for a few-shot prompt, formulate the prompt, and generate the Mistral response. (5 Marks)**

**Generate a set of gold examples by randomly selecting 10 instances of user_input and assistant_output from dataset ensuring a balanced representation with 2 examples from each class.**

In [26]:
import json
review_1 = data[data['product'] == 'credit_card']
review_2 = data[data['product'] == 'retail_banking']
review_3 = data[data['product'] == 'credit_reporting']
review_4 = data[data['product'] == 'mortgages_and_loans']
review_5 = data[data['product'] == 'debt_collection']

# Sample 2 examples for each category
examples_1 = review_1.sample(2, random_state=40)
examples_2 = review_2.sample(2, random_state=40)
examples_3 = review_3.sample(2, random_state=40)
examples_4 = review_4.sample(2, random_state=40)
examples_5 = review_5.sample(2, random_state=40)

# Concatenate examples for few shot prompting
examples_df = pd.concat([examples_1,examples_2,examples_3,examples_4,examples_5 ])

# Create the training set by excluding examples
gold_examples_df = data.drop(index=examples_df.index)

# Convert examples to JSON
columns_to_select = ['narrative', 'product']
examples_json = examples_df[columns_to_select].to_json(orient='records')

# Print the first record from the JSON
print(json.loads(examples_json)[0])

# Print the shapes of the datasets
print("Examples Set Shape:", examples_df.shape)
print("Gold Examples Shape:", gold_examples_df.shape)

{'narrative': 'called request new york state covid relief plan day interest fee waived amex provided relief leading late payment amex refused honor relief day gap insists charging late fee', 'product': 'credit_card'}
Examples Set Shape: (10, 3)
Gold Examples Shape: (490, 3)


- Define your **system_message**.
- Define **first_turn_template**, **example_template** and **prediction template**
- **create few shot prompt** using gold examples and system_message
- Randomly select 30 rows from test_df as test_data
- Create **mistral_response** with **mistral_response_cleaned** columns for this

In [27]:
system_message = "You are a helpful assistant that classifies user complaints into one of the following product categories: credit_card, retail_banking, credit_reporting, mortgages_and_loans, or debt_collection. Respond only with the product category."


In [28]:

first_turn_template = """[System Message]
{system_message}

[User Complaint]
{user_input}

[Assistant Response]
{assistant_output}

"""

# Template for subsequent examples
examples_template = """[User Complaint]
{user_input}

[Assistant Response]
{assistant_output}

"""

prediction_template = """[User Complaint]
{user_input}

[Assistant Response]"""

In [29]:
def create_few_shot_prompt(system_message, examples_df):

    """
    Return a prompt message in the format expected by Mistral 7b.
    10 examples are selected randomly as golden examples to form the
    few-shot prompt.
    We then loop through each example and parse the narrative as the user message
    and the product as the assistant message.

    Args:
        system_message (str): system message with instructions for classification
        examples(DataFrame): A DataFrame with examples (product + narrative + summary)
        to form the few-shot prompt.

    Output:
        few_shot_prompt (str): A prompt string in the Mistral format
    """

    few_shot_prompt = ''

    columns_to_select = ['narrative', 'product']
    examples = (
        examples_df.loc[:, columns_to_select].to_json(orient='records')
    )

    for idx, example in enumerate(json.loads(examples)):
        user_input_example = example['narrative']
        assistant_output_example = example['product']

        if idx == 0:
            few_shot_prompt += first_turn_template.format(
                system_message=system_message,
                user_input=user_input_example,
                assistant_output=assistant_output_example
            )
        else:
            few_shot_prompt += examples_template.format(
                user_input=user_input_example,
                assistant_output=assistant_output_example
            )

    return few_shot_prompt

In [30]:
few_shot_prompt = create_few_shot_prompt(system_message, examples_df)


In [31]:
print(few_shot_prompt)

[System Message]
You are a helpful assistant that classifies user complaints into one of the following product categories: credit_card, retail_banking, credit_reporting, mortgages_and_loans, or debt_collection. Respond only with the product category.

[User Complaint]
called request new york state covid relief plan day interest fee waived amex provided relief leading late payment amex refused honor relief day gap insists charging late fee

[Assistant Response]
credit_card

[User Complaint]
dispute case card ending usaa credit card merchant name transaction date transaction amount disputed amount dispute case card ending usaa credit card merchant name transaction date transaction amount disputed amount dispute making purchase credit card usaa two separate transaction merchant purchased trip cancelled told apply refund apply refund however company give refund back filed dispute credit agency help dispute account follow dispute policy state service provided customer right receive full ref

In [32]:
def generate_prompt(few_shot_prompt,new_review):
    prompt =  few_shot_prompt + prediction_template.format(user_input=new_review)
    return prompt

In [33]:
def generate_mistral_response(support_ticket_text):

    # Combine user_prompt and system_message to create the prompt
    prompt = generate_prompt(system_message,support_ticket_text)

    # Define the Llama model along with its parameters for generating a response
  # Call the Llama model to generate a response
    response = lcpp_llm(
        prompt=prompt,
        max_tokens=1200,
        temperature=0,
        top_p=0.95,
        repeat_penalty=1.2,
        top_k=50,
        stop=["/s"],
        echo=False
    )

    # Extract and return the response text
    response_text = response["choices"][0]["text"]
    print(response_text)
    return response_text

In [34]:
# Randomly select 50 rows from gold_examples
new_data = gold_examples_df.sample(n=50, random_state=40)

In [35]:
# example - new_data['mistral_response_cleaned'] = new_data['narrative'].apply(lambda x:______ )
#new_data['mistral_response'] = new_data['narrative'].apply(lambda x: "_____")
new_data['mistral_response'] = new_data['narrative'].apply(lambda x: generate_mistral_response(x))


Llama.generate: prefix-match hit

llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.42 ms /     7 runs   (    0.20 ms per token,  4936.53 tokens per second)
llama_print_timings: prompt eval time =     689.01 ms /   719 tokens (    0.96 ms per token,  1043.53 tokens per second)
llama_print_timings:        eval time =     193.70 ms /     6 runs   (   32.28 ms per token,    30.98 tokens per second)
llama_print_timings:       total time =     898.20 ms /   725 tokens
Llama.generate: prefix-match hit


 mortgage_and_loans



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.89 ms /     5 runs   (    0.18 ms per token,  5636.98 tokens per second)
llama_print_timings: prompt eval time =     386.02 ms /   434 tokens (    0.89 ms per token,  1124.29 tokens per second)
llama_print_timings:        eval time =     125.88 ms /     4 runs   (   31.47 ms per token,    31.78 tokens per second)
llama_print_timings:       total time =     520.08 ms /   438 tokens
Llama.generate: prefix-match hit


 credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.42 ms /     7 runs   (    0.20 ms per token,  4919.18 tokens per second)
llama_print_timings: prompt eval time =     163.30 ms /    19 tokens (    8.59 ms per token,   116.35 tokens per second)
llama_print_timings:        eval time =     183.32 ms /     6 runs   (   30.55 ms per token,    32.73 tokens per second)
llama_print_timings:       total time =     359.85 ms /    25 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.91 ms /     5 runs   (    0.18 ms per token,  5470.46 tokens per second)
llama_print_timings: prompt eval time =     377.83 ms /   422 tokens (    0.90 ms per token,  1116.89 tokens per second)
llama_print_timings:        eval time =     124.00 ms /     4 runs   (   31.00 ms per token,    32.26 tokens per second)
llama_print_timings:       total time =     509.96 ms /   426 tokens
Llama.generate: prefix-match hit


 credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.93 ms /     5 runs   (    0.19 ms per token,  5376.34 tokens per second)
llama_print_timings: prompt eval time =     386.53 ms /   434 tokens (    0.89 ms per token,  1122.81 tokens per second)
llama_print_timings:        eval time =     137.63 ms /     4 runs   (   34.41 ms per token,    29.06 tokens per second)
llama_print_timings:       total time =     532.56 ms /   438 tokens
Llama.generate: prefix-match hit


 credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.92 ms /     5 runs   (    0.18 ms per token,  5411.26 tokens per second)
llama_print_timings: prompt eval time =     198.32 ms /    86 tokens (    2.31 ms per token,   433.64 tokens per second)
llama_print_timings:        eval time =     125.41 ms /     4 runs   (   31.35 ms per token,    31.90 tokens per second)
llama_print_timings:       total time =     332.06 ms /    90 tokens
Llama.generate: prefix-match hit


 credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.89 ms /     5 runs   (    0.18 ms per token,  5605.38 tokens per second)
llama_print_timings: prompt eval time =     381.39 ms /   422 tokens (    0.90 ms per token,  1106.49 tokens per second)
llama_print_timings:        eval time =     126.52 ms /     4 runs   (   31.63 ms per token,    31.61 tokens per second)
llama_print_timings:       total time =     515.55 ms /   426 tokens
Llama.generate: prefix-match hit


 credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.19 ms /     6 runs   (    0.20 ms per token,  5063.29 tokens per second)
llama_print_timings: prompt eval time =     176.45 ms /    46 tokens (    3.84 ms per token,   260.70 tokens per second)
llama_print_timings:        eval time =     154.86 ms /     5 runs   (   30.97 ms per token,    32.29 tokens per second)
llama_print_timings:       total time =     341.87 ms /    51 tokens
Llama.generate: prefix-match hit



debt_collection



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.83 ms /     4 runs   (    0.21 ms per token,  4796.16 tokens per second)
llama_print_timings: prompt eval time =     187.55 ms /    68 tokens (    2.76 ms per token,   362.58 tokens per second)
llama_print_timings:        eval time =      97.57 ms /     3 runs   (   32.52 ms per token,    30.75 tokens per second)
llama_print_timings:       total time =     292.15 ms /    71 tokens
Llama.generate: prefix-match hit


 debt_collection



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.95 ms /     5 runs   (    0.19 ms per token,  5279.83 tokens per second)
llama_print_timings: prompt eval time =     390.07 ms /   439 tokens (    0.89 ms per token,  1125.44 tokens per second)
llama_print_timings:        eval time =     125.39 ms /     4 runs   (   31.35 ms per token,    31.90 tokens per second)
llama_print_timings:       total time =     523.67 ms /   443 tokens
Llama.generate: prefix-match hit


 credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.97 ms /     5 runs   (    0.19 ms per token,  5159.96 tokens per second)
llama_print_timings: prompt eval time =     380.40 ms /   422 tokens (    0.90 ms per token,  1109.36 tokens per second)
llama_print_timings:        eval time =     123.73 ms /     4 runs   (   30.93 ms per token,    32.33 tokens per second)
llama_print_timings:       total time =     512.81 ms /   426 tokens
Llama.generate: prefix-match hit


 credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.94 ms /     5 runs   (    0.19 ms per token,  5330.49 tokens per second)
llama_print_timings: prompt eval time =     178.27 ms /    54 tokens (    3.30 ms per token,   302.91 tokens per second)
llama_print_timings:        eval time =     125.76 ms /     4 runs   (   31.44 ms per token,    31.81 tokens per second)
llama_print_timings:       total time =     311.89 ms /    58 tokens
Llama.generate: prefix-match hit


 credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.49 ms /     7 runs   (    0.21 ms per token,  4694.84 tokens per second)
llama_print_timings: prompt eval time =     178.29 ms /    54 tokens (    3.30 ms per token,   302.87 tokens per second)
llama_print_timings:        eval time =     184.31 ms /     6 runs   (   30.72 ms per token,    32.55 tokens per second)
llama_print_timings:       total time =     373.29 ms /    60 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.92 ms /     5 runs   (    0.18 ms per token,  5458.52 tokens per second)
llama_print_timings: prompt eval time =     383.78 ms /   422 tokens (    0.91 ms per token,  1099.58 tokens per second)
llama_print_timings:        eval time =     125.54 ms /     4 runs   (   31.39 ms per token,    31.86 tokens per second)
llama_print_timings:       total time =     516.92 ms /   426 tokens
Llama.generate: prefix-match hit


 credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       2.34 ms /    12 runs   (    0.20 ms per token,  5126.01 tokens per second)
llama_print_timings: prompt eval time =     255.79 ms /   245 tokens (    1.04 ms per token,   957.82 tokens per second)
llama_print_timings:        eval time =     340.28 ms /    11 runs   (   30.93 ms per token,    32.33 tokens per second)
llama_print_timings:       total time =     616.43 ms /   256 tokens
Llama.generate: prefix-match hit


 credit_mortgages_and_loans



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.00 ms /     5 runs   (    0.20 ms per token,  5020.08 tokens per second)
llama_print_timings: prompt eval time =     436.64 ms /   512 tokens (    0.85 ms per token,  1172.59 tokens per second)
llama_print_timings:        eval time =     156.72 ms /     5 runs   (   31.34 ms per token,    31.90 tokens per second)
llama_print_timings:       total time =     602.74 ms /   517 tokens
Llama.generate: prefix-match hit


 credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.22 ms /     6 runs   (    0.20 ms per token,  4930.16 tokens per second)
llama_print_timings: prompt eval time =     174.82 ms /    37 tokens (    4.72 ms per token,   211.64 tokens per second)
llama_print_timings:        eval time =     154.99 ms /     5 runs   (   31.00 ms per token,    32.26 tokens per second)
llama_print_timings:       total time =     339.52 ms /    42 tokens
Llama.generate: prefix-match hit



credit_card



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.80 ms /     4 runs   (    0.20 ms per token,  5018.82 tokens per second)
llama_print_timings: prompt eval time =     289.87 ms /   292 tokens (    0.99 ms per token,  1007.34 tokens per second)
llama_print_timings:        eval time =      92.67 ms /     3 runs   (   30.89 ms per token,    32.37 tokens per second)
llama_print_timings:       total time =     388.80 ms /   295 tokens
Llama.generate: prefix-match hit


 credit_card



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.78 ms /     9 runs   (    0.20 ms per token,  5064.72 tokens per second)
llama_print_timings: prompt eval time =     198.85 ms /   121 tokens (    1.64 ms per token,   608.50 tokens per second)
llama_print_timings:        eval time =     249.43 ms /     8 runs   (   31.18 ms per token,    32.07 tokens per second)
llama_print_timings:       total time =     463.84 ms /   129 tokens
Llama.generate: prefix-match hit


 mortgages_and_loans



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.93 ms /     5 runs   (    0.19 ms per token,  5399.57 tokens per second)
llama_print_timings: prompt eval time =     382.10 ms /   422 tokens (    0.91 ms per token,  1104.41 tokens per second)
llama_print_timings:        eval time =     126.43 ms /     4 runs   (   31.61 ms per token,    31.64 tokens per second)
llama_print_timings:       total time =     516.99 ms /   426 tokens
Llama.generate: prefix-match hit


 credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.78 ms /     4 runs   (    0.19 ms per token,  5141.39 tokens per second)
llama_print_timings: prompt eval time =     226.26 ms /   184 tokens (    1.23 ms per token,   813.24 tokens per second)
llama_print_timings:        eval time =      93.74 ms /     3 runs   (   31.25 ms per token,    32.00 tokens per second)
llama_print_timings:       total time =     326.62 ms /   187 tokens
Llama.generate: prefix-match hit


 debt_collection



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.02 ms /     5 runs   (    0.20 ms per token,  4921.26 tokens per second)
llama_print_timings: prompt eval time =     436.92 ms /   512 tokens (    0.85 ms per token,  1171.84 tokens per second)
llama_print_timings:        eval time =     159.94 ms /     5 runs   (   31.99 ms per token,    31.26 tokens per second)
llama_print_timings:       total time =     606.27 ms /   517 tokens
Llama.generate: prefix-match hit


 credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.03 ms /     5 runs   (    0.21 ms per token,  4873.29 tokens per second)
llama_print_timings: prompt eval time =     642.11 ms /   611 tokens (    1.05 ms per token,   951.55 tokens per second)
llama_print_timings:        eval time =     125.66 ms /     4 runs   (   31.42 ms per token,    31.83 tokens per second)
llama_print_timings:       total time =     779.09 ms /   615 tokens
Llama.generate: prefix-match hit


 retail_banking



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.74 ms /     4 runs   (    0.19 ms per token,  5369.13 tokens per second)
llama_print_timings: prompt eval time =    1362.75 ms /  1430 tokens (    0.95 ms per token,  1049.35 tokens per second)
llama_print_timings:        eval time =      96.72 ms /     3 runs   (   32.24 ms per token,    31.02 tokens per second)
llama_print_timings:       total time =    1471.95 ms /  1433 tokens
Llama.generate: prefix-match hit


 credit_card



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.61 ms /     7 runs   (    0.23 ms per token,  4358.66 tokens per second)
llama_print_timings: prompt eval time =     215.65 ms /   166 tokens (    1.30 ms per token,   769.78 tokens per second)
llama_print_timings:        eval time =     185.86 ms /     6 runs   (   30.98 ms per token,    32.28 tokens per second)
llama_print_timings:       total time =     413.82 ms /   172 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.38 ms /     7 runs   (    0.20 ms per token,  5068.79 tokens per second)
llama_print_timings: prompt eval time =     302.50 ms /   338 tokens (    0.89 ms per token,  1117.34 tokens per second)
llama_print_timings:        eval time =     189.23 ms /     6 runs   (   31.54 ms per token,    31.71 tokens per second)
llama_print_timings:       total time =     503.14 ms /   344 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.95 ms /    11 runs   (    0.18 ms per token,  5629.48 tokens per second)
llama_print_timings: prompt eval time =     201.13 ms /   124 tokens (    1.62 ms per token,   616.52 tokens per second)
llama_print_timings:        eval time =     313.00 ms /    10 runs   (   31.30 ms per token,    31.95 tokens per second)
llama_print_timings:       total time =     532.32 ms /   134 tokens
Llama.generate: prefix-match hit



mortgages_and_loans



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.42 ms /     7 runs   (    0.20 ms per token,  4929.58 tokens per second)
llama_print_timings: prompt eval time =     208.49 ms /   134 tokens (    1.56 ms per token,   642.73 tokens per second)
llama_print_timings:        eval time =     188.52 ms /     6 runs   (   31.42 ms per token,    31.83 tokens per second)
llama_print_timings:       total time =     408.56 ms /   140 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.66 ms /     4 runs   (    0.16 ms per token,  6106.87 tokens per second)
llama_print_timings: prompt eval time =     192.35 ms /    89 tokens (    2.16 ms per token,   462.71 tokens per second)
llama_print_timings:        eval time =      94.20 ms /     3 runs   (   31.40 ms per token,    31.85 tokens per second)
llama_print_timings:       total time =     292.33 ms /    92 tokens
Llama.generate: prefix-match hit


 debt_collection



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       8.82 ms /    43 runs   (    0.21 ms per token,  4876.39 tokens per second)
llama_print_timings: prompt eval time =     199.81 ms /   124 tokens (    1.61 ms per token,   620.58 tokens per second)
llama_print_timings:        eval time =    1299.11 ms /    42 runs   (   30.93 ms per token,    32.33 tokens per second)
llama_print_timings:       total time =    1693.05 ms /   166 tokens
Llama.generate: prefix-match hit


 mortgages_and_loans or debt_collection (The user's complaint involves a dealer and payments, which could potentially relate to either a mortgage/loan situation or a debt collection issue.)



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.02 ms /     5 runs   (    0.20 ms per token,  4906.77 tokens per second)
llama_print_timings: prompt eval time =     435.21 ms /   507 tokens (    0.86 ms per token,  1164.96 tokens per second)
llama_print_timings:        eval time =     128.88 ms /     4 runs   (   32.22 ms per token,    31.04 tokens per second)
llama_print_timings:       total time =     572.22 ms /   511 tokens
Llama.generate: prefix-match hit


 credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.03 ms /     5 runs   (    0.21 ms per token,  4873.29 tokens per second)
llama_print_timings: prompt eval time =     297.74 ms /   315 tokens (    0.95 ms per token,  1057.96 tokens per second)
llama_print_timings:        eval time =     127.03 ms /     4 runs   (   31.76 ms per token,    31.49 tokens per second)
llama_print_timings:       total time =     433.97 ms /   319 tokens
Llama.generate: prefix-match hit


 credit_banking



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.95 ms /     5 runs   (    0.19 ms per token,  5279.83 tokens per second)
llama_print_timings: prompt eval time =     430.19 ms /   507 tokens (    0.85 ms per token,  1178.54 tokens per second)
llama_print_timings:        eval time =     123.62 ms /     4 runs   (   30.91 ms per token,    32.36 tokens per second)
llama_print_timings:       total time =     561.68 ms /   511 tokens
Llama.generate: prefix-match hit


 credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.38 ms /     7 runs   (    0.20 ms per token,  5083.51 tokens per second)
llama_print_timings: prompt eval time =     209.15 ms /   147 tokens (    1.42 ms per token,   702.84 tokens per second)
llama_print_timings:        eval time =     183.08 ms /     6 runs   (   30.51 ms per token,    32.77 tokens per second)
llama_print_timings:       total time =     403.24 ms /   153 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.15 ms /     6 runs   (    0.19 ms per token,  5221.93 tokens per second)
llama_print_timings: prompt eval time =     174.34 ms /    45 tokens (    3.87 ms per token,   258.12 tokens per second)
llama_print_timings:        eval time =     154.11 ms /     5 runs   (   30.82 ms per token,    32.45 tokens per second)
llama_print_timings:       total time =     337.86 ms /    50 tokens
Llama.generate: prefix-match hit



credit_card



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.91 ms /     5 runs   (    0.18 ms per token,  5470.46 tokens per second)
llama_print_timings: prompt eval time =     429.44 ms /   512 tokens (    0.84 ms per token,  1192.26 tokens per second)
llama_print_timings:        eval time =     155.22 ms /     5 runs   (   31.04 ms per token,    32.21 tokens per second)
llama_print_timings:       total time =     593.96 ms /   517 tokens
Llama.generate: prefix-match hit


 credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.13 ms /     5 runs   (    0.23 ms per token,  4416.96 tokens per second)
llama_print_timings: prompt eval time =     380.17 ms /   422 tokens (    0.90 ms per token,  1110.03 tokens per second)
llama_print_timings:        eval time =     124.33 ms /     4 runs   (   31.08 ms per token,    32.17 tokens per second)
llama_print_timings:       total time =     513.95 ms /   426 tokens
Llama.generate: prefix-match hit


 credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.96 ms /     5 runs   (    0.19 ms per token,  5230.13 tokens per second)
llama_print_timings: prompt eval time =     434.52 ms /   507 tokens (    0.86 ms per token,  1166.81 tokens per second)
llama_print_timings:        eval time =     126.15 ms /     4 runs   (   31.54 ms per token,    31.71 tokens per second)
llama_print_timings:       total time =     569.19 ms /   511 tokens
Llama.generate: prefix-match hit


 credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.29 ms /     7 runs   (    0.18 ms per token,  5409.58 tokens per second)
llama_print_timings: prompt eval time =     193.31 ms /   105 tokens (    1.84 ms per token,   543.18 tokens per second)
llama_print_timings:        eval time =     184.93 ms /     6 runs   (   30.82 ms per token,    32.44 tokens per second)
llama_print_timings:       total time =     389.39 ms /   111 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.92 ms /     5 runs   (    0.18 ms per token,  5458.52 tokens per second)
llama_print_timings: prompt eval time =     381.57 ms /   423 tokens (    0.90 ms per token,  1108.58 tokens per second)
llama_print_timings:        eval time =     127.87 ms /     4 runs   (   31.97 ms per token,    31.28 tokens per second)
llama_print_timings:       total time =     518.04 ms /   427 tokens
Llama.generate: prefix-match hit


 credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.94 ms /     5 runs   (    0.19 ms per token,  5336.18 tokens per second)
llama_print_timings: prompt eval time =     385.56 ms /   434 tokens (    0.89 ms per token,  1125.63 tokens per second)
llama_print_timings:        eval time =     124.78 ms /     4 runs   (   31.20 ms per token,    32.06 tokens per second)
llama_print_timings:       total time =     518.14 ms /   438 tokens
Llama.generate: prefix-match hit

llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.88 ms /     5 runs   (    0.18 ms per token,  5668.93 tokens per second)
llama_print_timings: prompt eval time =       0.00 ms /     1 tokens (    0.00 ms per token,      inf tokens per second)
llama_print_timings:        eval time =     155.88 ms /     5 runs   (   31.18 ms per token,    32.08 tokens per second)
llama_print_timings:       total time =     163.75 ms /     6 

 credit_reporting
 credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       2.11 ms /    11 runs   (    0.19 ms per token,  5208.33 tokens per second)
llama_print_timings: prompt eval time =     192.86 ms /    99 tokens (    1.95 ms per token,   513.33 tokens per second)
llama_print_timings:        eval time =     311.01 ms /    10 runs   (   31.10 ms per token,    32.15 tokens per second)
llama_print_timings:       total time =     521.60 ms /   109 tokens
Llama.generate: prefix-match hit



mortgages_and_loans



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.40 ms /     7 runs   (    0.20 ms per token,  5014.33 tokens per second)
llama_print_timings: prompt eval time =     244.33 ms /   204 tokens (    1.20 ms per token,   834.93 tokens per second)
llama_print_timings:        eval time =     185.77 ms /     6 runs   (   30.96 ms per token,    32.30 tokens per second)
llama_print_timings:       total time =     441.09 ms /   210 tokens
Llama.generate: prefix-match hit



credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.86 ms /     5 runs   (    0.17 ms per token,  5813.95 tokens per second)
llama_print_timings: prompt eval time =     381.23 ms /   422 tokens (    0.90 ms per token,  1106.95 tokens per second)
llama_print_timings:        eval time =     124.12 ms /     4 runs   (   31.03 ms per token,    32.23 tokens per second)
llama_print_timings:       total time =     512.95 ms /   426 tokens
Llama.generate: prefix-match hit

llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.95 ms /     5 runs   (    0.19 ms per token,  5257.62 tokens per second)
llama_print_timings: prompt eval time =       0.00 ms /     1 tokens (    0.00 ms per token,      inf tokens per second)
llama_print_timings:        eval time =     156.98 ms /     5 runs   (   31.40 ms per token,    31.85 tokens per second)
llama_print_timings:       total time =     165.71 ms /     6 

 credit_reporting
 credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       6.53 ms /    33 runs   (    0.20 ms per token,  5055.92 tokens per second)
llama_print_timings: prompt eval time =     177.21 ms /    55 tokens (    3.22 ms per token,   310.36 tokens per second)
llama_print_timings:        eval time =     979.35 ms /    32 runs   (   30.60 ms per token,    32.67 tokens per second)
llama_print_timings:       total time =    1209.81 ms /    87 tokens
Llama.generate: prefix-match hit


 mortgages_and_loans or retail_banking (Depending on the nature of the lease and the company involved, it could be either.)



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       1.41 ms /     7 runs   (    0.20 ms per token,  4961.02 tokens per second)
llama_print_timings: prompt eval time =     211.66 ms /   151 tokens (    1.40 ms per token,   713.41 tokens per second)
llama_print_timings:        eval time =     183.52 ms /     6 runs   (   30.59 ms per token,    32.69 tokens per second)
llama_print_timings:       total time =     406.54 ms /   157 tokens
Llama.generate: prefix-match hit


 mortgage_and_loans



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.89 ms /     5 runs   (    0.18 ms per token,  5643.34 tokens per second)
llama_print_timings: prompt eval time =     382.15 ms /   422 tokens (    0.91 ms per token,  1104.28 tokens per second)
llama_print_timings:        eval time =     123.46 ms /     4 runs   (   30.86 ms per token,    32.40 tokens per second)
llama_print_timings:       total time =     513.68 ms /   426 tokens
Llama.generate: prefix-match hit


 credit_reporting



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =       0.93 ms /     5 runs   (    0.18 ms per token,  5405.41 tokens per second)
llama_print_timings: prompt eval time =     386.11 ms /   434 tokens (    0.89 ms per token,  1124.04 tokens per second)
llama_print_timings:        eval time =     125.08 ms /     4 runs   (   31.27 ms per token,    31.98 tokens per second)
llama_print_timings:       total time =     519.08 ms /   438 tokens


 credit_reporting


In [36]:
# example - new_data['mistral_response_cleaned'] = new_data['narrative'].apply(lambda x:______ )
#new_data['mistral_response_cleaned'] = new_data['mistral_response'].apply(lambda x: "_____")
new_data['mistral_response_cleaned'] = new_data['mistral_response'].apply(lambda x: extract_category(x))


##### **Q3.2: Calculate the F1 score** **(1 Marks)**

In [37]:
# Calculate F1 score for 'product' and 'mistral_response'
f1 = f1_score(new_data['product'], new_data['mistral_response_cleaned'], average='micro')
print(f'F1 Score: {f1}')
print(f'F1 Score: {f1}')

F1 Score: 0.84
F1 Score: 0.84


##### **Q3.3: Share your observations on the few-shot and zero-shot prompt techniques. (1 Marks)**

############ Few-shot prompting improves accuracy by showing examples, while zero-shot relies only on instructions and is simpler but often less precise.

# **Section 3: Text to Text generation**

### **Question 4: Zero-Shot Prompting for Text Summarization (5 Marks)**

##### **Q4.1: Define the Prompt Template, System Message, generate prompt and model response** **(3 Marks)**


- Define a **system message** as a string and assign it to the variable system_message to generate summary of narrative in data.
- Create a **zero shot prompt template** that incorporates the system message and user input.
- Define **generate_prompt** function that takes both the system_message and user_input as arguments and formats them into a prompt template


Write a Python function called **generate_mistral_response** that takes a single parameter, narrative, which represents the user's complain. Inside the function, you should perform the following tasks:


- **Combine the system_message and narrative to create a prompt string using generate_prompt function.**

*Generate a response from the Mistral model using the lcpp_llm instance with the following parameters:*

- prompt should be the combined prompt string.
- max_tokens should be set to 1200.
- temperature should be set to 0.
- top_p should be set to 0.95.
- repeat_penalty should be set to 1.2.
- top_k should be set to 50.
- stop should be set as a list containing '/s'.
- echo should be set to False.
Extract and return the response text from the generated response.

Don't forget to provide a value for the system_message variable before using it in the function.

In [38]:
system_message = "You are a helpful assistant that generates a concise summary of the given complaint narrative."


In [39]:
zero_shot_prompt_template = """[System Message]
{system_message}

[User Complaint]
{user_input}

[Summary]
"""
def generate_prompt(system_message,user_input):
    prompt=zero_shot_prompt_template.format(system_message=system_message,user_input =user_input)
    return prompt

def generate_mistral_response(input_text):

    # Combine user_prompt and system_message to create the prompt
    prompt = generate_prompt(system_message,input_text)

    # Define the Llama model along with its parameters for generating a response
    response = lcpp_llm(
        prompt=prompt,
        max_tokens=1200,
        temperature=0,
        top_p=0.95,
        repeat_penalty=1.2,
        top_k=50,
        stop=["/s"],
        echo=False
    )


    # Extract and return the response text
    response_text = response["choices"][0]["text"]
    print(response_text)
    return response_text

##### **Q4.2: Generate mistral_response column containing LLM generated summaries** **(1 Marks)**

In [40]:
# Randomly select 30 rows
gold_examples = data.sample(30, random_state=40)

In [41]:
# example - new_data['mistral_response_cleaned'] = new_data['narrative'].apply(lambda x:______ )
gold_examples['mistral_response'] = gold_examples['narrative'].apply(lambda x: generate_mistral_response(x))


Llama.generate: prefix-match hit

llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      33.14 ms /   163 runs   (    0.20 ms per token,  4918.68 tokens per second)
llama_print_timings: prompt eval time =     248.31 ms /   247 tokens (    1.01 ms per token,   994.71 tokens per second)
llama_print_timings:        eval time =    5066.79 ms /   162 runs   (   31.28 ms per token,    31.97 tokens per second)
llama_print_timings:       total time =    5602.62 ms /   409 tokens
Llama.generate: prefix-match hit


The user is reporting a fraudulent charge on their Capital One checking account, which was immediately canceled. They disputed the charge and were informed that provisional credit would be issued pending determination of the claim. However, they received a form letter denying the claim despite never losing possession of their debit card or authorizing someone else to use it. The user believes their card may have been intercepted and fraudulently activated for unauthorized purchases. They have contacted Capital One multiple times to discuss the original claim and make a determination, but feel that malfeasance is occurring as they were sent a replacement debit card without authorization and had credit purchased on their account without consent. The user has also reported this incident to local police department's financial services division regarding fraudulent activity.



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      34.99 ms /   169 runs   (    0.21 ms per token,  4830.37 tokens per second)
llama_print_timings: prompt eval time =     439.50 ms /   506 tokens (    0.87 ms per token,  1151.32 tokens per second)
llama_print_timings:        eval time =    5295.43 ms /   168 runs   (   31.52 ms per token,    31.73 tokens per second)
llama_print_timings:       total time =    6043.08 ms /   674 tokens
Llama.generate: prefix-match hit


The Consumer Reporting Agency (CRA) is required to block reporting information if a consumer identifies information that resulted from alleged identity theft. The CRA should promptly notify the furnisher of this information and may decline or rescind the block upon reasonable determination, providing evidence whether the consumer was aware of the good service resulting in blocked transactions. Consumers have the right to be notified promptly if their information is reinstated. If a reseller file contains such information, they must provide notice to the consumer. The CRA may apply exceptions for verification companies and check services acting on behalf of consumers. Law enforcement agencies cannot access blocked information without proper authorization or verifiable proof from the original consumer contract. Unverified accounts listed in reports must be removed if the consumer is unable to provide copy verifiable proof.



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      34.97 ms /   166 runs   (    0.21 ms per token,  4746.65 tokens per second)
llama_print_timings: prompt eval time =     224.81 ms /   183 tokens (    1.23 ms per token,   814.03 tokens per second)
llama_print_timings:        eval time =    5283.10 ms /   165 runs   (   32.02 ms per token,    31.23 tokens per second)
llama_print_timings:       total time =    5814.09 ms /   348 tokens
Llama.generate: prefix-match hit


The user is disputing USAA for collecting debt on two delinquent accounts that were allegedly opened without their consent. They have provided evidence in the form of mastercard and visa statements, credit card application documents, and a fraud claim. The user claims they did not authorize these accounts and are personally liable for the debts. USAA is accused of creating fraudulent delinquent accounts using private ID data. The user also mentions an unauthorized transaction occurring on one of their existing cards. They have received a response from the Consumer Financial Protection Bureau (CFPB) stating that USAA was found to be liable for the unauthorized transaction. The user is requesting that USAA review and settle the claim, as they believe there are cases of identity theft and criminal activity involved.



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      35.32 ms /   169 runs   (    0.21 ms per token,  4784.55 tokens per second)
llama_print_timings: prompt eval time =     448.81 ms /   506 tokens (    0.89 ms per token,  1127.42 tokens per second)
llama_print_timings:        eval time =    5377.12 ms /   168 runs   (   32.01 ms per token,    31.24 tokens per second)
llama_print_timings:       total time =    6134.79 ms /   674 tokens
Llama.generate: prefix-match hit


The Consumer Reporting Agency (CRA) is required to block reporting information if a consumer identifies information that resulted from alleged identity theft. The CRA should promptly notify the furnisher of this information and may decline or rescind the block upon reasonable determination, providing evidence whether the consumer was aware of the good service resulting in blocked transactions. Consumers have the right to be notified promptly if their information is reinstated. If a reseller file contains such information, they must provide notice to the consumer. The CRA may apply exceptions for verification companies and check services acting on behalf of consumers. Law enforcement agencies cannot access blocked information without proper authorization or verifiable proof from the original consumer contract. Unverified accounts listed in reports must be removed if the consumer is unable to provide copy verifiable proof.



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      10.63 ms /    58 runs   (    0.18 ms per token,  5454.20 tokens per second)
llama_print_timings: prompt eval time =     180.85 ms /    75 tokens (    2.41 ms per token,   414.71 tokens per second)
llama_print_timings:        eval time =    1716.22 ms /    57 runs   (   30.11 ms per token,    33.21 tokens per second)
llama_print_timings:       total time =    1991.82 ms /   132 tokens
Llama.generate: prefix-match hit


The user has mentioned the opening and closing of multiple balance accounts several times. It appears they are experiencing issues with managing their various accounts, possibly due to confusion or errors in record keeping. They may be requesting assistance in organizing or resolving discrepancies related to these accounts.



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      33.86 ms /   172 runs   (    0.20 ms per token,  5079.29 tokens per second)
llama_print_timings: prompt eval time =     372.47 ms /   421 tokens (    0.88 ms per token,  1130.30 tokens per second)
llama_print_timings:        eval time =    5306.02 ms /   171 runs   (   31.03 ms per token,    32.23 tokens per second)
llama_print_timings:       total time =    5982.80 ms /   592 tokens
Llama.generate: prefix-match hit


The Consumer Reporting Agency (CRA) is required to block certain reporting information in a consumer's file if the consumer identifies information that resulted from alleged identity theft. The CRA must promptly notify any business furnishing information about the consumer of this block, and the furnisher must decline or rescind the report related to the blocked transaction. If the CRA determines that the block was an error, it shall remove the block upon receiving appropriate proof from the consumer. Additionally, if a reseller files a consumer report concerning identified information, they are required to inform the consumer of their decision and provide them with contact details for the CRA. The check service company must also apply certain provisions when processing negotiable instruments or electronic fund transfers related to blocked transactions. Access to blocked information by law enforcement agencies is restricted under specific circumstances.



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      25.58 ms /   125 runs   (    0.20 ms per token,  4886.44 tokens per second)
llama_print_timings: prompt eval time =     191.65 ms /    88 tokens (    2.18 ms per token,   459.16 tokens per second)
llama_print_timings:        eval time =    3951.04 ms /   124 runs   (   31.86 ms per token,    31.38 tokens per second)
llama_print_timings:       total time =    4371.03 ms /   212 tokens
Llama.generate: prefix-match hit


The user is having issues with PenFed in regards to closing a loan. They have been asked to provide proof of income, including payment stubs and account statements, as well as tax forms. The user has also formally applied for the loan but PenFed is asking for additional documentation such as deposit statements, checks, and transfer accounts. Additionally, they were requested to provide an invoice related to legal services rendered which they believe violates their privilege. They have provided a copy of the invoice but are concerned that PenFed may be going beyond what's necessary in evaluating their income payment capacity.



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      34.12 ms /   172 runs   (    0.20 ms per token,  5041.62 tokens per second)
llama_print_timings: prompt eval time =     387.33 ms /   421 tokens (    0.92 ms per token,  1086.92 tokens per second)
llama_print_timings:        eval time =    5431.01 ms /   171 runs   (   31.76 ms per token,    31.49 tokens per second)
llama_print_timings:       total time =    6127.21 ms /   592 tokens
Llama.generate: prefix-match hit


The Consumer Reporting Agency (CRA) is required to block certain reporting information in a consumer's file if the consumer identifies information that resulted from alleged identity theft. The CRA must promptly notify any business furnishing information about the consumer of this block, and the furnisher must decline or rescind the report related to the blocked transaction. If the CRA determines that the block was an error, it shall remove the block upon receiving appropriate proof from the consumer. Additionally, if a reseller files a consumer report concerning identified information, they are required to inform the consumer of their decision and provide them with contact details for the CRA. The check service company must also apply certain provisions when processing negotiable instruments or electronic fund transfers related to blocked transactions. Access to blocked information by law enforcement agencies is restricted under specific circumstances.



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      35.94 ms /   171 runs   (    0.21 ms per token,  4757.53 tokens per second)
llama_print_timings: prompt eval time =     394.88 ms /   433 tokens (    0.91 ms per token,  1096.53 tokens per second)
llama_print_timings:        eval time =    5492.19 ms /   170 runs   (   32.31 ms per token,    30.95 tokens per second)
llama_print_timings:       total time =    6198.79 ms /   603 tokens
Llama.generate: prefix-match hit


The Consumer Reporting Agency (CRA) is required to block reporting of certain consumer information if the consumer identifies it as being a result of alleged identity theft. The CRA must promptly notify any business that has furnished such information and may decline or rescind the blocking request based on specific reasons. If an error occurs, the consumer can request reinstatement of the blocked information. Consumers have the right to be notified in a timely manner if their information is declined or rescinded. The CRA must also block any subsequent use of this information according to the provision. Resellers are obligated to provide notice to consumers when they obtain and resell consumer reports containing blocked information, and check service companies have exceptions for verifying transactions. Law enforcement agencies cannot access blocked information without a specific requirement in law.



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      34.87 ms /   172 runs   (    0.20 ms per token,  4933.03 tokens per second)
llama_print_timings: prompt eval time =     400.66 ms /   421 tokens (    0.95 ms per token,  1050.77 tokens per second)
llama_print_timings:        eval time =    5539.68 ms /   171 runs   (   32.40 ms per token,    30.87 tokens per second)
llama_print_timings:       total time =    6254.89 ms /   592 tokens
Llama.generate: prefix-match hit


The Consumer Reporting Agency (CRA) is required to block certain reporting information in a consumer's file if the consumer identifies information that resulted from alleged identity theft. The CRA must promptly notify any business furnishing information about the consumer of this block, and the furnisher must decline or rescind the report related to the blocked transaction. If the CRA determines that the block was an error, it shall remove the block upon receiving appropriate proof from the consumer. Additionally, if a reseller files a consumer report concerning identified information, they are required to inform the consumer of their decision and provide them with contact details for the CRA. The check service company must also apply certain provisions when processing negotiable instruments or electronic fund transfers related to blocked transactions. Access to blocked information by law enforcement agencies is restricted under specific circumstances.



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      31.92 ms /   155 runs   (    0.21 ms per token,  4855.74 tokens per second)
llama_print_timings: prompt eval time =     212.97 ms /   148 tokens (    1.44 ms per token,   694.93 tokens per second)
llama_print_timings:        eval time =    4870.30 ms /   154 runs   (   31.63 ms per token,    31.62 tokens per second)
llama_print_timings:       total time =    5373.42 ms /   302 tokens
Llama.generate: prefix-match hit


The user is a victim of identity theft and has discovered false, fraudulent information on their credit report related to transactions made without their consent. They believe these incidents occurred as a result of the Equifax data breach. The user has contacted Equifax to lock their account and request that the inaccurate information be removed. However, they have had difficulty communicating with the company and validating their identity. Additionally, they have reported the fraudulent activity to local police and the Federal Trade Commission (FTC). They are also experiencing adverse actions due to the incorrect reporting on their credit report, making it difficult for them to find employment. The user is requesting assistance in effectively communicating with Equifax and creditor agencies to resolve this issue.



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      41.97 ms /   205 runs   (    0.20 ms per token,  4884.44 tokens per second)
llama_print_timings: prompt eval time =     441.01 ms /   512 tokens (    0.86 ms per token,  1160.96 tokens per second)
llama_print_timings:        eval time =    6401.13 ms /   204 runs   (   31.38 ms per token,    31.87 tokens per second)
llama_print_timings:       total time =    7217.21 ms /   716 tokens
Llama.generate: prefix-match hit


The Consumer Reporting Agency (CRA) is required by law to block reporting information if a consumer identifies information that resulted from alleged identity theft. The CRA should promptly notify the furnisher of this information and may decline or rescind the block upon reasonable determination, providing evidence whether the consumer was aware of the good service resulting in blocked transactions. Consumers have the right to be notified promptly if their information is reinstated. If a reseller file contains such information, they must apply the same rules as the CRA and inform consumers accordingly. The Check Service Company acting on an authorization purpose should report this information to the National Consumer Reporting Agency but cannot access blocked information without proper verification. Law enforcement agencies also have limited access to these records under specific circumstances. Unverified accounts must be removed from consumer reports if the consumer is unable to prov


llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      33.90 ms /   172 runs   (    0.20 ms per token,  5073.45 tokens per second)
llama_print_timings: prompt eval time =     379.12 ms /   421 tokens (    0.90 ms per token,  1110.45 tokens per second)
llama_print_timings:        eval time =    5409.43 ms /   171 runs   (   31.63 ms per token,    31.61 tokens per second)
llama_print_timings:       total time =    6094.96 ms /   592 tokens
Llama.generate: prefix-match hit


The Consumer Reporting Agency (CRA) is required to block certain reporting information in a consumer's file if the consumer identifies information that resulted from alleged identity theft. The CRA must promptly notify any business furnishing information about the consumer of this block, and the furnisher must decline or rescind the report related to the blocked transaction. If the CRA determines that the block was an error, it shall remove the block upon receiving appropriate proof from the consumer. Additionally, if a reseller files a consumer report concerning identified information, they are required to inform the consumer of their decision and provide them with contact details for the CRA. The check service company must also apply certain provisions when processing negotiable instruments or electronic fund transfers related to blocked transactions. Access to blocked information by law enforcement agencies is restricted under specific circumstances.



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      34.63 ms /   167 runs   (    0.21 ms per token,  4822.55 tokens per second)
llama_print_timings: prompt eval time =     256.89 ms /   233 tokens (    1.10 ms per token,   907.00 tokens per second)
llama_print_timings:        eval time =    5272.13 ms /   166 runs   (   31.76 ms per token,    31.49 tokens per second)
llama_print_timings:       total time =    5834.54 ms /   399 tokens
Llama.generate: prefix-match hit


The user has been experiencing financial hardships and applied for a mortgage modification. After receiving an initial offer with a discrepancy, the investor agreed to defer principal and extend the loan term to allow for more affordable payments. However, approximately one year later, due to further family losses in income, the user contacted Aurora to request another modification. The process was delayed, and the user expressed concern that the agreement had not been honored as promised. They were told to reapply for the modification, which resulted in a cycle of repeated applications. A housing advocate submitted an escalation request alleging a breach of oral contract and financial loss. Meanwhile, the user received a letter denying their request despite investor agreement. The user fears foreclosure due to their medical condition during the pandemic and urgently needs assistance.



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      43.94 ms /   205 runs   (    0.21 ms per token,  4665.88 tokens per second)
llama_print_timings: prompt eval time =     459.87 ms /   512 tokens (    0.90 ms per token,  1113.35 tokens per second)
llama_print_timings:        eval time =    6644.92 ms /   204 runs   (   32.57 ms per token,    30.70 tokens per second)
llama_print_timings:       total time =    7492.11 ms /   716 tokens
Llama.generate: prefix-match hit


The Consumer Reporting Agency (CRA) is required by law to block reporting information if a consumer identifies information that resulted from alleged identity theft. The CRA should promptly notify the furnisher of this information and may decline or rescind the block upon reasonable determination, providing evidence whether the consumer was aware of the good service resulting in blocked transactions. Consumers have the right to be notified promptly if their information is reinstated. If a reseller file contains such information, they must apply the same rules as the CRA and inform consumers accordingly. The Check Service Company acting on an authorization purpose should report this information to the National Consumer Reporting Agency but cannot access blocked information without proper verification. Law enforcement agencies also have limited access to these records under specific circumstances. Unverified accounts must be removed from consumer reports if the consumer is unable to prov


llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      42.85 ms /   215 runs   (    0.20 ms per token,  5017.27 tokens per second)
llama_print_timings: prompt eval time =     394.60 ms /   439 tokens (    0.90 ms per token,  1112.52 tokens per second)
llama_print_timings:        eval time =    6778.61 ms /   214 runs   (   31.68 ms per token,    31.57 tokens per second)
llama_print_timings:       total time =    7568.29 ms /   653 tokens
Llama.generate: prefix-match hit


The Consumer Reporting Agency (CRA) is required to block reporting of certain consumer information if the consumer identifies it as being a result of alleged identity theft. The CRA must promptly block this information from their files on the business day following receipt of appropriate proof from the consumer. However, the CRA may decline or rescind the blocking request based on specific reasons such as material misrepresentation or error. If the block is declined or rescinded, the affected consumer must be notified promptly and in writing. The purpose of this section is to protect consumers from identity theft by preventing the reporting and use of incorrect information. This applies to both the CRA and resellers. Consumers have the right to report identity theft to the bureau and obtain information regarding their case from the reseller. If a check service company issues an authorization for processing a negotiable instrument or electronic fund transfer, they must also report this 


llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      33.95 ms /   172 runs   (    0.20 ms per token,  5066.27 tokens per second)
llama_print_timings: prompt eval time =     385.51 ms /   421 tokens (    0.92 ms per token,  1092.05 tokens per second)
llama_print_timings:        eval time =    5344.85 ms /   171 runs   (   31.26 ms per token,    31.99 tokens per second)
llama_print_timings:       total time =    6036.88 ms /   592 tokens
Llama.generate: prefix-match hit


The Consumer Reporting Agency (CRA) is required to block certain reporting information in a consumer's file if the consumer identifies information that resulted from alleged identity theft. The CRA must promptly notify any business furnishing information about the consumer of this block, and the furnisher must decline or rescind the report related to the blocked transaction. If the CRA determines that the block was an error, it shall remove the block upon receiving appropriate proof from the consumer. Additionally, if a reseller files a consumer report concerning identified information, they are required to inform the consumer of their decision and provide them with contact details for the CRA. The check service company must also apply certain provisions when processing negotiable instruments or electronic fund transfers related to blocked transactions. Access to blocked information by law enforcement agencies is restricted under specific circumstances.



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      36.42 ms /   178 runs   (    0.20 ms per token,  4887.42 tokens per second)
llama_print_timings: prompt eval time =     238.51 ms /   194 tokens (    1.23 ms per token,   813.37 tokens per second)
llama_print_timings:        eval time =    5646.20 ms /   177 runs   (   31.90 ms per token,    31.35 tokens per second)
llama_print_timings:       total time =    6210.05 ms /   371 tokens
Llama.generate: prefix-match hit


The user expressed frustration with American Express for reducing their credit limit without prior notice or justification. The representative initially stated the reason was an increase in balance, but later changed their position to say it was due to an increased account balance despite available credit remaining. The user felt taken advantage of as they were paying a significant amount towards their balance and maintaining good standing. They also mentioned that American Express failed to provide a valid reason for the reduction during a time when many people could be experiencing financial hardship, such as illness or workplace closure. The user criticized American Express' business practices and lacked confidence in the representative's handling of the situation. They encouraged American Express to help customers meet their needs during urgent situations and expressed concern that they were creating unnecessary hardships for some members. Additionally, the user felt that the repre


llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      24.67 ms /   115 runs   (    0.21 ms per token,  4660.97 tokens per second)
llama_print_timings: prompt eval time =     211.15 ms /   140 tokens (    1.51 ms per token,   663.03 tokens per second)
llama_print_timings:        eval time =    3636.70 ms /   114 runs   (   31.90 ms per token,    31.35 tokens per second)
llama_print_timings:       total time =    4058.55 ms /   254 tokens
Llama.generate: prefix-match hit


The user is a victim of identity theft following the Equifax data breach. They have filed a claim and are trying to clear their credit report, but have encountered issues with fraudulent items. The process has been long and complicated, causing financial and emotional stress. Additionally, they fear that negative information on their report may prevent them from getting a job due to adverse actions based on inaccurate reporting. They are considering hiring an attorney specializing in predatory loans and breach victim compensation to help review their credit file and remove the fraudulent items.



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      47.39 ms /   224 runs   (    0.21 ms per token,  4726.64 tokens per second)
llama_print_timings: prompt eval time =     327.38 ms /   359 tokens (    0.91 ms per token,  1096.58 tokens per second)
llama_print_timings:        eval time =    7339.59 ms /   223 runs   (   32.91 ms per token,    30.38 tokens per second)
llama_print_timings:       total time =    8095.87 ms /   582 tokens
Llama.generate: prefix-match hit


A divorced service member, who is a single mother and fell behind on mortgage payments after receiving a sudden loss of child support, struggled to modify the loan with her servicing company. She was unable to afford essential repairs and considered filing for deed in lieu of foreclosure due to emotional, physical, and financial strain. However, she wanted to avoid this outcome and tried negotiating a short sale with a broker. The process became complicated when another interested party emerged, and the loan servicing company continued to report late payments, damaging her credit score. Despite these challenges, she managed to make a full payment and contacted the VA to clarify the status of the deed transfer. However, the loan servicer still reported late payments, causing further financial hardship. She was unable to find an attorney willing to represent her case due to her damaged credit score. The situation worsened over several years as she continued to contact the loan servicing 


llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      12.89 ms /    68 runs   (    0.19 ms per token,  5273.36 tokens per second)
llama_print_timings: prompt eval time =     186.67 ms /    73 tokens (    2.56 ms per token,   391.06 tokens per second)
llama_print_timings:        eval time =    2104.25 ms /    67 runs   (   31.41 ms per token,    31.84 tokens per second)
llama_print_timings:       total time =    2406.97 ms /   140 tokens
Llama.generate: prefix-match hit


The user is experiencing identity theft with multiple accounts reported on their TransUnion credit file. They have identified several affected accounts, including those with the account names "u dept ed," "dpt edxxxx," and "original creditor." The user has attempted to contact TransUnion for assistance but has not yet received a response.



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      26.81 ms /   132 runs   (    0.20 ms per token,  4923.17 tokens per second)
llama_print_timings: prompt eval time =     309.74 ms /   337 tokens (    0.92 ms per token,  1088.00 tokens per second)
llama_print_timings:        eval time =    4127.94 ms /   131 runs   (   31.51 ms per token,    31.73 tokens per second)
llama_print_timings:       total time =    4673.77 ms /   468 tokens
Llama.generate: prefix-match hit


The user is experiencing inconsistencies and inaccuracies in their credit report from Equifax. The reported name, last payment date, account status, and credit limit do not match the user's records or reality. They have disputed these issues multiple times but the account remains unchanged. Additionally, an invalid account number was listed as authorized on another reporting account. Despite providing evidence to support their claims, Equifax has refused to make corrections, which is negatively impacting the user's credit score. The user feels obligated by law to request a proper investigation into these errors but has not received satisfactory results from the company.



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      23.67 ms /   116 runs   (    0.20 ms per token,  4899.89 tokens per second)
llama_print_timings: prompt eval time =     197.43 ms /   112 tokens (    1.76 ms per token,   567.28 tokens per second)
llama_print_timings:        eval time =    3647.55 ms /   115 runs   (   31.72 ms per token,    31.53 tokens per second)
llama_print_timings:       total time =    4051.49 ms /   227 tokens
Llama.generate: prefix-match hit


The user is experiencing identity theft and has discovered several fraudulent applications and accounts opened in their name with Equifax. They have disputed the information multiple times, but were told they would receive a response via certified mail. The user is now requesting to initiate a formal complaint due to Equifax's failure to conduct an adequate investigation or block usage of the fraudulent account. They believe this may be a violation of both the Fair Credit Reporting Act and Fair Debt Collection Practices Act, and have also contacted their state attorney general for further assistance.



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      22.24 ms /   108 runs   (    0.21 ms per token,  4855.68 tokens per second)
llama_print_timings: prompt eval time =     210.95 ms /   134 tokens (    1.57 ms per token,   635.21 tokens per second)
llama_print_timings:        eval time =    3419.78 ms /   107 runs   (   31.96 ms per token,    31.29 tokens per second)
llama_print_timings:       total time =    3822.37 ms /   241 tokens
Llama.generate: prefix-match hit


The user has filed a complaint about an investigation process regarding a transaction dispute. They claim that the investigator was unable to confirm card possession and gave vague answers, despite providing supporting documentation. The user also mentions being disrespected when requesting a copy of a document proving a debt owed by the cardholder. Additionally, they state that their account has been reported as delinquent to credit reporting companies without resolution or notification. The user is frustrated and wants the situation resolved promptly and requests to close their account.



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      28.94 ms /   145 runs   (    0.20 ms per token,  5011.06 tokens per second)
llama_print_timings: prompt eval time =     298.11 ms /   291 tokens (    1.02 ms per token,   976.16 tokens per second)
llama_print_timings:        eval time =    4577.24 ms /   144 runs   (   31.79 ms per token,    31.46 tokens per second)
llama_print_timings:       total time =    5135.50 ms /   435 tokens
Llama.generate: prefix-match hit


The user placed multiple orders for products with guaranteed day shipping but received them several weeks later due to high order volume and demand. They followed up via email, were informed of the delay, and paid for two-day shipping. The company apologized and assured they would ship soon, but only small shipments were sent on a first come, first served basis. The user expressed their preference to cancel orders and receive refunds if possible, as they had received some items already. They provided documentation showing the cancellation of one order and the issuance of a refund. However, there seemed to be an issue with another transaction that was disputed, but no resolution or communication regarding this matter was mentioned in the complaint narrative.



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      35.13 ms /   172 runs   (    0.20 ms per token,  4895.96 tokens per second)
llama_print_timings: prompt eval time =     393.16 ms /   421 tokens (    0.93 ms per token,  1070.82 tokens per second)
llama_print_timings:        eval time =    5541.40 ms /   171 runs   (   32.41 ms per token,    30.86 tokens per second)
llama_print_timings:       total time =    6250.25 ms /   592 tokens
Llama.generate: prefix-match hit


The Consumer Reporting Agency (CRA) is required to block certain reporting information in a consumer's file if the consumer identifies information that resulted from alleged identity theft. The CRA must promptly notify any business furnishing information about the consumer of this block, and the furnisher must decline or rescind the report related to the blocked transaction. If the CRA determines that the block was an error, it shall remove the block upon receiving appropriate proof from the consumer. Additionally, if a reseller files a consumer report concerning identified information, they are required to inform the consumer of their decision and provide them with contact details for the CRA. The check service company must also apply certain provisions when processing negotiable instruments or electronic fund transfers related to blocked transactions. Access to blocked information by law enforcement agencies is restricted under specific circumstances.



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      27.44 ms /   133 runs   (    0.21 ms per token,  4847.47 tokens per second)
llama_print_timings: prompt eval time =     202.25 ms /   104 tokens (    1.94 ms per token,   514.22 tokens per second)
llama_print_timings:        eval time =    4313.55 ms /   132 runs   (   32.68 ms per token,    30.60 tokens per second)
llama_print_timings:       total time =    4766.91 ms /   236 tokens
Llama.generate: prefix-match hit


The user is a victim of identity theft and has discovered unauthorized transactions on their credit report. They mention that there have been several accounts opened in their name, some with the following account numbers: hlthfrxxxxxxxxxxx, xxxxxxxx, and xxxxxxxx. The dates these accounts were opened are also provided. The user requests that this information be removed from their TransUnion credit report pursuant to the Fair Credit Reporting Act (FCRA). They ask for the necessary steps to be taken by the furnishers of this information and express gratitude while reminding others to wear masks and stay home during these times.



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      37.59 ms /   183 runs   (    0.21 ms per token,  4867.67 tokens per second)
llama_print_timings: prompt eval time =     214.62 ms /   153 tokens (    1.40 ms per token,   712.88 tokens per second)
llama_print_timings:        eval time =    5689.58 ms /   182 runs   (   31.26 ms per token,    31.99 tokens per second)
llama_print_timings:       total time =    6231.94 ms /   335 tokens
Llama.generate: prefix-match hit


The user is a victim of identity theft and has discovered unauthorized transactions listed on their Equifax credit report. They have been trying to resolve the issue for an extended period, but it has proven to be complicated and time-consuming. The user fears that they may have been involved in the Equifax data breach and is concerned about potential fraudulent debts hindering their ability to apply for credit or even get a first-time credit card. They are currently reviewing their credit file carefully, considering seeking legal advice from an attorney specializing in identity theft cases, and are eligible for victim compensation. The user has noticed a disputed account on their Equifax report that they have not been able to resolve, causing financial and emotional stress. Additionally, the negative information reported may be hindering their ability to get a job due to adverse actions based on inaccurate credit reporting.



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      33.90 ms /   169 runs   (    0.20 ms per token,  4985.99 tokens per second)
llama_print_timings: prompt eval time =     438.92 ms /   506 tokens (    0.87 ms per token,  1152.84 tokens per second)
llama_print_timings:        eval time =    5243.94 ms /   168 runs   (   31.21 ms per token,    32.04 tokens per second)
llama_print_timings:       total time =    5982.23 ms /   674 tokens
Llama.generate: prefix-match hit


The Consumer Reporting Agency (CRA) is required to block reporting information if a consumer identifies information that resulted from alleged identity theft. The CRA should promptly notify the furnisher of this information and may decline or rescind the block upon reasonable determination, providing evidence whether the consumer was aware of the good service resulting in blocked transactions. Consumers have the right to be notified promptly if their information is reinstated. If a reseller file contains such information, they must provide notice to the consumer. The CRA may apply exceptions for verification companies and check services acting on behalf of consumers. Law enforcement agencies cannot access blocked information without proper authorization or verifiable proof from the original consumer contract. Unverified accounts listed in reports must be removed if the consumer is unable to provide copy verifiable proof.



llama_print_timings:        load time =     409.10 ms
llama_print_timings:      sample time =      23.75 ms /   121 runs   (    0.20 ms per token,  5094.95 tokens per second)
llama_print_timings: prompt eval time =     189.77 ms /    96 tokens (    1.98 ms per token,   505.89 tokens per second)
llama_print_timings:        eval time =    3660.47 ms /   120 runs   (   30.50 ms per token,    32.78 tokens per second)
llama_print_timings:       total time =    4060.06 ms /   216 tokens


The user's mother called to remove an authorized user from her credit card who had never used the line. However, this action negatively affected the user's credit score. The user was able to contact Experian and dispute the change but is eagerly waiting for the reflection of the correction in their score. They need a good credit score to refinance their home as their husband has lost his job and they are barely making ends meet with a nurse's salary. Despite trying to dispute online, the user was unable to get through to Experian by phone to further discuss the issue.


##### **Q4.3: Evaluate bert score** **(1 Marks)**

In [42]:
def evaluate_score(test_data, scorer, bert_score=False):

    """
    Return the ROUGE score or BERTScore for predictions on gold examples
    For each example we make a prediction using the prompt.
    Gold summaries and the AI generated summaries are aggregated into lists.
    These lists are used by the corresponding scorers to compute metrics.
    Since BERTScore is computed for each candidate-reference pair, we take the
    average F1 score across the gold examples.

    Args:
        prompt (List): list of messages in the Open AI prompt format
        gold_examples (str): JSON string with list of gold examples
        scorer (function): Scorer function used to compute the ROUGE score or the
                           BERTScore
        bert_score (boolean): A flag variable that indicates if BERTScore should
                              be used as the metric.

    Output:
        score (float): BERTScore or ROUGE score computed by comparing model predictions
                       with ground truth
    """

    model_predictions = test_data['mistral_response'].tolist()
    ground_truths = test_data['summary'].tolist()
    if bert_score:
        score = scorer.compute(
            predictions=model_predictions,
            references=ground_truths,
            lang="en",
            rescale_with_baseline=True
        )

        return sum(score['f1'])/len(score['f1'])
    else:
        return scorer.compute(
            predictions=model_predictions,
            references=ground_truths
        )

In [43]:
bert_scorer = evaluate.load("bertscore")

In [44]:
score = evaluate_score(
    gold_examples,   
    bert_scorer,
    bert_score=True
)


print(f'BERTScore: {score}')

Some weights of RobertaModel were not initialized from the model checkpoint at roberta-large and are newly initialized: ['pooler.dense.bias', 'pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


BERTScore: 0.3519383544723193
