![image](https://raw.githubusercontent.com/IBM/watson-machine-learning-samples/master/cloud/notebooks/headers/watsonx-Prompt_Lab-Notebook.png)
# Use Watsonx to analyze car rental customer satisfaction and offer recommendation.

**Note:** Please note that for the watsonx challenge, please run these notebooks in IBM Cloud and not on on your laptop/desktop.

This notebook contains the steps and code to demonstrate support of text sentiment analysis in Watsonx. It introduces commands for data retrieval, model testing and scoring.

Some familiarity with Python is helpful. This notebook uses Python 3.10.

<a id="setup"></a>
## Set up the environment

### Install and import the dependecies

In [173]:
!pip install datasets | tail -n 1
!pip install scikit-learn | tail -n 1
!pip install ibm-watson-machine-learning==1.0.312 | tail -n 1



**Note:** Please restart the notebook kernel to pick up proper version of packages installed above.

In [174]:
import os, getpass
from pandas import read_csv

### Watsonx API connection
This cell defines the credentials required to work with watsonx API for Foundation
Model inferencing.

**Action:** Provide the IBM Cloud user API key. Instructions have been provided to generate IBM Cloud API key. For details, see
[documentation](https://cloud.ibm.com/docs/account?topic=account-userapikey&interface=ui).

In [457]:
from ibm_cloud_sdk_core import IAMTokenManager
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator, BearerTokenAuthenticator
import os, getpass

access_token = IAMTokenManager(
    apikey = getpass.getpass("Please enter your api key (hit enter): "),
    url = "https://iam.cloud.ibm.com/identity/token"
).get_token()

Please enter your api key (hit enter): ········


### Defining the project id
The API requires project id that provides the context for the call. We will obtain the id from the project in which this notebook runs. When you run notebook on IBM Cloud, project in which it runs is saved as environment variable PROJECT_ID.

**Hint**: You can find the `project_id` as follows. Open the prompt lab in watsonx.ai. At the very top of the UI, there will be `Projects / <project name> /`. Click on the `<project name>` link. Then get the `project_id` from Project's Manage tab (Project -> Manage -> General -> Details).


In [240]:
try:
    project_id = os.environ["PROJECT_ID"]
except KeyError:
    project_id = input("Please enter your project_id (hit enter): ")

<a id="data"></a>
## Train/test data loading

Load train and test datasets. At first, training dataset (`train_data`) should be used to work with the models to prepare and tune prompt. Then, test dataset (`test_data`) should be used to calculate the metrics score for selected model, defined prompts and parameters.

In [177]:
filename_test = 'https://watsonx-gsi-challenge.s3.jp-tok.cloud-object-storage.appdomain.cloud/track1/test.csv'
filename_train = 'https://watsonx-gsi-challenge.s3.jp-tok.cloud-object-storage.appdomain.cloud/track1/train.csv'

test_data = read_csv(filename_test)
train_data = read_csv(filename_train)

In [178]:
train_data.head()

Unnamed: 0,ID,Gender,Status,Children,Age,Customer_Status,Car_Owner,Customer_Service,Satisfaction,Business_Area,Action
0,2944,Female,M,2,41.92,Active,No,Customer service was friendly and helpful.,1,Service: Knowledge,
1,1119,Female,M,2,33.6,Active,Yes,Customer service was good at MSP airport and t...,1,Service: Knowledge,
2,0,Male,M,0,51.0,Inactive,Yes,I do not understand why I have to pay additio...,0,Product: Pricing and Billing,Premium features
3,1085,Female,S,2,42.0,Inactive,No,Based on the customer service personnel I enco...,0,Service: Attitude,On-demand pickup location
4,0,Female,M,2,44.1,Active,No,Provide more convenient car pickup from the ai...,0,Service: Orders/Contracts,On-demand pickup location


In [179]:
test_data.head()

Unnamed: 0,ID,Gender,Status,Children,Age,Customer_Status,Car_Owner,Customer_Service,Satisfaction,Business_Area,Action
0,2771,Female,M,2,49.99,Inactive,No,"last time I rented a car was at Manchester, NH...",0,Product: Functioning,On-demand pickup location
1,1133,Male,S,1,56.05,Inactive,No,Please lower the prices.,0,Product: Pricing and Billing,Free Upgrade
2,900,Female,M,1,64.64,Active,No,Excellent response dealing with child seat.,1,Service: Accessibility,
3,3795,Male,M,0,46.51,Inactive,No,"all went quite smoothly... it was Enterprise, ...",1,Service: Accessibility,
4,3541,Male,S,1,17.01,Inactive,Yes,"Slow, long lineup",0,Product: Functioning,On-demand pickup location


<a id="models"></a>
## Foundation Models on Watsonx



Below code invokes Watson Machine Learning API to invoke Watsonx.ai LLMs


In [180]:
import requests

class Prompt:
    def __init__(self, access_token, project_id):
        self.access_token = access_token
        self.project_id = project_id

    def generate(self, input, model_id, parameters):
        wml_url = "https://us-south.ml.cloud.ibm.com/ml/v1-beta/generation/text?version=2023-05-28"
        Headers = {
            "Authorization": "Bearer " + self.access_token,
            "Content-Type": "application/json",
            "Accept": "application/json"
        }
        data = {
            "model_id": model_id,
            "input": input,
            "parameters": parameters,
            "project_id": self.project_id
        }
        response = requests.post(wml_url, json=data, headers=Headers)
        if response.status_code == 200:
            return response.json()["results"][0]["generated_text"]
        else:
            return response.text

<a id="predict"></a>
## Evaluate the model, prompt and parameters

### **1. Customer satisfaction**

Define instructions for the model to recognize if customer was satisfied or unsatisfied.

**Note:** Please **start with using [watsonx.ai Prompt Lab](https://dataplatform.cloud.ibm.com/wx/home?context=wx)** to find better prompts that provides you the best result on a small subset training records (under `train_data` variable). Make sure to not run an inference of all of `train_data`, as it'll take a long time to get the results. To get a sample from `train_data`, you can use e.g.`train_data.head(n=10)` to get first 10 records, or `train_data.sample(n=10)` to get random 10 records. Only once you have identified the best performing prompt, update this notebook to use the prompt and compute the metrics on the test data.

**Action:** Please edit the below cell and add your own prompt here. In the below prompt, we have the instruction (first sentence) and one example included in the prompt.  If you want to change the prompt or add your own examples or more examples, please change the below prompt accordingly.

#### 1.1. Default values and package declarations

For repeated execution of functions, values defined as one-time use are declared in advance.

In [458]:
from sklearn.metrics import f1_score

prompt = Prompt(access_token, project_id)
comments = list(test_data.Customer_Service)
satisfaction = list(test_data.Satisfaction.astype(str))

satisfaction_instruction_head = """

Decide if customer was satisfied or not based on the given feedback by customer. Respond 1 if satisfied and 0 if unsatisfied.

"""

#### Defining the model parameters
We need to provide a set of model parameters that will influence the result. We will use IBM's Granite model.

In [453]:
parameters = {
    "decoding_method": "greedy",""
    "max_new_tokens": 1,
    "min_new_tokens": 1,
    "repetition_penalty": 1
}

model_id = "ibm/granite-13b-instruct-v1"

Analyze the customer satisfaction for inputs from the test set.

**Note:** Execution of this cell could take several minutes.

#### 1.2. Randomly repeated

Randomly select 10 of the training data and provide them as examples after the instructions to repeat the process. The results with the following index numbers were the best.

[62, 7, 23, 53, 67, 66, 81, 40, 13, 78]

In [459]:
result_table = pd.DataFrame(columns=['input_line_nums', 'satisfaction_nums', 'f1_scores'])

input_line_nums = []
result_f1_scores = []
count_satisfaction = []

for i in range(10):
    print('Randomly select 10 training data...', end='')
    sample_train = train_data.sample(n=10)
    train_comments = list(sample_train.Customer_Service)
    train_satisfaction = list(sample_train.Satisfaction.astype(str))

    
    input_line_nums.append(list(sample_train.index))
    count_satisfaction.append(train_satisfaction.count('1'))
    print(input_line_nums[i], end='')
    print(', satisfaction count = {}, '.format(count_satisfaction[i]), end='')

    satisfaction_instruction = satisfaction_instruction_head + "\n"
    for input_text_1, input_text_2 in zip(train_comments, train_satisfaction):
        satisfaction_instruction = satisfaction_instruction + "\n" + input_text_1 + "\n" + input_text_2 + "\n" 

    results = []
    for input_text in comments:
        prompt_input = satisfaction_instruction + "\n" + input_text + "\n"
        results.append(prompt.generate(prompt_input, model_id, parameters).replace("\n",""))

    result_f1_scores.append(f1_score(satisfaction, results, average='micro'))
    print('f1_micro_score : ', result_f1_scores[i])
    
    if result_f1_scores[i] == 0.0:
        break
    
result_table['input_line_nums'] = input_line_nums
result_table['satisfaction_nums'] = count_satisfaction
result_table['f1_scores'] = result_f1_scores

result_table.sort_values('f1_scores', ascending=False)

Randomly select 10 training data...[13, 90, 82, 53, 69, 0, 40, 96, 18, 19], satisfaction count = 4, f1_micro_score :  0.8599999999999999
Randomly select 10 training data...[81, 26, 5, 17, 39, 8, 27, 36, 87, 6], satisfaction count = 5, f1_micro_score :  0.8599999999999999
Randomly select 10 training data...[13, 23, 30, 85, 62, 61, 24, 36, 16, 7], satisfaction count = 6, f1_micro_score :  0.8599999999999999
Randomly select 10 training data...[17, 36, 32, 82, 24, 88, 23, 29, 7, 34], satisfaction count = 9, f1_micro_score :  0.8599999999999999
Randomly select 10 training data...[60, 78, 14, 44, 66, 91, 90, 23, 40, 33], satisfaction count = 7, f1_micro_score :  0.8599999999999999
Randomly select 10 training data...[81, 59, 90, 74, 31, 82, 88, 62, 33, 1], satisfaction count = 5, f1_micro_score :  0.88
Randomly select 10 training data...[9, 52, 36, 91, 24, 46, 84, 96, 80, 93], satisfaction count = 6, f1_micro_score :  0.74
Randomly select 10 training data...[80, 11, 40, 49, 32, 26, 42, 59, 31

Unnamed: 0,input_line_nums,satisfaction_nums,f1_scores
8,"[62, 7, 23, 53, 67, 66, 81, 40, 13, 78]",7,0.96
5,"[81, 59, 90, 74, 31, 82, 88, 62, 33, 1]",5,0.88
7,"[80, 11, 40, 49, 32, 26, 42, 59, 31, 51]",5,0.88
0,"[13, 90, 82, 53, 69, 0, 40, 96, 18, 19]",4,0.86
1,"[81, 26, 5, 17, 39, 8, 27, 36, 87, 6]",5,0.86
2,"[13, 23, 30, 85, 62, 61, 24, 36, 16, 7]",6,0.86
3,"[17, 36, 32, 82, 24, 88, 23, 29, 7, 34]",9,0.86
4,"[60, 78, 14, 44, 66, 91, 90, 23, 40, 33]",7,0.86
9,"[27, 21, 61, 73, 42, 10, 23, 63, 48, 90]",5,0.82
6,"[9, 52, 36, 91, 24, 46, 84, 96, 80, 93]",6,0.74


In [463]:
train_data.loc[[62, 7, 23, 53, 67, 66, 81, 40, 13, 78]]

Unnamed: 0,ID,Gender,Status,Children,Age,Customer_Status,Car_Owner,Customer_Service,Satisfaction,Business_Area,Action
62,1672,Male,M,2,52.63,Active,No,they were very nice and willing to help in any...,1,Service: Knowledge,
7,3327,Male,D,0,32.85,Active,No,Its nice when you get in your car and it has a...,1,Product: Functioning,
23,1897,Male,M,1,46.83,Inactive,Yes,long lines waiting for the rental pick.,0,Product: Functioning,On-demand pickup location
53,3716,Male,M,2,61.71,Inactive,Yes,The company was overwhelmed by the number of c...,0,Service: Orders/Contracts,On-demand pickup location
67,45,Male,S,2,53.28,Inactive,No,"It was good, got the car we wanted without muc...",1,Service: Knowledge,
66,0,Male,M,1,35.1,Active,Yes,Guys were extremely nice.,1,Service: Attitude,
81,36,Female,S,1,62.79,Inactive,Yes,They were idiots. The car had problems and th...,0,Product: Functioning,Free Upgrade
40,1757,Male,M,1,47.85,Active,No,they were extremely helpful and did whatever t...,1,Product: Functioning,
13,0,Male,M,1,52.1,Active,Yes,We got our car very quickly.,1,Service: Orders/Contracts,
78,0,Female,S,0,45.0,Active,No,My experience up to date was good. It was in...,1,Product: Pricing and Billing,


### Re-Calculate the F1 micro score

Do it again to verify.

In [465]:
i=0
input_line_nums = []
result_f1_scores = []
count_satisfaction = []

print('Randomly select 10 training data...', end='')
sample_train = train_data.loc[[62, 7, 23, 53, 67, 66, 81, 40, 13, 78]]
train_comments = list(sample_train.Customer_Service)
train_satisfaction = list(sample_train.Satisfaction.astype(str))


input_line_nums.append(list(sample_train.index))
count_satisfaction.append(train_satisfaction.count('1'))
print(input_line_nums[i], end='')
print(', satisfaction count = {}, '.format(count_satisfaction[i]), end='')

satisfaction_instruction = satisfaction_instruction_head + "\n"
for input_text_1, input_text_2 in zip(train_comments, train_satisfaction):
    satisfaction_instruction = satisfaction_instruction + "\n" + input_text_1 + "\n" + input_text_2 + "\n" 

results = []
for input_text in comments:
    prompt_input = satisfaction_instruction + "\n" + input_text + "\n"
    results.append(prompt.generate(prompt_input, model_id, parameters).replace("\n",""))

result_f1_scores.append(f1_score(satisfaction, results, average='micro'))
print('f1_micro_score : ', result_f1_scores[i])

Randomly select 10 training data...[62, 7, 23, 53, 67, 66, 81, 40, 13, 78], satisfaction count = 7, f1_micro_score :  0.96


In [466]:
print('f1_micro_score', f1_score(satisfaction, results, average='micro'))

f1_micro_score 0.96


### **2. Offer Recommendation**

Define instructions for the model to recommend best offer to an unsatisfied customer.

**Note:** Please **start with using [watsonx.ai Prompt Lab](https://dataplatform.cloud.ibm.com/wx/home?context=wx)** to find better prompts that provides you the best result on a small subset training records (under `train_data` variable). Make sure to not run an inference of all of `train_data`, as it'll take a long time to get the results. To get a sample from `train_data`, you can use e.g.`train_data.head(n=10)` to get first 10 records, or `train_data.sample(n=10)` to get random 10 records. Only once you have identified the best performing prompt, update this notebook to use the prompt and compute the metrics on the test data.

**Action:** Please edit the below cell and add your own prompt here. In the below prompt, we have the instruction (first sentence) and one example included in the prompt.  If you want to change the prompt or add your own examples or more examples, please change the below prompt accordingly.

Categorize and cleanse training data so that it can be used to write instructions

In [288]:
unsatisfied_train_data = train_data.loc[train_data['Satisfaction'] == 0]
unsatisfied_train_data.head()

Unnamed: 0,ID,Gender,Status,Children,Age,Customer_Status,Car_Owner,Customer_Service,Satisfaction,Business_Area,Action
2,0,Male,M,0,51.0,Inactive,Yes,I do not understand why I have to pay additio...,0,Product: Pricing and Billing,Premium features
3,1085,Female,S,2,42.0,Inactive,No,Based on the customer service personnel I enco...,0,Service: Attitude,On-demand pickup location
4,0,Female,M,2,44.1,Active,No,Provide more convenient car pickup from the ai...,0,Service: Orders/Contracts,On-demand pickup location
6,0,Female,M,2,44.03,Active,No,VERY slow service!,0,Service: Accessibility,Free Upgrade
8,0,Male,S,2,20.4,Inactive,No,They could really try work harder.,0,Service: Attitude,Free Upgrade


#### 2.1. Defining the model parameters
We need to provide a set of model parameters that will influence the result. We will use IBM's Granite model.

In [470]:
parameters = {
    "decoding_method": "greedy",
    "max_new_tokens": 30,
    "min_new_tokens": 1,
    "repetition_penalty": 1
}

model_id = "ibm/granite-13b-instruct-v1"

#### 2.2. Filter test data for unsatisfied customer

In [471]:
unsatisfied_test_data = test_data.loc[test_data['Satisfaction'] == 0]
unsatisfied_test_data.head()

Unnamed: 0,ID,Gender,Status,Children,Age,Customer_Status,Car_Owner,Customer_Service,Satisfaction,Business_Area,Action
0,2771,Female,M,2,49.99,Inactive,No,"last time I rented a car was at Manchester, NH...",0,Product: Functioning,On-demand pickup location
1,1133,Male,S,1,56.05,Inactive,No,Please lower the prices.,0,Product: Pricing and Billing,Free Upgrade
4,3541,Male,S,1,17.01,Inactive,Yes,"Slow, long lineup",0,Product: Functioning,On-demand pickup location
5,2608,Female,S,0,32.02,Active,No,Customer is important for the enjoyment of the...,0,Product: Functioning,Voucher
7,3382,Male,M,1,52.15,Inactive,No,They should upgrade me every time.,0,Service: Knowledge,Free Upgrade


Analyze the recommended actions for inputs from the test set.

**Note:** Execution of this cell could take several minutes.

#### 2.3. Define the first part of the directive

In [472]:
offer_recommendation_instruction_head = """Generate next best offer to unsatisfied customer. Choose offer recommendation from the following list: 'On-demand pickup location', 'Free Upgrade', 'Voucher', 'Premium features'.
"""

#### 2.4. Find the most effective training data - Step1

At first, we checked the score by randomly inserting 8 training data into the instructions as in the previous customer satisfaction, but the results were not good.

So, we decided to increase the examples in the directive by one and check the results.

In the first attempt, index number 97 received the highest score.

In [473]:
for idx, row in unsatisfied_train_data.iterrows():
    
    print("input_text_index: {}, ".format(idx), end='')
    offer_recommendation_instruction = offer_recommendation_instruction_head + "\n" + row['Customer_Service'] + "\n\n" + row['Action'] + "\n\n"

    results = []
    prompt = Prompt(access_token, project_id)
    comments = list(unsatisfied_test_data.Customer_Service)
    offer_recommended = list(unsatisfied_test_data.Action.astype(str))

    for input_text in comments:
        input_text = input_text + '\n'
        res = prompt.generate(" ".join([offer_recommendation_instruction, input_text]), model_id, parameters).replace("\n", "")
        results.append(res)
    
    print('f1_micro_score', f1_score(offer_recommended, results, average='micro'))

input_text_index: 2, f1_micro_score 0.2777777777777778
input_text_index: 3, f1_micro_score 0.2777777777777778
input_text_index: 4, f1_micro_score 0.2222222222222222
input_text_index: 6, f1_micro_score 0.16666666666666666
input_text_index: 8, f1_micro_score 0.16666666666666666
input_text_index: 9, f1_micro_score 0.16666666666666666
input_text_index: 16, f1_micro_score 0.2222222222222222
input_text_index: 18, f1_micro_score 0.2222222222222222
input_text_index: 19, f1_micro_score 0.3333333333333333
input_text_index: 21, f1_micro_score 0.16666666666666666
input_text_index: 23, f1_micro_score 0.2222222222222222
input_text_index: 27, f1_micro_score 0.1111111111111111
input_text_index: 30, f1_micro_score 0.16666666666666666
input_text_index: 31, f1_micro_score 0.2222222222222222
input_text_index: 33, f1_micro_score 0.4444444444444444
input_text_index: 38, f1_micro_score 0.3333333333333333
input_text_index: 42, f1_micro_score 0.2222222222222222
input_text_index: 46, f1_micro_score 0.2777777777

#### 2.5. Find the most effective training data - Step2

Score is measured by increasing the examples included in the instructions. The best result came out when two examples were included.

In [474]:
for idx, row in unsatisfied_train_data.iterrows():
    if idx in [97]:
        continue
        
    offer_recommendation_instruction = offer_recommendation_instruction_head
    
    input_offer = unsatisfied_train_data.loc[[97]]
    
    for iidx, rrow in input_offer.iterrows():
        offer_recommendation_instruction = offer_recommendation_instruction + "\n" + rrow['Customer_Service'] + "\n\n" + rrow['Action'] + "\n\n"
    
    print("input_text_index: {}, ".format(idx), end='')
    offer_recommendation_instruction = offer_recommendation_instruction + "\n" + row['Customer_Service'] + "\n\n" + row['Action'] + "\n\n"

    results = []
    prompt = Prompt(access_token, project_id)
    comments = list(unsatisfied_test_data.Customer_Service)
    offer_recommended = list(unsatisfied_test_data.Action.astype(str))

    for input_text in comments:
        input_text = input_text + '\n'
        res = prompt.generate(" ".join([offer_recommendation_instruction, input_text]), model_id, parameters).replace("\n", "")
        results.append(res)
    
    print('f1_micro_score', f1_score(offer_recommended, results, average='micro'))
    
    

input_text_index: 2, f1_micro_score 0.3333333333333333
input_text_index: 3, f1_micro_score 0.3333333333333333
input_text_index: 4, f1_micro_score 0.3333333333333333
input_text_index: 6, f1_micro_score 0.3333333333333333
input_text_index: 8, f1_micro_score 0.3888888888888889
input_text_index: 9, f1_micro_score 0.3888888888888889
input_text_index: 16, f1_micro_score 0.2777777777777778
input_text_index: 18, f1_micro_score 0.3888888888888889
input_text_index: 19, f1_micro_score 0.3888888888888889
input_text_index: 21, f1_micro_score 0.3888888888888889
input_text_index: 23, f1_micro_score 0.16666666666666666
input_text_index: 27, f1_micro_score 0.3333333333333333
input_text_index: 30, f1_micro_score 0.3333333333333333
input_text_index: 31, f1_micro_score 0.4444444444444444
input_text_index: 33, f1_micro_score 0.2777777777777778
input_text_index: 38, f1_micro_score 0.4444444444444444
input_text_index: 42, f1_micro_score 0.3888888888888889
input_text_index: 46, f1_micro_score 0.27777777777777

### Re-Calculate the F1 micro score

In [475]:
offer_recommendation_instruction = offer_recommendation_instruction_head

input_offer = unsatisfied_train_data.loc[[97, 89]]

for iidx, rrow in input_offer.iterrows():
        offer_recommendation_instruction = offer_recommendation_instruction + "\n" + rrow['Customer_Service'] + "\n\n" + rrow['Action'] + "\n\n"
    
results = []
prompt = Prompt(access_token, project_id)
comments = list(unsatisfied_test_data.Customer_Service)
offer_recommended = list(unsatisfied_test_data.Action.astype(str))

for input_text in comments:
    input_text = input_text + '\n'
    res = prompt.generate(" ".join([offer_recommendation_instruction, input_text]), model_id, parameters).replace("\n", "")
    results.append(res)

In [476]:
from sklearn.metrics import f1_score

print('f1_micro_score', f1_score(offer_recommended, results, average='micro'))

f1_micro_score 0.6111111111111112


---

Copyright © 2023 IBM. This notebook and its source code are released under the terms of the MIT License.