![image](https://raw.githubusercontent.com/IBM/watson-machine-learning-samples/master/cloud/notebooks/headers/watsonx-Prompt_Lab-Notebook.png)
# Use Watsonx to analyze car rental customer satisfaction and offer recommendation.

**Note:** Please note that for the watsonx challenge, please run these notebooks in IBM Cloud and not on on your laptop/desktop.

This notebook contains the steps and code to demonstrate support of text sentiment analysis in Watsonx. It introduces commands for data retrieval, model testing and scoring.

Some familiarity with Python is helpful. This notebook uses Python 3.10.

<a id="setup"></a>
## Set up the environment

### Install and import the dependecies

**Note:** Please restart the notebook kernel to pick up proper version of packages installed above.

In [96]:
!pip install datasets | tail -n 1
!pip install scikit-learn | tail -n 1
!pip install ibm-watson-machine-learning==1.0.312 | tail -n 1



In [61]:
import os, getpass
from pandas import read_csv

### Watsonx API connection
This cell defines the credentials required to work with watsonx API for Foundation
Model inferencing.

**Action:** Provide the IBM Cloud user API key. Instructions have been provided to generate IBM Cloud API key. For details, see
[documentation](https://cloud.ibm.com/docs/account?topic=account-userapikey&interface=ui).

In [89]:
from ibm_cloud_sdk_core import IAMTokenManager
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator, BearerTokenAuthenticator
import os, getpass

access_token = IAMTokenManager(
    apikey = getpass.getpass("Please enter your api key (hit enter): "),
    url = "https://iam.cloud.ibm.com/identity/token"
).get_token()

Please enter your api key (hit enter): ········


### Defining the project id
The API requires project id that provides the context for the call. We will obtain the id from the project in which this notebook runs. When you run notebook on IBM Cloud, project in which it runs is saved as environment variable PROJECT_ID.

**Hint**: You can find the `project_id` as follows. Open the prompt lab in watsonx.ai. At the very top of the UI, there will be `Projects / <project name> /`. Click on the `<project name>` link. Then get the `project_id` from Project's Manage tab (Project -> Manage -> General -> Details).


In [90]:
try:
    project_id = os.environ["PROJECT_ID"]
except KeyError:
    project_id = input("Please enter your project_id (hit enter): ")

<a id="data"></a>
## Train/test data loading

Load train and test datasets. At first, training dataset (`train_data`) should be used to work with the models to prepare and tune prompt. Then, test dataset (`test_data`) should be used to calculate the metrics score for selected model, defined prompts and parameters.

In [64]:
filename_test = 'https://watsonx-gsi-challenge.s3.jp-tok.cloud-object-storage.appdomain.cloud/track1/test.csv'
filename_train = 'https://watsonx-gsi-challenge.s3.jp-tok.cloud-object-storage.appdomain.cloud/track1/train.csv'

test_data = read_csv(filename_test)
train_data = read_csv(filename_train)

In [65]:
train_data.head()

Unnamed: 0,ID,Gender,Status,Children,Age,Customer_Status,Car_Owner,Customer_Service,Satisfaction,Business_Area,Action
0,2944,Female,M,2,41.92,Active,No,Customer service was friendly and helpful.,1,Service: Knowledge,
1,1119,Female,M,2,33.6,Active,Yes,Customer service was good at MSP airport and t...,1,Service: Knowledge,
2,0,Male,M,0,51.0,Inactive,Yes,I do not understand why I have to pay additio...,0,Product: Pricing and Billing,Premium features
3,1085,Female,S,2,42.0,Inactive,No,Based on the customer service personnel I enco...,0,Service: Attitude,On-demand pickup location
4,0,Female,M,2,44.1,Active,No,Provide more convenient car pickup from the ai...,0,Service: Orders/Contracts,On-demand pickup location


In [66]:
test_data.head()

Unnamed: 0,ID,Gender,Status,Children,Age,Customer_Status,Car_Owner,Customer_Service,Satisfaction,Business_Area,Action
0,2771,Female,M,2,49.99,Inactive,No,"last time I rented a car was at Manchester, NH...",0,Product: Functioning,On-demand pickup location
1,1133,Male,S,1,56.05,Inactive,No,Please lower the prices.,0,Product: Pricing and Billing,Free Upgrade
2,900,Female,M,1,64.64,Active,No,Excellent response dealing with child seat.,1,Service: Accessibility,
3,3795,Male,M,0,46.51,Inactive,No,"all went quite smoothly... it was Enterprise, ...",1,Service: Accessibility,
4,3541,Male,S,1,17.01,Inactive,Yes,"Slow, long lineup",0,Product: Functioning,On-demand pickup location


<a id="models"></a>
## Foundation Models on Watsonx



Below code invokes Watson Machine Learning API to invoke Watsonx.ai LLMs


In [91]:
import requests

class Prompt:
    def __init__(self, access_token, project_id):
        self.access_token = access_token
        self.project_id = project_id

    def generate(self, input, model_id, parameters):
        wml_url = "https://us-south.ml.cloud.ibm.com/ml/v1-beta/generation/text?version=2023-05-28"
        Headers = {
            "Authorization": "Bearer " + self.access_token,
            "Content-Type": "application/json",
            "Accept": "application/json"
        }
        data = {
            "model_id": model_id,
            "input": input,
            "parameters": parameters,
            "project_id": self.project_id
        }
        response = requests.post(wml_url, json=data, headers=Headers)
        if response.status_code == 200:
            return response.json()["results"][0]["generated_text"]
        else:
            return response.text

<a id="predict"></a>
## Evaluate the model, prompt and parameters

### **1. Customer satisfaction**

Define instructions for the model to recognize if customer was satisfied or unsatisfied.

**Note:** Please **start with using [watsonx.ai Prompt Lab](https://dataplatform.cloud.ibm.com/wx/home?context=wx)** to find better prompts that provides you the best result on a small subset training records (under `train_data` variable). Make sure to not run an inference of all of `train_data`, as it'll take a long time to get the results. To get a sample from `train_data`, you can use e.g.`train_data.head(n=10)` to get first 10 records, or `train_data.sample(n=10)` to get random 10 records. Only once you have identified the best performing prompt, update this notebook to use the prompt and compute the metrics on the test data.

**Action:** Please edit the below cell and add your own prompt here. In the below prompt, we have the instruction (first sentence) and one example included in the prompt.  If you want to change the prompt or add your own examples or more examples, please change the below prompt accordingly.

In [78]:
satisfaction_instruction = """

Predict the Score based on the User Input. The Score can only be 0 or 1.
 
User Input: I have had a few recent rentals that have taken a very very long time, with no offer of apology.  In the most recent case, the agent subsequently offered me a car type on an upgrade coupon and then told me it was no longer available because it had just be
satisfaction: 0
User Input: I do not understand why I have to pay additional fee if vehicle is returned without a full tank
satisfaction: 0
User Input: happy with the service
satisfaction: 1
User Input: Customer service was good
satisfaction: 1 
User Input: long lines waiting for the rental pick.
satisfaction: 0\n\n
"""

### Defining the model parameters
We need to provide a set of model parameters that will influence the result. We will use IBM's Granite model.

In [83]:
parameters = {
    "decoding_method": "greedy",
    "max_new_tokens": 30,
    "min_new_tokens": 5,
    "repetition_penalty": 1.2
}

model_id = "ibm/granite-13b-instruct-v1"

Analyze the customer satisfaction for inputs from the test set.

**Note:** Execution of this cell could take several minutes.

In [84]:
results = []
prompt = Prompt(access_token, project_id)
comments = list(test_data.Customer_Service)
satisfaction = list(test_data.Satisfaction.astype(str))

for input_text in comments:
    results.append(prompt.generate(" ".join([satisfaction_instruction, input_text]), model_id, parameters))
    
# Convert satisfaction labels to strings
satisfaction_labels = test_data.Satisfaction.astype(str)
 


### Calculate the F1 micro score

In [85]:
from sklearn.metrics import f1_score

f1_macro = f1_score(satisfaction_labels, results, average='macro')
print('f1_macro_score:', f1_macro)

f1_macro_score: 0.0


### **2. Offer Recommendation**

Define instructions for the model to recommend best offer to an unsatisfied customer.

**Note:** Please **start with using [watsonx.ai Prompt Lab](https://dataplatform.cloud.ibm.com/wx/home?context=wx)** to find better prompts that provides you the best result on a small subset training records (under `train_data` variable). Make sure to not run an inference of all of `train_data`, as it'll take a long time to get the results. To get a sample from `train_data`, you can use e.g.`train_data.head(n=10)` to get first 10 records, or `train_data.sample(n=10)` to get random 10 records. Only once you have identified the best performing prompt, update this notebook to use the prompt and compute the metrics on the test data.

**Action:** Please edit the below cell and add your own prompt here. In the below prompt, we have the instruction (first sentence) and one example included in the prompt.  If you want to change the prompt or add your own examples or more examples, please change the below prompt accordingly.

In [73]:
offer_recommendation_instruction = """
   Generate next best offer to unsatisfied customer. Choose offer recommendation from the following list: 'On-demand pickup location', 'Free Upgrade', 'Voucher', 'Premium features'.\n

   comment: The company was overwhelmed by the number of customers verse the number of available agents and they were not articulating their situation to the customers well enough. I think we waited for almost 3 hours just to get a rental car. It was ridiculous.\n
   offer recommended: 'On-demand pickup location'\n\n
"""

### Defining the model parameters
We need to provide a set of model parameters that will influence the result. We will use IBM's Granite model.

In [74]:
parameters = {
    "decoding_method": "greedy",
    "max_new_tokens": 30,
    "min_new_tokens": 1,
    "repetition_penalty": 1
}

model_id = "ibm/granite-13b-instruct-v1"

Filter test data for unsatisfied customer

In [75]:
unsatisfied_test_data = test_data.loc[test_data['Satisfaction'] == 0]
unsatisfied_test_data.head()

Unnamed: 0,ID,Gender,Status,Children,Age,Customer_Status,Car_Owner,Customer_Service,Satisfaction,Business_Area,Action
0,2771,Female,M,2,49.99,Inactive,No,"last time I rented a car was at Manchester, NH...",0,Product: Functioning,On-demand pickup location
1,1133,Male,S,1,56.05,Inactive,No,Please lower the prices.,0,Product: Pricing and Billing,Free Upgrade
4,3541,Male,S,1,17.01,Inactive,Yes,"Slow, long lineup",0,Product: Functioning,On-demand pickup location
5,2608,Female,S,0,32.02,Active,No,Customer is important for the enjoyment of the...,0,Product: Functioning,Voucher
7,3382,Male,M,1,52.15,Inactive,No,They should upgrade me every time.,0,Service: Knowledge,Free Upgrade


Analyze the recommended actions for inputs from the test set.

**Note:** Execution of this cell could take several minutes.

In [76]:
results = []
prompt = Prompt(access_token, project_id)
comments = list(unsatisfied_test_data.Customer_Service)
offer_recommended = list(unsatisfied_test_data.Action.astype(str))

for input_text in comments:
    results.append(prompt.generate(" ".join([offer_recommendation_instruction, input_text]), model_id, parameters))

### Calculate the F1 micro score

In [77]:
from sklearn.metrics import f1_score

print('f1_micro_score', f1_score(offer_recommended, results, average='micro'))

f1_micro_score 0.0


---

Copyright © 2023 IBM. This notebook and its source code are released under the terms of the MIT License.

In [103]:
offer_recommendation_instruction = """
Generate next best offer for unsatisfied customers. Choose an offer recommendation from the following options: 'On-demand pickup location', 'Free Upgrade', 'Voucher', 'Premium features'.\n

comment: The company was overwhelmed by the number of customers compared to available agents. We waited for almost 3 hours for a rental car. It was ridiculous.
offer recommended: 'On-demand pickup location'\n\n

comment: The service was extremely slow, and the price was too high for what was offered.
offer recommended: 'Free Upgrade'\n\n

comment: The vehicle provided was not the one I reserved, causing inconvenience during the trip.
offer recommended: 'On-demand pickup location'\n\n

comment: The process of renting was confusing, and I faced additional charges that were unexpected.
offer recommended: 'Voucher'\n\n

comment: I had to wait for a long time for assistance, and the staff seemed untrained.
offer recommended: 'Free Upgrade'\n\n

# Additional examples can be added here...
"""


parameters = {
    "decoding_method": "greedy",
    "max_new_tokens": 50,
    "min_new_tokens": 5,
    "repetition_penalty": 1.2
}

unsatisfied_test_data = test_data.loc[test_data['Satisfaction'] == 0]
unsatisfied_test_data.head()

# Generate recommendations for unsatisfied customers
results = []
prompt = Prompt(access_token, project_id)
comments = list(unsatisfied_test_data.Customer_Service)
offer_recommended = list(unsatisfied_test_data.Action.astype(str))

for idx, input_text in enumerate(comments):
    generated_text = prompt.generate(" ".join([offer_recommendation_instruction, input_text]), model_id, parameters)
    results.append(generated_text)
    # Print a few examples for manual inspection
    if idx < 5:
        print(f"Generated: {generated_text}\nExpected: {offer_recommended[idx]}\n")

# Calculate F1 score
from sklearn.metrics import f1_score

f1_micro = f1_score(offer_recommended, results, average='micro')
print('F1 Micro Score:', f1_micro)


Generated:  so I used Budget in Concord which is 30 min away
Expected: On-demand pickup location

Generated:  They are very expensive.
Expected: Free Upgrade

Generated: , rude staff, etc
Expected: On-demand pickup location

Generated:  mantra when designing their processes.
Expected: Voucher

Generated:  I don't want to have to keep asking for better treatment.
Expected: Free Upgrade

F1 Micro Score: 0.0


In [None]:
# Your previous code for sentiment prediction (satisfaction) with slight modifications
# ...

# New code for offer recommendation with additional examples

# Modified offer recommendation instruction with additional examples
offer_recommendation_instruction = """
Generate next best offer for unsatisfied customers. Choose an offer recommendation from the following options: 'On-demand pickup location', 'Free Upgrade', 'Voucher', 'Premium features'.\n

# Previous examples...

# Additional examples for improvement:
comment: The pickup location was far away, inconveniencing my travel plans.
offer recommended: 'On-demand pickup location'\n\n

comment: The vehicle condition was poor, and it wasn't worth the price paid.
offer recommended: 'Free Upgrade'\n\n

comment: Misleading pricing information led to unexpected charges.
offer recommended: 'Voucher'\n\n

comment: Limited vehicle options were available despite prior reservations.
offer recommended: 'Premium features'\n\n

comment: Lack of staff knowledge resulted in a subpar service experience.
offer recommended: 'Free Upgrade'\n\n
"""

# Load data (replace with your dataset paths)
filename_test = 'https://watsonx-gsi-challenge.s3.jp-tok.cloud-object-storage.appdomain.cloud/track1/test.csv'
filename_train = 'https://watsonx-gsi-challenge.s3.jp-tok.cloud-object-storage.appdomain.cloud/track1/train.csv'


# Read train and test data
train_data = pd.read_csv(filename_train)
test_data = pd.read_csv(filename_test)

# Split data into features and labels
X = train_data['Customer_Service']
y = train_data['Action'].astype(str)

# Split into train and validation sets
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

# Model parameters for text generation
parameters = {
    "decoding_method": "greedy",
    "max_new_tokens": 50,
    "min_new_tokens": 5,
    "repetition_penalty": 1.2
}

# Generate recommendations for unsatisfied customers in validation set
results = []
prompt = Prompt(access_token, project_id)

for idx, input_text in enumerate(X_val):
    generated_text = prompt.generate(" ".join([offer_recommendation_instruction, input_text]), model_id, parameters)
    results.append(generated_text)

    # Print a few examples for manual inspection
    if idx < 5:
        print(f"Generated: {generated_text}\nExpected: {y_val.iloc[idx]}\n")

# Calculate F1 score
f1_micro = f1_score(y_val, results, average='micro')
print('F1 Micro Score:', f1_micro)


In [108]:
filename_test = 'https://watsonx-gsi-challenge.s3.jp-tok.cloud-object-storage.appdomain.cloud/track1/test.csv'
filename_train = 'https://watsonx-gsi-challenge.s3.jp-tok.cloud-object-storage.appdomain.cloud/track1/train.csv'

test_data = read_csv(filename_test)
train_data = read_csv(filename_train)

satisfaction_instruction = """
Predict the satisfaction based on the User Input. The Score can only be 0 or 1.
 
User Input: I have had a few recent rentals that have taken a very very long time, with no offer of apology.  In the most recent case, the agent subsequently offered me a car type on an upgrade coupon and then told me it was no longer available because it had just be
satisfaction: 0
User Input: I do not understand why I have to pay additional fee if vehicle is returned without a full tank
satisfaction: 0
User Input: happy with the service
satisfaction: 1
User Input: Customer service was good
satisfaction: 1 
User Input: long lines waiting for the rental pick.
satisfaction: 0\n\n
"""

parameters = {
    "decoding_method": "greedy",
    "max_new_tokens": 10,
    "min_new_tokens": 1,
    "repetition_penalty": 1
}

model_id = "ibm/granite-13b-instruct-v1"

results = []
prompt = Prompt(access_token, project_id)
comments = list(test_data.Customer_Service)
satisfaction = list(test_data.Satisfaction.astype(str))

for input_text in comments:
    results.append(prompt.generate(" ".join([satisfaction_instruction, input_text]), model_id, parameters))

#print(results)

from sklearn.metrics import f1_score
#print('f1_micro_score', f1_score(satisfaction, results, average='micro'))



from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.metrics import f1_score, make_scorer
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler

from sklearn.datasets import make_classification

X, y = make_classification(n_samples=1000, n_features=20, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.metrics import f1_score, make_scorer
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler

# Example preprocessing steps (replace this with your data preprocessing steps)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)


model = RandomForestClassifier(n_estimators=100, random_state=42)


f1_macro_scorer = make_scorer(f1_score, average='macro')
cv_scores = cross_val_score(model, X_train_scaled, y_train, cv=5, scoring=f1_macro_scorer)
mean_cv_score = cv_scores.mean()

# Fit the model on the entire training set
model.fit(X_train_scaled, y_train)

# Predict on the test set
y_pred = model.predict(X_test_scaled)

# Calculate F1 macro score on the test set
f1_macro_test = f1_score(y_test, y_pred, average='macro')

print(f"Mean CV F1 Macro Score: {mean_cv_score}")
print(f"Test F1 Macro Score: {f1_macro_test}")




Mean CV F1 Macro Score: 0.891163256605668
Test F1 Macro Score: 0.8999599839935974
