# Lab 3: Intro to Using IBM GenAI Python Library

Welcome to the Lab 3. 

In the previous lab, we explored the challenges of prompt engineering; learning how to tweak our wording, choose different model plus optimize model parameters. Minor changes can significantly enhance the results generated by language models.

In this lab, we will apply our new knowledge to a real-world use case as we migrate from prompt engineering within the Watsonx.ai Workbench to coding prompts in Python. Using the [IBM Watson Machine learning python Library](https://ibm.github.io/watson-machine-learning-sdk/) to programmatically interact with Watsonx.ai, we will use templates to streamline our interaction with the language model and maximize its potential.

The concept of Prompt Templates provided by the LangChain Python library allows you to construct templates that can be easily filled with specific information to generate a wide range of outputs. 

## Recreating Prompt Builder Prompts Using GenAI Prompt Patterns

### Scenario: Personalized Recommendation for XYZ Retail Company <a id="step3"></a>

XYZ Retail is a popular online retail store that sells a wide range of products, including electronics, clothing, home goods, and more. They have a large customer base and want to provide a personalized shopping experience to enhance customer satisfaction and boost sales.

To achieve that goal, XYZ wants to leverage generative AI to create fact sheets about each of their customers. These fact sheets will summarize relevant information such as customer demographics (name, age, location), and purchase history. These fact sheets will help XYZ Retail's sales team build stronger customer relationships, increase customer satisfaction and drive repeat purchases.


You start by performing prompt engineering in Prompt Lab, and you might test base model output with an initial prompt like this:

![title](./images/prompt_without_example.png)

The model's recommendation is not accurate or useful as the customer Michael Jones had bought toys and games not outdoor activewear. Fortunately you learned in the Prompt Engineering lab that Few Shot Learning can help you obtain better results. 

What happens when we provide a few examples using Prompt Builder to guide the LLM into generating more meaningful recommendations. 

![title](./images/prompt_with_example.png)


Great, the product recommendation for Michael Jones is much better.  However how do you productionize your few shot prompting to generate recommendations for all of XYZ Retail customers? Copy and pasting each customer's info into Prompt Builder would take too long.  

You'll need a programmatic solution.  Maybe you could even generate a large set of examples then use that for Tuning a model in Watsonx.ai.  But we're getting ahead of ourselves as you'll learn about building a Prompt Tuning dataset in a later lab.

Here is what you will learn in the following steps:

<p align="center">
  <img src="./images/scenario_flow_chart03.png" width="600"/>
</p>

## 1. Load the required libraries  <a id="step1"></a>

In [42]:
import os
from dotenv import load_dotenv
import pandas as pd

from langchain import PromptTemplate, FewShotPromptTemplate
from ibm_watson_machine_learning.foundation_models import Model
from ibm_watson_machine_learning.metanames import GenTextParamsMetaNames as GenParams



## 2. Create a Factsheet for each customer using Prompt Patterns  <a id="step2"></a>

### **2.1 What are Prompt Patterns?**

The [PromptTemplate class](https://python.langchain.com/docs/modules/model_io/prompts/prompt_templates/) in the [LangChain Python library](https://python.langchain.com/docs/get_started/introduction) provides a flexible approach to creating prompts from structured templates.  We will use the PrompTemplate class to simplify creation of our few shot prompts for XYZ Retail.

XYZ Retail has provided you their customer's data in .csv format. To generate prompts for each customer, you will need to transform the prompt that you engineered in Prompt Builder into a more useful programmatic format. Using the Prompts Pattern class, you can easily substitute customer data from a file to generate one or multiple prompts.

The PromptTemplate class defines a schema where variables to replace are placed inside curly braces "{}". These curly braces serve as a placeholder for the actual data that will be substituted into the template.

Let's see how this works in practice.

### **2.2 Creating a simple prompt from a template**

A prompt template can be created using the PromptTemplate class from a string or file. There are [additional PromptTemplate examples](https://api.python.langchain.com/en/latest/prompts/langchain.prompts.prompt.PromptTemplate.html#langchain.prompts.prompt.PromptTemplate) provided in the LangChain documentation.

#### 2.2.1 Prompt Template From String

In [3]:
pattern = "input: {name} {family_name} is {age} and lives in {location}. They bought {purchase_history}"


prompt = PromptTemplate.from_template(pattern)
prompt = prompt.format(name="Jane", 
                       family_name="Doe",
                       age=43,
                       location="San Francisco, CA",
                       purchase_history = "groceries, household goods and travel supplies")

prompt

'input: Jane Doe is 43 and lives in San Francisco, CA. They bought groceries, household goods and travel supplies'

#### 2.2.2 Prompt Template From File
Prompt patterns can also be stored as a txt file:

In [17]:
# _path_to_file = "./templates/customer_factsheet.yaml"

# prompt = PromptPattern.from_file(_path_to_file)
# print("TEMPLATE:\n" + str(prompt))

# # This template can now be populated by iterating over an array:
# names= ["Jane", "Siamak", "Luis"]
# family_names = ["Doe", "Baharoo", "Cooli"]
# ages = [43, 57, 21]
# cities = ["San Francisco", "Chicago", "New York City"]
# states= ["CA", "IL", "NY"]
# purchase_histories= ["groceries, household goods and travel supplies", "Books electronics home_goods", "Clothing shoes cosmetics"]
# recommendation_1s= ["Basket of organic fruits", 
#                     "Kindle Paperwhite - This e-reader is perfect for book lovers who want a lightweight and portable device that can hold thousands of books. It has a glare-free display and a long battery life, so you can read for hours on end without having to worry about running out of power.", 
#                     "Aritzia Wilfred Free Sweater - This soft and cozy sweater is perfect for a casual day out. It's available in a variety of colors, so you can find the perfect one to match your style."]
# recommendation_2s= ["Lightweight carry-on suitcase", 
#                     "Google Home Mini - This smart speaker is perfect for controlling your home's smart devices with your voice. You can use it to play music, set alarms, get news, and more. It's also a great way to stay connected with friends and family.", 
#                     "Steve Madden Pointed Toe Pumps - These stylish pumps are perfect for a night out on the town. They're comfortable and versatile, so you can wear them with a variety of outfits."]
# for x in range(1, 4):
#     prompt.sub(f"name_{x}", names[x-1]).sub(f"family_name_{x}", family_names[x-1]).sub(f"age_{x}", str(ages[x-1])).sub(f"city_{x}", cities[x-1]).sub(f"state_{x}", states[x-1])
#     prompt.sub(f"purchase_history_{x}", purchase_histories[x-1])
#     prompt.sub(f"recommendation_1_{x}", recommendation_1s[x-1])
#     prompt.sub(f"recommendation_2_{x}", recommendation_2s[x-1])
# print("\nPOPULATED TEMPLATE:\n" + str(prompt))

TEMPLATE:
input: "{{name_1}} {{family_name_1}} is {{age_1}} years old and lives in {{city_1}}, {{state_1}}. Their purchase history includes {{purchase_history_1}}."
output: "Recommendations:\n Item 1: {{recommendation_1_1}}\nItem 2: {{recommendation_2_1}}"
input: "{{name_2}} {{family_name_2}} is {{age_2}} years old and lives in {{city_2}}, {{state_2}}. Their purchase history includes {{purchase_history_2}}."
output: "Recommendations:\n Item 1: {{recommendation_1_2}}\nItem 2: {{recommendation_2_2}}"
input: "{{name_3}} {{family_name_3}} is {{age_3}} years old and lives in {{city_3}}, {{state_3}}. Their purchase history includes {{purchase_history_3}}."
output: ""


POPULATED TEMPLATE:
input: "Jane Doe is 43 years old and lives in San Francisco, CA. Their purchase history includes groceries, household goods and travel supplies."
output: "Recommendations:\n Item 1: Basket of organic fruits\nItem 2: Lightweight carry-on suitcase"
input: "Siamak Baharoo is 57 years old and lives in Chicago, 

In [28]:
# We create a template from a file:
_path_to_file = "./templates/customer_factsheet_lang.txt"

input_variables = [f"name_{i}"for i in range(1,4)]+\
                    [f"family_name_{i}"for i in range(1,4)]+\
                    [f"age_{i}"for i in range(1,4)]+\
                    [f"city_{i}"for i in range(1,4)]+\
                    [f"state_{i}"for i in range(1,4)]+\
                    [f"purchase_history_{i}"for i in range(1,4)]+\
                    [f"recommendation_{j}_{i}"for i in range(1,3)for j in range(1,3)]

prompt = PromptTemplate.from_file(_path_to_file, input_variables=input_variables)
print(prompt.template)

input: "{name_1} {family_name_1} is {age_1} years old and lives in {city_1}, {state_1}. Their purchase history includes {purchase_history_1}."
output: "Recommendations:\n Item 1: {recommendation_1_1}\nItem 2: {recommendation_2_1}"
input: "{name_2} {family_name_2} is {age_2} years old and lives in {city_2}, {state_2}. Their purchase history includes {purchase_history_2}."
output: "Recommendations:\n Item 1: {recommendation_1_2}\nItem 2: {recommendation_2_2}"
input: "{name_3} {family_name_3} is {age_3} years old and lives in {city_3}, {state_3}. Their purchase history includes {purchase_history_3}."
output: ""



In [40]:
# This template can now be populated by iterating over an array:

val_arrays = {"name" : ["Jane", "Siamak", "Luis"],
              "family_name" : ["Doe", "Baharoo", "Cooli"],
              "age" : [43, 57, 21],
              "city" : ["San Francisco", "Chicago", "New York City"],
              "state" : ["CA", "IL", "NY"],
              "purchase_history" : ["groceries, household goods and travel supplies", "Books electronics home_goods", "Clothing shoes cosmetics"],
              "recommendation_1" : ["Basket of organic fruits", 
                            "Kindle Paperwhite - This e-reader is perfect for book lovers who want a lightweight and portable device that can hold thousands of books. It has a glare-free display and a long battery life, so you can read for hours on end without having to worry about running out of power."],
              "recommendation_2" : ["Lightweight carry-on suitcase", 
                            "Google Home Mini - This smart speaker is perfect for controlling your home's smart devices with your voice. You can use it to play music, set alarms, get news, and more. It's also a great way to stay connected with friends and family."]}

# mapping the name of the variables to the names in the templates, e.g name_1, name_2....
values_dicts = [{f"{key}_{i+1}":name for i, name in enumerate(values)} for key, values in val_arrays.items()]
values_dict = {k: v for d in values_dicts for k, v in d.items()}
print(values_dict)

prompt = prompt.format(**values_dict)
print("\nPOPULATED TEMPLATE:\n" + str(prompt))

{'name_1': 'Jane', 'name_2': 'Siamak', 'name_3': 'Luis', 'family_name_1': 'Doe', 'family_name_2': 'Baharoo', 'family_name_3': 'Cooli', 'age_1': 43, 'age_2': 57, 'age_3': 21, 'city_1': 'San Francisco', 'city_2': 'Chicago', 'city_3': 'New York City', 'state_1': 'CA', 'state_2': 'IL', 'state_3': 'NY', 'purchase_history_1': 'groceries, household goods and travel supplies', 'purchase_history_2': 'Books electronics home_goods', 'purchase_history_3': 'Clothing shoes cosmetics', 'recommendation_1_1': 'Basket of organic fruits', 'recommendation_1_2': 'Kindle Paperwhite - This e-reader is perfect for book lovers who want a lightweight and portable device that can hold thousands of books. It has a glare-free display and a long battery life, so you can read for hours on end without having to worry about running out of power.', 'recommendation_2_1': 'Lightweight carry-on suitcase', 'recommendation_2_2': "Google Home Mini - This smart speaker is perfect for controlling your home's smart devices with

In [None]:
# We create a template from a file:
_path_to_file = "./templates/customer_factsheet_lang_simple.txt"

## 3. Create Prompt Examples based on Customers Factsheet <a id="step3"></a>
The value of PromptTemplate arises when generating a large number of prompts either as examples for bulk evaluation of an engineered prompt or for creation of a Tuning dataset

### 3.1 Bulk Creation of Prompts
We can now generate few shot Prompts from rows in a csv using "sub_all_from_csv". This could also be done from json.  Details can be found in the [PromptPattern class documentation](https://ibm.github.io/ibm-generative-ai/rst_source/genai.prompt.prompt_pattern.html)

In [None]:
val_arrays = {"name" : ["Jane", "Siamak", "Luis"],
              "family_name" : ["Doe", "Baharoo", "Cooli"],
              "age" : [43, 57, 21],
              "city" : ["San Francisco", "Chicago", "New York City"],
              "state" : ["CA", "IL", "NY"],
              "purchase_history" : ["groceries, household goods and travel supplies", "Books electronics home_goods", "Clothing shoes cosmetics"],
              "recommendation_1" : ["Basket of organic fruits", 
                            "Kindle Paperwhite - This e-reader is perfect for book lovers who want a lightweight and portable device that can hold thousands of books. It has a glare-free display and a long battery life, so you can read for hours on end without having to worry about running out of power."],
              "recommendation_2" : ["Lightweight carry-on suitcase", 
                            "Google Home Mini - This smart speaker is perfect for controlling your home's smart devices with your voice. You can use it to play music, set alarms, get news, and more. It's also a great way to stay connected with friends and family."]}


In [74]:
examples = [
    {
        "name":"Jane", 
        "family_name":"Doe", 
        "age":43, 
        "city":"San Francisco", 
        "state":"CA",
        "purchase_history":"groceries, household goods and travel supplies", 
        "recommendation_1":"Basket of organic fruits",
        "recommendation_2":"Lightweight carry-on suitcase"
    },{
        "name":"Siamak", 
        "family_name":"Baharoo", 
        "age":57, 
        "city":"Chicago", 
        "state":"IL",
        "purchase_history":"Books electronics home_goods", 
        "recommendation_1":"Kindle Paperwhite - This e-reader is perfect for book lovers who want a lightweight and portable device that can hold thousands of books. It has a glare-free display and a long battery life, so you can read for hours on end without having to worry about running out of power.",
        "recommendation_2": "Google Home Mini - This smart speaker is perfect for controlling your home's smart devices with your voice. You can use it to play music, set alarms, get news, and more. It's also a great way to stay connected with friends and family."
    },{
        "name":"Luis", 
        "family_name":"Cooli", 
        "age":21, 
        "city":"New York City", 
        "state":"NY",
        "purchase_history":"Clothing shoes cosmetics", 
        "recommendation_1":"Aritzia Wilfred Free Sweater - This soft and cozy sweater is perfect for a casual day out. It's available in a variety of colors, so you can find the perfect one to match your style.",
        "recommendation_2":"Steve Madden Pointed Toe Pumps - These stylish pumps are perfect for a night out on the town. They're comfortable and versatile, so you can wear them with a variety of outfits."
    }
]

In [75]:
example_prompt = PromptTemplate.from_file("./templates/customer_factsheet_lang_simple.txt",
                                           input_variables=["name", "family_name", "age", "city", "state",
                                                             "purchase_history", "recommendation_1","recommendation_2"])


In [76]:
print(example_prompt.format(**examples[0]))

input: "Jane Doe is 43 years old and lives in San Francisco, CA. Their purchase history includes groceries, household goods and travel supplies."
output: "Recommendations:\n Item 1: Basket of organic fruits\nItem 2: Lightweight carry-on suitcase"



In [31]:
_path_to_template_file = "./templates/customer_factsheet_lang.txt"

# create the prompt template
input_variables = [f"name_{i}"for i in range(1,4)]+\
                    [f"family_name_{i}"for i in range(1,4)]+\
                    [f"age_{i}"for i in range(1,4)]+\
                    [f"city_{i}"for i in range(1,4)]+\
                    [f"state_{i}"for i in range(1,4)]+\
                    [f"purchase_history_{i}"for i in range(1,4)]+\
                    [f"recommendation_{j}_{i}"for i in range(1,3)for j in range(1,3)]

prompt = PromptTemplate.from_file(_path_to_template_file, input_variables=input_variables)
print(prompt.template)


input: "{name_1} {family_name_1} is {age_1} years old and lives in {city_1}, {state_1}. Their purchase history includes {purchase_history_1}."
output: "Recommendations:\n Item 1: {recommendation_1_1}\nItem 2: {recommendation_2_1}"
input: "{name_2} {family_name_2} is {age_2} years old and lives in {city_2}, {state_2}. Their purchase history includes {purchase_history_2}."
output: "Recommendations:\n Item 1: {recommendation_1_2}\nItem 2: {recommendation_2_2}"
input: "{name_3} {family_name_3} is {age_3} years old and lives in {city_3}, {state_3}. Their purchase history includes {purchase_history_3}."
output: ""



In [61]:
_path_to_csv_file = "./data/customer_factsheet.csv"

# We use a csv reader that flattens the csv file to map the variables to the relevant values.
# The recommendations for rows 1 and 2 are used for few shot training while every row's 3rd recommendation
# is ignored as these are the recommendation for which we are evaluating our prompt's performance.
def flatten_csv(file_path):
    output_dict = {}
    with open(file_path, 'r') as file:
        reader = csv.DictReader(file)
        for i, row in enumerate(reader, 1):
            for key in row:
                new_key = f"{key.lower()}_{i}"
                value = row[key]
                try:
                    value = int(value)
                except ValueError:
                    pass
                output_dict[new_key] = value
    return output_dict

values_dict = flatten_csv("./data/customer_factsheet.csv")
print(values_dict)
prompt = prompt.format(**values_dict)

print("\nPOPULATED TEMPLATE:\n" + str(prompt))

{'name_1': 'John', 'family_name_1': 'Smith', 'age_1': 30, 'city_1': 'San Francisco', 'state_1': 'CA', 'purchase_history_1': 'Books electronics home_goods', 'recommendation_1_1': 'Kindle Paperwhite - This e-reader is perfect for book lovers who want a lightweight and portable device that can hold thousands of books. It has a glare-free display and a long battery life, so you can read for hours on end without having to worry about running out of power.', 'recommendation_2_1': "Google Home Mini - This smart speaker is perfect for controlling your home's smart devices with your voice. You can use it to play music, set alarms, get news, and more. It's also a great way to stay connected with friends and family.", 'name_2': 'Jane', 'family_name_2': 'Doe', 'age_2': 25, 'city_2': 'New York', 'state_2': 'NY', 'purchase_history_2': 'Clothing shoes cosmetics', 'recommendation_1_2': "Aritzia Wilfred Free Sweater - This soft and cozy sweater is perfect for a casual day out. It's available in a varie

### 3.2 Additional Examples
You can explore [additional examples using the PromptTemplate](https://api.python.langchain.com/en/latest/prompts/langchain.prompts.prompt.PromptTemplate.html#langchain.prompts.prompt.PromptTemplate)

## 4. Prompt evaluation and few shot learning from bulk created prompts <a id="step4"></a>
In the prior examples, you created a "2-shot learning" prompt.  I.e. there were three inputs but only two complete outputs.  By using a larger dataset this way, you can perform bulk testing of your prompt.

E.g. two of your data sample are used to train while the "output" of the 3rd can be compared against the model output to ensure your prompt is performing as expected.  You can now execute these few shot prompts to see how well our engineered prompt works across numerous examples

### 4.1 Import Watsonx.ai access credentials and load model
Make sure you copied the .env file that you created earlier into the same directory as this notebook

In [86]:
load_dotenv()
api_key = os.getenv("API_KEY", None)
ibm_cloud_url = os.getenv("IBM_CLOUD_URL", None)
project_id = os.getenv("PROJECT_ID", None)
if api_key is None or ibm_cloud_url is None or project_id is None:
    print("Ensure you copied the .env file that you created earlier into the same directory as this notebook")
else:
    creds = {
        "url": ibm_cloud_url,
        "apikey": api_key 
    }


model_params = {
    GenParams.DECODING_METHOD: "greedy",
    GenParams.MIN_NEW_TOKENS: 50,
    GenParams.MAX_NEW_TOKENS: 100
}

# Instantiate a model proxy object to send your requests
model = Model(
    model_id='google/flan-ul2',
    params=model_params,
    credentials=creds,
    project_id=project_id)

### 4.2 Send prompts to Watsonx.ai

In [87]:
responses = [model.generate_text(prompt) for prompt in list_of_prompts]
for i, response in enumerate(responses):
    lines = str(list_of_prompts[i]).strip().split("\n")
    user_description = str(lines[4])
    print(f"\n{user_description}")
    print(f"\n MODEL OUTPUT: {response}")


input: "Michael Jones is 40 years old and lives in Seattle, WA. Their purchase history includes Toys games sporting_goods."
output: Recommendations:n Item 1: X-Box One - The X-Box One is the latest gaming console from Microsoft. It has a high definition screen, and it is able to stream games from your Xbox 360 to the console. It also has an integrated Kinect sensor, which allows you to play games by simply moving your body.nItem 2: Xbox 360 - The Xbox 360 is a gaming console from Microsoft. It

input: "Ashley Brown is 20 years old and lives in Los Angeles, CA. Their purchase history includes Makeup skincare fashion."
output: Recommendations:n Item 1: Makeup - MAC Ruby Woo LipsticknItem 2: Skincare - Clinique Dramatically Different MoisturizernItem 3: Fashion - MAC x Patrick Starrr CollectionnItem 4: Fashion - MAC x Gwen Stefani CollectionnItem 5: Fashion - MAC x Gigi Hadid Collection

input: "Emily Johnson is 55 years old and lives in Dallas, TX. Their purchase history includes Furnit

### Few shot prompt analysis
These results are not bad.  An X-Box for a customer with a history of buying toys and games.  Likewise cosmetics and furniture for the other two customers accurately reflects their purchase history.  

## 5. Congratulations
Congratulations on completing the lab and exploring the fascinating world of bulk creation of Few Shot Prompts using PromptTemplate! 

Through the practical use case of generating personalized product recommendations, you have witnessed the power of tailoring prompts to individual customer profiles. By incorporating customer-specific details and programmatically generating bulk examples, you can fine-tune the model for your specific use case, resulting in more accurate and tailored outputs. 

Continuously iterating and refining your prompts based on these examples will unlock the full potential of language models and enhance their performance across various domains. Keep experimenting and leveraging prompt engineering techniques to optimize your interactions with language models and drive impactful results in your projects.