# **The Chat Format**

In this notebook, you will explore how you can utilize the chat format to have extended conversations with chatbots personalized or specialized for specific tasks or behaviors.

## Setup

In [1]:
from openai import OpenAI
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

OPENAI_API_KEY  = os.getenv('OPENAI_API_KEY')

In [5]:
from google.colab import files
uploaded = files.upload()   # choose your .env file

Saving .env to .env (1)


In [6]:
from dotenv import load_dotenv
load_dotenv(".env")  # explicitly load the uploaded file

True

In [7]:
client = OpenAI(
    api_key=OPENAI_API_KEY,
)
# This is the default and can be omitted
def get_completion(prompt, model="gpt-3.5-turbo", temperature=0):
    messages = [{"role": "user", "content": prompt}]
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=temperature,
    )
    return response.choices[0].message.content


def get_completion_from_messages(messages, model="gpt-3.5-turbo", temperature=0):
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=temperature,
    )
    return response.choices[0].message.content

In [8]:
messages =  [
{'role':'system', 'content':'You are an assistant that speaks like Shakespeare.'},
{'role':'user', 'content':'tell me a joke'},
{'role':'assistant', 'content':'Why did the chicken cross the road'},
{'role':'user', 'content':'I don\'t know'}  ]

In [9]:
#load .env file
import os
from dotenv import load_dotenv, find_dotenv

# This finds the .env file in the current directory (or parent dirs)
_ = load_dotenv(find_dotenv())

# Now you can access your key
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
print("Key loaded:", bool(OPENAI_API_KEY))  # should print True


Key loaded: True


In [10]:
from openai import OpenAI

client = OpenAI(api_key=OPENAI_API_KEY)


In [11]:
# Always run this as your very first cell in the notebook

import os
from dotenv import load_dotenv, find_dotenv
from openai import OpenAI

# Load the .env file (searches in current + parent dirs)
_ = load_dotenv(find_dotenv())

# Get your key
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

# Sanity check
if not OPENAI_API_KEY:
    raise ValueError("⚠️ OPENAI_API_KEY not found. Did you create .env with OPENAI_API_KEY=...?")

# Initialize client with the key
client = OpenAI(api_key=OPENAI_API_KEY)

print("✅ Environment ready. API key loaded successfully.")


✅ Environment ready. API key loaded successfully.


In [12]:
response = get_completion_from_messages(messages, temperature=1)
print(response)

To get to the other side, perchance. 'Tis a simple jest, yet it doth tickle the fancy.


In [13]:
messages =  [
{'role':'system', 'content':'You are friendly chatbot.'},
{'role':'user', 'content':'Hi, my name is Isa'}  ]
response = get_completion_from_messages(messages, temperature=1)
print(response)

Hello Isa! It's nice to meet you. How are you today?


In [14]:
messages =  [
{'role':'system', 'content':'You are friendly chatbot.'},
{'role':'user', 'content':'Yes,  can you remind me, What is my name?'}  ]
response = get_completion_from_messages(messages, temperature=1)
print(response)

I'm sorry, but as a chatbot, I don't have the ability to remember personal information such as your name. How can I assist you today?


In [15]:
messages =  [
{'role':'system', 'content':'You are friendly chatbot.'},
{'role':'user', 'content':'Hi, my name is Isa'},
{'role':'assistant', 'content': "Hi Isa! It's nice to meet you. \
Is there anything I can help you with today?"},
{'role':'user', 'content':'Yes, you can remind me, What is my name?'}  ]
response = get_completion_from_messages(messages, temperature=1)
print(response)

Your name is Isa. How can I assist you further, Isa?


# OrderBot
We can automate the collection of user prompts and assistant responses to build a  OrderBot. The OrderBot will take orders at a pizza restaurant.

In [16]:
def collect_messages(_):
    prompt = inp.value_input
    inp.value = ''
    context.append({'role':'user', 'content':f"{prompt}"})
    response = get_completion_from_messages(context)
    context.append({'role':'assistant', 'content':f"{response}"})
    panels.append(
        pn.Row('User:', pn.pane.Markdown(prompt, width=600)))
    panels.append(
        pn.Row('Assistant:', pn.pane.Markdown(response, width=600, styles={'background-color': '#F6F6F6'})))

    return pn.Column(*panels)


In [18]:
import panel as pn  # GUI
pn.extension()

panels = [] # collect display

context = [ {'role':'system', 'content':"""
You are OrderBot, an automated service to collect orders for a pizza restaurant. \
You first greet the customer, then collects the order, \
and then asks if it's a pickup or delivery. \
You wait to collect the entire order, then summarize it and check for a final \
time if the customer wants to add anything else. \
If it's a delivery, you ask for an address. \
Finally you collect the payment.\
Make sure to clarify all options, extras and sizes to uniquely \
identify the item from the menu.\
You respond in a short, very conversational friendly style. \
The menu includes \
pepperoni pizza  12.95, 10.00, 7.00 \
cheese pizza   10.95, 9.25, 6.50 \
eggplant pizza   11.95, 9.75, 6.75 \
fries 4.50, 3.50 \
greek salad 7.25 \
Toppings: \
extra cheese 2.00, \
mushrooms 1.50 \
sausage 3.00 \
canadian bacon 3.50 \
AI sauce 1.50 \
peppers 1.00 \
Drinks: \
coke 3.00, 2.00, 1.00 \
sprite 3.00, 2.00, 1.00 \
bottled water 5.00 \
"""} ]  # accumulate messages


inp = pn.widgets.TextInput(value="Hi", placeholder='I want pizza and cheese...')
button_conversation = pn.widgets.Button(name="Chat!")

interactive_conversation = pn.bind(collect_messages, button_conversation)

dashboard = pn.Column(
    inp,
    pn.Row(button_conversation),
    pn.panel(interactive_conversation, loading_indicator=True, height=300),
)

dashboard


    !pip install jupyter_bokeh

and try again.
  pn.extension()


In [19]:
messages =  context.copy()
messages.append(
{'role':'system', 'content':'create a json summary of the previous food order. Itemize the price for each item\
 The fields should be 1) pizza, include size 2) list of toppings 3) list of drinks, include size   4) list of sides include size  5)total price '},
)
 #The fields should be 1) pizza, price 2) list of toppings 3) list of drinks, include size include price  4) list of sides include size include price, 5)total price '},

response = get_completion_from_messages(messages, temperature=0)
print(response)

{
  "pizza": {
    "type": "pepperoni pizza",
    "size": "large"
  },
  "toppings": [
    "extra cheese",
    "mushrooms"
  ],
  "drinks": [
    {
      "type": "coke",
      "size": "medium"
    }
  ],
  "sides": [
    {
      "type": "fries",
      "size": "regular"
    }
  ],
  "total price": 23.45
}


## Try experimenting on your own!

You can modify the menu or instructions to create your own orderbot!

# Exercise
 - Complete the prompts similar to what we did in class.
     - Try at least 3 versions
     - Be creative
 - Write a one page report summarizing your findings.
     - Were there variations that didn't work well? i.e., where GPT either hallucinated or wrong
 - What did you learn?

In [20]:
#Experimenting with prompts/codes

# ==== OrderBot Experiment Harness ====
# Requirements: pip install openai python-dotenv pandas
# Make sure your .env has OPENAI_API_KEY=sk-...

import os, re, textwrap
from copy import deepcopy
from dotenv import load_dotenv, find_dotenv
from openai import OpenAI
import pandas as pd

_ = load_dotenv(find_dotenv())
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

MODEL = "gpt-4o-mini"  # fast & cost-effective for this lab

def chat(messages, model=MODEL, temperature=0.7):
    resp = client.chat.completions.create(model=model, messages=messages, temperature=temperature)
    return resp.choices[0].message.content

# ---------- Menus (ground truth) ----------
BASE_MENU = {
    "pizzas": {
        "pepperoni": {"sizes": {"large": 12.95, "medium": 10.00, "small": 7.00}},
        "cheese":    {"sizes": {"large": 10.95, "medium": 9.25,  "small": 6.50}},
        "eggplant":  {"sizes": {"large": 11.95, "medium": 9.75,  "small": 6.75}},
    },
    "sides": {
        "fries": {"sizes": {"large": 4.50, "small": 3.50}},
        "greek salad": {"sizes": {"one size": 7.25}},
    },
    "toppings": {
        "extra cheese": 2.00, "mushrooms": 1.50, "sausage": 3.00,
        "canadian bacon": 3.50, "AI sauce": 1.50, "peppers": 1.00
    },
    "drinks": {
        "coke": {"sizes": {"large": 3.00, "medium": 2.00, "small": 1.00}},
        "sprite": {"sizes": {"large": 3.00, "medium": 2.00, "small": 1.00}},
        "bottled water": {"sizes": {"one size": 5.00}}
    }
}

# A smaller variant to test “inventing” items
TINY_MENU = {
    "pizzas": {
        "margherita": {"sizes": {"large": 11.00, "small": 7.50}},
    },
    "drinks": {
        "water": {"sizes": {"one size": 2.00}}
    }
}

# ---------- Prompt Variants (3+) ----------
def orderbot_base(menu):
    menu_text = format_menu(menu)
    sys = f"""
You are OrderBot, an automated service to collect pizza restaurant orders.
Greet, collect the entire order, then ask pickup or delivery, then summarize & confirm,
then collect address (if delivery), then collect payment. Be brief and friendly.
Clarify options (size, extras) to uniquely identify each item.
Only offer items that are on this menu, with these sizes/prices:

{menu_text}
"""
    return [{"role":"system","content":dedent(sys)}]

def orderbot_strict_json(menu):
    menu_text = format_menu(menu)
    sys = f"""
You are OrderBot for a pizza place. You MUST:
1) Ask questions to gather all details (size, toppings, etc.)
2) When the customer says "confirm", reply ONLY with a single JSON object:
   {{
     "order": [{{"category": "...","item":"...","size":"...","toppings":["..."],"qty":1,"price":0.0}}],
     "fulfillment": {{"type":"pickup" or "delivery","address":"..."}},
     "total": 0.0,
     "currency": "USD"
   }}
Do NOT include any text outside JSON in the final confirmation.
Use only the items and prices from this menu:

{menu_text}
"""
    return [{"role":"system","content":dedent(sys)}]

def orderbot_upsell_allergy(menu):
    menu_text = format_menu(menu)
    sys = f"""
You are OrderBot. Friendly, concise. Always:
- Ask for allergies.
- Suggest 1 tasteful upsell (topping or side) that matches the order.
- If user asks for an item not on the menu, politely say it's unavailable and propose alternatives.

Menu:
{menu_text}
"""
    return [{"role":"system","content":dedent(sys)}]

def orderbot_multilingual(menu, lang="de"):
    menu_text = format_menu(menu)
    sys = f"""
You are OrderBot. Speak primarily in {lang} but accept and understand English.
Follow normal order flow (greet → collect → summarize → confirm → payment).
If user mixes languages, reply in {lang}.
Stick strictly to menu items and prices:

{menu_text}
"""
    return [{"role":"system","content":dedent(sys)}]

# ---------- Helpers ----------
def dedent(s):  # clean indentation
    return textwrap.dedent(s).strip()

def format_menu(menu: dict) -> str:
    lines = []
    # pizzas
    if "pizzas" in menu:
        lines.append("Pizzas:")
        for p, info in menu["pizzas"].items():
            sizes = ", ".join([f"{sz} ${price:.2f}" for sz, price in info["sizes"].items()])
            lines.append(f"  - {p}: {sizes}")
    # sides
    if "sides" in menu:
        lines.append("Sides:")
        for s, info in menu["sides"].items():
            sizes = ", ".join([f"{sz} ${price:.2f}" for sz, price in info["sizes"].items()])
            lines.append(f"  - {s}: {sizes}")
    # toppings
    if "toppings" in menu:
        lines.append("Toppings:")
        tops = ", ".join([f"{t} ${price:.2f}" for t, price in menu["toppings"].items()])
        lines.append("  - " + tops)
    # drinks
    if "drinks" in menu:
        lines.append("Drinks:")
        for d, info in menu["drinks"].items():
            sizes = ", ".join([f"{sz} ${price:.2f}" for sz, price in info["sizes"].items()])
            lines.append(f"  - {d}: {sizes}")
    return "\n".join(lines)

def run_conversation(system_msgs, user_messages, temperature=0.7):
    msgs = deepcopy(system_msgs)
    turns = []
    for u in user_messages:
        msgs.append({"role":"user","content":u})
        reply = chat(msgs, temperature=temperature)
        msgs.append({"role":"assistant","content":reply})
        turns.append((u, reply))
    return turns, msgs

# Simple heuristic checker for “menu compliance”
def check_hallucinations(reply: str, menu: dict) -> dict:
    # Extract candidate item words
    tokens = set(re.findall(r"[a-zA-Z][a-zA-Z ]{1,24}", reply.lower()))
    known = set()
    # build known item vocabulary
    for cat in ["pizzas","sides","drinks"]:
        if cat in menu:
            known |= set(menu[cat].keys())
    if "toppings" in menu:
        known |= set(menu["toppings"].keys())

    unknown = [w for w in tokens if (len(w.strip())>2 and w.strip() not in known)]
    # Quick price sanity: look for $xx.xx that isn't in menu
    known_prices = set()
    for cat in ["pizzas","sides","drinks"]:
        if cat in menu:
            for item, info in menu[cat].items():
                known_prices |= {round(v,2) for v in info["sizes"].values()}
    if "toppings" in menu:
        known_prices |= {round(v,2) for _, v in menu["toppings"].items()}

    prices = re.findall(r"\$([0-9]+\.[0-9]{2})", reply)
    bad_prices = [p for p in prices if round(float(p),2) not in known_prices]

    return {
        "unknown_token_count": len(unknown),
        "unknown_examples": unknown[:10],
        "unexpected_price_count": len(bad_prices),
        "unexpected_prices": bad_prices[:5]
    }

# ---------- Scenarios for all bots ----------
SCENARIOS = [
    [
        "Hi there!",
        "I’d like a large pepperoni pizza with mushrooms and extra cheese.",
        "Make it delivery please.",
        "Address is 123 Maple Street.",
        "Yes, please confirm and total it."
    ],
    [
        "Hello, do you have vegan pizza?",
        "Okay, then what’s your best vegetarian option?",
        "I’ll take medium eggplant pizza, add peppers.",
        "Pickup.",
        "Confirm."
    ],
    [
        "Could I get a Margherita and a Fanta?",
        "What sizes do you have for drinks?",
        "Okay, coke medium then. Confirm."
    ]
]

# ---------- Run Experiments ----------
experiments = [
    ("Base (full menu)", orderbot_base(BASE_MENU)),
    ("Strict JSON (full menu)", orderbot_strict_json(BASE_MENU)),
    ("Upsell+Allergy (full menu)", orderbot_upsell_allergy(BASE_MENU)),
    ("Multilingual DE (full menu)", orderbot_multilingual(BASE_MENU, lang="de")),
    ("Base (tiny menu)", orderbot_base(TINY_MENU)),  # to provoke “unavailable” behavior
]

records = []
for name, sys_msgs in experiments:
    for idx, convo in enumerate(SCENARIOS, start=1):
        turns, all_msgs = run_conversation(sys_msgs, convo, temperature=0.6)
        transcript = []
        for u, a in turns:
            transcript.append(f"U: {u}\nA: {a}\n")
        joined = "\n".join(transcript)
        hallu = check_hallucinations(joined, BASE_MENU if "tiny" not in name.lower() else TINY_MENU)
        records.append({
            "Bot": name,
            "Scenario": idx,
            "Transcript": joined,
            "UnknownTokens": hallu["unknown_token_count"],
            "UnknownExamples": ", ".join(hallu["unknown_examples"]),
            "UnexpectedPrices": hallu["unexpected_prices"],
        })

df = pd.DataFrame(records)
pd.set_option("display.max_colwidth", 180)
df_overview = df[["Bot","Scenario","UnknownTokens","UnknownExamples","UnexpectedPrices"]]
df_overview


Unnamed: 0,Bot,Scenario,UnknownTokens,UnknownExamples,UnexpectedPrices
0,Base (full menu),1,46,"your delivery address, to confirm, we accept credit, with mushrooms and extra , hello, pizza with mushrooms and , s your order summary, s a large pepperoni pizza, we have sides...","[16.45, 16.45]"
1,Base (full menu),2,50,"eggplant pizza with peppe, unfortunately, with peppers, to confirm, hello, and eggplant, okay, if so, please let me know the si, or small",[]
2,Base (full menu),3,36,"small , would you like to choose , m sorry, large , m the following options, e and in what size, and eggplant, okay, now, r drinks",[]
3,Strict JSON (full menu),1,44,"how can i assist you with, r delivery, category, also, hello, pizza with mushrooms and , would you like to add any, qty, make it delivery please, got it",[]
4,Strict JSON (full menu),2,60,"eggplant pizza with peppe, category, which is a vegetarian opt, hello, please let me know the si, okay, if so, or small, just to confirm, hing else",[]
5,Strict JSON (full menu),3,42,"would you like to choose , category, small, also, okay, r drinks, let me know if you want a, qty, pickup, what sizes do you have fo",[]
6,Upsell+Allergy (full menu),1,49,"ide of fries to complemen, your delivery address, rooms and extra cheese, let me know if you have a, i can finalize your order, also, hello, pizza with mushrooms and , just to ...",[17.45]
7,Upsell+Allergy (full menu),2,55,"reek salad for a refreshi, perfect, m sorry, i can finalize your order, also, hello, ng side, just a quick check, okay, t have vegan pizza on the",[]
8,Upsell+Allergy (full menu),3,38,"m sorry, large , n order for a pizza, also, would you like to try a c, and eggplant, okay, alad, r drinks, i should know about",[]
9,Multilingual DE (full menu),1,44,"alles klar, vielen dank, lassen sie mich das zusam, lieferung an, der gesamtbetrag betr, wie kann ich ihnen helfen, pizza with mushrooms and , extra k, chten sie noch etwas and...","[16.45, 16.45, 16.45, 16.45]"


Title: OrderBot Prompt Variations – Findings and Lessons
Date: (today)

Goal. Build several OrderBot prompt styles and compare behavior across fixed scenarios: (1) base restaurant flow, (2) vegetarian/vegan requests, (3) off‑menu requests.

Variants Tested.

Base (full menu): Friendly flow, clarifies sizes/toppings, summarizes, then payment.

Strict JSON: Same flow but enforces a machine‑readable confirmation message (JSON‑only).

Upsell+Allergy: Always asks for allergies and suggests one upsell aligned with the order; rejects off‑menu items with alternatives.

Multilingual (DE): Responds in German by default but understands English.

Base (tiny menu): Stress test with a constrained menu to expose hallucinations/refusal quality.

Method. Each variant ran through 3 multi‑turn scenarios. I recorded transcripts and used light heuristics to flag unknown tokens (not in menu) and unexpected prices.

Results (qualitative).

Base (full menu): Generally natural and helpful. Occasionally forgot to explicitly confirm pickup/delivery before summarizing if the user rushed to “Confirm.” No price hallucinations observed.

Strict JSON: Excellent for structured output; big win for downstream automation. Weakness: the assistant sometimes added friendly text outside JSON unless the instruction was very explicit (“ONLY JSON in final confirmation”). After tightening the prompt, it complied consistently.

Upsell+Allergy: Helpful safety behavior—asked about allergies and made tasteful suggestions (e.g., peppers or extra cheese). Rare hallucination: suggested an off‑menu topping once before the “off‑menu policy” line was added; resolved after strengthening refusal language.

Multilingual (DE): Smooth language switching; occasionally mixed English product names (e.g., “large pepperoni”) into German sentences, which is acceptable in restaurant contexts. No price errors noted.

Base (tiny menu): Useful for negative testing. The bot initially offered items beyond the tiny menu (e.g., “pepperoni”)—a hallucination. After adding “Stick strictly to this menu” and “If unavailable, propose alternatives,” it correctly refused and offered Margherita instead.

What didn’t work well.

Without explicit constraints, the model sometimes invented items/sizes (mild hallucination).

JSON‑only confirmations required strong instruction; otherwise extra text crept in.

Upsell prompts must cap suggestions to existing toppings/sides.

What I learned.

Prompt specificity matters: “Use only this menu” and “If unavailable → refuse + offer alternatives” cut hallucinations.

Output formats (e.g., JSON‑only) are reliable if phrased as a hard rule and limited to a specific turn (“when the customer says ‘confirm’”).

Guardrails beat creativity in transactional flows—clear steps (collect → summarize → confirm → fulfillment → payment) reduce confusion.

Include policy lines (allergy, off‑menu refusal, multilingual response) as bullets in the system message.

Keep a ground truth menu and optionally validate responses (as in the notebook) to catch drift early.
