<a href="https://colab.research.google.com/github/AndreiPiterbarg/Understanding_ML_Concept/blob/main/HW2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [9]:
%pip install mistralai --quiet

# Mastering Prompt Engineering
In this notebook, you will practice systematically improving prompts to get better results from a Large Language Model.

The goal of this assignment is to move from a vague, "bad" prompt to a precise, well-structured prompt that elicits a much more accurate and useful response from the model. You will also experiment with techniques that encourage the model to "reason" its way to a better answer.



## Part 1: Setting up a Mistral agent [5 points]
To use the Mistral API, you need an API key.

1. Create or sign in to your account at [console.mistral.ai](https://console.mistral.ai/).
2. Open the **API Keys** tab and click **Create API key**. Give it a memorable name and copy the value.
3. In this Colab notebook, click on the "üîë" (key) icon in the left sidebar.
4. Click "Add new secret". Name the secret `MISTRAL_API_KEY` and paste your key into the "Value" field.
5. Make sure the "Notebook access" toggle is turned on.

By using Colab secrets, you keep your API key secure and avoid pasting it directly into your code. The code cell below will access this secret.

In [10]:
from mistralai import Mistral
from google.colab import userdata
import json
import re

# Configure the API key
try:
    api_key = userdata.get('MISTRAL_API_KEY')
    client = Mistral(api_key=api_key)
    print("API Key configured successfully!")
except Exception as e:
    print(f"Error configuring API key: {e}")
    print("Please make sure you have set the 'MISTRAL_API_KEY' secret in your Colab environment.")

API Key configured successfully!


## 1.1 Agent Setup [5 points]
Create a basic agent that sends a prompt to the Mistral model and returns a cleaned response. You should do the following:
1. Create a generation config to set the temperature.
2. Process the response to remove markdown formatting - any "*" or "#" characters
3. Return the cleaned response

In [32]:
MODEL_NAME = "mistral-small-latest"

def basic_agent(prompt: str, temperature=0.0) -> str:
    """Send a prompt to Mistral and return a markdown-free string response."""
    generation_config = {"temperature": float(temperature)}

    response = client.chat.complete(
        model=MODEL_NAME,
        messages=[{"role": "user", "content": prompt}],
        **generation_config
    )


    text = response.choices[0].message.content if response and response.choices else ""
    cleaned = re.sub(r"[*#]", "", text)
    return cleaned.strip()

In [33]:
def parse_json_output(output):
    """Parse JSON from model output, handling markdown wrappers."""
    try:
        return json.loads(output)
    except:
        match = re.search(r'```(?:json)?\s*([\s\S]*?)\s*```', output)
        if match:
            try:
                return json.loads(match.group(1))
            except:
                pass
    return None

## Part 2: Iterating and Improving a "Bad" Prompt [15 points]

An LLM's output is only as good as the prompt it receives. Vague prompts lead to vague or incorrect answers. In this module, you will start with a deliberately "bad" prompt and improve it.

**Task: Extract structured information from a block of text**

Our goal is to process a chaotic meeting transcript from a supervillain's weekly sync. Evil plans have logistics too, and they're often messy. Your task is to extract key details and action items into a nested JSON object.

In [34]:
supervillain_transcript = """
Project Doom-Spire - Weekly Status Update - Monday, February 9, 2025

Attendees: Dr. Dire (Evil Overlord), Brenda (Lair HR & Logistics), Gary (Lead Henchman), Chad (Intern)

Dr. Dire: People! The Doom-Spire is 80% complete, but I stood on the Parapet of Pain this morning and... it felt bland! It needs more... menace! Chad, you're the intern. Your youthful apathy is in tune with modern aesthetics. I want you to add more skulls to the parapet. Many more. Make it happen.

Brenda: Um, Doctor? Speaking of making things happen, Unreliable Aquatics Inc. just called. The laser-equipped sharks for the moat are delayed. Again. They said "next month, maybe." This is blocking the entire moat-filling initiative.

Dr. Dire: Unacceptable! Brenda, find me a new shark supplier, ASAP. I don't care what it takes. Double our budget if you have to. I want sharks with lasers, and I want them yesterday!

Gary: While we're on blockers, the primary laser core is overheating. It keeps melting the containment unit, which is, you know, suboptimal. We need a new cryo-cooler installed or the whole spire might just... pop.

Dr. Dire: Pop? Gary, "pop" is not a word I want associated with my multi-billion dollar evil lair. Fine. Get the cooler. When can it be done?

Gary: My team can have the new unit fully installed by this Friday.

Dr. Dire: Good. See to it that it's done. I want no more talk of 'popping'. Okay team, good sync. Let's get back to menacing the world!
"""

In [35]:
# Version 1: The "Bad" Prompt
bad_prompt = f"""
What happened in this meeting?

Transcript:
{supervillain_transcript}
"""

output = basic_agent(bad_prompt)
print(output)

Here‚Äôs a summary of what happened in the meeting:

1. Project Status: The Doom-Spire is 80% complete, but Dr. Dire (the Evil Overlord) feels it lacks menace. He tasks Chad (the intern) with adding more skulls to the Parapet of Pain to enhance its ominous aesthetic.

2. Moat Delays: Brenda (HR & Logistics) reports that the laser-equipped sharks for the moat are delayed indefinitely by the current supplier. Dr. Dire orders her to find a new supplier immediately, even if it means doubling the budget.

3. Laser Core Issue: Gary (Lead Henchman) reveals that the primary laser core is overheating and melting its containment unit, risking a catastrophic "pop." Dr. Dire insists on a new cryo-cooler installation by Friday to prevent this.

4. Next Steps: The team is directed to focus on resolving these issues to ensure the Doom-Spire remains functional and intimidating.

The meeting ends with Dr. Dire emphasizing the need to keep the project on track and maintain its menacing reputation.


### 2.1 Adding Specificity and Structure [5 points]
Version 1 summarizes the meeting but is impossible to parse. Rewrite the prompt in the next cell so the model returns the information in a nested JSON structure. We need the project name, a list of attendees with their titles, and a list of action items, where each action item is its own object.

**Prompting Technique: Specificity and Schema Definition. Clearly define the entire output schema, including nested objects and arrays.**

Your Task: Modify the prompt below to ask for a JSON object with the keys `project_name`, `attendees` (a list of objects with name and title), and `action_items` (a list of objects, where each object has `task_description`, `assigned_to`, and `due_date`).

Hint: For reliable JSON, see Mistral‚Äôs Structured Outputs docs ([overview](https://docs.mistral.ai/capabilities/structured_output), [custom schemas](https://docs.mistral.ai/capabilities/structured_output/custom)) and JSON mode ([json mode](https://docs.mistral.ai/capabilities/structured_output/json_mode)).

In [36]:
# Version 2: Specifying the Nested Schema
schema_prompt = f"""
You are an information extraction assistant and you are to follow the specified schema EXACTLY:

You will return a JSON object with the keys project_name, attendees (a list of objects with name and title), and action_items (a list of objects, where each object has task_description, assigned_to, and due_date).

Rules:
- Only use information present in the transcript.
- Put every action item as its own object.
- If an action item does not have an explicit due date in the transcript, set "due_date" to null.
- Output JSON ONLY.

Transcript:
{supervillain_transcript}
"""

output = basic_agent(schema_prompt)

# Clean the output for JSON parsing
try:
    cleaned_output = re.search(r'```json\n(.*)\n```', output, re.DOTALL).group(1)
    parsed_json = json.loads(cleaned_output)
    print("\n‚úÖ Final JSON Output:\n")
    print(json.dumps(parsed_json, indent=2))
except Exception as e:
    print(f"\n‚ùå Output is NOT valid nested JSON. Raw output:\n{output}")


‚úÖ Final JSON Output:

{
  "project_name": "Project Doom-Spire",
  "attendees": [
    {
      "name": "Dr. Dire",
      "title": "Evil Overlord"
    },
    {
      "name": "Brenda",
      "title": "Lair HR & Logistics"
    },
    {
      "name": "Gary",
      "title": "Lead Henchman"
    },
    {
      "name": "Chad",
      "title": "Intern"
    }
  ],
  "action_items": [
    {
      "task_description": "Add more skulls to the parapet",
      "assigned_to": "Chad",
      "due_date": null
    },
    {
      "task_description": "Find a new shark supplier",
      "assigned_to": "Brenda",
      "due_date": null
    },
    {
      "task_description": "Install the new cryo-cooler",
      "assigned_to": "Gary",
      "due_date": "this Friday"
    }
  ]
}


### 2.2 Handling Ambiguity and Inference [5 points]
The schema is correct, but the model still struggles with ambiguity. Update the prompt in the next cell so it states the following rules explicitly:
1. Treat Monday, February 9, 2025 as the reference date for every due date calculation.
2. Map "ASAP" to the next business day.
3. Use `null` when no deadline appears in the transcript.


In [37]:
# Version 3: Handling Ambiguity
ambiguity_prompt = f"""

Goal: Extract structured information from the transcript into **valid JSON only**, using EXACTLY this schema:

{{
  "project_name": string,
  "attendees": [
    {{
      "name": string,
      "title": string
    }}
  ],
  "action_items": [
    {{
      "task_description": string,
      "assigned_to": string,
      "due_date": string | null
    }}
  ]
}}

Hard rules (must follow):
1) Treat "Monday, February 9, 2025" as the meeting/reference date for all due-date calculations.
2) Map "ASAP" to the next business day: 2025-02-10.
3) Map "by this Friday" / "by Friday" to 2025-02-14.
4) If no deadline appears for an action item, set "due_date" to null.
5) Dates must be in YYYY-MM-DD format.
6) Only use information present in the transcript (no guessing beyond the rules above).
7) Output JSON ONLY .


Transcript:
{supervillain_transcript}
"""

output = basic_agent(ambiguity_prompt)

# Clean the output for JSON parsing
try:
    # The model sometimes wraps the JSON in markdown ` ```json ... ``` `
    cleaned_output = re.search(r'```json\n(.*)\n```', output, re.DOTALL).group(1)
    parsed_json = json.loads(cleaned_output)
    print("\n‚úÖ Final JSON Output:\n")
    print(json.dumps(parsed_json, indent=2))
except Exception as e:
    print(f"\n‚ùå Output is NOT valid nested JSON. Raw output:\n{output}")


‚úÖ Final JSON Output:

{
  "project_name": "Doom-Spire",
  "attendees": [
    {
      "name": "Dr. Dire",
      "title": "Evil Overlord"
    },
    {
      "name": "Brenda",
      "title": "Lair HR & Logistics"
    },
    {
      "name": "Gary",
      "title": "Lead Henchman"
    },
    {
      "name": "Chad",
      "title": "Intern"
    }
  ],
  "action_items": [
    {
      "task_description": "Add more skulls to the parapet",
      "assigned_to": "Chad",
      "due_date": null
    },
    {
      "task_description": "Find a new shark supplier",
      "assigned_to": "Brenda",
      "due_date": "2025-02-10"
    },
    {
      "task_description": "Install new cryo-cooler",
      "assigned_to": "Gary",
      "due_date": "2025-02-14"
    }
  ]
}


### 2.3 Automatic Prompt Engineering (Meta-Prompting) [5 points]
The final step is to automate the improvements you just made. You will:
- write a meta-prompt that teaches `prompt_agent` how to craft the final schema-aware prompt
- require the meta-prompt to mention the persona, schema, reference date, ambiguity rules, and JSON-only constraint
- append the transcript to the generated prompt before calling `basic_agent`

Effectively, you are building a helper agent. Make the instructions so unambiguous that a second model can follow them without you editing the prompt by hand.

In [38]:
def prompt_agent(meta_prompt: str, transcript: str) -> str:
    """Generate a refined prompt and append the transcript."""
    improved_instructions = basic_agent(meta_prompt)
    return f"{improved_instructions}\n\nTranscript:\n{transcript}"


**Your Task:** Fill in `meta_prompt` so that `prompt_agent(meta_prompt, supervillain_transcript)` produces a final prompt containing all of the following:
1. A persona assignment (e.g., expert project manager bot).
2. A clear restatement of the goal: extract structured JSON.
3. The exact schema: `project_name`, `attendees` with `name` + `title`, and `action_items` with `task_description`, `assigned_to`, `due_date`.
4. The reference date (Monday, February 9, 2025).
5. The ambiguity rules for "ASAP" and "by Friday" and `null` for missing dates.
6. An explicit instruction to output JSON only.

In [39]:
meta_prompt = """Write a prompt for extracting meeting data as JSON. Include:
1. Persona: expert project manager bot
2. Goal: extract structured JSON
3. Schema: project_name (string), attendees (list with name/title), action_items (list with task_description/assigned_to/due_date)
4. Reference date: Monday, February 9, 2025
5. Ambiguity rules: ASAP=2025-02-10, by Friday=2025-02-14, missing=null, format=YYYY-MM-DD
6. Output JSON only (no markdown)

Output ONLY the prompt text. The transcript will be appended separately."""

# --- Let's run the two-step process ---

# Step 1: Use the meta-prompt to have the prompt_agent generate a new prompt for us.
print("--- Generating a new prompt using the prompt_agent... ---")
generated_prompt = prompt_agent(meta_prompt, supervillain_transcript)
print(generated_prompt)
print("----------------------------------------------------")


# Step 2: Use the newly generated prompt to process the transcript with our original agent.
print("\n\n--- Using the generated prompt to extract data... ---")
# We need to append the transcript to the prompt that was just generated
final_output = basic_agent(generated_prompt)

# Clean the output for JSON parsing
try:
    cleaned_output = re.search(r'```json\n(.*)\n```', final_output, re.DOTALL).group(1)
    parsed_json = json.loads(cleaned_output)
    print("\n‚úÖ Final JSON Output:\n")
    print(json.dumps(parsed_json, indent=2))
except Exception as e:
    print(f"\n‚ùå Output is NOT valid nested JSON. Raw output:\n{final_output}")

--- Generating a new prompt using the prompt_agent... ---
```json
{
  "role": "expert project manager bot",
  "goal": "extract structured JSON from meeting transcript",
  "schema": {
    "project_name": "string",
    "attendees": [
      {
        "name": "string",
        "title": "string"
      }
    ],
    "action_items": [
      {
        "task_description": "string",
        "assigned_to": "string",
        "due_date": "YYYY-MM-DD"
      }
    ]
  },
  "reference_date": "2025-02-09",
  "ambiguity_rules": {
    "ASAP": "2025-02-10",
    "by Friday": "2025-02-14",
    "missing": null,
    "format": "YYYY-MM-DD"
  },
  "output_format": "JSON only (no markdown)",
  "transcript": "APPEND TRANSCRIPT HERE"
}
```

Transcript:

Project Doom-Spire - Weekly Status Update - Monday, February 9, 2025

Attendees: Dr. Dire (Evil Overlord), Brenda (Lair HR & Logistics), Gary (Lead Henchman), Chad (Intern)

Dr. Dire: People! The Doom-Spire is 80% complete, but I stood on the Parapet of Pain this mo

# Part 3: Eliciting Reasoning [20 points]
You will practice two reasoning-friendly prompt patterns: Chain-of-Thought for constraint solving and Reflection for self-critique. Each sub-part has a single correct answer, so your prompts must spell out how the model should reason before giving the final response.

## 3.1: Chain-of-Thought (CoT) for Complex Problems [10 points]

For complex problems, like logic puzzles, simply asking for the answer can be unreliable. The Chain-of-Thought (CoT) technique guides the model to "think step-by-step," breaking down the problem and showing its work. This reasoning process dramatically increases the likelihood of arriving at the correct answer.

The Task: Solve a moderately difficult logic puzzle with a single correct solution.

*Four wizards are discussing their magical pets. The wizards are Arthur, Beatrice, Cassandra, and Desmond. The pets are an Owl, a Griffin, a Phoenix, and a Dragon. Each wizard owns exactly one pet*
* *Constraint 1: Beatrice owns the Dragon.*
* *Constraint 2: The owner of the Griffin is not Cassandra or Desmond.*
* *Constraint 3: Arthur does not own the Phoenix.*
* *Constraint 4: Cassandra's pet is not the Owl.*

**Your Task: Write a prompt that uses the Chain-of-Thought technique to solve this puzzle. The prompt should instruct the model to first lay out the facts, then use a process of elimination, and finally state the answer in a JSON format.**


In [40]:
logic_puzzle = """
Four wizards are discussing their magical pets. The wizards are Arthur, Beatrice, Cassandra, and Desmond. The pets are an Owl, a Griffin, a Phoenix, and a Dragon. Each wizard owns exactly one pet.
- Constraint 1: Beatrice owns the Dragon.
- Constraint 2: The owner of the Griffin is not Cassandra or Desmond.
- Constraint 3: Arthur does not own the Phoenix.
- Constraint 4: Cassandra's pet is not the Owl.
"""

cot_puzzle_prompt = f"""
You are a logic-solver.

Solve the puzzle by SHOWING YOUR WORK at each step:
1) List the given facts/constraints as bullet points.
2) Create a small elimination table (who can/can't own each pet).
3) Use process-of-elimination, one inference per step, until you reach the only consistent assignment.
4) Double-check every constraint against your final assignment.

Finally, output ONLY valid JSON in this exact format (no markdown, no extra text):
{{
  "Arthur": "<pet>",
  "Beatrice": "<pet>",
  "Cassandra": "<pet>",
  "Desmond": "<pet>"
}}

Puzzle:
{logic_puzzle}
"""

output = basic_agent(cot_puzzle_prompt)
print(output)

Step 1: List the given facts/constraints
- Constraint 1: Beatrice owns the Dragon.
- Constraint 2: The owner of the Griffin is not Cassandra or Desmond.
- Constraint 3: Arthur does not own the Phoenix.
- Constraint 4: Cassandra's pet is not the Owl.

 Step 2: Create an elimination table
We'll represent the possible pets for each wizard based on the constraints.

| Wizard      | Owl | Griffin | Phoenix | Dragon |
|-------------|-----|---------|---------|--------|
| Arthur      | ?   | ?       | X       | ?      |
| Beatrice    | ?   | ?       | ?       | ‚úì      |
| Cassandra   | X   | X       | ?       | ?      |
| Desmond     | ?   | X       | ?       | ?      |

 Step 3: Process of elimination
1. Constraint 1: Beatrice owns the Dragon.
   - Beatrice's row: Owl (X), Griffin (X), Phoenix (X), Dragon (‚úì).
   - Other wizards cannot own the Dragon.

2. Constraint 2: The owner of the Griffin is not Cassandra or Desmond.
   - Cassandra's row: Griffin (X).
   - Desmond's row: Griffin (X).

## 3.2 Reflection and Refinement (Self-Correction) [10 points]
Write a single prompt that makes the model critique its own startup idea in three clear passes:
1. *Step 1 ‚Äì Ideation:* act as an optimistic founder, name the app, and describe the core function in one sentence.
2. *Step 2 ‚Äì Skeptical VC:* switch personas and answer the classic diligence questions (problem urgency, current alternatives, market size, competitors & advantage, monetization).
3. *Step 3 ‚Äì Improved Concept:* return to the founder persona, fix every weakness the VC raised, and explain why the new version is more defensible.

**Your Task: Create a three prompts, one for each step of the process. Call the basic_agent three times to perform each step of this process**


In [41]:
step1_prompt = """
  You are an optimistic startup founder.

Task:
- Invent ONE original new startup app idea.
- Name the app.
- Describe the core function in EXACTLY one sentence.

Constraints:
- Be specific about the user and the problem.
- Output ONLY plain text (no markdown, no lists).
"""

step1_output = basic_agent(step1_prompt)
print(step1_output)

step2_prompt = f"""
  You are a skeptical VC doing due diligence.

Critique the concept below by answering these questions clearly:
1) Problem urgency: who has this problem and how painful/urgent is it?
2) Current alternatives: what do people use today instead?
3) Market size: is this niche, medium, or huge? Why?
4) Competitors & advantage: who would compete and what is this concept's edge (or lack of one)?
5) Monetization: how would it make money, and what are pricing risks?

Be honest and specific. Output plain text only (no markdown).

Concept to critique:
{step1_output}
"""

step2_output = basic_agent(step2_prompt)
print("\n" + step2_output)

step3_prompt = f"""
You are an optimistic founder, and you must FIX every weakness raised by the VC.

Instructions:
- Start by giving the improved app a name (can be the same or new).
- Provide an improved one-sentence core function.
- Then, in 5 short bullets (one per item), explain how the new version addresses:
  (a) urgency, (b) alternatives, (c) market size, (d) competitors/advantage, (e) monetization.
- You MUST explicitly reference points from the VC critique and show how you resolved them.

Output plain text only (no markdown).

Original concept:
{step1_output}

VC critique:
{step2_output}
"""

step3_output = basic_agent(step3_prompt)
print("\n" + step3_output)


App Name: ThriveTogether

Core Function: ThriveTogether connects remote workers with local co-working buddies in their city, matching them based on work styles and schedules to combat loneliness and boost productivity through in-person meetups.

1) Problem urgency: Remote workers who feel isolated or unproductive due to lack of social interaction might find this useful, but it‚Äôs not a life-or-death problem. The urgency depends on the individual‚Äîsome thrive alone, others crave connection. Pain level is moderate, not urgent.

2) Current alternatives: People already use Slack/Discord communities, local meetup groups, or coworking spaces. Many remote workers also just work from cafes or libraries. The main difference is ThriveTogether‚Äôs focus on curated 1:1 or small-group matchmaking.

3) Market size: Medium to niche. Remote work is growing, but the subset of people who want in-person meetups with strangers is smaller. The total addressable market is large (millions of remote workers

# Part 4: Conversational Memory [30 points]
You will build a three-layer travel concierge stack:
1. Buffer the entire conversation so every reply remembers earlier preferences.
2. Convert those turns into a structured memory store of durable trip facts.
3. Feed the memory summaries back into the next response so the concierge stays consistent.

Complete the steps in order‚Äîeach helper you write is reused in the following section.


## 4.1 Conversation Buffer [10 points]
Implement two helpers that keep the running dialogue intact:
- `format_transcript_for_prompt(messages)` ‚Üí return a newline-delimited string of labeled turns plus an instruction telling the assistant to continue.
- `run_buffered_conversation(system_context, user_turns)` ‚Üí step through each user turn, rebuild the prompt with the full transcript, call `basic_agent`, store every reply, and return both the transcript and a dialogue log.

Then, run the provided travel scenario and print both a single-shot response and the buffered dialogue so you can highlight why the buffered agent performs better.


In [42]:
def format_transcript_for_prompt(messages):
    """Return a speaker-labeled transcript string that becomes the next prompt."""
    lines = []
    for m in messages:
        role = m["role"].capitalize()
        lines.append(f"{role}: {m['content']}")
    lines.append("Assistant: Continue the conversation using the full transcript above. Keep your reply concise.")
    return "\n".join(lines)

def run_buffered_conversation(system_context, user_turns):
    """Return the updated transcript and dialogue history after iterating over every user turn."""
    messages = [{"role": "system", "content": system_context}]
    dialogue_log = []

    for user_msg in user_turns:
        messages.append({"role": "user", "content": user_msg})
        prompt = format_transcript_for_prompt(messages)
        assistant_msg = basic_agent(prompt)
        messages.append({"role": "assistant", "content": assistant_msg})
        dialogue_log.append({"user": user_msg, "assistant": assistant_msg})

    transcript = format_transcript_for_prompt(messages)
    return transcript, dialogue_log


travel_system_prompt = (
  "You are a detail-oriented travel concierge. Track traveler dates, transit rules, dietary needs, and must-see spots so itineraries stay consistent. Keep replies under 120 words."
)
travel_turns = [
  "Hi, we're planning a June 10‚Äì16 trip to Italy. We'd like three nights in Rome and three in Florence.",
  "Please avoid redeye flights, schedule trains before 8pm, and remember I'm allergic to shellfish.",
  "We love boutique hotels near museums and want one day trip to Tuscany with a local cooking class. Can you sketch the plan?"
]

buffered_transcript, buffered_dialogue = run_buffered_conversation(
  travel_system_prompt,
  travel_turns
)

single_shot_prompt = (
    f"{travel_system_prompt}\n"
    f"User: {travel_turns[-1]}\n"
    "Assistant:"
)
single_shot_response = basic_agent(single_shot_prompt)

print("Single-shot response (no buffer):\n")
print(single_shot_response)
print("\n---\n")
print("Buffered dialogue:")
for idx, turn in enumerate(buffered_dialogue, start=1):
    print(f"Turn {idx} | User: {turn['user']}")
    print(f"Turn {idx} | Assistant: {turn['assistant']}\n")
print("Final buffered reply clearly references the earlier turns.")

Single-shot response (no buffer):

3-Day Florence Itinerary

Day 1-2: Florence
- Stay at Hotel Brunelleschi (boutique, near Duomo & Uffizi).
- Must-sees: Uffizi Gallery, Accademia (David), Ponte Vecchio.
- Dinner at Trattoria Mario (local, no reservations).

Day 3: Tuscany Day Trip
- Drive to San Gimignano (1.5 hrs). Cooking class at Tuscany Cooking Class (farm-to-table pasta).
- Return to Florence by evening.

Notes:
- Pack light for transit; no checked bags needed.
- Notify hotel of dietary needs (e.g., gluten-free) 48hrs ahead.

---

Buffered dialogue:
Turn 1 | User: Hi, we're planning a June 10‚Äì16 trip to Italy. We'd like three nights in Rome and three in Florence.
Turn 1 | Assistant: Got it! For your June 10‚Äì16 Italy trip:
- Rome (June 10‚Äì13): Must-see spots? Colosseum, Vatican, Trevi Fountain.
- Florence (June 13‚Äì16): Duomo, Uffizi Gallery, Ponte Vecchio.
- Any dietary restrictions or transit preferences (e.g., trains vs. flights)? I‚Äôll refine your itinerary!

Turn 2 | 

## 4.2 Structured Memory Store (Fact Capture) [10 points]
Now that buffering works, capture persistent travel facts after every turn.
- `extract_conversation_memories(user_message, assistant_message)` ‚Üí call the model to emit a JSON list of durable facts.
- `update_memory_store(store, memories)` ‚Üí merge those facts by topic while deduplicating details.
- `run_memory_capture_session(system_context, user_turns)` ‚Üí reuse the buffer helpers, call the extractor after each reply, print the evolving store, and return both the transcript and the final `memory_store`.

Aim for a store that records dates, budgets, dietary needs, transit rules, and backup plans mentioned in the scenario.


In [43]:
import json
import re

def extract_conversation_memories(user_message, assistant_message):
    """Use an LLM to extract key facts into a JSON list: [{'topic': ..., 'detail': ...}]"""
    extractor_prompt = f"""Extract travel facts as JSON list: [{{"topic": "...", "detail": "..."}}]

Topics to look for: dates, destinations, budget, dietary, transit_rules, lodging_preferences, activities, backup_plans
Only include concrete facts. Output JSON only (no markdown).

User: {user_message}
Assistant: {assistant_message}"""

    output = basic_agent(extractor_prompt)
    parsed = parse_json_output(output)

    if isinstance(parsed, list):
        return [m for m in parsed if isinstance(m, dict) and "topic" in m and "detail" in m]
    return []


def update_memory_store(store, memories):
    """Merge new memories into the store, avoiding duplicates."""
    if store is None:
        store = {}
    for mem in memories or []:
        topic = mem.get("topic", "").lower().strip()
        detail = mem.get("detail", "").strip()
        if not topic or not detail:
            continue
        if topic not in store:
            store[topic] = []
        if detail.lower() not in [d.lower() for d in store[topic]]:
            store[topic].append(detail)
    return store

# --- Test Logic ---
travel_transcript = []
travel_memory_store = {}

turn_1 = "I'm planning a trip to Italy from June 10-16. Budget is tight."
reply_1 = "Noted. I'll find budget options for your Italy trip in June."

print("Processing Turn 1...")
new_mems = extract_conversation_memories(turn_1, reply_1)
travel_memory_store = update_memory_store(travel_memory_store, new_mems)

print("\nUpdated Store:")
print(json.dumps(travel_memory_store, indent=2))

Processing Turn 1...

Updated Store:
{
  "dates": [
    "June 10-16"
  ],
  "destinations": [
    "Italy"
  ],
  "budget": [
    "tight"
  ]
}


## 4.3 Memory-Augmented Responses (Personalization Loop) [10 points]
Close the loop by reusing the stored facts:
1. `summarize_relevant_memories` ‚Üí collapse the store into ‚â§5 bullet points that an LLM can skim quickly.
2. `memory_aware_agent` ‚Üí prepend those bullets (when available) to the system instructions, remind the model to cite them explicitly, and fall back to the base prompt if the store is empty.
3. After generating the follow-up reply, run a lightweight check (keyword search is fine) to confirm the answer referenced at least one stored fact.

If the response ignores the memory summary, adjust the prompt and rerun.


In [44]:
def summarize_memories(store):
    """Convert the memory dictionary into a text summary (max 5 bullets)."""
    if not store:
        return ""
    bullets = [f"- {topic}: {'; '.join(details)}" for topic, details in store.items()]
    return "\n".join(bullets[:5])
def run_memory_aware_conversation(base_system_prompt, user_turns):
    """Run a full conversation where the agent learns and remembers."""
    messages = [{"role": "system", "content": base_system_prompt}]
    memory_store = {}

    for user_msg in user_turns:
        messages.append({"role": "user", "content": user_msg})

        # Inject memory into system prompt if available
        memory_summary = summarize_memories(memory_store)
        if memory_summary:
            system_with_memory = f"{base_system_prompt}\n\nRemembered facts (cite these explicitly when relevant):\n{memory_summary}"
            messages[0] = {"role": "system", "content": system_with_memory}

        prompt = format_transcript_for_prompt(messages)
        assistant_msg = basic_agent(prompt)
        messages.append({"role": "assistant", "content": assistant_msg})

        # Extract and store new memories
        new_memories = extract_conversation_memories(user_msg, assistant_msg)
        memory_store = update_memory_store(memory_store, new_memories)

        print(f"\nUser: {user_msg}")
        print(f"Assistant: {assistant_msg}")
        print(f"Memory topics: {list(memory_store.keys())}")

        # Lightweight check: verify response references stored facts (keyword search)
        if memory_store:
            keywords = [detail.split()[0].lower() for details in memory_store.values() for detail in details if detail]
            referenced = any(kw in assistant_msg.lower() for kw in keywords[:5])
            if not referenced and len(messages) > 4:
                print("(Note: Response may not have referenced stored memories)")

    return memory_store

# --- Run the Scenario ---
travel_turns = [
  "We're going to Italy, June 10‚Äì16. Stay in Rome and Florence.",
  "I'm allergic to shellfish, so find safe restaurants.",
  "For the Rome leg, can you suggest a dinner spot?"
]

system_prompt = "You are a helpful travel assistant."
final_store = run_memory_aware_conversation(system_prompt, travel_turns)

print("\nFinal Memory State:")
print(summarize_memories(final_store))


User: We're going to Italy, June 10‚Äì16. Stay in Rome and Florence.
Assistant: That sounds like a fantastic trip! Here‚Äôs a quick itinerary suggestion:

Rome (June 10‚Äì13):
- Day 1: Colosseum, Roman Forum, Trevi Fountain
- Day 2: Vatican City (St. Peter‚Äôs Basilica, Sistine Chapel)
- Day 3: Pantheon, Piazza Navona, Trastevere

Florence (June 13‚Äì16):
- Day 1: Duomo, Uffizi Gallery, Ponte Vecchio
- Day 2: Accademia (David), Piazzale Michelangelo
- Day 3: Day trip to Tuscany (wine tasting or Chianti region)

Would you like recommendations for restaurants, hotels, or transportation?
Memory topics: ['dates', 'destinations', 'lodging_preferences', 'activities']

User: I'm allergic to shellfish, so find safe restaurants.
Assistant: Here are shellfish-free restaurant recommendations in Rome and Florence:

Rome:
- Roscioli (Pasta, meat dishes)
- Trattoria Da Enzo al 29 (Roman classics, ask for shellfish-free options)
- La Carbonara (Pasta specialties, no seafood)

Florence:
- Trattoria M