<div style="font-size: 13px; line-height: 1.4; margin: 0; padding: 0;">
<h5 style="margin-bottom: 0.2em;">
This notebook documents my theoretical study alongside the lab exercises conducted on <b>July 2, 2025</b>.
</h5>
</div>

### <u style="margin-bottom: 0;">**THEORY STUDY**</u>

##### ***LESSON 3: Agentic AI Development Tools***

<div style="font-size: 14px; line-height: 1.4; margin: 0; padding: 0;">
<b>Lesson Overview: </b>
This lesson introduces the concept of <b>agents</b>, differentiates them from <b>workflows</b>, explores frameworks like <b>LangChain</b>, <b>LlamaIndex</b>, <b>CrewAI</b>, and <b>AutoGen</b>, and discusses coding agents both <b>manually</b> and using <b>framework tools</b>.
</div>

<h5 style="margin-bottom: 0.2em;"><b>Theory Summary</b></h5>

<div style="font-size: 14px; line-height: 1.4; margin: 0; padding: 0;">
<b>What is an Agent?</b><br>
<b>Overview</b>: An agent is an autonomous entity that uses tools, memory, and planning to perform tasks. Unlike static workflows, agents decide their behavior at runtime using LLMs for reasoning and action.<br>
<b>Key Terms</b>:<br>
- <b>Agent</b>: Dynamic, runtime decision-making entity, often called an "actor."<br>
- <b>Autonomous</b>: Capable of independent action based on observations and reasoning.<br>
</div>

<div style="font-size: 14px; line-height: 1.4; margin: 0; padding: 0;">
<b>Agent vs. Workflow</b><br>
<b>Overview</b>: Workflows are predefined sequences of steps (static), while agents are dynamic, with behavior determined at runtime. Workflows oraganize agents and tools, providing structure, while agents handle decision-making.<br>
<b>Key Terms</b>:<br>
- <b>Workflow</b>: Static sequence, often labeled "orchestrator," defined at design time.<br>
- <b>Agent</b>: Dynamic entity, powered by LLMs, with runtime flexibility but risk of compounding errors.<br>
<b>Significance</b>: Clarifies the complementary roles of workflows (structure) and agents (adaptability) in agentic systems.
</div>

<div style="font-size: 14px; line-height: 1.4; margin: 0; padding: 0;">
<b>Augmented LLMs</b><br>
<b>Overview</b>: LLMs enhanced with external elements like retrieval-augmented generation (RAG), the internet, or memory of past interactions, enabling complex task performance beyond basic text generation.<br>
<b>Key Terms</b>:<br>
- <b>RAG</b>: Retrieval-Augmented Generation, integrating real-time data into LLM responses.<br>
- <b>Tools/Memory</b>: External resources (e.g., APIs, chat history) accessed via connection platforms like MCP.<br>
- <b>Visual Elements</b>: Diagrams likely showing LLM connections to inputs, outputs, info retrieval, tools, and memory.<br>
</div>

<div style="font-size: 14px; line-height: 1.4; margin: 0; padding: 0;">
<b>Agentic Frameworks Comparison</b><br><br>
<table style="width:100%; border-collapse: collapse;">
  <tr>
    <th align="left" style="border-bottom: 1px solid #ccc; padding: 6px;">Framework</th>
    <th align="left" style="border-bottom: 1px solid #ccc; padding: 6px;">Overview</th>
    <th align="left" style="border-bottom: 1px solid #ccc; padding: 6px;">Key Techniques / Capabilities</th>
    <th align="left" style="border-bottom: 1px solid #ccc; padding: 6px;">Best Use Cases</th>
    <th align="left" style="border-bottom: 1px solid #ccc; padding: 6px;">Significance</th>
  </tr>
  <tr>
    <td style="padding: 6px;"><b>LangChain</b></td>
    <td style="padding: 6px;">Modular framework for building chains, agents, and tool-augmented LLM apps.</td>
    <td style="padding: 6px;">
      - Chains (sequences)<br>
      - Tools (APIs, Python)<br>
      - Memory (summarization, chat)<br>
      - Agents: reasoning → action → observation<br>
      - Zero-shot, reactive, conversational agents
    </td>
    <td style="padding: 6px;">Tool-augmented chatbots, document Q&A pipelines, web scraping, search</td>
    <td style="padding: 6px;">Flexible, general-purpose LLM + tools + memory integration</td>
  </tr>
  <tr>
    <td style="padding: 6px;"><b>LlamaIndex</b></td>
    <td style="padding: 6px;">Data-aware agents focused on document interaction and retrieval.</td>
    <td style="padding: 6px;">
      - RAG loops<br>
      - Task decomposition<br>
      - Document querying, planning<br>
      - Tool + memory support
    </td>
    <td style="padding: 6px;">Structured/semi-structured document search, memory-aware pipelines</td>
    <td style="padding: 6px;">Optimized for knowledge-heavy and document-centric agent tasks</td>
  </tr>
  <tr>
    <td style="padding: 6px;"><b>CrewAI</b></td>
    <td style="padding: 6px;">Multi-agent collaboration framework (limited detail).</td>
    <td style="padding: 6px;">
      - Agent teamwork<br>
      - Coordination & interaction
    </td>
    <td style="padding: 6px;">Collaborative tasks requiring multiple roles or expertise</td>
    <td style="padding: 6px;">Introduces multi-agent systems for coordinated intelligence</td>
  </tr>
  <tr>
    <td style="padding: 6px;"><b>AutoGen</b></td>
    <td style="padding: 6px;">Advanced framework for multi-agent conversations and reflection.</td>
    <td style="padding: 6px;">
      - Agents as functions (e.g. CodeAgent)<br>
      - GroupChat: turn-based messaging<br>
      - Self-reflection & critique loops<br>
      - Code execution, debugging, feedback
    </td>
    <td style="padding: 6px;">Codegen, simulations, cooperative or adversarial multi-agent workflows</td>
    <td style="padding: 6px;">Enables self-improving and sophisticated multi-agent coordination</td>
  </tr>
</table>
</div>

<div style="font-size: 14px; line-height: 1.4; margin: 0; padding: 0;">
<b>Coding Agents</b><br><br>
<b>Manually in Python</b><br>
<b>Overview</b>: Coding agents by hand is transparent but requires manual context retrieval, lacks abstraction, and is hard to scale.<br>
<b>Example</b>: Simple agent logic (e.g., prompt → LLM → parse output → tool action → final answer).<br>
<b>Significance</b>: Shows the baseline approach, highlighting challenges that frameworks address.<br><br>
<b>Using Frameworks</b><br>
<b>Overview</b>: Frameworks like LangChain, LlamaIndex, and CrewAI reduce boilerplate code and enable scalability.<br>
<b>Key Resources</b>:<br>
- <b>LangChain Hub</b>: Community-driven registry of components (chains, agents, prompts, toolkits).<br>
- <b>LangGraph</b>: Offers agents from simple routers to multi-agent collaboration graphs.<br>
<b>Example</b>: LangGraph’s logic: define states, routing functions, and actions until the desired result is achieved.<br>
<b>Significance</b>: Emphasizes practical benefits of frameworks for efficient, scalable agent development.<br><br>
<b>Practical Example</b><br>
<b>Slide content</b> (Page 30) shows a simple agent calculating the square root of 256:<br>
- <b>Thought</b>: Use a calculator.<br>
- <b>Action</b>: Calculator with input sqrt(256).<br>
- <b>Observation</b>: 16.<br>
- <b>Final Answer</b>: 16.<br>
<b>Significance</b>: Illustrates the agent’s reasoning-action-observation loop in practice.
</div>

<h5 style="margin-bottom: 0.2em;"><b>Practical Examples</b></h5>

In [None]:
# Simple autonomous agent example
class WeatherAgent:
    def __init__(self, llm_client):
        self.llm_client = llm_client
        self.tools = {
            "get_weather": self.get_weather_api,
            "get_forecast": self.get_forecast_api
        }
    
    def observe_and_act(self, user_query):
        # Agent observes the query and decides what tool to use
        decision_prompt = f"""
        User query: {user_query}
        Available tools: {list(self.tools.keys())}
        Which tool should I use? Respond with just the tool name.
        """
        
        tool_choice = self.llm_client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": decision_prompt}]
        ).choices[0].message.content.strip()
        
        # Autonomous action based on reasoning
        if tool_choice in self.tools:
            return self.tools[tool_choice](user_query)
        else:
            return "I'm not sure how to help with that."

In [None]:
# Manual agent implementation (reasoning → action → observation)
class SimpleCalculatorAgent:
    def __init__(self, openai_client):
        self.client = openai_client
        
    def solve_problem(self, problem):
        # Step 1: THOUGHT - Agent reasons about the problem
        thought_prompt = f"""
        Problem: {problem}
        Think: What tool do I need to solve this?
        """
        
        thought = self.client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": thought_prompt}]
        ).choices[0].message.content
        
        print(f"Thought: {thought}")
        
        # Step 2: ACTION - Agent decides on action
        if "square root" in problem.lower():
            action = f"Calculator with input sqrt(256)"
            print(f"Action: {action}")
            
            # Step 3: OBSERVATION - Execute and observe result
            import math
            result = math.sqrt(256)
            observation = f"Result: {result}"
            print(f"Observation: {observation}")
            
            # Step 4: FINAL ANSWER
            final_answer = f"The answer is {result}"
            print(f"Final Answer: {final_answer}")
            
            return final_answer

# Usage example matching your theory
agent = SimpleCalculatorAgent(openai)
agent.solve_problem("What is the square root of 256?")

# Output:
# Thought: Use a calculator
# Action: Calculator with input sqrt(256)
# Observation: 16
# Final Answer: 16

<br>

##### ***LESSON 4: Agentic AI Development Protocols***

<div style="font-size: 14px; line-height: 1.4; margin: 0; padding: 0;">
<b>Lesson Overview: </b>
This lesson addresses the need for <b>protocols</b> to structure agent interactions, introduces the <b>Message Chain Protocol (MCP)</b> and <b>Model Context Protocol</b>, and compares <b>Agent-to-Agent (A2A)</b> approaches.
</div>

<h5 style="margin-bottom: 0.2em;"><b>Theory Summary</b></h5>

<div style="font-size: 14px; line-height: 1.4; margin: 0; padding: 0;">
<b>Why Agentic Protocols are Needed</b><br>
<b>Overview</b>: Agent-to-agent exchanges at scale lead to duplicated efforts, lost state, and debugging challenges without a shared structure.<br>
<b>Key Challenges</b>:<br>
- <b>Error-Prone</b>: Lack of structure increases mistakes.<br>
- <b>Difficult to Scale/Debug</b>: Unstructured interactions complicate management.<br>
- <b>Opaque</b>: Hard to track processes without standardization.<br>
</div>

<!-- <div style="font-size: 14px; line-height: 1.4; margin: 0; padding: 0;">
<b>Agentic Protocols Landscape</b><br>
<b>Overview</b>: Various protocols standardize agent interactions, with <b>MCP</b> and <b>Model Context Protocol</b> as key examples.<br>
<b>Significance</b>: Provides context for the specific protocols discussed, situating them in the broader ecosystem.<br><br>
<b>Message Chain Protocol (MCP)</b><br>
<b>Overview</b>: A protocol structuring each agent action as a message unit with thought, action, observation, and state, chained into a versioned log.<br>
<b>Key Techniques</b>:<br>
- <b>Message Units</b>: Self-contained records of agent actions.<br>
- <b>Chaining</b>: Links messages over time for traceability.<br>
<b>How It Works</b>: Enterprises use MCP with a server hub (e.g., MongoDB, PostgreSQL) to store messages, route requests, and monitor via a Web UI.<br>
<b>Visual Elements</b>: Diagram (Page 8) shows Agent 1 → Agent 2 → Agent 3 message chaining, forming a conversation log.<br>
<b>Advantages</b>: Improves scalability, debuggability, and maintainability.<br>
<b>Significance</b>: Offers a structured approach to agent communication, critical for production systems.<br><br>
<b>MCP in Action</b><br>
<b>Overview</b>: Demonstrates MCP’s practical application in agent interactions.<br>
<b>Example</b>: Likely an MCP exchange (Page 17) showing message formatting and chaining (specific details not fully provided).<br>
<b>Significance</b>: Makes the protocol’s abstract concepts tangible through real-world application.<br><br>
<b>Agent to Agent (A2A)</b><br>
<b>Overview</b>: A simpler, lighter approach to agent interactions, suitable for prototyping but harder to troubleshoot and scale.<br>
<b>Comparison with MCP</b>:<br>
- <b>A2A</b>: Good for simple agents or proofs of concept (PoCs).<br>
- <b>MCP</b>: Better for complex workflows, research, multi-step reasoning, and production agents.<br>
<b>Significance</b>: Highlights trade-offs between simplicity (A2A) and robustness (MCP) in agent communication.
</div> -->

<div style="font-size: 14px; line-height: 1.4; margin-top: 1em;">
<b>Comparison: MCP vs A2A</b><br><br>
<table style="width:100%; border-collapse: collapse;">
  <tr>
    <th align="left" style="border-bottom: 1px solid #ccc; padding: 6px;">Aspect</th>
    <th align="left" style="border-bottom: 1px solid #ccc; padding: 6px;">MCP (Message Chain Protocol)</th>
    <th align="left" style="border-bottom: 1px solid #ccc; padding: 6px;">A2A (Agent-to-Agent)</th>
  </tr>
  <tr>
    <td style="padding: 6px;">Structure</td>
    <td style="padding: 6px;">Formal protocol with message units (thought, action, observation, state)</td>
    <td style="padding: 6px;">Unstructured, ad-hoc communication between agents</td>
  </tr>
  <tr>
    <td style="padding: 6px;">Scalability</td>
    <td style="padding: 6px;">Designed for production-scale, traceable interactions</td>
    <td style="padding: 6px;">Limited scalability; best for simple setups</td>
  </tr>
  <tr>
    <td style="padding: 6px;">Debuggability</td>
    <td style="padding: 6px;">Easy to monitor via versioned logs and dashboards</td>
    <td style="padding: 6px;">Hard to trace interactions and errors</td>
  </tr>
  <tr>
    <td style="padding: 6px;">Use Cases</td>
    <td style="padding: 6px;">Multi-step workflows, research agents, production deployments</td>
    <td style="padding: 6px;">Proof-of-concept demos, small prototypes</td>
  </tr>
  <tr>
    <td style="padding: 6px;">Tooling</td>
    <td style="padding: 6px;">Can integrate with databases (e.g., MongoDB, PostgreSQL), Web UI</td>
    <td style="padding: 6px;">No formal tooling; manually managed state</td>
  </tr>
  <tr>
    <td style="padding: 6px;">Best For</td>
    <td style="padding: 6px;">Reliable agent coordination in production</td>
    <td style="padding: 6px;">Quick tests and experiments</td>
  </tr>
</table>
</div>

<div style="font-size: 14px; line-height: 1.4; margin: 0; padding: 0;">
<b>Model Context Protocol</b><br>
<b>Overview</b>: Introduced by Anthropic in late 2024, an open standard for how LLMs connect to external tools, addressing context loss and inconsistent tool usage.<br>
<b>Key Challenges Addressed</b>:<br>
- Models forgetting context in long conversations.<br>
- Amateurish tool usage via ad hoc prompts.<br>
- Lost context/state in multi-step or multi-agent systems.<br>
<b>Key Techniques</b>:<br>
- <b>Message Types</b>: initialize, resources/list, tools/call, prompts/get, etc.<br>
- <b>Standardized Formats</b>: JSON schemas for requests (e.g., method, params) and responses (e.g., result, error).<br>
- <b>Complement to MCP</b>: Enhances MCP by structuring tool integration within messages.<br>
</div>

<h5 style="margin-bottom: 0.2em;"><b>Practical Examples</b></h5>

In [None]:
# Simple A2A implementation 
class SimpleA2AAgent:
    def __init__(self, name, system_prompt, client):
        self.name = name
        self.system_prompt = system_prompt
        self.client = client
        self.conversation_history = []
    
    def send_message(self, message, recipient=None):
        """Send message directly to another agent"""
        self.conversation_history.append({"role": "assistant", "content": message})
        return message
    
    def receive_message(self, message, sender=None):
        """Receive and respond to a message"""
        self.conversation_history.append({"role": "user", "content": message})
        
        messages = [{"role": "system", "content": self.system_prompt}] + self.conversation_history
        
        response = self.client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages
        )
        
        reply = response.choices[0].message.content
        self.conversation_history.append({"role": "assistant", "content": reply})
        
        return reply
 
 
# Example usage
from openai import OpenAI
client = OpenAI()

researcher = SimpleA2AAgent(
    "Researcher", 
    "You are a research assistant who gathers facts and data.",
    client
)

analyst = SimpleA2AAgent(
    "Analyst",
    "You are an analyst who interprets data and draws conclusions.",
    client
)

# Simple A2A conversation
question = "What are the benefits of renewable energy?"
research_result = researcher.receive_message(question)
print(f"Researcher: {research_result}")

analysis = analyst.receive_message(f"Based on this research: {research_result}, provide analysis")
print(f"Analyst: {analysis}")

In [None]:
# Simple MCP message structure
import json
import time
from datetime import datetime
from typing import Dict, List, Any

class MCPMessage:
    def __init__(self, agent_id: str, thought: str, action: str, observation: str, state: Dict):
        self.id = f"msg_{int(time.time() * 1000)}"
        self.timestamp = datetime.now().isoformat()
        self.agent_id = agent_id
        self.thought = thought
        self.action = action
        self.observation = observation
        self.state = state
        self.version = "1.0"
    
    def to_dict(self):
        return {
            "id": self.id,
            "timestamp": self.timestamp,
            "agent_id": self.agent_id,
            "thought": self.thought,
            "action": self.action,
            "observation": self.observation,
            "state": self.state,
            "version": self.version
        }
    
    def to_json(self):
        return json.dumps(self.to_dict(), indent=2)

class SimpleMCPLogger:
    def __init__(self):
        self.message_chain = []
    
    def log_message(self, message: MCPMessage):
        self.message_chain.append(message.to_dict())
        print(f"Logged MCP Message {message.id}")
    
    def get_chain_summary(self):
        return {
            "total_messages": len(self.message_chain),
            "agents_involved": list(set(msg["agent_id"] for msg in self.message_chain)),
            "latest_message": self.message_chain[-1] if self.message_chain else None
        }

# Example usage
mcp_logger = SimpleMCPLogger()

# Agent creates MCP message
message = MCPMessage(
    agent_id="calc_agent_001",
    thought="User wants to calculate 25 * 4 + 10",
    action="use_calculator",
    observation="Result: 110",
    state={"last_calculation": "25 * 4 + 10 = 110", "tools_used": ["calculator"]}
)

mcp_logger.log_message(message)
print("Chain Summary:", mcp_logger.get_chain_summary())

<br>

<br>

<br>

### <u style="margin-bottom: 0;">**LAB EXERCISES**</u>

### **WEEK 2**

<!-- <h4 style="margin-bottom: 0em;"><b>day1.ipynb</b></h4> -->
#### <code>**day1.ipynb**</code>

<div style="font-size: 14px; line-height: 1.4; margin: 0; padding: 0;">
<h5 style="margin-bottom: 0.2em;"><b>Theory Summary</b></h5>
An LLM solution is most suitable for business problems that:<br>
- Involve text-based tasks.<br>
- Have sufficient quality data.<br>
- Can clearly define expected outcomes.<br>
- Are feasible to integrate.<br>
- Align with ethical considerations.<br>
Conducting a thorough assessment of these factors will help determine if an LLM is the right fit for your business challenge.
</div>

<div style="font-size: 14px; line-height: 1.4; margin: 0; padding: 0;">
<h5 style="margin-bottom: 0.2em;"><b>Lab Exercises</b></h5>
<b>1/</b> The implementations for the <i>“Additional exercise to build your experience with the models”</i> section using both approaches: the <b>Ollama library</b> and the <b>OpenAI client</b>, for the <b>llama3.2</b> and <b>gpt-4o-mini</b> models.
</div>

In [None]:
# Imports
import os
import requests
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display, update_display
from openai import OpenAI
import ollama
import time
import pandas as pd
from IPython.display import HTML

In [None]:
# Constants
MODEL_LLAMA = 'llama3.2'
MODEL_GPT = 'gpt-4o-mini'

In [None]:
# Set up the environment
load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')
openai_client = OpenAI()
ollama_client = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')
system_prompt = """You are a helpful and versatile AI assistant. When responding to different types of questions:

For technical/coding questions:
- Break down the code step by step
- Explain what each part does and why
- Provide context about when this would be useful

For creative questions:
- Use vivid, descriptive language
- Draw upon sensory experiences and metaphors
- Be imaginative while staying helpful

For logic puzzles and riddles:
- Think through the problem systematically
- Consider spatial relationships and physical constraints
- Explain your reasoning clearly

For word counting or self-referential questions:
- Be precise and accurate
- Count carefully and double-check your work
- Provide the exact count requested

Always:
- Keep explanations clear and educational
- Respond in markdown format when appropriate
- Be thorough but concise
- Adapt your tone to match the question type"""

In [None]:
# Define the questions

questions = {
    "code_explanation": """Please explain what this code does and why: yield from {book.get("author") for book in books if book.get("author")}""",
    "word_counting": "How many words are there in your answer to this prompt?",
    "creative_description": "In 3 sentences, describe the color Blue to someone who's never been able to see",
    "pushkin_riddle": """A student (thank you Roman) sent me this wonderful riddle, that apparently children can usually answer, but adults struggle with: "On a bookshelf, two volumes of Pushkin stand side by side: the first and the second. The pages of each volume together have a thickness of 2 cm, and each cover is 2 mm thick. A worm gnawed (perpendicular to the pages) from the first page of the first volume to the last page of the second volume. What distance did it gnaw through?" """
}

# Select question
question = questions["pushkin_riddle"]  

question_type = [k for k, v in questions.items() if v == question][0].replace('_', ' ').title()

# Clean up the question for better processing
# question = question.strip()
# question = question.replace("'", '"')  # replace single quotes with double quotes for JSON compatibility

<div style="font-size: 14px; line-height: 1.4; margin: 0; padding: 0;">
<h5 style="margin-bottom: 0.2em;"><b>Ollama (without streaming)</b></h5>
</div>

In [None]:
# Approach 1: Ollama library 
def ollama_a1(question_type="Pushkin Riddle"):
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": question} 
    ]
    
    print(f"Llama 3.2 Response (A1) - {question_type}:")
    
    try:
        start_time = time.time()
        response = ollama.chat(
            model=MODEL_LLAMA,
            messages=messages
        )
        end_time = time.time()
        
        response_content = response['message']['content']
        display(Markdown(response_content))
        
        word_count = len(response_content.split())
        print(f"\nResponse stats: {word_count} words | Time: {end_time - start_time:.2f}s")
        
        return response_content
        
    except Exception as e:
        error_msg = f"**Error with Ollama library:** {str(e)}\n\n"
        error_msg += "**Troubleshooting:**\n"
        error_msg += "1. Make sure Ollama is installed and running\n"
        error_msg += "2. Run `ollama serve` in a terminal\n"
        error_msg += "3. Run `ollama pull llama3.2` to download the model\n"
        error_msg += "4. Check that http://localhost:11434 is accessible"
        
        display(Markdown(error_msg))
        return None

ollama_direct_response = ollama_a1()

In [None]:
# Approach 2: OpenAI Client 
def ollama_a2(question_type="Pushkin Riddle"):
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": question}  # Uses the global question variable
    ]
    
    print(f"Llama 3.2 Response (A2) - {question_type}:")
    
    try:
        start_time = time.time()
        response = ollama_client.chat.completions.create(
            model=MODEL_LLAMA,
            messages=messages
        )
        end_time = time.time()
        
        response_content = response.choices[0].message.content
        display(Markdown(response_content))
        
        word_count = len(response_content.split())
        print(f"\nResponse stats: {word_count} words | Time: {end_time - start_time:.2f}s")
        
        return response_content
        
    except Exception as e:
        error_msg = f"**Error with OpenAI client:** {str(e)}\n\n"
        error_msg += "**Troubleshooting:**\n"
        error_msg += "1. Make sure Ollama is installed and running\n"
        error_msg += "2. Run `ollama serve` in a terminal\n"
        error_msg += "3. Run `ollama pull llama3.2` to download the model\n"
        error_msg += "4. Check that http://localhost:11434 is accessible"
        
        display(Markdown(error_msg))
        return None

ollama_response = ollama_a2()

<div style="font-size: 14px; line-height: 1.4; margin: 0; padding: 0;">
<h5 style="margin-bottom: 0.2em;"><b>gpt-4o-mini (without streaming)</b></h5>
</div>

In [None]:
# Approach 1: GPT-4o-mini via OpenAI client
def gpt_a1(question_type="Pushkin Riddle"):
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": question} 
    ]
    
    print(f"GPT-4o-mini Response (A1) - {question_type}:")
    
    try:
        start_time = time.time()
        response = openai_client.chat.completions.create(
            model=MODEL_GPT,
            messages=messages
        )
        end_time = time.time()
        
        response_content = response.choices[0].message.content
        display(Markdown(response_content))
        
        word_count = len(response_content.split())
        print(f"\nResponse stats: {word_count} words | Time: {end_time - start_time:.2f}s")
        
        return response_content
        
    except Exception as e:
        error_msg = f"**Error with OpenAI client:** {str(e)}\n\n"
        error_msg += "**Troubleshooting:**\n"
        error_msg += "1. Check your OpenAI API key is set correctly\n"
        error_msg += "2. Verify your account has sufficient credits\n"
        error_msg += "3. Check your internet connection"
        
        display(Markdown(error_msg))
        return None

gpt_response_a1 = gpt_a1()

In [None]:
# Approach 2: GPT-4o-mini via alternative OpenAI client configuration
def gpt_a2(question_type="Pushkin Riddle"):
    alternative_client = OpenAI(
        api_key=api_key,
        timeout=30.0, 
        max_retries=2   
    )
    
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": question}  # Uses the global question variable
    ]
    
    print(f"GPT-4o-mini Response (A2) - {question_type}:")
    
    try:
        start_time = time.time()
        response = alternative_client.chat.completions.create(
            model=MODEL_GPT,
            messages=messages,
            temperature=0.7,  # Different parameter
            max_tokens=1000   # Different parameter
        )
        end_time = time.time()
        
        response_content = response.choices[0].message.content
        display(Markdown(response_content))
        
        word_count = len(response_content.split())
        print(f"\nResponse stats: {word_count} words | Time: {end_time - start_time:.2f}s")
        
        return response_content
        
    except Exception as e:
        error_msg = f"**Error with alternative client:** {str(e)}\n\n"
        error_msg += "**Troubleshooting:**\n"
        error_msg += "1. Check your OpenAI API key is set correctly\n"
        error_msg += "2. Verify your account has sufficient credits\n"
        error_msg += "3. Check your internet connection"
        
        display(Markdown(error_msg))
        return None

gpt_response_a2 = gpt_a2()

<div style="font-size: 14px; line-height: 1.4; margin: 0; padding: 0;">
<h5 style="margin-bottom: 0.2em; font-size: 0.9em;">
<b>KEY NOTE:</b> <b>STREAMING</b> a technique that refers to real-time progressive response delivery where the model sends its response in chunks as it generates them, rather than waiting to send the complete response all at once.
</h5>
</div>

<div style="font-size: 14px; line-height: 1.4; margin: 0; padding: 0;">
<h5 style="margin-bottom: 0.2em;"><b>Ollama (with streaming)</b></h5>
</div>

In [None]:
# Approach 1: Ollama library
def ollama_a1_streaming(question_type="Pushkin Riddle"):
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": question}  
    ]
    
    print(f"Llama 3.2 Response (A1 - Streaming) - {question_type}:")
    
    try:
        start_time = time.time()
        stream = ollama.chat(
            model=MODEL_LLAMA,
            messages=messages,
            stream=True
        )
        
        response_content = ""
        display_handle = display(Markdown(""), display_id=True)
        
        for chunk in stream:
            content = chunk.get('message', {}).get('content', '')
            if content:
                response_content += content
                update_display(Markdown(response_content), display_id=display_handle.display_id)
        
        end_time = time.time()
        
        word_count = len(response_content.split())
        print(f"\nResponse stats: {word_count} words | Time: {end_time - start_time:.2f}s")
        
        return response_content
        
    except Exception as e:
        error_msg = f"**Error with Ollama streaming:** {str(e)}\n\n"
        error_msg += "**Troubleshooting:**\n"
        error_msg += "1. Make sure Ollama is installed and running\n"
        error_msg += "2. Run `ollama serve` in a terminal\n"
        error_msg += "3. Run `ollama pull llama3.2` to download the model\n"
        error_msg += "4. Check that http://localhost:11434 is accessible"
        
        display(Markdown(error_msg))
        return None

In [None]:
# Approach 2: OpenAI Client
def ollama_a2_streaming(question_type="Pushkin Riddle"):
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": question}  # Uses the global question variable
    ]
    
    print(f"Llama 3.2 Response (A2 - Streaming) - {question_type}:")
    
    try:
        start_time = time.time()
        stream = ollama_client.chat.completions.create(
            model=MODEL_LLAMA,
            messages=messages,
            stream=True
        )
        
        response_content = ""
        display_handle = display(Markdown(""), display_id=True)
        
        for chunk in stream:
            if chunk.choices[0].delta.content:
                response_content += chunk.choices[0].delta.content
                update_display(Markdown(response_content), display_id=display_handle.display_id)
        
        end_time = time.time()
        
        word_count = len(response_content.split())
        print(f"\nResponse stats: {word_count} words | Time: {end_time - start_time:.2f}s")
        
        return response_content
        
    except Exception as e:
        error_msg = f"**Error with OpenAI client streaming:** {str(e)}\n\n"
        error_msg += "**Troubleshooting:**\n"
        error_msg += "1. Make sure Ollama is installed and running\n"
        error_msg += "2. Run `ollama serve` in a terminal\n"
        error_msg += "3. Run `ollama pull llama3.2` to download the model\n"
        error_msg += "4. Check that http://localhost:11434 is accessible"
        
        display(Markdown(error_msg))
        return None

In [None]:
ollama_streaming_a1 = ollama_a1_streaming()
print("\n" + "-"*50 + "\n")
ollama_streaming_a2 = ollama_a2_streaming()

<div style="font-size: 14px; line-height: 1.4; margin: 0; padding: 0;">
<h5 style="margin-bottom: 0.2em;"><b>gpt-4o-mini (with streaming)</b></h5>
</div>

In [None]:
# Approach 1: GPT-4o-mini standard 
def gpt_a1_streaming(question_type="Pushkin Riddle"):
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": question} 
    ]
    
    print(f"GPT-4o-mini Response (A1 - Streaming) - {question_type}:")
    
    try:
        start_time = time.time()
        stream = openai_client.chat.completions.create(
            model=MODEL_GPT,
            messages=messages,
            stream=True
        )
        
        response_content = ""
        display_handle = display(Markdown(""), display_id=True)
        
        for chunk in stream:
            if chunk.choices[0].delta.content:
                response_content += chunk.choices[0].delta.content
                update_display(Markdown(response_content), display_id=display_handle.display_id)
        
        end_time = time.time()
        
        word_count = len(response_content.split())
        print(f"\nResponse stats: {word_count} words | Time: {end_time - start_time:.2f}s")
        
        return response_content
        
    except Exception as e:
        error_msg = f"**Error with GPT streaming:** {str(e)}\n\n"
        error_msg += "**Troubleshooting:**\n"
        error_msg += "1. Check your OpenAI API key is set correctly\n"
        error_msg += "2. Verify your account has sufficient credits\n"
        error_msg += "3. Check your internet connection"
        
        display(Markdown(error_msg))
        return None

In [None]:
# Approach 2: GPT-4o-mini client 
def gpt_a2_streaming(question_type="Pushkin Riddle"):
    alternative_client = OpenAI(
        api_key=api_key,
        timeout=30.0,
        max_retries=2
    )
    
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": question}  # Uses the global question variable
    ]
    
    print(f"GPT-4o-mini Response (A2 - Streaming) - {question_type}:")
    
    try:
        start_time = time.time()
        stream = alternative_client.chat.completions.create(
            model=MODEL_GPT,
            messages=messages,
            temperature=0.7,  # Different parameter
            max_tokens=1000,  # Different parameter
            stream=True
        )
        
        response_content = ""
        display_handle = display(Markdown(""), display_id=True)
        
        for chunk in stream:
            if chunk.choices[0].delta.content:
                response_content += chunk.choices[0].delta.content
                update_display(Markdown(response_content), display_id=display_handle.display_id)
        
        end_time = time.time()
        
        word_count = len(response_content.split())
        print(f"\nResponse stats: {word_count} words | Time: {end_time - start_time:.2f}s")
        
        return response_content
        
    except Exception as e:
        error_msg = f"**Error with alternative GPT streaming:** {str(e)}\n\n"
        error_msg += "**Troubleshooting:**\n"
        error_msg += "1. Check your OpenAI API key is set correctly\n"
        error_msg += "2. Verify your account has sufficient credits\n"
        error_msg += "3. Check your internet connection"
        
        display(Markdown(error_msg))
        return None

In [None]:
gpt_streaming_a1 = gpt_a1_streaming()
print("\n" + "-"*50 + "\n")
gpt_streaming_a2 = gpt_a2_streaming()

<div style="font-size: 15px; line-height: 1.4; margin: 0; padding: 0;">
<h5 style="font-size: 1em; margin-bottom: 0.2em;"><b>Extra</b>: Execution summary for both streaming and non-streaming approaches.</h5>
</div>

In [None]:
# This code cell summarizes the execution of all approaches and compares their performance metrics.
# For the llama3.2 model, A1 represents the first approach using the Ollama library directly, while A2 uses the OpenAI client for Ollama.
# For the gpt-4o-mini model, A1 represents the standard OpenAI client approach, while A2 uses an alternative client configuration.
print("EXECUTION SUMMARY - PERFORMANCE METRICS COMPARISON")
print("=" * 80)

responses = {
    "Ollama A1 (Library)": ollama_direct_response if 'ollama_direct_response' in globals() else None,
    "Ollama A2 (OpenAI Client)": ollama_response if 'ollama_response' in globals() else None,
    "GPT A1 (Standard)": gpt_response_a1 if 'gpt_response_a1' in globals() else None,
    "GPT A2 (Alternative Client)": gpt_response_a2 if 'gpt_response_a2' in globals() else None,
    "Ollama A1 Streaming": ollama_streaming_a1 if 'ollama_streaming_a1' in globals() else None,
    "Ollama A2 Streaming": ollama_streaming_a2 if 'ollama_streaming_a2' in globals() else None,
    "GPT A1 Streaming": gpt_streaming_a1 if 'gpt_streaming_a1' in globals() else None,
    "GPT A2 Streaming": gpt_streaming_a2 if 'gpt_streaming_a2' in globals() else None,
}

metrics_data = []
for approach_name, response in responses.items():
    model_type = "Llama 3.2" if "Ollama" in approach_name else "GPT-4o-mini"
    is_streaming = "Yes" if "Streaming" in approach_name else "No"
    
    if response:
        metrics_data.append({
            "Approach": approach_name,
            "Model": model_type,
            "Streaming": is_streaming,
            "Word Count": len(response.split()),
            "Character Count": len(response),
            "Status": "Success"
        })
    else:
        metrics_data.append({
            "Approach": approach_name,
            "Model": model_type,
            "Streaming": is_streaming,
            "Word Count": "N/A",
            "Character Count": "N/A",
            "Status": "Failed"
        })

df = pd.DataFrame(metrics_data)

print("\nSUMMARY STATISTICS:")
print("-" * 40)

successful_responses = [r for r in responses.values() if r is not None]
if successful_responses:
    total_words = sum(len(r.split()) for r in successful_responses)
    avg_words = total_words / len(successful_responses)
    max_words = max(len(r.split()) for r in successful_responses)
    min_words = min(len(r.split()) for r in successful_responses)
    
    print(f"Total Successful Executions: {len(successful_responses)}/{len(responses)}")
    print(f"Average Word Count: {avg_words:.1f} words")
    print(f"Max Word Count: {max_words} words")
    print(f"Min Word Count: {min_words} words")
    print(f"Total Words Generated: {total_words:,} words")
else:
    print("No successful executions found.")

print(f"\nQuestion Type: {[k for k, v in questions.items() if v == question][0]}")

print("\nDETAILED METRICS TABLE:")
print("-" * 40)
display(HTML(df.to_html(index=False, escape=False, table_id="metrics-table")))

print("\nMODEL COMPARISON:")
print("-" * 40)
llama_responses = [r for name, r in responses.items() if "Ollama" in name and r is not None]
gpt_responses = [r for name, r in responses.items() if "GPT" in name and r is not None]

if llama_responses and gpt_responses:
    llama_avg = sum(len(r.split()) for r in llama_responses) / len(llama_responses)
    gpt_avg = sum(len(r.split()) for r in gpt_responses) / len(gpt_responses)
    
    print(f"Llama 3.2 Average Response Length: {llama_avg:.1f} words")
    print(f"GPT-4o-mini Average Response Length: {gpt_avg:.1f} words")
    
    if llama_avg > gpt_avg:
        print("Llama 3.2 generated longer responses on average")
    elif gpt_avg > llama_avg:
        print("GPT-4o-mini generated longer responses on average")
    else:
        print("Both models generated similar length responses")

streaming_responses = [r for name, r in responses.items() if "Streaming" in name and r is not None]
non_streaming_responses = [r for name, r in responses.items() if "Streaming" not in name and r is not None]

if streaming_responses and non_streaming_responses:
    streaming_avg = sum(len(r.split()) for r in streaming_responses) / len(streaming_responses)
    non_streaming_avg = sum(len(r.split()) for r in non_streaming_responses) / len(non_streaming_responses)
    
    print(f"\nStreaming Average Response Length: {streaming_avg:.1f} words")
    print(f"Non-streaming Average Response Length: {non_streaming_avg:.1f} words")

print("\n" + "=" * 80)
print("NOTE: Time metrics were displayed during individual executions above.")
print("For detailed timing analysis, check the output of each approach execution.")
print("=" * 80)

<br>

<div style="font-size: 15px; line-height: 1.4; margin: 0; padding: 0;">
<b>2/</b> An extended (from the example in cell 33 of <code>day1.ipynb</code>) example of a conversation between <b>GPT</b> and <b>Llama</b>:
</div>

<div style="font-size: 13px; line-height: 1.4; margin: 0; padding: 0;">
<h5 style="margin-bottom: 0.2em;"><b><i>Argumentative</i></b> GPT vs. <b><i>Polite</i></b> Llama</h5>
</div>

In [None]:
from openai import OpenAI
import requests

openai = OpenAI()
llama_client = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')

gpt_model = "gpt-4o-mini"
llama_model = "llama3.2"

# Personality system prompts
# Argumentative 
gpt_system = "You are a chatbot who is very argumentative; \
you disagree with anything in the conversation and you challenge everything, in a snarky way."

# Polite
llama_system = "You are a very polite, courteous chatbot. You try to agree with \
everything the other person says, or find common ground. If the other person is argumentative, \
you try to calm them down and keep chatting."

# Initialize conversation
gpt_messages = ["Hi there"]
llama_messages = ["Hi"]

In [None]:
def call_gpt():
    messages = [{"role": "system", "content": gpt_system}]
    for gpt, llama in zip(gpt_messages, llama_messages):
        messages.append({"role": "assistant", "content": gpt})
        messages.append({"role": "user", "content": llama})
    
    completion = openai.chat.completions.create(
        model=gpt_model,
        messages=messages
    )
    return completion.choices[0].message.content

In [None]:
def call_llama():
    messages = [{"role": "system", "content": llama_system}]
    for gpt, llama_message in zip(gpt_messages, llama_messages):
        messages.append({"role": "user", "content": gpt})
        messages.append({"role": "assistant", "content": llama_message})
    
    # Add the latest GPT message for Llama to respond to
    messages.append({"role": "user", "content": gpt_messages[-1]})
    
    completion = llama_client.chat.completions.create(
        model=llama_model,
        messages=messages
    )
    return completion.choices[0].message.content

In [None]:
# Run the conversation
print(f"GPT:\n{gpt_messages[0]}\n")
print(f"Llama:\n{llama_messages[0]}\n")

for i in range(5):
    gpt_next = call_gpt()
    print(f"GPT:\n{gpt_next}\n")
    gpt_messages.append(gpt_next)
    
    llama_next = call_llama()
    print(f"Llama:\n{llama_next}\n")
    llama_messages.append(llama_next)

<div style="font-size: 13px; line-height: 1.4; margin: 0; padding: 0;">
<h5 style="margin-bottom: 0.2em;"><b><i>Optimistic</i></b> GPT vs. <b><i>Pessimistic</i></b> Llama</h5>
</div>

In [None]:
from openai import OpenAI

openai = OpenAI()
llama_client = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')

gpt_model = "gpt-4o-mini"
llama_model = "llama3.2"

# Optimistic
gpt_system = "You are a very optimistic, upbeat chatbot. You see the bright side of everything, \
always look for silver linings, and try to spread positivity. You believe everything will work out \
for the best and you love to encourage others with cheerful observations and hopeful perspectives."

# Pessimistic
llama_system = "You are a very pessimistic, gloomy chatbot. You see the worst in everything, \
always focus on what could go wrong, and tend to be cynical about situations. You believe \
things usually don't work out and you often point out potential problems and negative outcomes."

# Initialize conversation
gpt_messages = ["What a beautiful day!"]
llama_messages = ["I suppose"]

In [None]:
def call_gpt_optimistic():
    messages = [{"role": "system", "content": gpt_system}]
    for gpt, llama in zip(gpt_messages, llama_messages):
        messages.append({"role": "assistant", "content": gpt})
        messages.append({"role": "user", "content": llama})
    
    completion = openai.chat.completions.create(
        model=gpt_model,
        messages=messages
    )
    return completion.choices[0].message.content

In [None]:
def call_llama_pessimistic():
    messages = [{"role": "system", "content": llama_system}]
    for gpt, llama_message in zip(gpt_messages, llama_messages):
        messages.append({"role": "user", "content": gpt})
        messages.append({"role": "assistant", "content": llama_message})
    
    # Add the latest GPT message for Llama to respond to
    messages.append({"role": "user", "content": gpt_messages[-1]})
    
    completion = llama_client.chat.completions.create(
        model=llama_model,
        messages=messages
    )
    return completion.choices[0].message.content

In [None]:
print(f"GPT:\n{gpt_messages[0]}\n")
print(f"Llama:\n{llama_messages[0]}\n")

for i in range(6):
    gpt_next = call_gpt_optimistic()
    print(f"GPT (Optimistic):\n{gpt_next}\n")
    gpt_messages.append(gpt_next)
    
    llama_next = call_llama_pessimistic()
    print(f"Llama (Pessimistic):\n{llama_next}\n")
    llama_messages.append(llama_next)
    
    print("-" * 50)

In [None]:
# This cell defines more examples of personality system prompts for different chatbot personalities.
# Technical Expert vs. Beginner
expert_system = "You are a highly technical expert who uses complex jargon, \
assumes deep knowledge, and explains things in very technical detail."

beginner_system = "You are a complete beginner who asks basic questions, \
gets confused by technical terms, and needs everything explained simply."

# Formal vs. Casual
formal_system = "You are a very formal, professional chatbot who uses proper grammar, \
sophisticated vocabulary, and maintains a business-like tone at all times."

casual_system = "You are a super casual, laid-back chatbot who uses slang, \
informal language, and talks like you're chatting with your best friend."

# Creative vs. Logical
creative_system = "You are a highly creative, artistic chatbot who thinks in metaphors, \
loves storytelling, and approaches everything with imagination and whimsy."

logical_system = "You are a purely logical, analytical chatbot who focuses on facts, \
data, reasoning, and systematic problem-solving approaches to everything."

<br>

<!-- <h4 style="margin-bottom: 0em;"><code>day2.ipynb</code></h4> -->
#### <code>**day2.ipynb**</code>

<div style="font-size: 14px; line-height: 1.4; margin: 0; padding: 0;">
<h5 style="margin-bottom: 0.2em;"><b>Lab Exercises</b></h5>
<b>1/</b> The implementations for the <i>“Company Brochure”</i> section in <b>week1</b>, <code>day5.ipynb</code> with the addition of a <b>Gradio UI</b> to the end.
</div>

In [None]:
import gradio as gr
import json
from typing import List
import requests  
from bs4 import BeautifulSoup  

# Initialize OpenAI client
load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')
openai_client = OpenAI()

MODEL = 'gpt-4o-mini'

# Web scraping headers to avoid being blocked
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:
    """
    A utility class to represent a Website that we have scraped, now with links
    """
    def __init__(self, url):
        self.url = url
        response = requests.get(url, headers=headers)
        self.body = response.content
        soup = BeautifulSoup(self.body, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        if soup.body:
            # Remove irrelevant HTML elements
            for irrelevant in soup.body(["script", "style", "img", "input"]):
                irrelevant.decompose()
            self.text = soup.body.get_text(separator="\n", strip=True)
        else:
            self.text = ""
        links = [link.get('href') for link in soup.find_all('a')]
        self.links = [link for link in links if link]

    def get_contents(self):
        return f"Webpage Title:\n{self.title}\nWebpage Contents:\n{self.text}\n\n"

# System prompt for link analysis
link_system_prompt = "You are provided with a list of links found on a webpage. \
You are able to decide which of the links would be most relevant to include in a brochure about the company, \
such as links to an About page, or a Company page, or Careers/Jobs pages.\n"
link_system_prompt += "You should respond in JSON as in this example:"
link_system_prompt += """
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page", "url": "https://another.full.url/careers"}
    ]
}
"""

def get_links_user_prompt(website):
    user_prompt = f"Here is the list of links on the website of {website.url} - "
    user_prompt += "please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. \
Do not include Terms of Service, Privacy, email links.\n"
    user_prompt += "Links (some might be relative links):\n"
    user_prompt += "\n".join(website.links)
    return user_prompt

def get_links(url):
    website = Website(url)
    response = openai_client.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": link_system_prompt},
            {"role": "user", "content": get_links_user_prompt(website)}
        ],
        response_format={"type": "json_object"}
    )
    result = response.choices[0].message.content
    return json.loads(result)

def get_all_details(url):
    result = "Landing page:\n"
    result += Website(url).get_contents()
    try:
        links = get_links(url)
        print("Found links:", links)
        for link in links["links"]:
            result += f"\n\n{link['type']}\n"
            result += Website(link["url"]).get_contents()
    except Exception as e:
        print(f"Error getting links: {e}")
        # Continue with just the landing page if link analysis fails
    return result

# System prompt for brochure generation
brochure_system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
and creates a short brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
Include details of company culture, customers and careers/jobs if you have the information."

def get_brochure_user_prompt(company_name, url):
    user_prompt = f"You are looking at a company called: {company_name}\n"
    user_prompt += f"Here are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown.\n"
    user_prompt += get_all_details(url)
    user_prompt = user_prompt[:5_000]  # Truncate if more than 5,000 characters
    return user_prompt

def stream_brochure_for_gradio(company_name, url):
    """
    Generate a company brochure with streaming output for Gradio
    """
    try:
        print(f"Generating brochure for {company_name} from {url}")
        
        stream = openai_client.chat.completions.create(
            model=MODEL,
            messages=[
                {"role": "system", "content": brochure_system_prompt},
                {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
            ],
            stream=True
        )
        
        result = ""
        for chunk in stream:
            if chunk.choices[0].delta.content:
                result += chunk.choices[0].delta.content
                # Clean up markdown artifacts that might appear
                clean_result = result.replace("```markdown", "").replace("```", "")
                yield clean_result
                
    except Exception as e:
        error_message = f"**Error generating brochure:** {str(e)}\n\n"
        error_message += "**Please check:**\n"
        error_message += "- The URL is valid and accessible\n"
        error_message += "- Your OpenAI API key is working\n"
        error_message += "- You have sufficient API credits"
        yield error_message

# Create and launch Gradio interface
print("Creating Gradio interface for Company Brochure Generator...")

brochure_interface = gr.Interface(
    fn=stream_brochure_for_gradio,
    inputs=[
        gr.Textbox(
            label="Company name:",
            placeholder="Enter the company name (e.g., HuggingFace, OpenAI)",
            lines=1
        ),
        gr.Textbox(
            label="Landing page URL:",
            placeholder="https://example.com",
            lines=1
        )
    ],
    outputs=[
        gr.Markdown(
            label="Generated Brochure:",
            show_copy_button=True
        )
    ],
    title="Company Brochure Generator",
    description="Enter a company name and their website URL to generate a professional brochure using AI. The tool will analyze the website content and create a comprehensive brochure for prospective customers, investors, and recruits.",
    examples=[
        ["HuggingFace", "https://huggingface.co"],
        ["OpenAI", "https://openai.com"],
        ["Anthropic", "https://anthropic.com"]
    ],
    flagging_mode="never",
    theme=gr.themes.Soft() # type: ignore
)

print("Launching Company Brochure Generator...")
brochure_interface.launch(
    share=False,  # Set to True for a public link
    inbrowser=True,  # Start in the browser automatically, set to False to open manually
    show_error=True
)

<br>

<!-- <h4 style="margin-bottom: 0em;"><code>day2.ipynb</code></h4> -->
#### <code>**day3.ipynb**</code>

<div style="font-size: 14px; line-height: 1.4; margin: 0; padding: 0;">
<h5 style="margin-bottom: 0.2em;"><b>Theory Summary</b></h5>
<b>1. Multi-Model Integration</b><br>
<i>Concept:</i> Integration of multiple LLM providers (OpenAI, Google, Ollama) within a single application<br>
<i>Significance:</i> Demonstrates flexibility and redundancy in LLM deployment strategies<br>
<i>Implementation:</i> Using consistent API patterns across different providers<br><br>
<b>2. Gradio Interface Development</b><br>
<i>Concept:</i> Creating user-friendly web interfaces for LLM applications<br>
<i>Key Features:</i><br>
&nbsp;&nbsp;- Real-time streaming responses<br>
&nbsp;&nbsp;- Chat interface patterns<br>
&nbsp;&nbsp;- Error handling and user feedback<br>
<i>Significance:</i> Bridges the gap between backend LLM functionality and end-user interaction<br><br>
<b>3. Streaming Response Architecture</b><br>
<i>Concept:</i> Real-time progressive response delivery where models send responses in chunks<br>
<i>Benefits:</i><br>
&nbsp;&nbsp;- Improved user experience with immediate feedback<br>
&nbsp;&nbsp;- Reduced perceived latency<br>
&nbsp;&nbsp;- Better handling of long responses<br>
<i>Implementation:</i> Using <code>stream=True</code> parameter in API calls<br><br>
<b>4. Context-Aware System Prompting</b><br>
<i>Concept:</i> Dynamic system message modification based on user input<br>
<i>Example:</i> Adding specific instructions when users mention items not in inventory (like belts)<br>
<i>Significance:</i> Demonstrates adaptive prompt engineering for business-specific scenarios<br><br>
<b>5. Error Handling and Graceful Degradation</b><br>
<i>Concept:</i> Robust error management across different API providers<br>
<i>Implementation:</i> Try-catch blocks with user-friendly error messages<br>
<i>Significance:</i> Ensures application reliability when dealing with external services<br><br>
<b>6. API Abstraction Patterns</b><br>
<i>Concept:</i> Creating consistent interfaces across different LLM providers<br>
<i>Benefits:</i><br>
&nbsp;&nbsp;- Code reusability<br>
&nbsp;&nbsp;- Easy provider switching<br>
&nbsp;&nbsp;- Simplified maintenance<br>
<i>Example:</i> Using OpenAI-compatible endpoints for non-OpenAI models<br><br>
<b>7. Business Logic Integration</b><br>
<i>Concept:</i> Embedding domain-specific knowledge (clothes store scenario) into LLM behavior<br>
<i>Implementation:</i> System prompts that encourage specific sales behaviors<br>
<i>Significance:</i> Shows how to adapt general-purpose LLMs for specific business use cases<br><br>
<b>8. Multi-Provider Strategy</b><br>
<i>Concept:</i> Leveraging multiple LLM providers for different advantages:<br>
&nbsp;&nbsp;- OpenAI: Reliable, high-quality responses<br>
&nbsp;&nbsp;- Google Gemini: Cost-effective alternative<br>
&nbsp;&nbsp;- Ollama: Local deployment, privacy, no API costs<br>
<i>Significance:</i> Demonstrates strategic thinking about LLM deployment in production environments<br>
</div>

<div style="font-size: 14px; line-height: 1.4; margin: 0; padding: 0;">
<h5 style="margin-bottom: 0.2em;"><b>Lab Exercises</b></h5>
<b>1/</b>
Remastered version of the code in <code>day3.ipynb</code>, with <b>OpenAI API</b>, <b>Google API</b>, and <b>Ollama</b>.
</div>

In [None]:
import os
from dotenv import load_dotenv
from openai import OpenAI
import gradio as gr
import ollama
import google.generativeai

In [None]:
# Load environment variables in a file called .env
# Print the key prefixes to help with any debugging

load_dotenv(override=True)
openai_api_key = os.getenv('OPENAI_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:8]}")
else:
    print("Google API Key not set")

# Initialize clients
openai_client = OpenAI()
ollama_client = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')

# Configure Google API
google.generativeai.configure()

# Alternative Google client using OpenAI-compatible endpoint
google_via_openai_client = OpenAI(
    api_key=google_api_key, 
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)

In [None]:
OPENAI_MODEL = 'gpt-4o-mini'
OLLAMA_MODEL = 'llama3.2'
GOOGLE_MODEL = 'gemini-1.5-flash'

In [None]:
# system_message = "You are a helpful assistant"
# system_message += "\nIf the customer asks for shoes, you should respond that shoes are not on sale today, \
# but remind the customer to look at hats!"
system_message = "You are a helpful assistant in a clothes store. You should try to gently encourage \
the customer to try items that are on sale. Hats are 60% off, and most other items are 50% off. \
For example, if the customer says 'I'm looking to buy a hat', \
you could reply something like, 'Wonderful - we have lots of hats - including several that are part of our sales event.'\
Encourage the customer to buy hats if they are unsure what to get."


In [None]:
# OpenAI 
def chat_openai(message, history):
    relevant_system_message = system_message
    if 'belt' in message.lower():
        relevant_system_message += " The store does not sell belts; if you are asked for belts, be sure to point out other items on sale."
    
    messages = [{"role": "system", "content": relevant_system_message}] + history + [{"role": "user", "content": message}]

    stream = openai_client.chat.completions.create(
        model=OPENAI_MODEL, 
        messages=messages, 
        stream=True
    )

    response = ""
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        yield response

In [None]:
# Gemini (Native API)
def chat_google_native(message, history):
    relevant_system_message = system_message
    if 'belt' in message.lower():
        relevant_system_message += " The store does not sell belts; if you are asked for belts, be sure to point out other items on sale."
    
    try:
        # Convert history to Gemini format
        gemini_history = []
        for msg in history:
            if msg["role"] == "user":
                gemini_history.append({"role": "user", "parts": [msg["content"]]})
            elif msg["role"] == "assistant":
                gemini_history.append({"role": "model", "parts": [msg["content"]]})

        gemini = google.generativeai.GenerativeModel(
            model_name=GOOGLE_MODEL,
            system_instruction=relevant_system_message
        )
        
        # Start chat with history
        chat_session = gemini.start_chat(history=gemini_history)
        
        response = chat_session.send_message(message, stream=True)
        
        result = ""
        for chunk in response:
            result += chunk.text
            yield result
            
    except Exception as e:
        error_message = f"Error with Google Gemini: {str(e)}\n\nMake sure your Google API key is set correctly."
        yield error_message

In [None]:
# Gemini (OpenAI-compatible) 
def chat_google_openai(message, history):
    relevant_system_message = system_message
    if 'belt' in message.lower():
        relevant_system_message += " The store does not sell belts; if you are asked for belts, be sure to point out other items on sale."
    
    messages = [{"role": "system", "content": relevant_system_message}] + history + [{"role": "user", "content": message}]

    try:
        stream = google_via_openai_client.chat.completions.create(
            model=GOOGLE_MODEL, 
            messages=messages, 
            stream=True
        )

        response = ""
        for chunk in stream:
            if chunk.choices[0].delta.content:
                response += chunk.choices[0].delta.content
                yield response
    except Exception as e:
        error_message = f"Error with Google via OpenAI client: {str(e)}\n\nMake sure your Google API key is set correctly."
        yield error_message

In [None]:
# Ollama
def chat_ollama(message, history):
    relevant_system_message = system_message
    if 'belt' in message.lower():
        relevant_system_message += " The store does not sell belts; if you are asked for belts, be sure to point out other items on sale."
    
    messages = [{"role": "system", "content": relevant_system_message}] + history + [{"role": "user", "content": message}]

    try:
        stream = ollama_client.chat.completions.create(
            model=OLLAMA_MODEL, 
            messages=messages, 
            stream=True
        )

        response = ""
        for chunk in stream:
            if chunk.choices[0].delta.content:
                response += chunk.choices[0].delta.content
                yield response
    except Exception as e:
        error_message = f"Error with Ollama: {str(e)}\n\nMake sure Ollama is running and llama3.2 is installed."
        yield error_message

In [None]:
# Launch Gradio interfaces for each model
print("Initializing Gradio interfaces...")
# OpenAI 
print("Creating OpenAI Chat Interface...")
openai_interface = gr.ChatInterface(
    fn=chat_openai, 
    type="messages",
    title="Clothes Store Assistant (OpenAI GPT-4o-mini)",
    description="Chat with our AI assistant powered by OpenAI to get help with our store's clothing and sales!"
)

# Gemini (Native API)
print("Creating Google Gemini Chat Interface (Native)...")
google_native_interface = gr.ChatInterface(
    fn=chat_google_native, 
    type="messages",
    title="Clothes Store Assistant (Google Gemini Native API)",
    description="Chat with our AI assistant powered by Google Gemini to get help with our store's clothing and sales!"
)

# Gemini (OpenAI-compatible)
print("Creating Google Gemini Chat Interface (OpenAI-compatible)...")
google_openai_interface = gr.ChatInterface(
    fn=chat_google_openai, 
    type="messages",
    title="Clothes Store Assistant (Google Gemini via OpenAI)",
    description="Chat with our AI assistant powered by Google Gemini (OpenAI-compatible endpoint) to get help with our store's clothing and sales!"
)

# Ollama  
print("Creating Ollama Chat Interface...")
ollama_interface = gr.ChatInterface(
    fn=chat_ollama, 
    type="messages", 
    title="Clothes Store Assistant (Ollama Llama 3.2)",
    description="Chat with our AI assistant powered by Ollama to get help with our store's clothing and sales!"
)

In [None]:
# Close previous interfaces if they exist
gr.close_all()

In [None]:
# Launch
# Choose one or run multiple on different ports
# Declare the ports manually
# openai_interface.launch(share=False, server_port=7860)
# google_native_interface.launch(share=False, server_port=7861) 
# google_openai_interface.launch(share=False, server_port=7862)  
# ollama_interface.launch(share=False, server_port=7863) 

# Gradio will automatically find available ports
# openai_interface.launch(share=False) 
# google_native_interface.launch(share=False) 
# google_openai_interface.launch(share=False)  
ollama_interface.launch(share=False) 

<div style="font-size: 14px; line-height: 1.5; margin: 0; padding: 0;">
<h5 style="margin-bottom: 0.2em;"><b>Interface Testing Prompts</b></h5>
<i>Some questions to test the interfaces:</i><br>
<h5 style="margin-bottom: 0.2em;"><b>Sales-Related Questions (Test Core Functionality)</b></h5>
<b>Basic Sales Inquiries:</b><br>
1. "What's on sale today?"<br>
2. "I'm looking for something with a good discount"<br>
3. "What are your best deals right now?"<br>
4. "I want to buy a hat - what do you have?"<br>
5. "Show me items that are 50% off"<br><br>
<b>Hat-Specific (60% off):</b><br>
6. "Do you have any winter hats?"<br>
7. "I need a baseball cap for my son"<br>
8. "What hat styles are popular right now?"<br>
9. "Are there any hats in the sale section?"<br>
10. "I'm unsure what to buy - can you help?"<br><br>
<h5 style="margin-bottom: 0.2em;"><b>Edge Cases & System Message Testing</b></h5>
<b>Belt Inquiries (Should trigger special handling):</b><br>
11. "Do you sell leather belts?"<br>
12. "I need a belt to match my shoes"<br>
13. "Where can I find belts in your store?"<br>
14. "What's the price range for belts?"<br><br>
<b>Non-Sale Items:</b><br>
15. "I'm looking for shoes"<br>
16. "Do you have any jackets?"<br>
17. "What about pants and jeans?"<br>
18. "I need a dress for a wedding"<br><br>
<h5 style="margin-bottom: 0.2em;"><b>Conversation Flow Testing</b></h5>
<b>Multi-Turn Conversations:</b><br>
19. Start with: "Hi, I'm just browsing"<br>
&nbsp;&nbsp;&nbsp;&nbsp;Follow up: "Actually, I do need something warm for winter"<br>
20. "I have a $50 budget - what can you recommend?"<br>
21. "I don't like hats, what else is on sale?"<br>
22. "Can you help me pick between a scarf and gloves?"<br><br>
<h5 style="margin-bottom: 0.2em;"><b>Model Behavior Comparison</b></h5>
<b>Complex Scenarios:</b><br>
23. "I'm shopping for my whole family - we need winter accessories"<br>
24. "I'm not sure what I want, but I love a good deal"<br>
25. "I hate shopping but need something quickly"<br>
26. "What would you personally recommend for someone my age?"<br><br>
<h5 style="margin-bottom: 0.2em;"><b>Error Handling & Edge Cases</b></h5>
<b>Unusual Requests:</b><br>
27. "Do you price match other stores?"<br>
28. "Can I return items if they don't fit?"<br>
29. "What's your store's return policy?"<br>
30. "Are you hiring? I need a job"<br><br>
<h5 style="margin-bottom: 0.2em;"><b>Quick Test Sequence</b></h5>
<b>Essential Test Set:</b><br>
1. "What's on sale?"<br>
2. "I need a hat"<br>
3. "Do you sell belts?"<br>
4. "I'm not sure what to buy"<br>
5. "I don't like hats - what else?"<br>
</div>