# Week 1: LLM Fundamentals & Basic Chatbots

## üìö Session Overview

**Duration:** 2 hours  
**Week:** 1  
**Instructor-Led Session**

## üéØ Learning Objectives

By the end of this session, you will be able to:
1. Understand what LLMs are and how they work
2. Make API calls to OpenAI's GPT models
3. Implement conversation history and memory
4. Create streaming responses
5. Apply basic prompt engineering techniques

## üìã Prerequisites

- ‚úÖ Python 3.10+
- ‚úÖ OpenAI API key

## ‚è±Ô∏è Estimated Time

- Setup & Introduction: 10 minutes
- Section 1 (LLM Basics): 30 minutes
- Section 2 (First Chatbot): 30 minutes
- Section 3 (Memory & History): 25 minutes
- Section 4 (Prompt Engineering): 20 minutes
- Wrap-up & Q&A: 5 minutes

## üîß Setup

In [None]:
# Import required libraries
import os
from dotenv import load_dotenv
from openai import OpenAI
import json
from datetime import datetime
from typing import List, Dict

# Load environment variables
load_dotenv()

# Initialize OpenAI client
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

print("‚úÖ Setup complete!")
print(os.getenv("OPENAI_API_KEY"))

---

# Section 1: Introduction to LLMs

## What are Large Language Models?

Large Language Models (LLMs) are AI models trained on vast amounts of text data to:
- **Understand** natural language
- **Generate** human-like text
- **Complete** tasks based on instructions

### Popular LLMs:
- **GPT-4** / GPT-3.5 (OpenAI)
- **Claude** (Anthropic)
- **Llama** (Meta)
- **Gemini** (Google)

---

## Key Concepts

### 1. **Tokens**
- LLMs process text as tokens (roughly 4 characters per token)
- "Hello World" ‚âà 2 tokens
- Token limits vary by model (e.g., GPT-3.5: 4K, GPT-4: 8K-128K)

### 2. **Temperature** (0.0 - 2.0)
- Controls randomness/creativity
- **Low (0.0-0.3)**: Deterministic, consistent
- **Medium (0.5-0.7)**: Balanced
- **High (0.8-2.0)**: Creative, varied

### 3. **Context Window**
- How much text the model can "remember" at once
- Includes both input and output

---

## 1.1: Your First API Call

In [None]:
# Simple completion
response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "user", "content": "What is an AI agent?"}
    ],
    temperature=0.7,
    max_tokens=150
)

print("ü§ñ Response:")
print(response.choices[0].message.content)
print(f"\nüìä Tokens used: {response.usage.total_tokens}")

### üîç Understanding the Response Object

In [None]:
# Let's examine the response structure
print("Response object structure:")
print(f"ID: {response.id}")
print(f"Model: {response.model}")
print(f"Created: {datetime.fromtimestamp(response.created)}")
print(f"\nUsage:")
print(f"  Prompt tokens: {response.usage.prompt_tokens}")
print(f"  Completion tokens: {response.usage.completion_tokens}")
print(f"  Total tokens: {response.usage.total_tokens}")
print(f"\nMessage role: {response.choices[0].message.role}")
print(f"Finish reason: {response.choices[0].finish_reason}")

## 1.2: Experimenting with Temperature

Let's see how temperature affects responses:

In [None]:
prompt = "Write a creative tagline for an AI chatbot."

temperatures = [0.0, 0.5, 1.0, 1.5]

for temp in temperatures:
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": prompt}],
        temperature=temp,
        max_tokens=50
    )
    
    print(f"üå°Ô∏è Temperature {temp}:")
    print(f"   {response.choices[0].message.content}")
    print()

### ‚úèÔ∏è Try It Yourself!

**Exercise:** Ask the LLM to explain a technical concept at different temperatures.

Observe how:
- Low temperature = consistent, factual
- High temperature = creative, varied

In [None]:
# YOUR CODE HERE
# Try different temperatures and compare outputs

your_prompt = "Explain what a neural network is in simple terms."

# TODO: Test with temperature 0.0 and 1.5


---

# Section 2: Building Your First Chatbot

Now let's build a simple chatbot with a conversation loop!

## 2.1: Simple Single-Turn Chatbot

In [None]:
def simple_chatbot(user_message: str) -> str:
    """
    A simple chatbot that responds to a single message.
    No memory - each call is independent.
    """
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "user", "content": user_message}
        ],
        temperature=0.7,
        max_tokens=200
    )
    
    return response.choices[0].message.content

# Test it
print("ü§ñ:", simple_chatbot("Hello! What can you help me with?"))
print()
print("ü§ñ:", simple_chatbot("What did I just ask you?"))  # It won't remember!

### ‚ö†Ô∏è Problem: No Memory!

The chatbot doesn't remember previous messages. Let's fix that!

## 2.2: Adding System Prompts

System prompts define the chatbot's personality and behavior.

In [None]:
def chatbot_with_personality(user_message: str, personality: str = "helpful") -> str:
    """
    Chatbot with different personalities based on system prompt.
    """
    
    # Different system prompts for different personalities
    system_prompts = {
        "helpful": "You are a helpful and friendly AI assistant.",
        "professional": "You are a professional business consultant. Be formal and concise.",
        "casual": "You are a casual, fun friend. Use emojis and keep it light!",
        "teacher": "You are a patient teacher. Explain concepts clearly with examples."
    }
    
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": system_prompts.get(personality, system_prompts["helpful"])},
            {"role": "user", "content": user_message}
        ],
        temperature=0.7,
        max_tokens=200
    )
    
    return response.choices[0].message.content

# Test different personalities
question = "How do I learn Python?"

print("üëî Professional:")
print(chatbot_with_personality(question, "professional"))
print()

print("üòÑ Casual:")
print(chatbot_with_personality(question, "casual"))
print()

print("üë®‚Äçüè´ Teacher:")
print(chatbot_with_personality(question, "teacher"))

### ‚úèÔ∏è Try It Yourself!

**Exercise:** Create your own custom personality!

In [None]:
# YOUR CODE HERE
# Create a custom system prompt and test it

custom_system_prompt = """You are a pirate captain. 
Speak like a pirate and give advice about sailing the seven seas!"""

# TODO: Use this system prompt to respond to a question


---

# Section 3: Conversation History & Memory

Let's build a chatbot that remembers the conversation!

## 3.1: Chatbot with Full Conversation History

In [None]:
class Chatbot:
    """
    A chatbot that maintains conversation history.
    """
    
    def __init__(self, system_prompt: str = "You are a helpful AI assistant."):
        self.messages = [
            {"role": "system", "content": system_prompt}
        ]
    
    def chat(self, user_message: str) -> str:
        """
        Send a message and get a response, maintaining history.
        """
        # Add user message to history
        self.messages.append({
            "role": "user",
            "content": user_message
        })
        
        # Get response from API
        response = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=self.messages,
            temperature=0.7,
            max_tokens=200
        )
        
        # Extract assistant's response
        assistant_message = response.choices[0].message.content
        
        # Add assistant's response to history
        self.messages.append({
            "role": "assistant",
            "content": assistant_message
        })
        
        return assistant_message
    
    def get_history(self) -> List[Dict]:
        """Get the conversation history."""
        return self.messages
    
    def clear_history(self):
        """Clear conversation history (keep system prompt)."""
        self.messages = [self.messages[0]]  # Keep only system prompt


# Test the chatbot
bot = Chatbot(system_prompt="You are a friendly AI assistant who loves to help!")

print("üë§: Hi, my name is Alex.")
print(f"ü§ñ: {bot.chat('Hi, my name is Alex.')}")
print()

print("üë§: What's my name?")
print(f"ü§ñ: {bot.chat('What is my name?')}")
print()

print("üë§: I love programming in Python.")
print(f"ü§ñ: {bot.chat('I love programming in Python.')}")
print()

print("üë§: What programming language did I mention?")
print(f"ü§ñ: {bot.chat('What programming language did I mention?')}")

## 3.2: Viewing Conversation History

In [None]:
# Display the conversation history
print("üìú Conversation History:")
print("=" * 50)

for i, message in enumerate(bot.get_history(), 1):
    role = message["role"].upper()
    content = message["content"]
    print(f"\n{i}. [{role}]")
    print(f"   {content}")
    print("-" * 50)

## 3.3: Managing Memory with Window Size

Keeping all history can exceed token limits. Let's implement a sliding window:

In [None]:
class ChatbotWithMemoryWindow:
    """
    Chatbot that keeps only the last N messages to manage token limits.
    """
    
    def __init__(self, system_prompt: str = "You are a helpful AI assistant.", 
                 max_history: int = 10):
        """
        Args:
            system_prompt: The system message
            max_history: Maximum number of messages to keep (not counting system)
        """
        self.system_prompt = {"role": "system", "content": system_prompt}
        self.messages = []
        self.max_history = max_history
    
    def chat(self, user_message: str) -> str:
        """Send a message and get a response."""
        # Add user message
        self.messages.append({"role": "user", "content": user_message})
        
        # Trim history if needed (keep last max_history messages)
        if len(self.messages) > self.max_history:
            self.messages = self.messages[-self.max_history:]
        
        # Build messages for API (system + history)
        api_messages = [self.system_prompt] + self.messages
        
        # Get response
        response = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=api_messages,
            temperature=0.7,
            max_tokens=200
        )
        
        assistant_message = response.choices[0].message.content
        
        # Add assistant response
        self.messages.append({"role": "assistant", "content": assistant_message})
        
        # Trim again if needed
        if len(self.messages) > self.max_history:
            self.messages = self.messages[-self.max_history:]
        
        return assistant_message


# Test with limited history
limited_bot = ChatbotWithMemoryWindow(max_history=6)  # Keep last 3 exchanges

print(f"ü§ñ: {limited_bot.chat('Hi, I am learning about AI.')}")
print(f"ü§ñ: {limited_bot.chat('My favorite color is blue.')}")
print(f"ü§ñ: {limited_bot.chat('I have a dog named Max.')}")
print(f"ü§ñ: {limited_bot.chat('I work as a software engineer.')}")
print()
print("Now testing memory...")
print(f"ü§ñ: {limited_bot.chat('What am I learning about?')}")
print(f"ü§ñ: {limited_bot.chat('What is my favorite color?')}")

### ‚úèÔ∏è Try It Yourself!

**Exercise:** Test the memory window by having a long conversation and seeing what gets forgotten.

In [None]:
# YOUR CODE HERE
# Create a chatbot with max_history=4 and test memory limits


---

# Section 4: Streaming Responses

Streaming makes chatbots feel more responsive!

## 4.1: Basic Streaming

In [None]:
def stream_response(user_message: str):
    """
    Stream the response token by token.
    """
    print("ü§ñ: ", end="", flush=True)
    
    stream = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "user", "content": user_message}
        ],
        temperature=0.7,
        max_tokens=200,
        stream=True  # Enable streaming!
    )
    
    full_response = ""
    
    for chunk in stream:
        if chunk.choices[0].delta.content is not None:
            content = chunk.choices[0].delta.content
            print(content, end="", flush=True)
            full_response += content
    
    print()  # New line after streaming
    return full_response

# Test streaming
response = stream_response("Tell me a short story about a robot learning to code.")

## 4.2: Chatbot with Streaming Support

In [None]:
class StreamingChatbot:
    """
    Chatbot with streaming support and conversation history.
    """
    
    def __init__(self, system_prompt: str = "You are a helpful AI assistant."):
        self.messages = [{"role": "system", "content": system_prompt}]
    
    def chat(self, user_message: str, stream: bool = False) -> str:
        """
        Chat with optional streaming.
        
        Args:
            user_message: The user's message
            stream: Whether to stream the response
        """
        # Add user message
        self.messages.append({"role": "user", "content": user_message})
        
        if stream:
            # Streaming response
            print("ü§ñ: ", end="", flush=True)
            
            response_stream = client.chat.completions.create(
                model="gpt-3.5-turbo",
                messages=self.messages,
                temperature=0.7,
                max_tokens=200,
                stream=True
            )
            
            full_response = ""
            for chunk in response_stream:
                if chunk.choices[0].delta.content is not None:
                    content = chunk.choices[0].delta.content
                    print(content, end="", flush=True)
                    full_response += content
            
            print()  # New line
            
        else:
            # Non-streaming response
            response = client.chat.completions.create(
                model="gpt-3.5-turbo",
                messages=self.messages,
                temperature=0.7,
                max_tokens=200
            )
            full_response = response.choices[0].message.content
        
        # Add assistant response to history
        self.messages.append({"role": "assistant", "content": full_response})
        
        return full_response


# Test streaming chatbot
streaming_bot = StreamingChatbot()

print("Testing with streaming:")
streaming_bot.chat("Explain what machine learning is.", stream=True)
print()

print("\nTesting without streaming:")
response = streaming_bot.chat("What are the main types of machine learning?", stream=False)
print(f"ü§ñ: {response}")

---

# Section 5: Prompt Engineering Basics

Learn how to write better prompts for better responses!

## 5.1: Be Specific and Clear

In [None]:
# ‚ùå Vague prompt
vague_prompt = "Tell me about Python."

# ‚úÖ Specific prompt
specific_prompt = """Explain Python programming language to a beginner who has never coded before. 
Include: what it's used for, why it's popular, and one simple example.
Keep the explanation under 100 words."""

print("Vague Prompt Response:")
print(simple_chatbot(vague_prompt))
print("\n" + "="*50 + "\n")

print("Specific Prompt Response:")
print(simple_chatbot(specific_prompt))

## 5.2: Few-Shot Learning (Provide Examples)

In [None]:
# Using examples to teach the model a pattern
few_shot_messages = [
    {"role": "system", "content": "You convert sentences into emoji summaries."},
    {"role": "user", "content": "I love coding in Python."},
    {"role": "assistant", "content": "‚ù§Ô∏èüíªüêç"},
    {"role": "user", "content": "I went to the beach and saw dolphins."},
    {"role": "assistant", "content": "üèñÔ∏èüê¨"},
    {"role": "user", "content": "I ate pizza and watched a movie."}
]

response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=few_shot_messages,
    temperature=0.3,
    max_tokens=50
)

print("ü§ñ:", response.choices[0].message.content)

## 5.3: Prompt Templates

Reusable templates for common tasks:

In [None]:
def create_prompt_from_template(template: str, **kwargs) -> str:
    """
    Create a prompt from a template with variables.
    """
    return template.format(**kwargs)

# Define templates
SUMMARIZE_TEMPLATE = """
Summarize the following text in {num_sentences} sentences:

{text}

Summary:
"""

TRANSLATE_TEMPLATE = """
Translate the following text from {source_lang} to {target_lang}:

{text}

Translation:
"""

# Use the template
text_to_summarize = """
Artificial Intelligence (AI) is intelligence demonstrated by machines, 
in contrast to the natural intelligence displayed by humans and animals. 
Leading AI textbooks define the field as the study of "intelligent agents": 
any device that perceives its environment and takes actions that maximize 
its chance of successfully achieving its goals.
"""

prompt = create_prompt_from_template(
    SUMMARIZE_TEMPLATE,
    num_sentences=2,
    text=text_to_summarize
)

print("Generated Prompt:")
print(prompt)
print("\n" + "="*50 + "\n")

response = simple_chatbot(prompt)
print("ü§ñ Response:")
print(response)

## 5.4: Chain of Thought Prompting

In [None]:
# Without chain of thought
simple_math = "What is 15% of 240?"

# With chain of thought
cot_math = """
What is 15% of 240?

Let's solve this step by step:
1. First, understand what we need to find
2. Convert the percentage to decimal
3. Multiply
4. Give the final answer
"""

print("Simple prompt:")
print(simple_chatbot(simple_math))
print("\n" + "="*50 + "\n")

print("Chain of thought prompt:")
print(simple_chatbot(cot_math))

### ‚úèÔ∏è Try It Yourself!

**Exercise:** Create a prompt template for code explanation.

In [None]:
# YOUR CODE HERE
# Create a template that explains code in simple terms

CODE_EXPLAIN_TEMPLATE = """
# TODO: Create your template here
# Variables to include: {code}, {language}, {skill_level}
"""

# Test it with a code snippet


---

# üéØ Summary & Key Takeaways

## What We Learned:

### 1. **LLM Fundamentals**
- Understanding tokens, temperature, and context windows
- Making API calls to OpenAI
- Examining response objects

### 2. **Building Chatbots**
- Single-turn vs. multi-turn conversations
- System prompts for personality
- Conversation history management

### 3. **Memory Management**
- Maintaining conversation context
- Sliding window approach for token limits
- Trade-offs between memory and performance

### 4. **Streaming Responses**
- Creating responsive, real-time chatbots
- Handling streamed chunks

### 5. **Prompt Engineering**
- Writing clear, specific prompts
- Few-shot learning with examples
- Prompt templates for reusability
- Chain of thought reasoning

---

## üìù Next Steps:

### Exercises for This Week:

**Exercise 1 (Due Monday):** `02_exercise_personal_assistant.ipynb`
- Build a personal assistant chatbot
- Implement memory and personalities
- Add prompt templates

**Exercise 2 (Due Friday):** `03_exercise_domain_chatbot.ipynb`
- Create a domain-specific chatbot
- Implement streaming
- Add input validation

---

## ü§î Reflection Questions:

1. When would you use high vs. low temperature?
2. Why is conversation history important?
3. What are the trade-offs of keeping all conversation history?
4. How does prompt engineering improve responses?

---

## üìö Additional Resources:

- [OpenAI API Documentation](https://platform.openai.com/docs)
- [Prompt Engineering Guide](https://www.promptingguide.ai/)
- [Best Practices for Prompt Engineering](https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-openai-api)

---

## ‚ùì Questions?

**Office Hours:** Monday & Friday check-ins  
**Next Session:** Week 2 - LangChain Core Concepts

---

**Happy Coding! üöÄ**