# Lesson 6 – Lists, Tuples & Dictionaries for GenAI

In this lesson, you'll learn about Python's three most important data structures and how they're used in AI and machine learning applications.

---

## Learning Objectives
- Understand **lists**, **tuples**, and **dictionaries**
- Learn when to use each data structure
- Apply these concepts to GenAI scenarios
- Practice with real-world AI examples

---

## 1. Lists - Dynamic Collections

**Lists** are ordered, mutable collections that can store multiple items of any type.

### Basic List Operations

In [None]:
# Creating lists
ai_models = ["GPT-4", "Claude", "Gemini", "LLaMA"]
model_scores = [95, 92, 88, 85]
mixed_data = ["GPT-4", 95, True, 3.14]

print("AI Models:", ai_models)
print("Scores:", model_scores)
print("Mixed data:", mixed_data)

In [None]:
# Accessing list elements (indexing)
print("First model:", ai_models[0])
print("Last model:", ai_models[-1])
print("Best score:", max(model_scores))

In [None]:
# Modifying lists
ai_models.append("Mistral")  # Add to end
ai_models.insert(1, "Bard")  # Insert at position
ai_models.remove("Bard")     # Remove by value

print("Updated models:", ai_models)
print("Number of models:", len(ai_models))

### GenAI Example: Processing User Prompts

In [None]:
# Real-world GenAI scenario: Managing conversation history
conversation_history = []

# Adding messages to conversation one by one
conversation_history.append({"role": "user", "content": "What is machine learning?"})
conversation_history.append({"role": "assistant", "content": "Machine learning is a subset of AI that enables computers to learn from data."})
conversation_history.append({"role": "user", "content": "Explain neural networks"})
conversation_history.append({"role": "assistant", "content": "Neural networks are computing systems inspired by biological neural networks."})

print(f"Conversation has {len(conversation_history)} messages")
print("Last message:", conversation_history[-1])
print("First message:", conversation_history[0])

### Basic List Processing

In [None]:
# Processing AI prompts step by step
raw_prompt1 = "  EXPLAIN AI  "
raw_prompt2 = "  what is ML?  "
raw_prompt3 = "  DEFINE NLP  "

# Clean prompts one by one
clean_prompt1 = raw_prompt1.strip().lower()
clean_prompt2 = raw_prompt2.strip().lower()
clean_prompt3 = raw_prompt3.strip().lower()

print("Original:", raw_prompt1)
print("Cleaned:", clean_prompt1)
print("Length:", len(clean_prompt1))

# Create a list of cleaned prompts
clean_prompts = [clean_prompt1, clean_prompt2, clean_prompt3]
print("All cleaned prompts:", clean_prompts)

---

## 2. Tuples - Immutable Collections

**Tuples** are ordered, immutable collections. Perfect for data that shouldn't change.

### Basic Tuple Operations

In [None]:
# Creating tuples
model_info = ("GPT-4", "OpenAI", "2023", 1.76e12)  # (name, company, year, parameters)
coordinates = (10.5, 20.3)  # x, y coordinates
rgb_color = (255, 128, 0)   # Red, Green, Blue values

print("Model info:", model_info)
print("Type:", type(model_info))

In [None]:
# Accessing tuple elements
model_name = model_info[0]
company = model_info[1]
param_count = model_info[3]

print(f"{model_name} by {company} has {param_count:.1e} parameters")

In [None]:
# Tuple unpacking (very useful!)
name, company, year, parameters = model_info
print(f"Unpacked: {name} was released by {company} in {year}")

# Multiple assignment using tuples
x, y = coordinates
print(f"Position: ({x}, {y})")

### GenAI Example: Model Configuration and Results

In [None]:
# AI model configurations (immutable settings)
gpt35_config = ("gpt-3.5-turbo", 0.7, 150, 0.01)  # (model, temperature, max_tokens, cost_per_1k)
gpt4_config = ("gpt-4", 0.5, 200, 0.03)
claude_config = ("claude-3", 0.8, 180, 0.02)

# Accessing tuple elements
model_name, temperature, max_tokens, cost = gpt4_config
print(f"Model: {model_name}")
print(f"Temperature: {temperature}")
print(f"Max tokens: {max_tokens}")
print(f"Cost per 1k tokens: ${cost}")

# Create a list of all configs
all_configs = [gpt35_config, gpt4_config, claude_config]
print(f"\nTotal models available: {len(all_configs)}")

In [None]:
# Function returning multiple values (using tuples)
def analyze_text(text):
    word_count = len(text.split())
    char_count = len(text)
    avg_word_length = char_count / word_count if word_count > 0 else 0
    return word_count, char_count, avg_word_length

# Example: Analyzing AI-generated text
ai_response = "Artificial intelligence is transforming how we interact with technology."
words, chars, avg_len = analyze_text(ai_response)

print(f"Text analysis: {words} words, {chars} characters, avg length: {avg_len:.1f}")

---

## 3. Dictionaries - Key-Value Mappings

**Dictionaries** store data as key-value pairs. Essential for structured data in AI applications.

### Basic Dictionary Operations

In [None]:
# Creating dictionaries
ai_model = {
    "name": "GPT-4",
    "company": "OpenAI",
    "type": "Language Model",
    "parameters": 1760000000000,
    "multimodal": True
}

print("AI Model:", ai_model)
print("Model name:", ai_model["name"])
print("Is multimodal:", ai_model["multimodal"])

In [None]:
# Adding and modifying dictionary data
ai_model["release_year"] = 2023
ai_model["parameters"] = "1.76 trillion"  # More readable format

print("Updated model:")
print(f"  name: {ai_model['name']}")
print(f"  company: {ai_model['company']}")
print(f"  type: {ai_model['type']}")
print(f"  parameters: {ai_model['parameters']}")
print(f"  multimodal: {ai_model['multimodal']}")
print(f"  release_year: {ai_model['release_year']}")

In [None]:
# Safe access with .get() method
cost = ai_model.get("cost_per_token", "Not specified")
print(f"Cost: {cost}")

# Check if key exists
if "multimodal" in ai_model:
    print(f"This model supports multimodal: {ai_model['multimodal']}")

### GenAI Example: Managing API Responses

In [None]:
# Simulating API response from an AI service
api_response = {
    "id": "chatcmpl-123",
    "object": "chat.completion",
    "model": "gpt-4",
    "choices": [
        {
            "message": {
                "role": "assistant",
                "content": "Machine learning is a powerful subset of artificial intelligence."
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 12,
        "completion_tokens": 15,
        "total_tokens": 27
    }
}

# Extracting information from nested dictionaries
response_text = api_response["choices"][0]["message"]["content"]
total_tokens = api_response["usage"]["total_tokens"]
model_used = api_response["model"]

print(f"AI Response: {response_text}")
print(f"Tokens used: {total_tokens} (Model: {model_used})")

### GenAI Example: Prompt Templates and Configuration

In [None]:
# AI prompt templates
prompt_templates = {
    "explain": "Explain {topic} in simple terms for a beginner.",
    "code": "Write Python code to {task}. Include comments.",
    "debug": "Help me debug this Python code: {code}"
}

# Using templates
topic = "neural networks"
explain_template = prompt_templates["explain"]
explain_prompt = explain_template.format(topic=topic)
print(f"Template: {explain_template}")
print(f"Generated prompt: {explain_prompt}")

# AI model settings
ai_settings = {
    "temperature": 0.7,
    "max_tokens": 150,
    "top_p": 0.9,
    "frequency_penalty": 0.1
}

print("\nAI Configuration:")
print(f"  temperature: {ai_settings['temperature']}")
print(f"  max_tokens: {ai_settings['max_tokens']}")
print(f"  top_p: {ai_settings['top_p']}")
print(f"  frequency_penalty: {ai_settings['frequency_penalty']}")

---

## 4. Combining Data Structures

Real GenAI applications often combine lists, tuples, and dictionaries.

In [None]:
# AI model information using combined data structures
gpt4_model = {
    "name": "GPT-4",
    "company": "OpenAI",
    "capabilities": ["text", "image", "code"],
    "pricing": (0.03, 0.06),  # (input_cost, output_cost) per 1k tokens
    "max_context": 8192
}

claude_model = {
    "name": "Claude-3",
    "company": "Anthropic",
    "capabilities": ["text", "image", "analysis"],
    "pricing": (0.015, 0.075),
    "max_context": 200000
}

# Accessing nested data
print(f"Model: {gpt4_model['name']}")
print(f"Company: {gpt4_model['company']}")
print(f"First capability: {gpt4_model['capabilities'][0]}")
print(f"Second capability: {gpt4_model['capabilities'][1]}")
print(f"Total capabilities: {len(gpt4_model['capabilities'])}")

# Accessing tuple within dictionary
input_cost, output_cost = gpt4_model["pricing"]
print(f"Input cost: ${input_cost}/1k tokens")
print(f"Output cost: ${output_cost}/1k tokens")

# Create a list of models
ai_models_db = [gpt4_model, claude_model]
print(f"\nTotal models in database: {len(ai_models_db)}")

In [None]:
# Checking model capabilities
print("Checking which models can process images:")

# Check GPT-4
if "image" in gpt4_model["capabilities"]:
    print(f"✓ {gpt4_model['name']} can process images")
else:
    print(f"✗ {gpt4_model['name']} cannot process images")

# Check Claude
if "image" in claude_model["capabilities"]:
    print(f"✓ {claude_model['name']} can process images")
else:
    print(f"✗ {claude_model['name']} cannot process images")

# Compare costs
gpt4_input_cost = gpt4_model["pricing"][0]
claude_input_cost = claude_model["pricing"][0]

print(f"\nCost comparison:")
print(f"GPT-4 input cost: ${gpt4_input_cost}/1k tokens")
print(f"Claude input cost: ${claude_input_cost}/1k tokens")

if gpt4_input_cost < claude_input_cost:
    print("GPT-4 is cheaper for input")
else:
    print("Claude is cheaper for input")

---

## 5. Practice Exercises

### Exercise 1: Conversation Manager

In [None]:
# TODO: Create a conversation manager
# 1. Create an empty list called 'conversation'
# 2. Add at least 3 message dictionaries with 'role' and 'content' keys
# 3. Print the total number of messages
# 4. Print the first and last messages

# Your code here:
conversation = []

# Example: Add your messages like this:
# conversation.append({"role": "user", "content": "Hello AI!"})
# conversation.append({"role": "assistant", "content": "Hello! How can I help?"})

# Print results
# print(f"Total messages: {len(conversation)}")
# print(f"First message: {conversation[0]}")
# print(f"Last message: {conversation[-1]}")

### Exercise 2: Model Performance Tracker

In [None]:
# TODO: Create a model performance tracker
# 1. Create a dictionary with model names as keys
# 2. Each value should be a tuple of (accuracy, speed, cost)
# 3. Compare models by accessing their data
# 4. Find which model has the best accuracy

# Example data to use:
# "GPT-4": (95, 8, 0.03)
# "Claude-3": (93, 12, 0.015)
# "Gemini": (88, 15, 0.001)

# Your code here:
model_performance = {
    # Add your data like: "GPT-4": (95, 8, 0.03)
}

# Example of how to access data:
# gpt4_data = model_performance["GPT-4"]
# accuracy, speed, cost = gpt4_data
# print(f"GPT-4 accuracy: {accuracy}%")

### Exercise 3: Prompt Library

In [None]:
# TODO: Build a prompt library system
# 1. Create a list of prompt categories: ["coding", "writing", "analysis"]
# 2. Create a dictionary where each category maps to a list of prompts
# 3. Add at least 2 prompts per category
# 4. Access prompts using indexing

# Your code here:
categories = ["coding", "writing", "analysis"]
prompt_library = {
    # Add your prompts like:
    # "coding": ["Write a Python function", "Debug this code"],
    # "writing": ["Write an essay about", "Create a summary of"],
    # "analysis": ["Analyze this data", "Compare these options"]
}

# Example of accessing prompts:
# coding_prompts = prompt_library["coding"]
# first_coding_prompt = coding_prompts[0]
# print(f"First coding prompt: {first_coding_prompt}")

---

## 6. Key Takeaways

### When to Use Each Data Structure:

| Data Structure | Use When | GenAI Examples |
|----------------|----------|----------------|
| **List** | Ordered, changeable data | Conversation history, token sequences, model outputs |
| **Tuple** | Ordered, unchangeable data | Model configurations, coordinates, API responses |
| **Dictionary** | Key-value relationships | API responses, model metadata, user profiles |

### Common Patterns in GenAI:
- **Lists**: Store sequences of messages, tokens, or results
- **Tuples**: Package related data (model settings, coordinates)
- **Dictionaries**: Structure complex data (API responses, configurations)

### Best Practices:
1. Use descriptive variable names
2. Choose the right data structure for your use case
3. Use `.get()` for safe dictionary access
4. Combine structures when needed for complex data
5. Consider immutability when data shouldn't change

---

## Next Steps
- Practice with real AI API responses
- Learn about data processing with pandas
- Explore JSON handling for API integration
- Study prompt engineering techniques