<a href="https://colab.research.google.com/github/jeremiahoclark/python-coding-patterns/blob/main/02_pythonic_idioms.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Pythonic Idioms and Language-Specific Patterns

This notebook covers Python-specific patterns that leverage the language's unique features for clean, idiomatic code. These patterns are not from the Gang of Four but are equally important for writing Pythonic code.

## Overview

Python has a rich set of idioms and established "Pythonic" ways to solve problems. This section covers:
- **EAFP vs LBYL**: Error handling philosophy
- **Context Managers**: Resource management with `with` statements
- **Comprehensions**: Concise collection creation
- **Duck Typing**: Polymorphism by protocol
- **Iteration Protocol**: Custom iterables and iterators

Each pattern includes practical examples with real-world data scenarios.

In [None]:
# Required imports for all patterns
import os
import time
import json
import contextlib
from contextlib import contextmanager
from typing import Protocol, Iterator, Iterable, Any, Dict, List, Optional
from io import StringIO
import tempfile
import random
from datetime import datetime, timedelta
import threading

## 2.1 EAFP vs LBYL (Coding Style Idiom)

**EAFP**: "Easier to Ask Forgiveness than Permission" - Try the operation and handle exceptions if it fails.

**LBYL**: "Look Before You Leap" - Check conditions before performing an action.

**Python Philosophy**: EAFP is generally preferred because it's often cleaner and avoids race conditions.

**Real-world Example**: Processing configuration data where some keys might be missing.

In [None]:
# Sample configuration data with missing keys
user_configs = [
    {"user_id": 1, "name": "Alice", "email": "alice@example.com", "age": 30, "premium": True},
    {"user_id": 2, "name": "Bob", "email": "bob@example.com", "premium": False},  # Missing age
    {"user_id": 3, "name": "Charlie", "age": 25},  # Missing email and premium
    {"user_id": 4, "name": "Diana", "email": "diana@example.com", "age": 28, "premium": True}
]

def process_user_config_lbyl(config: Dict[str, Any]) -> Dict[str, Any]:
    """LBYL approach: Check before accessing."""
    result = {"user_id": config["user_id"], "name": config["name"]}

    # Check each field before accessing
    if "email" in config:
        result["email"] = config["email"]
        result["has_email"] = True
    else:
        result["email"] = "not_provided@unknown.com"
        result["has_email"] = False

    if "age" in config:
        result["age"] = config["age"]
        result["age_group"] = "adult" if config["age"] >= 18 else "minor"
    else:
        result["age"] = 0
        result["age_group"] = "unknown"

    if "premium" in config:
        result["premium"] = config["premium"]
        result["account_type"] = "premium" if config["premium"] else "basic"
    else:
        result["premium"] = False
        result["account_type"] = "basic"

    return result

def process_user_config_eafp(config: Dict[str, Any]) -> Dict[str, Any]:
    """EAFP approach: Try and handle exceptions."""
    result = {"user_id": config["user_id"], "name": config["name"]}

    # Try to access each field directly
    try:
        result["email"] = config["email"]
        result["has_email"] = True
    except KeyError:
        result["email"] = "not_provided@unknown.com"
        result["has_email"] = False

    try:
        age = config["age"]
        result["age"] = age
        result["age_group"] = "adult" if age >= 18 else "minor"
    except KeyError:
        result["age"] = 0
        result["age_group"] = "unknown"

    try:
        premium = config["premium"]
        result["premium"] = premium
        result["account_type"] = "premium" if premium else "basic"
    except KeyError:
        result["premium"] = False
        result["account_type"] = "basic"

    return result

def process_user_config_pythonic(config: Dict[str, Any]) -> Dict[str, Any]:
    """Pythonic approach using dict.get() method."""
    result = {"user_id": config["user_id"], "name": config["name"]}

    # Use dict.get() with defaults - neither pure LBYL nor EAFP
    email = config.get("email", "not_provided@unknown.com")
    age = config.get("age", 0)
    premium = config.get("premium", False)

    result.update({
        "email": email,
        "has_email": "@" in email and email != "not_provided@unknown.com",
        "age": age,
        "age_group": "adult" if age >= 18 else "minor" if age > 0 else "unknown",
        "premium": premium,
        "account_type": "premium" if premium else "basic"
    })

    return result

In [None]:
# Demo: EAFP vs LBYL
print("=== EAFP vs LBYL Demo ===")
print("Processing user configurations...\n")

for i, config in enumerate(user_configs, 1):
    print(f"User {i}: {config}")

    # Process with all three approaches
    lbyl_result = process_user_config_lbyl(config)
    eafp_result = process_user_config_eafp(config)
    pythonic_result = process_user_config_pythonic(config)

    # Verify all approaches produce the same result
    assert lbyl_result == eafp_result == pythonic_result, "Results should be identical"

    print(f"  Processed: {pythonic_result}")
    print()

# Performance comparison
print("=== Performance Comparison ===")

# Create test data with varying missing fields
test_configs = []
for i in range(1000):
    config = {"user_id": i, "name": f"User{i}"}
    if i % 2 == 0:  # 50% have email
        config["email"] = f"user{i}@example.com"
    if i % 3 == 0:  # 33% have age
        config["age"] = random.randint(18, 65)
    if i % 4 == 0:  # 25% have premium
        config["premium"] = random.choice([True, False])
    test_configs.append(config)

# Time each approach
approaches = [
    ("LBYL", process_user_config_lbyl),
    ("EAFP", process_user_config_eafp),
    ("Pythonic", process_user_config_pythonic)
]

for name, func in approaches:
    start_time = time.time()
    results = [func(config) for config in test_configs]
    end_time = time.time()

    print(f"{name} approach: {end_time - start_time:.4f} seconds for {len(test_configs)} configs")

print("\nNote: Pythonic approach using dict.get() is often fastest and most readable!")

## 2.2 Context Manager Pattern (`with` statement)

**Problem**: Ensure resources are properly cleaned up (files closed, locks released, connections closed) even if exceptions occur.

**Solution**: Use context managers with the `with` statement to guarantee cleanup.

**Real-world Example**: Managing database connections, temporary files, and performance timing.

In [None]:
# Custom context manager class for database simulation
class DatabaseConnection:
    """Simulates a database connection with proper resource management."""

    def __init__(self, db_name: str, host: str = "localhost"):
        self.db_name = db_name
        self.host = host
        self.connection = None
        self.transaction_active = False
        self.queries_executed = 0

    def __enter__(self):
        print(f"🔌 Connecting to database '{self.db_name}' at {self.host}")
        # Simulate connection setup
        time.sleep(0.1)  # Simulate connection time
        self.connection = f"conn_to_{self.db_name}"
        print(f"✅ Connected successfully")
        return self

    def __exit__(self, exc_type, exc_value, exc_traceback):
        if exc_type is not None:
            print(f"❌ Exception occurred: {exc_type.__name__}: {exc_value}")
            if self.transaction_active:
                print(f"🔄 Rolling back transaction")
                self.transaction_active = False
        else:
            if self.transaction_active:
                print(f"✅ Committing transaction")
                self.transaction_active = False

        print(f"🔌 Closing database connection (executed {self.queries_executed} queries)")
        self.connection = None
        return False  # Don't suppress exceptions

    def begin_transaction(self):
        print(f"🚀 Beginning transaction")
        self.transaction_active = True

    def execute_query(self, query: str) -> List[Dict[str, Any]]:
        if not self.connection:
            raise RuntimeError("Not connected to database")

        self.queries_executed += 1
        print(f"📊 Executing query: {query}")

        # Simulate query results
        if "users" in query.lower():
            return [
                {"id": 1, "name": "Alice", "role": "admin"},
                {"id": 2, "name": "Bob", "role": "user"}
            ]
        elif "products" in query.lower():
            return [
                {"id": 1, "name": "Laptop", "price": 999.99},
                {"id": 2, "name": "Mouse", "price": 29.99}
            ]
        else:
            return [{"result": "success", "rows_affected": 1}]

# Function-based context manager using @contextmanager decorator
@contextmanager
def timing_context(operation_name: str):
    """Context manager for timing operations."""
    print(f"⏱️  Starting {operation_name}")
    start_time = time.time()

    try:
        yield start_time
    finally:
        end_time = time.time()
        duration = end_time - start_time
        print(f"⏱️  {operation_name} completed in {duration:.4f} seconds")

@contextmanager
def temporary_config_file(config_data: Dict[str, Any]):
    """Context manager for temporary configuration files."""
    # Create temporary file
    temp_file = tempfile.NamedTemporaryFile(mode='w', suffix='.json', delete=False)
    temp_path = temp_file.name

    try:
        # Write config data
        json.dump(config_data, temp_file, indent=2)
        temp_file.flush()
        temp_file.close()

        print(f"📁 Created temporary config file: {temp_path}")
        yield temp_path

    finally:
        # Cleanup
        if os.path.exists(temp_path):
            os.unlink(temp_path)
            print(f"🗑️  Removed temporary config file: {temp_path}")

@contextmanager
def thread_safe_counter():
    """Context manager for thread-safe operations."""
    lock = threading.Lock()
    counter = {"value": 0}

    def increment():
        with lock:
            counter["value"] += 1
            return counter["value"]

    def get_value():
        with lock:
            return counter["value"]

    print("🔒 Thread-safe counter initialized")
    try:
        yield {"increment": increment, "get_value": get_value}
    finally:
        print(f"🔒 Thread-safe counter final value: {counter['value']}")

In [None]:
# Demo: Context Manager Pattern
print("=== Context Manager Pattern Demo ===")

# 1. Database connection with transaction
print("\n--- Database Connection Context Manager ---")
try:
    with DatabaseConnection("sales_db") as db:
        db.begin_transaction()
        users = db.execute_query("SELECT * FROM users")
        products = db.execute_query("SELECT * FROM products")

        print(f"Found {len(users)} users and {len(products)} products")

        # Simulate some business logic
        for user in users:
            db.execute_query(f"UPDATE user_stats SET last_login = NOW() WHERE id = {user['id']}")

        # Transaction will be committed automatically
except Exception as e:
    print(f"Database operation failed: {e}")

# 2. Error handling with rollback
print("\n--- Database Error Handling ---")
try:
    with DatabaseConnection("inventory_db") as db:
        db.begin_transaction()
        db.execute_query("UPDATE inventory SET quantity = quantity - 1 WHERE product_id = 123")

        # Simulate an error
        raise ValueError("Insufficient inventory!")

        db.execute_query("INSERT INTO orders (product_id, quantity) VALUES (123, 1)")

except ValueError as e:
    print(f"Business logic error handled: {e}")

# 3. Timing context manager
print("\n--- Timing Context Manager ---")
with timing_context("Data Processing"):
    # Simulate some work
    data = [i**2 for i in range(10000)]
    result = sum(data)
    print(f"Processed {len(data)} items, sum = {result}")

# 4. Temporary file context manager
print("\n--- Temporary File Context Manager ---")
config = {
    "database": {"host": "localhost", "port": 5432, "name": "myapp"},
    "api": {"timeout": 30, "retries": 3},
    "features": {"cache_enabled": True, "debug_mode": False}
}

with temporary_config_file(config) as config_path:
    # Read and use the temporary config
    with open(config_path, 'r') as f:
        loaded_config = json.load(f)

    print(f"Loaded config from {config_path}:")
    print(f"  Database: {loaded_config['database']['name']}")
    print(f"  API timeout: {loaded_config['api']['timeout']}s")
    print(f"  Cache enabled: {loaded_config['features']['cache_enabled']}")

# 5. Thread-safe counter
print("\n--- Thread-Safe Counter Context Manager ---")
with thread_safe_counter() as counter:
    # Simulate multiple operations
    for i in range(5):
        new_value = counter["increment"]()
        print(f"  Operation {i+1}: Counter = {new_value}")

    final_value = counter["get_value"]()
    print(f"  Final counter value: {final_value}")

# 6. Multiple context managers
print("\n--- Multiple Context Managers ---")
sample_data = {"results": [random.randint(1, 100) for _ in range(1000)]}

with timing_context("Multi-context operation"), \
     temporary_config_file(sample_data) as temp_file, \
     DatabaseConnection("analytics_db") as db:

    # Load data from temp file
    with open(temp_file, 'r') as f:
        data = json.load(f)

    # Process data
    results = data["results"]
    stats = {
        "count": len(results),
        "mean": sum(results) / len(results),
        "min": min(results),
        "max": max(results)
    }

    # Store in database
    db.execute_query(f"INSERT INTO analytics (count, mean, min, max) VALUES ({stats['count']}, {stats['mean']:.2f}, {stats['min']}, {stats['max']})")

    print(f"Processed {stats['count']} values:")
    print(f"  Mean: {stats['mean']:.2f}")
    print(f"  Range: {stats['min']} - {stats['max']}")

print("\n✅ All context managers handled resources properly!")

## 2.3 Comprehensions and Generator Expressions

**Problem**: Need concise, readable ways to create collections and process data.

**Solution**: Use list, dict, set comprehensions and generator expressions instead of verbose loops.

**Real-world Example**: Processing log data, transforming datasets, and analyzing user behavior.

In [None]:
# Sample data: simulated server logs
log_entries = [
    {"timestamp": "2024-01-15 10:30:15", "level": "INFO", "message": "User login successful", "user_id": 123, "ip": "192.168.1.100", "response_time": 45},
    {"timestamp": "2024-01-15 10:31:22", "level": "ERROR", "message": "Database connection failed", "user_id": None, "ip": "192.168.1.101", "response_time": 0},
    {"timestamp": "2024-01-15 10:32:18", "level": "INFO", "message": "API request processed", "user_id": 456, "ip": "192.168.1.102", "response_time": 120},
    {"timestamp": "2024-01-15 10:33:05", "level": "WARN", "message": "Slow query detected", "user_id": 789, "ip": "192.168.1.103", "response_time": 2500},
    {"timestamp": "2024-01-15 10:34:12", "level": "INFO", "message": "User logout", "user_id": 123, "ip": "192.168.1.100", "response_time": 25},
    {"timestamp": "2024-01-15 10:35:30", "level": "ERROR", "message": "Authentication failed", "user_id": None, "ip": "192.168.1.104", "response_time": 15},
    {"timestamp": "2024-01-15 10:36:45", "level": "INFO", "message": "File upload completed", "user_id": 456, "ip": "192.168.1.102", "response_time": 850},
    {"timestamp": "2024-01-15 10:37:20", "level": "DEBUG", "message": "Cache miss", "user_id": 789, "ip": "192.168.1.103", "response_time": 5}
]

# Sample sales data
sales_data = [
    {"product": "Laptop", "category": "Electronics", "price": 999.99, "quantity": 2, "discount": 0.1},
    {"product": "Mouse", "category": "Electronics", "price": 29.99, "quantity": 5, "discount": 0.0},
    {"product": "Keyboard", "category": "Electronics", "price": 79.99, "quantity": 3, "discount": 0.05},
    {"product": "Book", "category": "Education", "price": 19.99, "quantity": 10, "discount": 0.15},
    {"product": "Notebook", "category": "Office", "price": 4.99, "quantity": 20, "discount": 0.0},
    {"product": "Monitor", "category": "Electronics", "price": 299.99, "quantity": 1, "discount": 0.2}
]

def demonstrate_comprehensions():
    """Demonstrate various comprehension patterns."""
    print("=== Comprehensions Demo ===")

    # 1. List Comprehension: Extract error messages
    print("\n--- List Comprehensions ---")

    # Simple filtering and transformation
    error_messages = [entry["message"] for entry in log_entries if entry["level"] == "ERROR"]
    print(f"Error messages: {error_messages}")

    # Complex transformation with multiple conditions
    slow_requests = [
        f"User {entry['user_id']} from {entry['ip']}: {entry['response_time']}ms"
        for entry in log_entries
        if entry["response_time"] > 100 and entry["user_id"] is not None
    ]
    print(f"Slow requests: {slow_requests}")

    # Nested comprehension: Extract unique IP addresses per log level
    ip_by_level = [
        (level, [entry["ip"] for entry in log_entries if entry["level"] == level])
        for level in set(entry["level"] for entry in log_entries)
    ]
    print(f"IPs by level: {dict(ip_by_level)}")

    # 2. Dictionary Comprehensions
    print("\n--- Dictionary Comprehensions ---")

    # Create user response time mapping
    user_response_times = {
        entry["user_id"]: entry["response_time"]
        for entry in log_entries
        if entry["user_id"] is not None
    }
    print(f"User response times: {user_response_times}")

    # Product revenue calculations
    product_revenues = {
        item["product"]: item["price"] * item["quantity"] * (1 - item["discount"])
        for item in sales_data
    }
    print(f"Product revenues: {product_revenues}")

    # Category summaries
    category_stats = {
        category: {
            "items": len([item for item in sales_data if item["category"] == category]),
            "total_revenue": sum(
                item["price"] * item["quantity"] * (1 - item["discount"])
                for item in sales_data if item["category"] == category
            )
        }
        for category in set(item["category"] for item in sales_data)
    }
    print(f"Category stats: {category_stats}")

    # 3. Set Comprehensions
    print("\n--- Set Comprehensions ---")

    # Unique IP addresses that had errors
    error_ips = {entry["ip"] for entry in log_entries if entry["level"] == "ERROR"}
    print(f"IPs with errors: {error_ips}")

    # Price ranges (rounded to nearest 10)
    price_ranges = {int(item["price"] // 10) * 10 for item in sales_data}
    print(f"Price ranges: {sorted(price_ranges)}")

    # Unique first letters of product names
    product_initials = {item["product"][0].upper() for item in sales_data}
    print(f"Product initials: {sorted(product_initials)}")

def demonstrate_generator_expressions():
    """Demonstrate generator expressions for memory efficiency."""
    print("\n=== Generator Expressions Demo ===")

    # Memory-efficient processing of large datasets
    print("\n--- Memory-Efficient Processing ---")

    # Generator for processing response times
    response_time_gen = (
        entry["response_time"]
        for entry in log_entries
        if entry["response_time"] > 0
    )

    # Calculate statistics without storing all values
    valid_times = list(response_time_gen)  # Convert to list for multiple uses
    avg_response_time = sum(valid_times) / len(valid_times)
    max_response_time = max(valid_times)
    min_response_time = min(valid_times)

    print(f"Response time stats:")
    print(f"  Average: {avg_response_time:.2f}ms")
    print(f"  Range: {min_response_time}ms - {max_response_time}ms")

    # Generator for revenue calculations
    revenue_gen = (
        item["price"] * item["quantity"] * (1 - item["discount"])
        for item in sales_data
    )

    total_revenue = sum(revenue_gen)
    print(f"\nTotal revenue: ${total_revenue:.2f}")

    # Large dataset simulation
    print("\n--- Large Dataset Simulation ---")

    # Generate large amount of data lazily
    large_dataset_gen = (
        {"id": i, "value": i**2, "category": "A" if i % 2 == 0 else "B"}
        for i in range(100000)
    )

    # Process only what we need
    category_a_sum = sum(
        item["value"]
        for item in large_dataset_gen
        if item["category"] == "A" and item["id"] < 1000
    )

    print(f"Sum of category A values (first 1000): {category_a_sum}")

    # Demonstrate lazy evaluation
    print("\n--- Lazy Evaluation Demo ---")

    def expensive_operation(x):
        """Simulate an expensive operation."""
        time.sleep(0.001)  # Simulate work
        return x * x

    # Create generator (no work done yet)
    start_time = time.time()
    expensive_gen = (expensive_operation(i) for i in range(1000))
    creation_time = time.time() - start_time
    print(f"Generator creation time: {creation_time:.4f}s")

    # Use only first 10 values (only 10 operations performed)
    start_time = time.time()
    first_ten = [next(expensive_gen) for _ in range(10)]
    consumption_time = time.time() - start_time
    print(f"Processing first 10 values: {consumption_time:.4f}s")
    print(f"First 10 squares: {first_ten}")

def demonstrate_advanced_patterns():
    """Demonstrate advanced comprehension patterns."""
    print("\n=== Advanced Comprehension Patterns ===")

    # 1. Conditional expressions in comprehensions
    print("\n--- Conditional Expressions ---")

    log_summaries = [
        f"{entry['level']}: {entry['message'][:30]}{'...' if len(entry['message']) > 30 else ''}"
        for entry in log_entries
    ]
    print("Log summaries:")
    for summary in log_summaries:
        print(f"  {summary}")

    # 2. Multiple conditions and transformations
    print("\n--- Complex Transformations ---")

    enhanced_sales = [
        {
            **item,
            "final_price": item["price"] * (1 - item["discount"]),
            "total_value": item["price"] * item["quantity"] * (1 - item["discount"]),
            "price_tier": "premium" if item["price"] > 100 else "standard" if item["price"] > 50 else "budget",
            "high_volume": item["quantity"] > 5
        }
        for item in sales_data
    ]

    print("Enhanced sales data:")
    for item in enhanced_sales:
        print(f"  {item['product']}: ${item['final_price']:.2f} ({item['price_tier']}) - Total: ${item['total_value']:.2f}")

    # 3. Flattening nested structures
    print("\n--- Flattening Nested Data ---")

    nested_data = [
        {"department": "Sales", "employees": ["Alice", "Bob", "Charlie"]},
        {"department": "Engineering", "employees": ["David", "Eve", "Frank", "Grace"]},
        {"department": "Marketing", "employees": ["Henry", "Ivy"]}
    ]

    # Flatten to list of (employee, department) tuples
    employee_dept_pairs = [
        (employee, dept["department"])
        for dept in nested_data
        for employee in dept["employees"]
    ]

    print(f"Employee-Department pairs: {employee_dept_pairs}")

    # Create department size mapping
    dept_sizes = {
        dept["department"]: len(dept["employees"])
        for dept in nested_data
    }

    print(f"Department sizes: {dept_sizes}")

    # 4. Using walrus operator (:=) in comprehensions (Python 3.8+)
    print("\n--- Walrus Operator in Comprehensions ---")

    # Calculate and filter in one pass
    high_value_items = [
        f"{item['product']}: ${total_value:.2f}"
        for item in sales_data
        if (total_value := item["price"] * item["quantity"] * (1 - item["discount"])) > 500
    ]

    print(f"High-value items (>$500): {high_value_items}")

In [None]:
# Demo: Comprehensions and Generator Expressions
demonstrate_comprehensions()
demonstrate_generator_expressions()
demonstrate_advanced_patterns()

## 2.4 Duck Typing and Protocols

**Philosophy**: "If it looks like a duck and quacks like a duck, it's a duck."

**Problem**: Need flexible code that works with different types as long as they support required operations.

**Solution**: Write code that depends on behavior (methods/attributes) rather than specific types.

**Real-world Example**: File processing system that works with different data sources.

In [None]:
# Duck typing examples with data sources
from typing import Protocol, runtime_checkable

# Define protocols for static typing (optional but helpful)
@runtime_checkable
class Readable(Protocol):
    """Protocol for objects that can be read."""
    def read(self) -> str: ...
    def close(self) -> None: ...

@runtime_checkable
class DataSource(Protocol):
    """Protocol for data sources."""
    def get_data(self) -> List[Dict[str, Any]]: ...
    def get_metadata(self) -> Dict[str, Any]: ...

class FileDataSource:
    """Data source that reads from files."""

    def __init__(self, filename: str):
        self.filename = filename
        self._data = None

    def get_data(self) -> List[Dict[str, Any]]:
        if self._data is None:
            # Simulate reading from file
            self._data = [
                {"id": 1, "name": "Alice", "score": 95, "source": "file"},
                {"id": 2, "name": "Bob", "score": 87, "source": "file"},
                {"id": 3, "name": "Charlie", "score": 92, "source": "file"}
            ]
        return self._data

    def get_metadata(self) -> Dict[str, Any]:
        return {
            "source_type": "file",
            "filename": self.filename,
            "last_modified": "2024-01-15 10:30:00",
            "size_bytes": 256
        }

class DatabaseDataSource:
    """Data source that reads from database."""

    def __init__(self, connection_string: str):
        self.connection_string = connection_string
        self._cached_data = None

    def get_data(self) -> List[Dict[str, Any]]:
        if self._cached_data is None:
            # Simulate database query
            self._cached_data = [
                {"id": 4, "name": "Diana", "score": 98, "source": "database"},
                {"id": 5, "name": "Eve", "score": 89, "source": "database"},
                {"id": 6, "name": "Frank", "score": 94, "source": "database"}
            ]
        return self._cached_data

    def get_metadata(self) -> Dict[str, Any]:
        return {
            "source_type": "database",
            "connection": self.connection_string,
            "last_query": "2024-01-15 10:35:00",
            "rows_returned": len(self.get_data())
        }

class APIDataSource:
    """Data source that reads from API."""

    def __init__(self, api_url: str, api_key: str):
        self.api_url = api_url
        self.api_key = api_key

    def get_data(self) -> List[Dict[str, Any]]:
        # Simulate API call
        return [
            {"id": 7, "name": "Grace", "score": 91, "source": "api"},
            {"id": 8, "name": "Henry", "score": 86, "source": "api"}
        ]

    def get_metadata(self) -> Dict[str, Any]:
        return {
            "source_type": "api",
            "endpoint": self.api_url,
            "api_version": "v2.1",
            "rate_limit": "1000/hour"
        }

class MockDataSource:
    """Mock data source for testing."""

    def __init__(self, mock_data: List[Dict[str, Any]]):
        self.mock_data = mock_data

    def get_data(self) -> List[Dict[str, Any]]:
        return self.mock_data

    def get_metadata(self) -> Dict[str, Any]:
        return {
            "source_type": "mock",
            "records_count": len(self.mock_data),
            "created_at": datetime.now().isoformat()
        }

# Duck typing in action - functions that work with any "data source"
def analyze_data(data_source) -> Dict[str, Any]:
    """Analyze data from any source that implements the DataSource protocol."""
    # Duck typing: we don't check the type, just use the methods
    try:
        data = data_source.get_data()
        metadata = data_source.get_metadata()

        # Perform analysis
        scores = [item.get("score", 0) for item in data if "score" in item]

        analysis = {
            "source_info": metadata,
            "record_count": len(data),
            "has_scores": len(scores) > 0
        }

        if scores:
            analysis.update({
                "average_score": sum(scores) / len(scores),
                "max_score": max(scores),
                "min_score": min(scores),
                "score_distribution": {
                    "excellent": len([s for s in scores if s >= 95]),
                    "good": len([s for s in scores if 85 <= s < 95]),
                    "fair": len([s for s in scores if s < 85])
                }
            })

        return analysis

    except AttributeError as e:
        return {"error": f"Object doesn't implement required methods: {e}"}

def merge_data_sources(*sources) -> List[Dict[str, Any]]:
    """Merge data from multiple sources."""
    merged_data = []
    source_stats = []

    for i, source in enumerate(sources):
        try:
            data = source.get_data()
            metadata = source.get_metadata()

            # Add source identifier to each record
            for record in data:
                record["source_index"] = i
                record["source_type"] = metadata.get("source_type", "unknown")

            merged_data.extend(data)
            source_stats.append({
                "index": i,
                "type": metadata.get("source_type", "unknown"),
                "records": len(data)
            })

        except AttributeError as e:
            print(f"Warning: Source {i} doesn't implement required methods: {e}")
            continue

    print(f"Merged data from {len(source_stats)} sources: {source_stats}")
    return merged_data

# File-like objects with duck typing
class StringDataFile:
    """String-based file-like object."""

    def __init__(self, content: str):
        self.content = content
        self.position = 0
        self.closed = False

    def read(self, size: int = -1) -> str:
        if self.closed:
            raise ValueError("I/O operation on closed file")

        if size == -1:
            result = self.content[self.position:]
            self.position = len(self.content)
        else:
            result = self.content[self.position:self.position + size]
            self.position += len(result)

        return result

    def readline(self) -> str:
        if self.closed:
            raise ValueError("I/O operation on closed file")

        newline_pos = self.content.find('\n', self.position)
        if newline_pos == -1:
            result = self.content[self.position:]
            self.position = len(self.content)
        else:
            result = self.content[self.position:newline_pos + 1]
            self.position = newline_pos + 1

        return result

    def close(self) -> None:
        self.closed = True

class LogFile:
    """Log file that adds timestamps."""

    def __init__(self):
        self.lines = []
        self.position = 0
        self.closed = False

    def write(self, text: str) -> None:
        timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
        self.lines.append(f"[{timestamp}] {text}")

    def read(self, size: int = -1) -> str:
        content = "\n".join(self.lines)
        if size == -1:
            return content
        else:
            return content[:size]

    def readline(self) -> str:
        if self.position < len(self.lines):
            line = self.lines[self.position] + "\n"
            self.position += 1
            return line
        return ""

    def close(self) -> None:
        self.closed = True

def process_file_like_object(file_obj) -> Dict[str, Any]:
    """Process any file-like object using duck typing."""
    try:
        # Try to read the entire content
        content = file_obj.read()

        # Analyze the content
        lines = content.split('\n')
        words = content.split()

        analysis = {
            "total_characters": len(content),
            "total_lines": len(lines),
            "total_words": len(words),
            "average_line_length": sum(len(line) for line in lines) / len(lines) if lines else 0,
            "has_timestamps": any("[" in line and "]" in line for line in lines)
        }

        return analysis

    except AttributeError as e:
        return {"error": f"Object doesn't behave like a file: {e}"}
    finally:
        # Try to close if the object supports it
        if hasattr(file_obj, 'close'):
            file_obj.close()

In [None]:
# Demo: Duck Typing and Protocols
print("=== Duck Typing and Protocols Demo ===")

# 1. Data source duck typing
print("\n--- Data Source Duck Typing ---")

# Create different data sources
file_source = FileDataSource("students.json")
db_source = DatabaseDataSource("postgresql://localhost/school")
api_source = APIDataSource("https://api.school.com/students", "key123")
mock_source = MockDataSource([
    {"id": 9, "name": "Ivy", "score": 93, "source": "mock"},
    {"id": 10, "name": "Jack", "score": 88, "source": "mock"}
])

# Analyze each source using duck typing
sources = [
    ("File Source", file_source),
    ("Database Source", db_source),
    ("API Source", api_source),
    ("Mock Source", mock_source)
]

for name, source in sources:
    print(f"\n{name}:")
    analysis = analyze_data(source)

    if "error" in analysis:
        print(f"  Error: {analysis['error']}")
    else:
        print(f"  Type: {analysis['source_info']['source_type']}")
        print(f"  Records: {analysis['record_count']}")
        if analysis['has_scores']:
            print(f"  Average Score: {analysis['average_score']:.1f}")
            print(f"  Score Range: {analysis['min_score']} - {analysis['max_score']}")
            dist = analysis['score_distribution']
            print(f"  Distribution: {dist['excellent']} excellent, {dist['good']} good, {dist['fair']} fair")

# 2. Merge multiple data sources
print("\n--- Merging Multiple Data Sources ---")
all_data = merge_data_sources(file_source, db_source, api_source, mock_source)

print(f"\nMerged data ({len(all_data)} total records):")
for record in all_data:
    print(f"  {record['name']} (score: {record['score']}, from: {record['source_type']})")

# 3. Protocol checking (runtime)
print("\n--- Protocol Checking ---")
for name, source in sources:
    is_data_source = isinstance(source, DataSource)
    print(f"{name} implements DataSource protocol: {is_data_source}")

# 4. File-like object duck typing
print("\n--- File-like Object Duck Typing ---")

# Create different file-like objects
string_file = StringDataFile("Hello World\nThis is line 2\nAnd line 3")
log_file = LogFile()
log_file.write("System started")
log_file.write("User logged in")
log_file.write("Processing request")

# Use StringIO for comparison
string_io = StringIO("StringIO content\nLine 2 from StringIO\nEnd of StringIO")

file_objects = [
    ("String File", string_file),
    ("Log File", log_file),
    ("StringIO", string_io)
]

for name, file_obj in file_objects:
    print(f"\n{name}:")
    analysis = process_file_like_object(file_obj)

    if "error" in analysis:
        print(f"  Error: {analysis['error']}")
    else:
        print(f"  Characters: {analysis['total_characters']}")
        print(f"  Lines: {analysis['total_lines']}")
        print(f"  Words: {analysis['total_words']}")
        print(f"  Avg line length: {analysis['average_line_length']:.1f}")
        print(f"  Has timestamps: {analysis['has_timestamps']}")

# 5. Duck typing failure example
print("\n--- Duck Typing Failure Example ---")

class NotADataSource:
    """Class that doesn't implement the DataSource protocol."""
    def __init__(self):
        self.value = 42

    def get_value(self):  # Wrong method name
        return self.value

fake_source = NotADataSource()
analysis = analyze_data(fake_source)
print(f"Analysis of non-conforming object: {analysis}")

print("\n✅ Duck typing allows flexible, reusable code that works with any object implementing the expected interface!")

## 2.5 Iteration Protocol and Iterable Pattern

**Problem**: Need to create custom objects that can be used in `for` loops, comprehensions, and other iteration contexts.

**Solution**: Implement the iteration protocol by defining `__iter__` and `__next__` methods.

**Real-world Example**: Custom data structures, pagination, and lazy data generation.

In [None]:
# Iteration protocol examples
from typing import Iterator, Iterable, Generator

class NumberSequence:
    """Simple iterable that generates a sequence of numbers."""

    def __init__(self, start: int, end: int, step: int = 1):
        self.start = start
        self.end = end
        self.step = step

    def __iter__(self) -> Iterator[int]:
        """Return an iterator (using generator for simplicity)."""
        current = self.start
        while current < self.end:
            yield current
            current += self.step

    def __len__(self) -> int:
        """Calculate length of the sequence."""
        return max(0, (self.end - self.start + self.step - 1) // self.step)

class DataBatch:
    """Iterator for processing data in batches."""

    def __init__(self, data: List[Any], batch_size: int):
        self.data = data
        self.batch_size = batch_size
        self.current_index = 0

    def __iter__(self) -> Iterator[List[Any]]:
        return self

    def __next__(self) -> List[Any]:
        if self.current_index >= len(self.data):
            raise StopIteration

        batch = self.data[self.current_index:self.current_index + self.batch_size]
        self.current_index += self.batch_size
        return batch

    def reset(self) -> None:
        """Reset iterator to beginning."""
        self.current_index = 0

class PaginatedAPI:
    """Simulates paginated API responses with automatic pagination."""

    def __init__(self, total_items: int, page_size: int = 10):
        self.total_items = total_items
        self.page_size = page_size
        self.total_pages = (total_items + page_size - 1) // page_size

    def __iter__(self) -> Iterator[Dict[str, Any]]:
        """Iterate through all pages automatically."""
        for page_num in range(1, self.total_pages + 1):
            page_data = self._fetch_page(page_num)
            for item in page_data["items"]:
                yield item

    def _fetch_page(self, page_num: int) -> Dict[str, Any]:
        """Simulate API call to fetch a page."""
        start_idx = (page_num - 1) * self.page_size
        end_idx = min(start_idx + self.page_size, self.total_items)

        items = [
            {
                "id": i,
                "name": f"Item {i}",
                "value": i * 10,
                "page": page_num
            }
            for i in range(start_idx, end_idx)
        ]

        # Simulate API response structure
        return {
            "page": page_num,
            "page_size": self.page_size,
            "total_pages": self.total_pages,
            "total_items": self.total_items,
            "items": items
        }

    def get_pages(self) -> Iterator[Dict[str, Any]]:
        """Iterator that yields complete pages instead of individual items."""
        for page_num in range(1, self.total_pages + 1):
            yield self._fetch_page(page_num)

class FibonacciSequence:
    """Fibonacci sequence generator with different iteration modes."""

    def __init__(self, max_count: Optional[int] = None, max_value: Optional[int] = None):
        self.max_count = max_count
        self.max_value = max_value

    def __iter__(self) -> Iterator[int]:
        """Generate Fibonacci numbers up to specified limit."""
        a, b = 0, 1
        count = 0

        while True:
            # Check count limit
            if self.max_count is not None and count >= self.max_count:
                break

            # Check value limit
            if self.max_value is not None and a > self.max_value:
                break

            yield a
            a, b = b, a + b
            count += 1

    def pairs(self) -> Iterator[tuple[int, int]]:
        """Generate consecutive Fibonacci pairs."""
        fib_iter = iter(self)
        try:
            prev = next(fib_iter)
            for current in fib_iter:
                yield (prev, current)
                prev = current
        except StopIteration:
            pass

    def ratios(self) -> Iterator[float]:
        """Generate ratios between consecutive Fibonacci numbers (approaches golden ratio)."""
        for prev, current in self.pairs():
            if prev != 0:
                yield current / prev

class DataProcessor:
    """Processes data with lazy evaluation and caching."""

    def __init__(self, data: List[Dict[str, Any]]):
        self.data = data
        self._cache = {}

    def filter_by_category(self, category: str) -> Iterator[Dict[str, Any]]:
        """Filter items by category with lazy evaluation."""
        cache_key = f"category_{category}"

        if cache_key not in self._cache:
            print(f"🔍 Filtering by category: {category}")
            self._cache[cache_key] = [
                item for item in self.data
                if item.get("category") == category
            ]

        for item in self._cache[cache_key]:
            yield item

    def transform_values(self, transform_func) -> Iterator[Dict[str, Any]]:
        """Apply transformation to each item."""
        for item in self.data:
            yield transform_func(item)

    def sliding_window(self, window_size: int) -> Iterator[List[Dict[str, Any]]]:
        """Generate sliding windows of data."""
        if window_size <= 0:
            return

        for i in range(len(self.data) - window_size + 1):
            yield self.data[i:i + window_size]

# Generator functions for common patterns
def infinite_counter(start: int = 0, step: int = 1) -> Generator[int, None, None]:
    """Infinite counter generator."""
    current = start
    while True:
        yield current
        current += step

def chunked(iterable: Iterable[Any], chunk_size: int) -> Generator[List[Any], None, None]:
    """Split iterable into chunks of specified size."""
    iterator = iter(iterable)
    while True:
        chunk = list(itertools.islice(iterator, chunk_size))
        if not chunk:
            break
        yield chunk

def enumerate_with_step(iterable: Iterable[Any], start: int = 0, step: int = 1) -> Generator[tuple[int, Any], None, None]:
    """Enhanced enumerate with custom step."""
    index = start
    for item in iterable:
        yield (index, item)
        index += step

In [None]:
# Demo: Iteration Protocol and Iterable Pattern
print("=== Iteration Protocol and Iterable Pattern Demo ===")

# 1. Custom number sequence
print("\n--- Custom Number Sequence ---")
seq = NumberSequence(2, 20, 3)
print(f"Sequence from 2 to 20 with step 3: {list(seq)}")
print(f"Length: {len(seq)}")

# Use in comprehension
squares = [x**2 for x in NumberSequence(1, 6)]
print(f"Squares: {squares}")

# 2. Data batching
print("\n--- Data Batching ---")
sample_data = list(range(25))  # 0 to 24
batch_processor = DataBatch(sample_data, batch_size=7)

print("Processing data in batches of 7:")
for i, batch in enumerate(batch_processor, 1):
    print(f"  Batch {i}: {batch}")

# Reset and process again
print("\nReset and process first 2 batches:")
batch_processor.reset()
for i, batch in enumerate(batch_processor, 1):
    print(f"  Batch {i}: {batch}")
    if i >= 2:
        break

# 3. Paginated API simulation
print("\n--- Paginated API ---")
api = PaginatedAPI(total_items=35, page_size=8)

# Iterate through all items automatically
print("First 10 items from paginated API:")
for i, item in enumerate(api, 1):
    print(f"  {item['name']} (ID: {item['id']}, Value: {item['value']}, Page: {item['page']})")
    if i >= 10:
        break

# Iterate through pages
print("\nPage-by-page iteration:")
for page in api.get_pages():
    print(f"  Page {page['page']}: {len(page['items'])} items (IDs: {[item['id'] for item in page['items']]})")

# 4. Fibonacci sequence with different modes
print("\n--- Fibonacci Sequence ---")

# First 10 Fibonacci numbers
fib_count = FibonacciSequence(max_count=10)
print(f"First 10 Fibonacci numbers: {list(fib_count)}")

# Fibonacci numbers up to 100
fib_value = FibonacciSequence(max_value=100)
print(f"Fibonacci numbers ≤ 100: {list(fib_value)}")

# Fibonacci pairs
fib_pairs = FibonacciSequence(max_count=8)
print(f"Fibonacci pairs: {list(fib_pairs.pairs())}")

# Fibonacci ratios (approaching golden ratio)
fib_ratios = FibonacciSequence(max_count=10)
ratios = list(fib_ratios.ratios())
print(f"Fibonacci ratios: {[f'{ratio:.6f}' for ratio in ratios]}")
print(f"Last ratio (≈ golden ratio): {ratios[-1]:.10f}")

# 5. Data processing with lazy evaluation
print("\n--- Data Processing with Lazy Evaluation ---")

# Sample dataset
dataset = [
    {"id": 1, "name": "Product A", "category": "Electronics", "price": 99.99, "rating": 4.5},
    {"id": 2, "name": "Product B", "category": "Books", "price": 19.99, "rating": 4.2},
    {"id": 3, "name": "Product C", "category": "Electronics", "price": 199.99, "rating": 4.8},
    {"id": 4, "name": "Product D", "category": "Clothing", "price": 49.99, "rating": 3.9},
    {"id": 5, "name": "Product E", "category": "Electronics", "price": 299.99, "rating": 4.7},
    {"id": 6, "name": "Product F", "category": "Books", "price": 24.99, "rating": 4.1}
]

processor = DataProcessor(dataset)

# Filter by category (cached)
print("Electronics products:")
electronics = list(processor.filter_by_category("Electronics"))
for product in electronics:
    print(f"  {product['name']}: ${product['price']} (Rating: {product['rating']})")

# Second call uses cache
print("\nSecond call to Electronics filter (uses cache):")
electronics_cached = list(processor.filter_by_category("Electronics"))
print(f"Found {len(electronics_cached)} electronics products")

# Transform values
def add_price_tier(item):
    item = item.copy()
    price = item['price']
    if price < 50:
        item['price_tier'] = 'Budget'
    elif price < 150:
        item['price_tier'] = 'Mid-range'
    else:
        item['price_tier'] = 'Premium'
    return item

print("\nProducts with price tiers:")
enhanced_products = list(processor.transform_values(add_price_tier))
for product in enhanced_products:
    print(f"  {product['name']}: {product['price_tier']} (${product['price']})")

# Sliding window
print("\nSliding window of size 3:")
for i, window in enumerate(processor.sliding_window(3), 1):
    window_names = [item['name'] for item in window]
    print(f"  Window {i}: {window_names}")

# 6. Generator functions
print("\n--- Generator Functions ---")

# Infinite counter (take first 5)
counter = infinite_counter(10, 3)
first_five = [next(counter) for _ in range(5)]
print(f"First 5 from infinite counter (start=10, step=3): {first_five}")

# Enhanced enumerate
items = ['apple', 'banana', 'cherry', 'date']
enumerated = list(enumerate_with_step(items, start=100, step=5))
print(f"Enhanced enumerate: {enumerated}")

print("\n✅ Custom iterables integrate seamlessly with Python's iteration ecosystem!")

## Summary

This notebook demonstrated 5 essential Pythonic idioms and language-specific patterns:

### 1. **EAFP vs LBYL**
- **Use Case**: Error handling and conditional logic
- **Key Learning**: Python favors "try and handle exceptions" over "check first"
- **Demo**: Configuration processing with missing fields

### 2. **Context Managers**
- **Use Case**: Resource management and guaranteed cleanup
- **Key Learning**: `with` statements ensure proper resource handling
- **Demo**: Database connections, temporary files, and timing operations

### 3. **Comprehensions and Generator Expressions**
- **Use Case**: Concise collection creation and data transformation
- **Key Learning**: More readable and often faster than explicit loops
- **Demo**: Log analysis, sales data processing, and memory-efficient computations

### 4. **Duck Typing and Protocols**
- **Use Case**: Flexible, polymorphic code based on behavior
- **Key Learning**: "If it quacks like a duck, it's a duck"
- **Demo**: Data sources, file-like objects, and protocol-based design

### 5. **Iteration Protocol**
- **Use Case**: Custom iterables and lazy data generation
- **Key Learning**: Implement `__iter__` and `__next__` for seamless integration
- **Demo**: Number sequences, pagination, Fibonacci generation, and data processing

### Key Takeaways
- **Pythonic Philosophy**: Embrace Python's idioms for cleaner, more maintainable code
- **Real-world Application**: These patterns solve common programming challenges elegantly
- **Performance Benefits**: Many Pythonic patterns offer performance advantages
- **Integration**: Pythonic code works seamlessly with Python's built-in functions and libraries
- **Readability**: Following Python idioms makes code more readable to other Python developers