# Chapter 9: Error Handling and Debugging

Writing code that works under ideal conditions is straightforward; writing code that fails gracefully under unexpected conditions distinguishes professionals from amateurs. Error handling is not merely about preventing crashes—it is about maintaining system integrity, providing actionable feedback to users, and preserving diagnostic information for developers.

This chapter explores Python's exception mechanism, the art of designing meaningful error hierarchies, systematic debugging techniques, and the logging infrastructure that replaces `print()` statements in production environments. We will emphasize defensive programming patterns that align with modern industry standards, including exception groups, exception notes, and structured logging.

## 9.1 Exceptions: The try-except-else-finally Mechanism

Python uses **exceptions** to handle errors and exceptional conditions. When an error occurs, Python raises an exception. If unhandled, the program terminates with a traceback. The `try-except` construct allows you to intercept exceptions and respond appropriately.

### The Basic Structure

The complete error handling block consists of four clauses:

```python
from typing import Optional

def process_file(filename: str) -> Optional[str]:
    """
    Demonstrates the full exception handling syntax.
    
    Returns file content or None if file cannot be processed.
    """
    file_handle: Optional[object] = None
    
    try:
        # Code that might raise an exception
        file_handle = open(filename, 'r', encoding='utf-8')
        content: str = file_handle.read()
        return content
    
    except FileNotFoundError as e:
        # Handle specific exception type
        print(f"Error: The file '{filename}' was not found.")
        print(f"Details: {e}")
        return None
    
    except PermissionError:
        # Handle different specific exception
        print(f"Error: Permission denied when accessing '{filename}'.")
        return None
    
    except Exception as e:
        # Catch-all (use sparingly and usually for logging before re-raising)
        print(f"Unexpected error: {e}")
        raise  # Re-raise the same exception
    
    else:
        # Executes ONLY if try block succeeded (no exception raised)
        print("File read successfully.")
        return content
    
    finally:
        # Always executes, regardless of success or failure
        # Used for cleanup (closing files, releasing locks, etc.)
        if file_handle:
            file_handle.close()
            print("File handle closed.")

# Usage
result: Optional[str] = process_file("data.txt")
```

**Clause Execution Rules:**
*   **`try`**: Mandatory. Contains the protected code.
*   **`except`**: Optional (but useless without it). Catches specific exceptions. Can have multiple except blocks for different error types.
*   **`else`**: Optional. Executes only if the `try` block completes without raising an exception. Useful for code that should run only on success, keeping the `try` block focused on the operation that might fail.
*   **`finally`**: Optional. Always executes before leaving the `try` block (even if `return`, `break`, `continue`, or uncaught exceptions occur). Essential for resource cleanup.

### Exception Hierarchy and Specificity

Python's built-in exceptions form a hierarchy. `Exception` is the base class for most user-facing exceptions, but you should catch the **most specific exception possible**.

**The Exception Hierarchy (Simplified):**
```
BaseException
 ├── SystemExit          (raised by sys.exit())
 ├── KeyboardInterrupt   (Ctrl+C)
 ├── GeneratorExit
 └── Exception           (base for user-defined exceptions)
      ├── ArithmeticError
      │    └── ZeroDivisionError
      ├── LookupError
      │    ├── IndexError
      │    └── KeyError
      ├── TypeError
      ├── ValueError
      └── OSError
           ├── FileNotFoundError
           ├── PermissionError
           └── IsADirectoryError
```

**Industry Standard:** Catch specific exceptions, never use bare `except:` (which catches `SystemExit` and `KeyboardInterrupt`, preventing clean shutdowns).

```python
# BAD PRACTICE - Never do this
try:
    risky_operation()
except:  # Catches KeyboardInterrupt, SystemExit, everything!
    pass  # Silently swallows all errors

# GOOD PRACTICE - Be specific
try:
    risky_operation()
except (ValueError, TypeError) as e:  # Catch only expected errors
    handle_error(e)
```

### The Exception Object

When you use `except ExceptionType as e`, the variable `e` holds the exception instance containing valuable diagnostic information:

```python
def parse_port(config: dict) -> int:
    """Parse port number from configuration with detailed error handling."""
    try:
        port: int = int(config["port"])
        if not 1 <= port <= 65535:
            raise ValueError(f"Port {port} out of valid range")
        return port
    except KeyError as e:
        # e.args contains the arguments passed to the exception
        missing_key: str = e.args[0]
        raise ValueError(f"Missing required configuration key: {missing_key}") from e
    except ValueError as e:
        # str(e) gives the human-readable message
        print(f"Invalid port configuration: {e}")
        raise
```

**Exception Attributes:**
*   `args`: Tuple of arguments passed to the constructor
*   `__traceback__`: The traceback object associated with the exception
*   `__cause__`: The original exception if using `raise ... from` (exception chaining)
*   `__context__`: The exception being handled when this one was raised (implicit chaining)
*   `__notes__` (Python 3.11+): List of additional notes added via `add_note()`

### Exception Chaining and Context

When you catch an exception and raise a different one, preserve the original context using the `from` keyword. This maintains the full error history.

```python
def load_configuration(path: str) -> dict:
    """Load and parse configuration file."""
    try:
        with open(path, 'r') as f:
            raw_data: str = f.read()
    except FileNotFoundError as e:
        # Explicit chaining - indicates direct causation
        raise ConfigurationError(f"Config file not found: {path}") from e
    
    try:
        config: dict = json.loads(raw_data)
    except json.JSONDecodeError as e:
        # Also preserves context, but indicates conversion failure
        raise ConfigurationError(f"Invalid JSON in {path}") from e
    
    return config
```

**Output with chaining:**
```
ConfigurationError: Invalid JSON in config.json
caused by JSONDecodeError: Expecting ',' delimiter: line 3 column 1 (char 45)
```

**Implicit vs. Explicit Chaining:**
*   If you raise a new exception while handling another, Python automatically sets `__context__`
*   Use `raise NewException() from original` to set `__cause__` explicitly when the new exception is a direct consequence of the original

### Exception Groups (Python 3.11+)

Modern Python supports **Exception Groups**, allowing multiple unrelated exceptions to be raised and caught together—particularly useful in concurrent programming (asyncio) or when validating multiple independent fields.

```python
# Raising multiple exceptions simultaneously
def validate_user_data(data: dict) -> None:
    """Validate user data, collecting all errors."""
    errors: list[Exception] = []
    
    if "email" not in data:
        errors.append(ValueError("Email is required"))
    if "age" in data and data["age"] < 0:
        errors.append(ValueError("Age cannot be negative"))
    if "password" in data and len(data["password"]) < 8:
        errors.append(ValueError("Password too short"))
    
    if errors:
        # Raise all validation errors together
        raise ExceptionGroup("Validation failed", errors)

# Handling exception groups
try:
    validate_user_data({"age": -5, "password": "123"})
except ExceptionGroup as eg:
    # Iterate through individual exceptions
    for error in eg.exceptions:
        print(f"Validation error: {error}")
    
    # Or use except* for specific types
    try:
        validate_user_data({"age": -5})
    except* ValueError as eg:
        print(f"Caught {len(eg.exceptions)} ValueErrors")
```

### Adding Notes to Exceptions (Python 3.11+)

Enhance exceptions with additional context using `add_note()`—invaluable for adding runtime context that wasn't available where the exception was originally raised.

```python
def process_user_request(user_id: int, request_data: dict) -> None:
    """Process request with enriched error information."""
    try:
        user: User = fetch_user(user_id)
    except UserNotFoundError as e:
        # Add contextual information to help debugging
        e.add_note(f"Requested while processing action: {request_data.get('action')}")
        e.add_note(f"Timestamp: {datetime.now().isoformat()}")
        e.add_note(f"Request ID: {request_data.get('request_id')}")
        raise
```

## 9.2 Raising Exceptions: Custom Exception Classes and Hierarchies

Built-in exceptions cover generic cases, but professional applications require domain-specific exceptions that communicate precise failure modes. Designing a proper exception hierarchy is a critical API design skill.

### Creating Custom Exceptions

Custom exceptions should inherit from `Exception` (or a more specific built-in) and typically require no implementation beyond documentation:

```python
class PaymentError(Exception):
    """
    Base exception for payment processing failures.
    
    Attributes:
        transaction_id: Unique identifier for the failed transaction
        amount: The monetary amount involved
        currency: The currency code (ISO 4217)
    """
    
    def __init__(
        self, 
        message: str, 
        transaction_id: str, 
        amount: float, 
        currency: str = "USD"
    ) -> None:
        super().__init__(message)
        self.transaction_id = transaction_id
        self.amount = amount
        self.currency = currency
    
    def __str__(self) -> str:
        base_msg: str = super().__str__()
        return f"{base_msg} [Transaction: {self.transaction_id}, Amount: {self.amount} {self.currency}]"

# Usage
raise PaymentError(
    "Insufficient funds", 
    transaction_id="txn_12345", 
    amount=150.00
)
```

### Exception Hierarchy Design

Design hierarchies to allow callers to catch errors at appropriate granularity:

```python
# Base exception for the entire application/library
class ApplicationError(Exception):
    """Base for all application-specific errors."""
    pass

# Domain-specific bases
class DatabaseError(ApplicationError):
    """Base for all database-related errors."""
    pass

class ValidationError(ApplicationError):
    """Base for all input validation errors."""
    pass

# Specific implementations
class ConnectionTimeoutError(DatabaseError):
    """Could not establish database connection within timeout period."""
    def __init__(self, host: str, port: int, timeout: int) -> None:
        self.host = host
        self.port = port
        self.timeout = timeout
        super().__init__(f"Connection to {host}:{port} timed out after {timeout}s")

class SchemaError(DatabaseError):
    """Database schema mismatch or migration issue."""
    pass

class FieldValidationError(ValidationError):
    """Specific field failed validation."""
    def __init__(self, field_name: str, value: any, constraint: str) -> None:
        self.field_name = field_name
        self.value = value
        self.constraint = constraint
        super().__init__(f"Field '{field_name}' with value '{value}' violates constraint: {constraint}")

# Usage allows granular catching
def save_user(user_data: dict) -> None:
    try:
        validate_data(user_data)  # Might raise FieldValidationError
        connect_to_db()           # Might raise ConnectionTimeoutError
        insert_record(user_data)  # Might raise SchemaError
    except ValidationError as e:
        # Catches FieldValidationError and other validation issues
        return HTTPResponse(400, str(e))
    except DatabaseError as e:
        # Catches any database issue without catching unrelated ApplicationErrors
        logger.error(f"Database failure: {e}")
        return HTTPResponse(503, "Service temporarily unavailable")
```

**Best Practices for Exception Design:**
1.  **Inherit from appropriate built-ins**: If your error is a type error, inherit from `TypeError`. If it's value-related, inherit from `ValueError`.
2.  **Provide context**: Include relevant IDs, field names, or constraints in the exception.
3.  **Avoid flow control**: Exceptions should be for exceptional circumstances, not regular control flow (though Python uses StopIteration internally).
4.  **Document exceptions**: Use docstrings to specify which exceptions your functions raise.

### Context Managers for Exception Handling

Combine exceptions with context managers (`with` statement) for robust resource management:

```python
from contextlib import contextmanager
from typing import Generator

class TransactionRollbackError(Exception):
    """Raised when transaction must be rolled back."""
    pass

@contextmanager
def database_transaction(conn) -> Generator:
    """
    Context manager that automatically rolls back on exceptions.
    
    Usage:
        with database_transaction(conn) as cursor:
            cursor.execute("INSERT ...")
            # If exception occurs here, automatic rollback
    """
    cursor = conn.cursor()
    try:
        yield cursor
        conn.commit()
        print("Transaction committed successfully")
    except Exception as e:
        conn.rollback()
        print(f"Transaction rolled back due to: {e}")
        # Re-raise or wrap depending on requirements
        raise TransactionRollbackError("Operation failed, changes reverted") from e
    finally:
        cursor.close()

# Usage
try:
    with database_transaction(db_connection) as cursor:
        cursor.execute("UPDATE accounts SET balance = balance - 100 WHERE id = 1")
        cursor.execute("UPDATE accounts SET balance = balance + 100 WHERE id = 2")
        # If second update fails, first is automatically rolled back
except TransactionRollbackError as e:
    notify_admin(e)
```

## 9.3 Debugging: Tools and Techniques

When exceptions occur or logic fails, systematic debugging is essential. Python offers powerful debugging capabilities from simple print statements to sophisticated interactive debuggers.

### Understanding Tracebacks

A traceback is Python's stack trace—history of function calls that led to the error. Reading it correctly is fundamental:

```
Traceback (most recent call last):
  File "/app/main.py", line 45, in <module>
    process_payment(order_id, amount)
  File "/app/payments.py", line 12, in process_payment
    validate_funds(account, amount)
  File "/app/validation.py", line 28, in validate_funds
    current_balance = get_balance(account.id)
  File "/app/db.py", line 55, in get_balance
    row = cursor.execute(query, (account_id,)).fetchone()
ValueError: invalid literal for int() with base 10: 'null'
```

**Reading the Traceback:**
1.  **Bottom**: The actual exception and message (`ValueError`)
2.  **Upwards**: The chain of calls, starting from the most recent
3.  **Line numbers**: Exact locations in files
4.  **Top**: The entry point (`<module>` means the script itself)

**The error occurred in `db.py` line 55**, but the logical error might be that the database returned the string "null" instead of a number, indicating a schema or data quality issue.

### Interactive Debugging with pdb

The Python Debugger (`pdb`) allows you to pause execution, inspect variables, and step through code.

**Basic Usage:**

```python
import pdb

def complex_calculation(x: int, y: int) -> int:
    result: int = x * 2
    pdb.set_trace()  # Execution pauses here
    # You now have an interactive prompt: (Pdb)
    result += y
    return result

# Python 3.7+ alternative: breakpoint()
def modern_debug(x: int, y: int) -> int:
    result: int = x * 2
    breakpoint()  # Respects PYTHONBREAKPOINT environment variable
    result += y
    return result
```

**Essential pdb Commands:**
*   `n` (next): Execute current line and move to next (don't step into functions)
*   `s` (step): Step into function calls
*   `c` (continue): Continue execution until next breakpoint or end
*   `l` (list): Show current line and surrounding context
*   `p <variable>`: Print variable value
*   `pp <variable>`: Pretty print (better formatting for dicts/lists)
*   `b <line>`: Set breakpoint at line number
*   `q` (quit): Abort execution

**Post-Mortem Debugging:**
Debug after an exception has occurred:

```python
import traceback
import sys

def failing_function() -> None:
    x: int = 1
    y: str = "2"
    return x + y  # TypeError

try:
    failing_function()
except Exception:
    # Enter debugger at the point of exception
    traceback.print_exc()
    import pdb; pdb.post_mortem(sys.exc_info()[2])
```

### IDE and Advanced Debugging

Modern IDEs (VS Code, PyCharm) provide graphical debuggers with:
*   **Breakpoints**: Click to pause execution at specific lines
*   **Conditional breakpoints**: Pause only when conditions are met (e.g., `user_id == None`)
*   **Watch expressions**: Monitor specific variables as you step
*   **Evaluate expressions**: Execute arbitrary code in the current scope
*   **Remote debugging**: Attach to running processes in Docker or production

**Debugging Strategies:**
1.  **Binary Search**: If you have 1000 lines and a bug, check line 500. If state is bad there, bug is earlier; if good, bug is later. Repeat.
2.  **Rubber Ducking**: Explain the code line-by-line to an inanimate object (or colleague). Often you spot the error while explaining.
3.  **Reproduce First**: Never debug without a reliable way to reproduce the error.
4.  **Isolate Variables**: Comment out half the code. If error persists, it's in the remaining half.

### Logging for Debugging

While `print()` statements are tempting, they are often forgotten and pollute production output. Use the `logging` module instead (see 9.4), or temporary debug logs:

```python
import logging
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger(__name__)

def debug_function(data: dict) -> None:
    logger.debug(f"Function called with data: {data}")
    result: list = expensive_operation(data)
    logger.debug(f"Intermediate result: {result}")
    # ...
```

## 9.4 Logging: The Professional Alternative to print()

Production applications cannot rely on `print()` for diagnostics. The `logging` module provides a flexible, configurable framework for recording events with severity levels, timestamps, and context.

### The Logging Hierarchy

Python's logging system is hierarchical:
1.  **Loggers**: Entry points—create one per module (`logging.getLogger(__name__)`)
2.  **Handlers**: Send logs to destinations (console, file, network, email)
3.  **Formatters**: Specify output layout
4.  **Filters**: Fine-grained control over which logs pass through

```python
import logging
import sys
from logging.handlers import RotatingFileHandler, SysLogHandler
from typing import override

# Create logger
logger: logging.Logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)  # Threshold for this logger

# Console handler (INFO and above)
console_handler: logging.StreamHandler = logging.StreamHandler(sys.stdout)
console_handler.setLevel(logging.INFO)
console_format: logging.Formatter = logging.Formatter(
    '%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    datefmt='%Y-%m-%d %H:%M:%S'
)
console_handler.setFormatter(console_format)

# File handler (DEBUG and above, with rotation)
file_handler: RotatingFileHandler = RotatingFileHandler(
    'app.log', 
    maxBytes=10485760,  # 10MB
    backupCount=5       # Keep 5 backup files
)
file_handler.setLevel(logging.DEBUG)
file_format: logging.Formatter = logging.Formatter(
    '%(asctime)s - %(name)s - %(levelname)s - %(filename)s:%(lineno)d - %(message)s'
)
file_handler.setFormatter(file_format)

# Add handlers to logger
logger.addHandler(console_handler)
logger.addHandler(file_handler)

# Usage
logger.debug("Detailed diagnostic information")
logger.info("General confirmation that things are working")
logger.warning("Indication of potential future problem")
logger.error("Unable to perform specific operation")
logger.critical("Program may be unable to continue")
```

### Log Levels (Severity Scale)

| Level | Numeric Value | When to Use |
|-------|--------------|-------------|
| `DEBUG` | 10 | Detailed diagnostic information during development |
| `INFO` | 20 | Confirmation that things are working as expected |
| `WARNING` | 30 | Indication that something unexpected happened, but software is still working |
| `ERROR` | 40 | Due to a more serious problem, the software could not perform some function |
| `CRITICAL` | 50 | A serious error, indicating that the program itself may be unable to continue running |

### Advanced Logging Patterns

**Lazy Evaluation with Logging:**
Logging functions accept format strings and arguments to avoid string formatting overhead when the message isn't emitted (e.g., debug logs in production):

```python
# BAD - Always constructs the string, even if DEBUG is disabled
logger.debug(f"Processing {len(huge_list)} items: {huge_list}")

# GOOD - Only formats if DEBUG is enabled
logger.debug("Processing %d items: %s", len(huge_list), huge_list)

# MODERN (Python 3.6+) - Uses __format__ only if needed
logger.debug("Processing %d items", len(huge_list))
```

**Contextual Information with extra:**

```python
# Add custom fields to log records
logger.info(
    "User authentication successful",
    extra={
        'user_id': user.id,
        'ip_address': request.remote_addr,
        'user_agent': request.user_agent
    }
)

# Formatter must include these fields
formatter = logging.Formatter(
    '%(asctime)s - %(user_id)s - %(ip_address)s - %(message)s'
)
```

**Structured Logging (JSON):**
For log aggregation systems (ELK, Splunk, Datadog), output JSON instead of text:

```python
import json
import logging

class JSONFormatter(logging.Formatter):
    @override
    def format(self, record: logging.LogRecord) -> str:
        log_obj: dict = {
            'timestamp': self.formatTime(record),
            'level': record.levelname,
            'logger': record.name,
            'message': record.getMessage(),
            'filename': record.filename,
            'line': record.lineno,
        }
        if hasattr(record, 'user_id'):
            log_obj['user_id'] = record.user_id
        return json.dumps(log_obj)

# Usage
json_handler: logging.StreamHandler = logging.StreamHandler()
json_handler.setFormatter(JSONFormatter())
logger.addHandler(json_handler)
```

**Configuration via File or Dict:**
For complex applications, configure logging via dictionary (often loaded from YAML/JSON):

```python
import logging.config

LOGGING_CONFIG: dict = {
    'version': 1,
    'disable_existing_loggers': False,
    'formatters': {
        'standard': {
            'format': '%(asctime)s [%(levelname)s] %(name)s: %(message)s'
        },
    },
    'handlers': {
        'default': {
            'level': 'INFO',
            'formatter': 'standard',
            'class': 'logging.StreamHandler',
            'stream': 'ext://sys.stdout',
        },
        'file': {
            'level': 'DEBUG',
            'formatter': 'standard',
            'class': 'logging.handlers.RotatingFileHandler',
            'filename': 'app.log',
            'maxBytes': 10485760,
            'backupCount': 3,
        },
    },
    'loggers': {
        '': {  # Root logger
            'handlers': ['default', 'file'],
            'level': 'DEBUG',
            'propagate': False
        },
        'urllib3': {  # Reduce noise from libraries
            'handlers': ['default'],
            'level': 'WARNING',
            'propagate': False
        }
    }
}

logging.config.dictConfig(LOGGING_CONFIG)
```

### Logging vs. Print: Decision Matrix

| Scenario | Use Print? | Use Logging? | Reasoning |
|----------|-----------|--------------|-----------|
| Quick script, one-time run | ✅ Yes | ❌ No | Simplicity |
| Production application | ❌ No | ✅ Yes | Configurability, levels, persistence |
| Library code | ❌ Never | ✅ Yes | Let application control output |
| Error tracking | ❌ No | ✅ Yes | Stack traces, context, severity |
| User-facing output | ✅ Yes | ❌ No | Logging is for developers/operators |

## Summary

Robust error handling transforms fragile scripts into resilient applications. You have learned to use `try-except-else-finally` blocks with precise exception specificity, avoiding bare `except:` clauses that mask critical errors. You can design custom exception hierarchies that communicate domain-specific failures while preserving context through exception chaining (`raise ... from`). Modern Python features like Exception Groups and exception notes allow you to handle complex concurrent failures and enrich diagnostic information.

You now possess systematic debugging techniques, from reading tracebacks effectively to using `pdb` for interactive inspection and leveraging IDE graphical debuggers. Finally, you understand that `logging` is not merely a better `print()`—it is a configurable, hierarchical infrastructure that separates diagnostic concerns from business logic, supports structured output for modern observability platforms, and allows runtime configuration of verbosity without code changes.

However, error handling and debugging are reactive measures. The next chapter shifts focus to proactive quality assurance—ensuring that errors are caught before they reach production through systematic testing strategies.

**Next Chapter**: Chapter 10: Testing and Quality Assurance (Unit Testing, Pytest, TDD, and Coverage).