# Chapter 21: Advanced Logging

Building on logging fundamentals, this notebook covers the advanced features of Python's
logging module: logger hierarchies, multiple handlers, custom filters, rotating file
handlers, structured logging, configuration via dictionaries, and best practices for
logging in libraries vs applications.

## Topics Covered
- **Logger hierarchy**: Parent-child relationships and propagation
- **Multiple handlers**: Console + file on a single logger
- **Filters**: Filtering log records by custom criteria
- **Rotating handlers**: `RotatingFileHandler` and `TimedRotatingFileHandler`
- **Structured logging**: JSON-formatted log output
- **dictConfig and fileConfig**: Declarative configuration
- **Libraries vs applications**: Different logging strategies
- **LoggerAdapter**: Adding context with extra fields

## Logger Hierarchy: Parent-Child Relationships

Logger names form a hierarchy using dot-separated names, just like Python packages.
The logger `myapp.db.connection` is a child of `myapp.db`, which is a child of `myapp`,
which is a child of the **root** logger.

When a logger processes a record, it passes the record to its own handlers, then
(if `propagate=True`, which is the default) passes it up to the parent logger. This
continues all the way to the root.

In [None]:
import logging
import sys

# Clean slate
for name in ["webapp", "webapp.auth", "webapp.auth.oauth"]:
    lg = logging.getLogger(name)
    lg.handlers.clear()
    lg.setLevel(logging.NOTSET)
    lg.propagate = True

# Build a three-level hierarchy
parent = logging.getLogger("webapp")
child = logging.getLogger("webapp.auth")
grandchild = logging.getLogger("webapp.auth.oauth")

# Verify the hierarchy
print(f"child.parent: {child.parent}")
print(f"grandchild.parent: {grandchild.parent}")
print(f"parent.parent: {parent.parent}")

# Set level only on the parent -- children inherit it
parent.setLevel(logging.WARNING)

print(f"\nparent effective level:     {logging.getLevelName(parent.getEffectiveLevel())}")
print(f"child effective level:      {logging.getLevelName(child.getEffectiveLevel())}")
print(f"grandchild effective level: {logging.getLevelName(grandchild.getEffectiveLevel())}")

In [None]:
import logging
import sys

# Demonstrate propagation: records bubble up to parent handlers
parent = logging.getLogger("webapp")
parent.setLevel(logging.DEBUG)
parent.handlers.clear()

child = logging.getLogger("webapp.auth")
child.handlers.clear()

# Add a handler ONLY to the parent
parent_handler = logging.StreamHandler(sys.stdout)
parent_handler.setFormatter(logging.Formatter("[%(name)s] %(levelname)s: %(message)s"))
parent.addHandler(parent_handler)
parent.propagate = False  # Stop at parent, don't go to root

# Child has no handler, but propagation sends records to parent's handler
print("--- Child logs propagate to parent handler ---")
child.warning("Login attempt failed for user 'admin'")
child.info("Session created for user 'alice'")

# Disable propagation on child
print("\n--- After child.propagate = False ---")
child.propagate = False
child.warning("This will NOT appear (no handler, no propagation)")

# Give child its own handler
child_handler = logging.StreamHandler(sys.stdout)
child_handler.setFormatter(logging.Formatter("  AUTH >> %(levelname)s: %(message)s"))
child.addHandler(child_handler)

print("\n--- Child with its own handler, propagation off ---")
child.warning("Appears only through child's handler")

# Re-enable propagation: message appears in BOTH handlers
child.propagate = True
print("\n--- Child with own handler + propagation on ---")
child.error("Appears in BOTH child and parent handlers")

# Clean up
parent.removeHandler(parent_handler)
child.removeHandler(child_handler)
child.propagate = True
parent.propagate = True

## Multiple Handlers Per Logger

A common pattern is to attach multiple handlers to a single logger, each with a different
level and destination. For example: verbose `DEBUG` output to a file, and only `WARNING`+
to the console.

In [None]:
import logging
import sys
import io

logger = logging.getLogger("myapp.multi")
logger.setLevel(logging.DEBUG)
logger.handlers.clear()
logger.propagate = False

# Handler 1: Console -- only WARNING and above
console_handler = logging.StreamHandler(sys.stdout)
console_handler.setLevel(logging.WARNING)
console_handler.setFormatter(logging.Formatter(
    "CONSOLE | %(levelname)-8s | %(message)s"
))

# Handler 2: Simulated file (StringIO) -- all levels from DEBUG
file_buffer = io.StringIO()
file_handler = logging.StreamHandler(file_buffer)
file_handler.setLevel(logging.DEBUG)
file_handler.setFormatter(logging.Formatter(
    "%(asctime)s | %(levelname)-8s | %(name)s | %(message)s",
    datefmt="%Y-%m-%d %H:%M:%S"
))

logger.addHandler(console_handler)
logger.addHandler(file_handler)

# Emit messages at various levels
print("=== Console output (WARNING+) ===")
logger.debug("Connecting to database")
logger.info("Query executed in 45ms")
logger.warning("Slow query detected: 2300ms")
logger.error("Connection pool exhausted")

# Show what the "file" captured (all levels)
print("\n=== File output (DEBUG+) ===")
print(file_buffer.getvalue())

# Clean up
logger.removeHandler(console_handler)
logger.removeHandler(file_handler)
logger.propagate = True

## Filters: Custom Log Record Filtering

Filters provide fine-grained control over which records a logger or handler processes.
A filter can be:
- A `logging.Filter` instance (filters by logger name prefix)
- Any object with a `filter(record)` method that returns `True`/`False`
- A callable that takes a `LogRecord` and returns `True`/`False`

In [None]:
import logging
import sys


class SensitiveDataFilter(logging.Filter):
    """Filter that redacts sensitive fields from log messages."""

    SENSITIVE_KEYWORDS: list[str] = ["password", "token", "secret", "api_key"]

    def filter(self, record: logging.LogRecord) -> bool:
        """Redact sensitive data but allow the record through."""
        message: str = record.getMessage()
        for keyword in self.SENSITIVE_KEYWORDS:
            if keyword in message.lower():
                record.msg = f"[REDACTED - contains '{keyword}']"
                record.args = None
                break
        return True  # Always allow through (but redacted)


class LevelRangeFilter(logging.Filter):
    """Only allow records within a specific level range."""

    def __init__(self, low: int = logging.DEBUG, high: int = logging.CRITICAL) -> None:
        super().__init__()
        self.low = low
        self.high = high

    def filter(self, record: logging.LogRecord) -> bool:
        return self.low <= record.levelno <= self.high


# Setup logger
logger = logging.getLogger("myapp.filters")
logger.setLevel(logging.DEBUG)
logger.handlers.clear()
logger.propagate = False

handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(logging.Formatter("%(levelname)-8s: %(message)s"))
logger.addHandler(handler)

# Add the sensitive data filter
sensitive_filter = SensitiveDataFilter()
handler.addFilter(sensitive_filter)

print("--- With SensitiveDataFilter ---")
logger.info("User alice logged in successfully")
logger.info("Authenticating with password=s3cret123")
logger.info("Setting API_KEY=abc-def-ghi")
logger.warning("Token expired for user bob")

# Add a level range filter: only DEBUG and INFO
handler.removeFilter(sensitive_filter)
range_filter = LevelRangeFilter(low=logging.DEBUG, high=logging.INFO)
handler.addFilter(range_filter)

print("\n--- With LevelRangeFilter (DEBUG-INFO only) ---")
logger.debug("This DEBUG message passes")
logger.info("This INFO message passes")
logger.warning("This WARNING will be filtered out")
logger.error("This ERROR will be filtered out")

# Clean up
handler.removeFilter(range_filter)
logger.removeHandler(handler)
logger.propagate = True

## RotatingFileHandler and TimedRotatingFileHandler

For long-running applications, log files can grow very large. The `logging.handlers`
module provides rotating handlers that automatically manage file sizes:

- **`RotatingFileHandler`**: Rotates when the file reaches a size limit
  - `maxBytes`: Maximum size per file
  - `backupCount`: Number of rotated backup files to keep
  
- **`TimedRotatingFileHandler`**: Rotates based on time intervals
  - `when`: `'S'` (seconds), `'M'` (minutes), `'H'` (hours), `'D'` (days), `'midnight'`
  - `interval`: Number of time units between rotations
  - `backupCount`: Number of backup files to keep

In [None]:
from logging.handlers import RotatingFileHandler, TimedRotatingFileHandler
import logging
import tempfile
import os

# Create a temporary directory for demo log files
log_dir: str = tempfile.mkdtemp(prefix="logging_demo_")
log_file: str = os.path.join(log_dir, "app.log")

# RotatingFileHandler: rotate after 1KB, keep 3 backups
rotating_handler = RotatingFileHandler(
    filename=log_file,
    maxBytes=1024,       # Rotate after 1KB
    backupCount=3,       # Keep app.log.1, app.log.2, app.log.3
)
rotating_handler.setFormatter(logging.Formatter(
    "%(asctime)s [%(levelname)s] %(message)s"
))

logger = logging.getLogger("myapp.rotating")
logger.setLevel(logging.DEBUG)
logger.handlers.clear()
logger.propagate = False
logger.addHandler(rotating_handler)

# Write enough messages to trigger rotation
for i in range(50):
    logger.info("Log entry number %d: some application event data here", i)

# Close the handler so files are flushed
rotating_handler.close()
logger.removeHandler(rotating_handler)

# Show the rotated files
print(f"Log directory: {log_dir}")
print(f"\nFiles created:")
for filename in sorted(os.listdir(log_dir)):
    filepath = os.path.join(log_dir, filename)
    size = os.path.getsize(filepath)
    print(f"  {filename}: {size} bytes")

# Show content of the most recent log file
print(f"\nLast 3 lines of app.log:")
with open(log_file) as f:
    lines = f.readlines()
    for line in lines[-3:]:
        print(f"  {line.rstrip()}")

# TimedRotatingFileHandler (conceptual -- just show configuration)
print("\nTimedRotatingFileHandler configuration example:")
print("  when='midnight' -- rotate at midnight each day")
print("  interval=1      -- every 1 unit of 'when'")
print("  backupCount=7   -- keep one week of logs")

# Clean up temp directory
import shutil
shutil.rmtree(log_dir)
logger.propagate = True

## Structured Logging: JSON-Formatted Output

Plain-text logs are human-readable but hard to parse programmatically. **Structured logging**
outputs records in a machine-readable format like JSON, making it easy to index, search,
and analyze logs with tools like Elasticsearch, Splunk, or CloudWatch.

In [None]:
import json
import logging
import sys
from datetime import datetime, timezone


class JSONFormatter(logging.Formatter):
    """Formats log records as JSON lines."""

    def format(self, record: logging.LogRecord) -> str:
        """Convert a LogRecord to a JSON string."""
        log_data: dict[str, object] = {
            "timestamp": datetime.fromtimestamp(
                record.created, tz=timezone.utc
            ).isoformat(),
            "level": record.levelname,
            "logger": record.name,
            "message": record.getMessage(),
            "module": record.module,
            "function": record.funcName,
            "line": record.lineno,
        }

        # Include exception info if present
        if record.exc_info and record.exc_info[0] is not None:
            log_data["exception"] = self.formatException(record.exc_info)

        # Include any extra fields
        if hasattr(record, "request_id"):
            log_data["request_id"] = record.request_id  # type: ignore[attr-defined]
        if hasattr(record, "user_id"):
            log_data["user_id"] = record.user_id  # type: ignore[attr-defined]

        return json.dumps(log_data, default=str)


# Set up logger with JSON formatter
logger = logging.getLogger("myapp.json")
logger.setLevel(logging.DEBUG)
logger.handlers.clear()
logger.propagate = False

handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(JSONFormatter())
logger.addHandler(handler)

# Regular log messages become JSON
logger.info("Application started")
logger.warning("Cache miss ratio above threshold: 45%%")

# Log with extra fields
logger.info(
    "User authenticated",
    extra={"request_id": "req-abc-123", "user_id": 42}
)

# Log with exception
try:
    result: float = 1 / 0
except ZeroDivisionError:
    logger.error("Calculation failed", exc_info=True)

# Clean up
logger.removeHandler(handler)
logger.propagate = True

## dictConfig: Declarative Logging Configuration

`logging.config.dictConfig()` lets you configure the entire logging system from a
dictionary. This is the **recommended** approach for complex setups because:

- Configuration can be stored in YAML/JSON/TOML files
- Easy to change without modifying code
- All loggers, handlers, formatters, and filters defined in one place

`logging.config.fileConfig()` is the older file-based approach using INI format.

In [None]:
import logging
import logging.config
import sys

# Define the entire logging configuration as a dictionary
LOGGING_CONFIG: dict = {
    "version": 1,                     # Required, always 1
    "disable_existing_loggers": False, # Don't disable loggers created before config

    "formatters": {
        "standard": {
            "format": "%(asctime)s [%(levelname)-8s] %(name)s: %(message)s",
            "datefmt": "%H:%M:%S",
        },
        "brief": {
            "format": "%(levelname)s: %(message)s",
        },
    },

    "handlers": {
        "console": {
            "class": "logging.StreamHandler",
            "level": "INFO",
            "formatter": "brief",
            "stream": "ext://sys.stdout",
        },
        "detailed_console": {
            "class": "logging.StreamHandler",
            "level": "DEBUG",
            "formatter": "standard",
            "stream": "ext://sys.stdout",
        },
    },

    "loggers": {
        "myapp": {
            "level": "DEBUG",
            "handlers": ["detailed_console"],
            "propagate": False,
        },
        "myapp.db": {
            "level": "WARNING",       # Only warnings+ from the db module
            "handlers": ["console"],
            "propagate": False,
        },
    },

    "root": {
        "level": "WARNING",
        "handlers": ["console"],
    },
}

# Apply the configuration
logging.config.dictConfig(LOGGING_CONFIG)

# Test the configured loggers
app_logger = logging.getLogger("myapp")
db_logger = logging.getLogger("myapp.db")

print("--- myapp logger (detailed format, DEBUG+) ---")
app_logger.debug("Initializing application")
app_logger.info("Configuration loaded")

print("\n--- myapp.db logger (brief format, WARNING+) ---")
db_logger.info("Executing SELECT query")       # Filtered out
db_logger.warning("Query took 5.2 seconds")
db_logger.error("Connection lost to primary DB")

# Clean up
for name in ["myapp", "myapp.db"]:
    lg = logging.getLogger(name)
    lg.handlers.clear()
    lg.setLevel(logging.NOTSET)
    lg.propagate = True
root = logging.getLogger()
root.handlers.clear()

## Logging in Libraries vs Applications

Libraries and applications have different logging responsibilities:

**Libraries** should:
- Create loggers using `getLogger(__name__)`
- Add a `NullHandler` to suppress "No handlers could be found" warnings
- **Never** configure handlers, formatters, or levels
- Let the application decide how to handle log output

**Applications** should:
- Configure the root logger or specific loggers with handlers and formatters
- Use `dictConfig()` or `basicConfig()` at the entry point
- Set appropriate levels for different components

In [None]:
import logging
import sys


# === Library code (what a library author writes) ===
# File: mylib/__init__.py
lib_logger = logging.getLogger("mylib")
lib_logger.addHandler(logging.NullHandler())  # Prevent "no handler" warning


def library_function(data: str) -> str:
    """A function in a third-party library."""
    lib_logger.debug("Processing data: %s", data[:50])
    if not data:
        lib_logger.warning("Empty data received")
        return ""
    lib_logger.info("Data processed successfully, length=%d", len(data))
    return data.upper()


# === Without application configuration, library logs are silently discarded ===
print("--- Without app configuration (NullHandler absorbs logs) ---")
result = library_function("hello world")
print(f"Result: {result}")
print("(No log output -- NullHandler absorbed everything)")

# === Application code: configure logging to see library output ===
print("\n--- With app configuration ---")
app_handler = logging.StreamHandler(sys.stdout)
app_handler.setFormatter(logging.Formatter("[%(name)s] %(levelname)s: %(message)s"))

# Configure the library's logger from the application side
lib_logger.addHandler(app_handler)
lib_logger.setLevel(logging.DEBUG)

result = library_function("hello world")
print(f"Result: {result}")

# Clean up
lib_logger.removeHandler(app_handler)
lib_logger.setLevel(logging.NOTSET)

## LoggerAdapter: Adding Context with Extra Fields

`LoggerAdapter` wraps a logger and injects extra context into every log record.
This is useful for adding request IDs, user IDs, or other contextual information
without passing them to every log call.

In [None]:
import logging
import sys


# Set up a logger with a format that includes 'extra' fields
logger = logging.getLogger("myapp.adapter")
logger.setLevel(logging.DEBUG)
logger.handlers.clear()
logger.propagate = False

handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(logging.Formatter(
    "%(asctime)s [%(levelname)s] request_id=%(request_id)s user=%(user)s: %(message)s",
    datefmt="%H:%M:%S"
))
logger.addHandler(handler)

# Create a LoggerAdapter that injects request context
request_context: dict[str, str] = {
    "request_id": "req-7f3a-b2c1",
    "user": "alice",
}
adapter = logging.LoggerAdapter(logger, extra=request_context)

# Every log call through the adapter automatically includes the context
adapter.info("Handling GET /api/users")
adapter.debug("Querying database for user list")
adapter.info("Returning 25 users")

# A different request gets a different adapter
print()
another_context: dict[str, str] = {
    "request_id": "req-9d4e-f8a2",
    "user": "bob",
}
another_adapter = logging.LoggerAdapter(logger, extra=another_context)
another_adapter.warning("Rate limit approaching for user")
another_adapter.error("Permission denied for /admin endpoint")

# Clean up
logger.removeHandler(handler)
logger.propagate = True

In [None]:
import logging
import sys


class RequestAdapter(logging.LoggerAdapter):
    """Custom LoggerAdapter that prefixes messages with request context."""

    def process(
        self, msg: str, kwargs: dict
    ) -> tuple[str, dict]:
        """Prepend request context to every log message."""
        request_id: str = self.extra.get("request_id", "unknown")  # type: ignore[union-attr]
        client_ip: str = self.extra.get("client_ip", "unknown")  # type: ignore[union-attr]
        return f"[{request_id}] [{client_ip}] {msg}", kwargs


# Set up logger with a simple format (context is in the message itself)
logger = logging.getLogger("myapp.custom_adapter")
logger.setLevel(logging.DEBUG)
logger.handlers.clear()
logger.propagate = False

handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(logging.Formatter("%(levelname)-8s %(message)s"))
logger.addHandler(handler)

# Simulate handling two concurrent requests
req1 = RequestAdapter(logger, {"request_id": "a1b2c3", "client_ip": "192.168.1.10"})
req2 = RequestAdapter(logger, {"request_id": "d4e5f6", "client_ip": "10.0.0.5"})

req1.info("Received GET /api/products")
req2.info("Received POST /api/orders")
req1.debug("Fetching products from cache")
req2.warning("Order validation failed: missing field 'quantity'")
req1.info("Returning 150 products")
req2.error("Order creation failed")

# Clean up
logger.removeHandler(handler)
logger.propagate = True

## Summary

### Key Takeaways

| Concept | Tool | Purpose |
|---------|------|--------|
| **Logger hierarchy** | Dot-separated names | Parent-child log propagation |
| **Multiple handlers** | `addHandler()` | Send logs to console + file simultaneously |
| **Filters** | `logging.Filter`, callables | Fine-grained record filtering and modification |
| **Rotating handlers** | `RotatingFileHandler` | Automatic log file rotation by size or time |
| **Structured logging** | Custom `Formatter` | JSON output for machine parsing |
| **dictConfig** | `logging.config.dictConfig()` | Declarative logging configuration |
| **NullHandler** | `logging.NullHandler()` | Silent default handler for libraries |
| **LoggerAdapter** | `logging.LoggerAdapter` | Inject context into every log message |

### Best Practices
- Use `propagate=False` on loggers with their own handlers to avoid duplicate output
- Libraries should only add `NullHandler`; applications configure everything else
- Use `dictConfig()` for complex setups -- store config in external files
- Use `RotatingFileHandler` in production to prevent disk exhaustion
- Use structured (JSON) logging when logs will be parsed by aggregation tools
- Use `LoggerAdapter` or the `extra` parameter to add request/session context
- Set `disable_existing_loggers: False` in `dictConfig` to avoid silencing third-party loggers