# Example 2: Extractors - Automatic Context Extraction

## What You'll Learn

This notebook demonstrates how extractors automatically pull context from function arguments:

- **Extractor Interfaces** - Abstract base classes for extensibility
- **Default Extractors** - Built-in extractors for common patterns
- **Custom Extractors** - Creating domain-specific extractors
- **ResultProcessor** - Processing function results
- **Real-World Examples** - Practical extractor implementations


## Setup - Imports


In [6]:
from dc_logger.client.Log import LogEntity, HTTPDetails, MultiTenant
from dc_logger.client.extractors import (
    EntityExtractor,
    HTTPDetailsExtractor,
    MultiTenantExtractor,
    ResultProcessor,
    KwargsEntityExtractor,
    KwargsHTTPDetailsExtractor,
    KwargsMultiTenantExtractor,
    DefaultResultProcessor
)

print("All imports successful!")


All imports successful!


---
## Part 1: Understanding Extractor Interfaces

Extractors follow the **Interface Segregation Principle** - each has a single, focused responsibility.


In [7]:
print("=" * 70)
print("Extractor Interfaces")
print("=" * 70)

print("\n1. EntityExtractor")
print("   Purpose: Extract entity information (datasets, users, resources)")
print("   Method: extract(func, args, kwargs) -> Optional[LogEntity]")

print("\n2. HTTPDetailsExtractor")
print("   Purpose: Extract HTTP request/response details")
print("   Method: extract(func, args, kwargs) -> Optional[HTTPDetails]")

print("\n3. MultiTenantExtractor")
print("   Purpose: Extract multi-tenant and session information")
print("   Method: extract(func, args, kwargs) -> Optional[MultiTenant]")

print("\n4. ResultProcessor")
print("   Purpose: Process function results and update context")
print("   Method: process(result, http_details) -> (dict, HTTPDetails)")


Extractor Interfaces

1. EntityExtractor
   Purpose: Extract entity information (datasets, users, resources)
   Method: extract(func, args, kwargs) -> Optional[LogEntity]

2. HTTPDetailsExtractor
   Purpose: Extract HTTP request/response details
   Method: extract(func, args, kwargs) -> Optional[HTTPDetails]

3. MultiTenantExtractor
   Purpose: Extract multi-tenant and session information
   Method: extract(func, args, kwargs) -> Optional[MultiTenant]

4. ResultProcessor
   Purpose: Process function results and update context
   Method: process(result, http_details) -> (dict, HTTPDetails)


---
## Part 2: KwargsEntityExtractor - Default Entity Extraction


In [8]:
print("=" * 70)
print("KwargsEntityExtractor - Extracting from kwargs")
print("=" * 70)

# Create the extractor
entity_extractor = KwargsEntityExtractor(kwarg_name="entity")

# Simulate function call
def dummy_function():
    pass

# Example 1: Entity as dict
kwargs1 = {
    "entity": {"type": "dataset", "id": "ds_123", "name": "Sales Data"}
}
entity1 = entity_extractor.extract(dummy_function, (), kwargs1)

print("\nExample 1: Entity from dict")
print(f"  Type: {entity1.type}")
print(f"  ID: {entity1.id}")
print(f"  Name: {entity1.name}")

# Example 2: No entity in kwargs
kwargs2 = {"other_param": "value"}
entity2 = entity_extractor.extract(dummy_function, (), kwargs2)

print("\nExample 2: No entity found")
print(f"  Result: {entity2}")

# Example 3: Custom kwarg name
custom_extractor = KwargsEntityExtractor(kwarg_name="resource")
kwargs3 = {"resource": {"type": "file", "id": "file_456"}}
entity3 = custom_extractor.extract(dummy_function, (), kwargs3)

print("\nExample 3: Custom kwarg name ('resource')")
print(f"  Type: {entity3.type}")
print(f"  ID: {entity3.id}")


KwargsEntityExtractor - Extracting from kwargs

Example 1: Entity from dict
  Type: dataset
  ID: ds_123
  Name: Sales Data

Example 2: No entity found
  Result: None

Example 3: Custom kwarg name ('resource')
  Type: file
  ID: file_456


---
## Part 3: KwargsHTTPDetailsExtractor - HTTP Information


In [9]:
print("=" * 70)
print("KwargsHTTPDetailsExtractor - Extracting HTTP Details")
print("=" * 70)

http_extractor = KwargsHTTPDetailsExtractor()

# Example 1: From individual kwargs
kwargs1 = {
    "method": "POST",
    "url": "https://api.example.com/orders",
    "headers": {"Content-Type": "application/json"},
    "params": {"expand": "items"}
}
http1 = http_extractor.extract(dummy_function, (), kwargs1)

print("\nExample 1: From individual kwargs")
print(f"  Method: {http1.method}")
print(f"  URL: {http1.url}")
print(f"  Headers: {http1.headers}")
print(f"  Params: {http1.params}")

# Example 2: From HTTPDetails object
kwargs2 = {
    "http_details": HTTPDetails(
        method="GET",
        url="https://api.example.com/users/123",
        status_code=200
    )
}
http2 = http_extractor.extract(dummy_function, (), kwargs2)

print("\nExample 2: From HTTPDetails object")
print(f"  Method: {http2.method}")
print(f"  URL: {http2.url}")
print(f"  Status: {http2.status_code}")

# Example 3: From dict
kwargs3 = {
    "http_details": {
        "method": "DELETE",
        "url": "/api/items/789"
    }
}
http3 = http_extractor.extract(dummy_function, (), kwargs3)

print("\nExample 3: From dict")
print(f"  Method: {http3.method}")
print(f"  URL: {http3.url}")


KwargsHTTPDetailsExtractor - Extracting HTTP Details

Example 1: From individual kwargs
  Method: POST
  URL: https://api.example.com/orders
  Headers: {'Content-Type': 'application/json'}
  Params: {'expand': 'items'}

Example 2: From HTTPDetails object
  Method: GET
  URL: https://api.example.com/users/123
  Status: 200

Example 3: From dict
  Method: DELETE
  URL: /api/items/789


---
## Part 4: Custom Entity Extractor for E-Commerce

Create extractors tailored to your domain.


In [10]:
print("=" * 70)
print("Custom EntityExtractor - E-Commerce Example")
print("=" * 70)

# Custom extractor for e-commerce orders
class OrderEntityExtractor(EntityExtractor):
    """Extract order entities from function arguments."""
    
    def extract(self, func, args, kwargs):
        # Check for order_id
        if "order_id" in kwargs:
            return LogEntity(
                type="order",
                id=kwargs["order_id"],
                name=f"Order {kwargs['order_id']}",
                additional_info={
                    "total": kwargs.get("total"),
                    "items_count": kwargs.get("items_count")
                }
            )
        # Check for product_id
        elif "product_id" in kwargs:
            return LogEntity(
                type="product",
                id=kwargs["product_id"],
                name=kwargs.get("product_name", "Unknown Product")
            )
        
        return None

# Use the custom extractor
order_extractor = OrderEntityExtractor()

# Example 1: Extract order
order_kwargs = {
    "order_id": "ORD_2024_123",
    "total": 299.99,
    "items_count": 3
}
order_entity = order_extractor.extract(dummy_function, (), order_kwargs)

print("\nExample 1: Order Entity")
print(f"  Type: {order_entity.type}")
print(f"  ID: {order_entity.id}")
print(f"  Name: {order_entity.name}")
print(f"  Total: ${order_entity.additional_info['total']}")
print(f"  Items: {order_entity.additional_info['items_count']}")

# Example 2: Extract product
product_kwargs = {
    "product_id": "PROD_WIDGET_001",
    "product_name": "Super Widget Pro"
}
product_entity = order_extractor.extract(dummy_function, (), product_kwargs)

print("\nExample 2: Product Entity")
print(f"  Type: {product_entity.type}")
print(f"  ID: {product_entity.id}")
print(f"  Name: {product_entity.name}")


Custom EntityExtractor - E-Commerce Example

Example 1: Order Entity
  Type: order
  ID: ORD_2024_123
  Name: Order ORD_2024_123
  Total: $299.99
  Items: 3

Example 2: Product Entity
  Type: product
  ID: PROD_WIDGET_001
  Name: Super Widget Pro


---
## Summary


1. **Extractor Interfaces** - Abstract base classes for extensibility
2. **KwargsEntityExtractor** - Extract entities from kwargs
3. **KwargsHTTPDetailsExtractor** - Extract HTTP details from multiple sources
4. **KwargsMultiTenantExtractor** - Extract multi-tenant context
5. **DefaultResultProcessor** - Process function results
6. **Custom Extractors** - Domain-specific implementations

### Key Takeaways

- Extractors follow **Single Responsibility Principle**
- Easy to create **custom extractors** for your domain
- Extractors are **composable** - mix default and custom
- **Dependency Injection** - pass extractors to decorators/loggers

