# Practical Patterns for AWS with Boto3

## Learning Objectives

By the end of this notebook, you will be able to:

1. Handle AWS errors with botocore exceptions
2. Implement pagination for large result sets
3. Use async operations for better performance
4. Apply best practices for production AWS code
5. Implement common design patterns

---

## 1. Error Handling with Botocore Exceptions

### Exception Hierarchy

```
BaseException
  +-- Exception
        +-- BotoCoreError (base for all botocore errors)
        |     +-- DataNotFoundError
        |     +-- UnknownServiceError
        |     +-- ApiVersionNotFoundError
        |     +-- ...
        +-- ClientError (AWS service errors)
              +-- Contains Error Code and Message
```

### Common Error Codes by Service

In [None]:
# Common AWS error codes reference
AWS_ERROR_CODES = {
    "S3": {
        "NoSuchBucket": "The specified bucket does not exist",
        "NoSuchKey": "The specified key does not exist",
        "BucketAlreadyExists": "Bucket name already taken globally",
        "BucketNotEmpty": "Cannot delete non-empty bucket",
        "AccessDenied": "Access denied to this resource",
    },
    "DynamoDB": {
        "ResourceNotFoundException": "Table or index doesn't exist",
        "ConditionalCheckFailedException": "Condition expression failed",
        "ProvisionedThroughputExceededException": "Request rate too high",
        "ValidationException": "Invalid parameters",
        "ResourceInUseException": "Table is being created/deleted",
    },
    "Lambda": {
        "ResourceNotFoundException": "Function not found",
        "InvalidParameterValueException": "Invalid parameter",
        "TooManyRequestsException": "Rate limit exceeded",
        "ServiceException": "Lambda service error",
    },
    "SQS": {
        "QueueDoesNotExist": "Queue not found",
        "QueueDeletedRecently": "Queue was recently deleted",
        "MessageNotInflight": "Message not currently in flight",
    },
    "General": {
        "ExpiredTokenException": "Security token has expired",
        "UnauthorizedAccess": "Credentials are invalid",
        "ThrottlingException": "Request rate exceeded",
        "RequestLimitExceeded": "API request limit reached",
    }
}

print("Common AWS Error Codes:")
print("=" * 60)
for service, errors in AWS_ERROR_CODES.items():
    print(f"\n{service}:")
    for code, description in errors.items():
        print(f"  {code:40} - {description}")

In [None]:
import boto3
from botocore.exceptions import (
    ClientError,
    BotoCoreError,
    NoCredentialsError,
    PartialCredentialsError,
    ParamValidationError,
    EndpointConnectionError,
    ReadTimeoutError,
    ConnectTimeoutError
)
from typing import Optional, Dict, Any

class AWSErrorHandler:
    """Centralized AWS error handling."""
    
    # Errors that should be retried
    RETRYABLE_ERRORS = {
        'ProvisionedThroughputExceededException',
        'ThrottlingException',
        'TooManyRequestsException',
        'RequestLimitExceeded',
        'ServiceException',
        'ServiceUnavailable',
        'InternalServerError',
        'EC2ThrottledException',
    }
    
    @staticmethod
    def is_retryable(error: ClientError) -> bool:
        """Check if an error should be retried."""
        error_code = error.response['Error']['Code']
        return error_code in AWSErrorHandler.RETRYABLE_ERRORS
    
    @staticmethod
    def get_error_info(error: ClientError) -> Dict[str, Any]:
        """Extract error information from ClientError."""
        return {
            'code': error.response['Error']['Code'],
            'message': error.response['Error']['Message'],
            'request_id': error.response.get('ResponseMetadata', {}).get('RequestId'),
            'http_status': error.response.get('ResponseMetadata', {}).get('HTTPStatusCode'),
            'retryable': AWSErrorHandler.is_retryable(error)
        }
    
    @staticmethod
    def handle_s3_error(error: ClientError, bucket: str = None, key: str = None) -> Dict[str, Any]:
        """Handle S3-specific errors with context."""
        error_code = error.response['Error']['Code']
        
        if error_code == 'NoSuchBucket':
            return {
                'error': 'Bucket not found',
                'bucket': bucket,
                'suggestion': 'Check bucket name and region'
            }
        elif error_code == 'NoSuchKey':
            return {
                'error': 'Object not found',
                'bucket': bucket,
                'key': key,
                'suggestion': 'Verify the object key is correct'
            }
        elif error_code == 'AccessDenied':
            return {
                'error': 'Access denied',
                'suggestion': 'Check IAM permissions and bucket policy'
            }
        else:
            return AWSErrorHandler.get_error_info(error)

print("AWSErrorHandler class defined")

In [None]:
# Comprehensive error handling example
def safe_s3_get(
    bucket: str,
    key: str,
    s3_client=None
) -> Dict[str, Any]:
    """
    Safely get an object from S3 with comprehensive error handling.
    
    Args:
        bucket: S3 bucket name
        key: Object key
        s3_client: Optional S3 client (creates one if not provided)
        
    Returns:
        Dictionary with content or error information
    """
    if s3_client is None:
        try:
            s3_client = boto3.client('s3')
        except NoCredentialsError:
            return {'success': False, 'error': 'No AWS credentials found'}
        except PartialCredentialsError:
            return {'success': False, 'error': 'Incomplete AWS credentials'}
    
    try:
        response = s3_client.get_object(Bucket=bucket, Key=key)
        content = response['Body'].read()
        
        return {
            'success': True,
            'content': content,
            'content_type': response.get('ContentType'),
            'content_length': response.get('ContentLength'),
            'last_modified': response.get('LastModified'),
            'etag': response.get('ETag')
        }
        
    except ClientError as e:
        error_info = AWSErrorHandler.handle_s3_error(e, bucket, key)
        return {'success': False, **error_info}
    
    except EndpointConnectionError:
        return {
            'success': False,
            'error': 'Cannot connect to AWS endpoint',
            'suggestion': 'Check internet connection and region'
        }
    
    except (ReadTimeoutError, ConnectTimeoutError) as e:
        return {
            'success': False,
            'error': 'Request timed out',
            'retryable': True
        }
    
    except ParamValidationError as e:
        return {
            'success': False,
            'error': f'Invalid parameters: {e}'
        }
    
    except BotoCoreError as e:
        return {
            'success': False,
            'error': f'AWS SDK error: {e}'
        }

print("Example: Safe S3 operations")
print("-" * 40)
print("""
result = safe_s3_get('my-bucket', 'data/config.json')

if result['success']:
    config = json.loads(result['content'])
    print(f"Loaded config: {config}")
else:
    print(f"Error: {result['error']}")
    if result.get('suggestion'):
        print(f"Suggestion: {result['suggestion']}")
    if result.get('retryable'):
        print("This error can be retried")
""")

In [None]:
# Retry decorator with exponential backoff
import time
import functools
from typing import Callable, Type, Tuple

def retry_with_backoff(
    max_retries: int = 3,
    base_delay: float = 1.0,
    max_delay: float = 60.0,
    exponential_base: float = 2.0,
    retryable_exceptions: Tuple[Type[Exception], ...] = (ClientError,)
):
    """
    Decorator for retrying functions with exponential backoff.
    
    Args:
        max_retries: Maximum number of retry attempts
        base_delay: Initial delay between retries (seconds)
        max_delay: Maximum delay between retries
        exponential_base: Base for exponential calculation
        retryable_exceptions: Exceptions to retry on
    """
    def decorator(func: Callable):
        @functools.wraps(func)
        def wrapper(*args, **kwargs):
            last_exception = None
            
            for attempt in range(max_retries + 1):
                try:
                    return func(*args, **kwargs)
                    
                except retryable_exceptions as e:
                    last_exception = e
                    
                    # Check if error is retryable
                    if isinstance(e, ClientError):
                        if not AWSErrorHandler.is_retryable(e):
                            raise
                    
                    if attempt < max_retries:
                        # Calculate delay with jitter
                        delay = min(
                            base_delay * (exponential_base ** attempt),
                            max_delay
                        )
                        # Add jitter (0-25% of delay)
                        jitter = delay * 0.25 * (time.time() % 1)
                        delay += jitter
                        
                        print(f"Attempt {attempt + 1} failed, "
                              f"retrying in {delay:.2f}s...")
                        time.sleep(delay)
                    else:
                        print(f"Max retries ({max_retries}) exceeded")
                        raise
            
            raise last_exception
        
        return wrapper
    return decorator

# Example usage
@retry_with_backoff(max_retries=3, base_delay=1.0)
def get_dynamodb_item(table_name: str, key: dict):
    """Get item from DynamoDB with automatic retries."""
    dynamodb = boto3.resource('dynamodb')
    table = dynamodb.Table(table_name)
    response = table.get_item(Key=key)
    return response.get('Item')

print("Retry decorator defined")
print("""
Usage:
@retry_with_backoff(max_retries=3)
def my_aws_function():
    # AWS operations here
    pass
""")

## 2. Pagination

### Why Pagination?

AWS APIs return limited results per request:
- S3 `list_objects_v2`: Max 1000 objects
- DynamoDB `scan`/`query`: Max 1MB of data
- Lambda `list_functions`: Max 50 functions

### Pagination Methods

In [None]:
from typing import Generator, List, Dict, Any

# Method 1: Manual pagination with continuation tokens
def list_s3_objects_manual(bucket: str, prefix: str = '') -> List[Dict]:
    """
    List S3 objects using manual pagination.
    
    Shows the low-level approach - good for understanding,
    but use paginators in production.
    """
    s3 = boto3.client('s3')
    all_objects = []
    continuation_token = None
    
    while True:
        # Build request parameters
        params = {
            'Bucket': bucket,
            'Prefix': prefix,
            'MaxKeys': 1000  # Maximum allowed
        }
        
        if continuation_token:
            params['ContinuationToken'] = continuation_token
        
        response = s3.list_objects_v2(**params)
        
        # Collect objects
        for obj in response.get('Contents', []):
            all_objects.append({
                'key': obj['Key'],
                'size': obj['Size'],
                'last_modified': obj['LastModified']
            })
        
        # Check for more pages
        if response.get('IsTruncated'):
            continuation_token = response['NextContinuationToken']
        else:
            break
    
    return all_objects

print("Manual pagination example defined")

In [None]:
# Method 2: Using paginators (recommended)
def list_s3_objects_paginator(bucket: str, prefix: str = '') -> Generator:
    """
    List S3 objects using boto3 paginators.
    
    This is the recommended approach - cleaner and handles
    all pagination logic automatically.
    """
    s3 = boto3.client('s3')
    paginator = s3.get_paginator('list_objects_v2')
    
    # Create page iterator
    page_iterator = paginator.paginate(
        Bucket=bucket,
        Prefix=prefix,
        PaginationConfig={
            'MaxItems': 10000,  # Total max items
            'PageSize': 1000    # Items per page
        }
    )
    
    # Yield objects one at a time
    for page in page_iterator:
        for obj in page.get('Contents', []):
            yield {
                'key': obj['Key'],
                'size': obj['Size'],
                'last_modified': obj['LastModified']
            }

print("Paginator example defined")
print("""
Usage:
for obj in list_s3_objects_paginator('my-bucket', 'data/'):
    print(f"Processing: {obj['key']}")
""")

In [None]:
# Method 3: Using JMESPath for server-side filtering
def list_large_objects(
    bucket: str,
    prefix: str = '',
    min_size_mb: int = 100
) -> Generator:
    """
    List objects larger than specified size using JMESPath filtering.
    
    JMESPath filtering happens server-side, reducing data transfer.
    """
    s3 = boto3.client('s3')
    paginator = s3.get_paginator('list_objects_v2')
    
    min_size_bytes = min_size_mb * 1024 * 1024
    
    # Use JMESPath to filter results
    page_iterator = paginator.paginate(
        Bucket=bucket,
        Prefix=prefix
    ).search(f"Contents[?Size > `{min_size_bytes}`]")
    
    for obj in page_iterator:
        if obj:  # JMESPath can return None
            yield {
                'key': obj['Key'],
                'size_mb': obj['Size'] / (1024 * 1024)
            }

print("JMESPath filtering example defined")
print("""
Usage:
for obj in list_large_objects('my-bucket', min_size_mb=100):
    print(f"{obj['key']}: {obj['size_mb']:.1f} MB")
""")

In [None]:
# DynamoDB pagination example
def scan_dynamodb_table(table_name: str) -> Generator:
    """
    Scan entire DynamoDB table using pagination.
    
    Warning: Full table scans are expensive - use queries when possible!
    """
    dynamodb = boto3.client('dynamodb')
    paginator = dynamodb.get_paginator('scan')
    
    for page in paginator.paginate(TableName=table_name):
        for item in page.get('Items', []):
            # Convert DynamoDB format to Python dict
            yield deserialize_dynamodb_item(item)

def deserialize_dynamodb_item(item: dict) -> dict:
    """Convert DynamoDB item format to regular Python dict."""
    from boto3.dynamodb.types import TypeDeserializer
    deserializer = TypeDeserializer()
    return {k: deserializer.deserialize(v) for k, v in item.items()}

print("DynamoDB pagination example defined")

In [None]:
# Generic paginator wrapper
class AWSPaginator:
    """Generic wrapper for AWS pagination with progress tracking."""
    
    def __init__(self, client, operation: str):
        """
        Initialize paginator.
        
        Args:
            client: Boto3 client
            operation: API operation name
        """
        self.paginator = client.get_paginator(operation)
        self.total_items = 0
        self.total_pages = 0
    
    def paginate(
        self,
        result_key: str,
        max_items: int = None,
        show_progress: bool = False,
        **kwargs
    ) -> Generator:
        """
        Paginate through results.
        
        Args:
            result_key: Key containing results in response
            max_items: Maximum total items to return
            show_progress: Print progress updates
            **kwargs: Arguments passed to the paginate call
        """
        config = {}
        if max_items:
            config['MaxItems'] = max_items
        
        if config:
            kwargs['PaginationConfig'] = config
        
        for page in self.paginator.paginate(**kwargs):
            self.total_pages += 1
            items = page.get(result_key, [])
            
            for item in items:
                self.total_items += 1
                yield item
                
                if show_progress and self.total_items % 1000 == 0:
                    print(f"Processed {self.total_items} items...")
        
        if show_progress:
            print(f"Complete: {self.total_items} items in {self.total_pages} pages")

print("AWSPaginator class defined")
print("""
Usage:
s3 = boto3.client('s3')
paginator = AWSPaginator(s3, 'list_objects_v2')

for obj in paginator.paginate(
    result_key='Contents',
    Bucket='my-bucket',
    show_progress=True
):
    process(obj)

print(f"Total: {paginator.total_items}")
""")

## 3. Async Operations

### Why Async?

- **Concurrent requests**: Process multiple items in parallel
- **Better throughput**: Don't wait for each request to complete
- **Resource efficiency**: Handle I/O-bound operations efficiently

In [None]:
import asyncio
from concurrent.futures import ThreadPoolExecutor
from typing import List, Callable, Any

class AsyncAWSClient:
    """
    Async wrapper for boto3 operations using ThreadPoolExecutor.
    
    Note: boto3 is not truly async, but we can use threads
    to achieve concurrency for I/O-bound operations.
    """
    
    def __init__(self, max_workers: int = 10):
        """
        Initialize async client.
        
        Args:
            max_workers: Maximum concurrent operations
        """
        self.executor = ThreadPoolExecutor(max_workers=max_workers)
    
    async def run_async(self, func: Callable, *args, **kwargs) -> Any:
        """
        Run a synchronous function asynchronously.
        
        Args:
            func: Function to run
            *args, **kwargs: Arguments for the function
            
        Returns:
            Function result
        """
        loop = asyncio.get_event_loop()
        return await loop.run_in_executor(
            self.executor,
            lambda: func(*args, **kwargs)
        )
    
    async def batch_operation(
        self,
        func: Callable,
        items: List[Any],
        batch_size: int = 10
    ) -> List[Any]:
        """
        Execute operation on multiple items concurrently.
        
        Args:
            func: Function to apply to each item
            items: List of items to process
            batch_size: Number of concurrent operations
            
        Returns:
            List of results
        """
        results = []
        
        for i in range(0, len(items), batch_size):
            batch = items[i:i + batch_size]
            tasks = [self.run_async(func, item) for item in batch]
            batch_results = await asyncio.gather(*tasks, return_exceptions=True)
            results.extend(batch_results)
        
        return results
    
    def close(self):
        """Shutdown the executor."""
        self.executor.shutdown(wait=True)

print("AsyncAWSClient defined")

In [None]:
# Example: Async S3 downloads
async def download_files_async(bucket: str, keys: List[str]) -> List[Dict]:
    """
    Download multiple S3 files concurrently.
    
    Args:
        bucket: S3 bucket name
        keys: List of object keys to download
        
    Returns:
        List of download results
    """
    s3 = boto3.client('s3')
    client = AsyncAWSClient(max_workers=10)
    
    def download_one(key: str) -> Dict:
        """Download a single file."""
        try:
            response = s3.get_object(Bucket=bucket, Key=key)
            content = response['Body'].read()
            return {
                'key': key,
                'success': True,
                'size': len(content),
                'content': content
            }
        except ClientError as e:
            return {
                'key': key,
                'success': False,
                'error': str(e)
            }
    
    results = await client.batch_operation(download_one, keys, batch_size=10)
    client.close()
    
    return results

print("Async S3 download example defined")
print("""
Usage:
keys = ['file1.json', 'file2.json', 'file3.json']
results = await download_files_async('my-bucket', keys)

for result in results:
    if result['success']:
        print(f"Downloaded {result['key']}: {result['size']} bytes")
    else:
        print(f"Failed {result['key']}: {result['error']}")
""")

In [None]:
# Example: Async DynamoDB batch writes
async def batch_write_dynamodb_async(
    table_name: str,
    items: List[Dict],
    batch_size: int = 25
) -> Dict[str, int]:
    """
    Write items to DynamoDB in parallel batches.
    
    Args:
        table_name: DynamoDB table name
        items: Items to write
        batch_size: Items per batch (max 25 for DynamoDB)
        
    Returns:
        Statistics dictionary
    """
    dynamodb = boto3.resource('dynamodb')
    table = dynamodb.Table(table_name)
    
    stats = {'written': 0, 'failed': 0}
    client = AsyncAWSClient(max_workers=5)
    
    def write_batch(batch_items: List[Dict]) -> Dict:
        """Write a batch of items."""
        try:
            with table.batch_writer() as batch:
                for item in batch_items:
                    batch.put_item(Item=item)
            return {'success': True, 'count': len(batch_items)}
        except Exception as e:
            return {'success': False, 'error': str(e)}
    
    # Split into batches
    batches = [
        items[i:i + batch_size]
        for i in range(0, len(items), batch_size)
    ]
    
    results = await client.batch_operation(write_batch, batches, batch_size=5)
    client.close()
    
    for result in results:
        if isinstance(result, dict) and result.get('success'):
            stats['written'] += result['count']
        else:
            stats['failed'] += batch_size
    
    return stats

print("Async DynamoDB batch write example defined")

## 4. Best Practices

### Resource Management

In [None]:
# Best Practice 1: Connection pooling and client reuse
from botocore.config import Config

class AWSClientFactory:
    """
    Factory for creating and reusing AWS clients.
    
    Creating clients is expensive - reuse them!
    """
    
    _clients = {}
    _config = Config(
        max_pool_connections=50,
        retries={
            'max_attempts': 3,
            'mode': 'adaptive'
        }
    )
    
    @classmethod
    def get_client(cls, service: str, region: str = 'us-east-1'):
        """
        Get or create a client for a service.
        
        Args:
            service: AWS service name
            region: AWS region
            
        Returns:
            Boto3 client
        """
        key = f"{service}:{region}"
        
        if key not in cls._clients:
            cls._clients[key] = boto3.client(
                service,
                region_name=region,
                config=cls._config
            )
        
        return cls._clients[key]
    
    @classmethod
    def get_resource(cls, service: str, region: str = 'us-east-1'):
        """Get or create a resource for a service."""
        key = f"{service}:resource:{region}"
        
        if key not in cls._clients:
            cls._clients[key] = boto3.resource(
                service,
                region_name=region,
                config=cls._config
            )
        
        return cls._clients[key]

print("AWSClientFactory defined")
print("""
# Good: Reuse clients
s3 = AWSClientFactory.get_client('s3')
dynamodb = AWSClientFactory.get_resource('dynamodb')

# Bad: Creating new clients repeatedly
for item in items:
    s3 = boto3.client('s3')  # Don't do this!
    s3.put_object(...)
""")

In [None]:
# Best Practice 2: Context managers for resources
from contextlib import contextmanager

@contextmanager
def s3_streaming_upload(bucket: str, key: str):
    """
    Context manager for streaming uploads to S3.
    
    Handles multipart uploads automatically.
    """
    import io
    s3 = boto3.client('s3')
    buffer = io.BytesIO()
    
    try:
        yield buffer
        buffer.seek(0)
        s3.upload_fileobj(buffer, bucket, key)
    finally:
        buffer.close()

print("S3 streaming upload context manager defined")
print("""
Usage:
with s3_streaming_upload('my-bucket', 'output.csv') as f:
    writer = csv.writer(f)
    for row in data:
        writer.writerow(row)
# Automatically uploads when context exits
""")

In [None]:
# Best Practice 3: Structured logging
import logging
import json
from datetime import datetime

class AWSOperationLogger:
    """Structured logging for AWS operations."""
    
    def __init__(self, name: str):
        self.logger = logging.getLogger(name)
        self.logger.setLevel(logging.INFO)
        
        # Add handler if not exists
        if not self.logger.handlers:
            handler = logging.StreamHandler()
            handler.setFormatter(logging.Formatter(
                '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
            ))
            self.logger.addHandler(handler)
    
    def log_operation(
        self,
        operation: str,
        service: str,
        success: bool,
        duration_ms: float = None,
        **extra
    ):
        """Log an AWS operation with structured data."""
        log_data = {
            'timestamp': datetime.utcnow().isoformat(),
            'operation': operation,
            'service': service,
            'success': success,
            **extra
        }
        
        if duration_ms:
            log_data['duration_ms'] = round(duration_ms, 2)
        
        level = logging.INFO if success else logging.ERROR
        self.logger.log(level, json.dumps(log_data))
    
    def log_error(self, operation: str, service: str, error: Exception, **extra):
        """Log an error."""
        self.log_operation(
            operation=operation,
            service=service,
            success=False,
            error_type=type(error).__name__,
            error_message=str(error),
            **extra
        )

print("AWSOperationLogger defined")

In [None]:
# Best Practice 4: Configuration management
from dataclasses import dataclass, field
from typing import Optional
import os

@dataclass
class AWSConfig:
    """Configuration for AWS operations."""
    
    region: str = field(default_factory=lambda: os.environ.get('AWS_REGION', 'us-east-1'))
    profile: Optional[str] = field(default_factory=lambda: os.environ.get('AWS_PROFILE'))
    max_retries: int = 3
    connect_timeout: int = 5
    read_timeout: int = 30
    max_pool_connections: int = 25
    
    # Service-specific settings
    s3_bucket: Optional[str] = field(default_factory=lambda: os.environ.get('S3_BUCKET'))
    dynamodb_table: Optional[str] = field(default_factory=lambda: os.environ.get('DYNAMODB_TABLE'))
    sqs_queue_url: Optional[str] = field(default_factory=lambda: os.environ.get('SQS_QUEUE_URL'))
    
    def get_boto_config(self) -> Config:
        """Get botocore Config object."""
        return Config(
            region_name=self.region,
            retries={'max_attempts': self.max_retries, 'mode': 'adaptive'},
            connect_timeout=self.connect_timeout,
            read_timeout=self.read_timeout,
            max_pool_connections=self.max_pool_connections
        )
    
    def create_session(self) -> boto3.Session:
        """Create a boto3 session with this config."""
        kwargs = {'region_name': self.region}
        if self.profile:
            kwargs['profile_name'] = self.profile
        return boto3.Session(**kwargs)
    
    @classmethod
    def from_env(cls) -> 'AWSConfig':
        """Create config from environment variables."""
        return cls(
            region=os.environ.get('AWS_REGION', 'us-east-1'),
            profile=os.environ.get('AWS_PROFILE'),
            max_retries=int(os.environ.get('AWS_MAX_RETRIES', '3')),
            s3_bucket=os.environ.get('S3_BUCKET'),
            dynamodb_table=os.environ.get('DYNAMODB_TABLE'),
            sqs_queue_url=os.environ.get('SQS_QUEUE_URL')
        )

print("AWSConfig class defined")
print("""
Usage:
# From environment
config = AWSConfig.from_env()

# Create session with config
session = config.create_session()
s3 = session.client('s3', config=config.get_boto_config())

# Use configured values
s3.list_objects_v2(Bucket=config.s3_bucket)
""")

In [None]:
# Best Practice 5: Health checks and monitoring
from dataclasses import dataclass
from typing import List, Dict

@dataclass
class HealthCheckResult:
    service: str
    healthy: bool
    latency_ms: float
    message: str = ""

class AWSHealthChecker:
    """Check health of AWS services."""
    
    def __init__(self, config: AWSConfig = None):
        self.config = config or AWSConfig()
        self.session = self.config.create_session()
    
    def check_s3(self, bucket: str = None) -> HealthCheckResult:
        """Check S3 connectivity."""
        bucket = bucket or self.config.s3_bucket
        s3 = self.session.client('s3')
        
        start = time.time()
        try:
            s3.head_bucket(Bucket=bucket)
            latency = (time.time() - start) * 1000
            return HealthCheckResult('s3', True, latency, f"Bucket {bucket} accessible")
        except ClientError as e:
            latency = (time.time() - start) * 1000
            return HealthCheckResult('s3', False, latency, str(e))
    
    def check_dynamodb(self, table: str = None) -> HealthCheckResult:
        """Check DynamoDB connectivity."""
        table = table or self.config.dynamodb_table
        dynamodb = self.session.client('dynamodb')
        
        start = time.time()
        try:
            dynamodb.describe_table(TableName=table)
            latency = (time.time() - start) * 1000
            return HealthCheckResult('dynamodb', True, latency, f"Table {table} accessible")
        except ClientError as e:
            latency = (time.time() - start) * 1000
            return HealthCheckResult('dynamodb', False, latency, str(e))
    
    def check_sqs(self, queue_url: str = None) -> HealthCheckResult:
        """Check SQS connectivity."""
        queue_url = queue_url or self.config.sqs_queue_url
        sqs = self.session.client('sqs')
        
        start = time.time()
        try:
            sqs.get_queue_attributes(
                QueueUrl=queue_url,
                AttributeNames=['ApproximateNumberOfMessages']
            )
            latency = (time.time() - start) * 1000
            return HealthCheckResult('sqs', True, latency, "Queue accessible")
        except ClientError as e:
            latency = (time.time() - start) * 1000
            return HealthCheckResult('sqs', False, latency, str(e))
    
    def check_all(self) -> Dict[str, HealthCheckResult]:
        """Check all configured services."""
        results = {}
        
        if self.config.s3_bucket:
            results['s3'] = self.check_s3()
        
        if self.config.dynamodb_table:
            results['dynamodb'] = self.check_dynamodb()
        
        if self.config.sqs_queue_url:
            results['sqs'] = self.check_sqs()
        
        return results

print("AWSHealthChecker class defined")

---

## Exercises

### Exercise 1: Implement a Retry Handler

Create a comprehensive retry handler that handles different types of AWS errors.

In [None]:
# Exercise 1: Your code here
from typing import Callable, Any, Optional

class SmartRetryHandler:
    """Intelligent retry handler for AWS operations."""
    
    def __init__(self, max_retries: int = 3, base_delay: float = 1.0):
        pass
    
    def execute(self, func: Callable, *args, **kwargs) -> Any:
        """
        Execute function with smart retry logic.
        
        Should handle:
        - Throttling errors (with longer backoff)
        - Transient errors (with standard backoff)
        - Non-retryable errors (fail immediately)
        """
        pass

<details>
<summary>Click to see solution</summary>

```python
import time
import random
from typing import Callable, Any, Set, Dict
from botocore.exceptions import ClientError
import logging

logger = logging.getLogger(__name__)

class SmartRetryHandler:
    """Intelligent retry handler for AWS operations."""
    
    # Errors that indicate throttling - use longer backoff
    THROTTLING_ERRORS: Set[str] = {
        'Throttling',
        'ThrottlingException',
        'ThrottledException',
        'ProvisionedThroughputExceededException',
        'RequestLimitExceeded',
        'TooManyRequestsException',
        'SlowDown',
    }
    
    # Transient errors that can be retried
    TRANSIENT_ERRORS: Set[str] = {
        'ServiceException',
        'ServiceUnavailable',
        'InternalError',
        'InternalServerError',
        'RequestTimeout',
        'EC2ThrottledException',
    }
    
    def __init__(
        self,
        max_retries: int = 3,
        base_delay: float = 1.0,
        max_delay: float = 60.0,
        throttle_multiplier: float = 2.0
    ):
        """
        Initialize retry handler.
        
        Args:
            max_retries: Maximum retry attempts
            base_delay: Base delay in seconds
            max_delay: Maximum delay cap
            throttle_multiplier: Extra multiplier for throttling errors
        """
        self.max_retries = max_retries
        self.base_delay = base_delay
        self.max_delay = max_delay
        self.throttle_multiplier = throttle_multiplier
        self._stats = {
            'attempts': 0,
            'retries': 0,
            'throttles': 0,
            'failures': 0
        }
    
    def _get_error_code(self, error: ClientError) -> str:
        """Extract error code from ClientError."""
        return error.response.get('Error', {}).get('Code', 'Unknown')
    
    def _is_retryable(self, error: ClientError) -> bool:
        """Check if error should be retried."""
        code = self._get_error_code(error)
        return code in self.THROTTLING_ERRORS or code in self.TRANSIENT_ERRORS
    
    def _is_throttling(self, error: ClientError) -> bool:
        """Check if error is a throttling error."""
        return self._get_error_code(error) in self.THROTTLING_ERRORS
    
    def _calculate_delay(self, attempt: int, is_throttle: bool) -> float:
        """
        Calculate delay with exponential backoff and jitter.
        
        Args:
            attempt: Current attempt number (0-indexed)
            is_throttle: Whether this is a throttling error
        """
        # Exponential backoff
        delay = self.base_delay * (2 ** attempt)
        
        # Extra delay for throttling
        if is_throttle:
            delay *= self.throttle_multiplier
        
        # Add jitter (0-25%)
        jitter = delay * random.uniform(0, 0.25)
        delay += jitter
        
        # Cap at max delay
        return min(delay, self.max_delay)
    
    def execute(self, func: Callable, *args, **kwargs) -> Any:
        """
        Execute function with smart retry logic.
        
        Args:
            func: Function to execute
            *args, **kwargs: Arguments to pass to function
            
        Returns:
            Function result
            
        Raises:
            Last exception if all retries fail
        """
        last_error = None
        
        for attempt in range(self.max_retries + 1):
            self._stats['attempts'] += 1
            
            try:
                return func(*args, **kwargs)
                
            except ClientError as e:
                last_error = e
                error_code = self._get_error_code(e)
                
                # Check if retryable
                if not self._is_retryable(e):
                    logger.error(f"Non-retryable error: {error_code}")
                    self._stats['failures'] += 1
                    raise
                
                # Check if we have retries left
                if attempt >= self.max_retries:
                    logger.error(f"Max retries exceeded for {error_code}")
                    self._stats['failures'] += 1
                    raise
                
                # Track throttling
                is_throttle = self._is_throttling(e)
                if is_throttle:
                    self._stats['throttles'] += 1
                
                # Calculate and apply delay
                delay = self._calculate_delay(attempt, is_throttle)
                
                logger.warning(
                    f"Attempt {attempt + 1} failed with {error_code}, "
                    f"retrying in {delay:.2f}s "
                    f"({'throttled' if is_throttle else 'transient'})"
                )
                
                self._stats['retries'] += 1
                time.sleep(delay)
        
        raise last_error
    
    def get_stats(self) -> Dict[str, int]:
        """Get retry statistics."""
        return self._stats.copy()
    
    def reset_stats(self):
        """Reset statistics."""
        self._stats = {
            'attempts': 0,
            'retries': 0,
            'throttles': 0,
            'failures': 0
        }

# Usage example
retry_handler = SmartRetryHandler(max_retries=5)

def get_item(table_name, key):
    dynamodb = boto3.client('dynamodb')
    return dynamodb.get_item(TableName=table_name, Key=key)

# result = retry_handler.execute(get_item, 'my-table', {'id': {'S': '123'}})
print(f"Stats: {retry_handler.get_stats()}")
```
</details>

### Exercise 2: Implement a Paginated Iterator

Create a generic iterator that handles pagination for any AWS service.

In [None]:
# Exercise 2: Your code here
from typing import Iterator, Any, Callable

class GenericAWSIterator:
    """Generic paginated iterator for AWS services."""
    
    def __init__(
        self,
        client,
        operation: str,
        result_key: str,
        **kwargs
    ):
        pass
    
    def __iter__(self) -> Iterator[Any]:
        pass
    
    def filter(self, predicate: Callable[[Any], bool]) -> 'GenericAWSIterator':
        """Add client-side filter."""
        pass
    
    def limit(self, n: int) -> 'GenericAWSIterator':
        """Limit results."""
        pass

<details>
<summary>Click to see solution</summary>

```python
from typing import Iterator, Any, Callable, Optional, List
import boto3

class GenericAWSIterator:
    """
    Generic paginated iterator for AWS services.
    
    Supports filtering, limiting, and chaining operations.
    """
    
    def __init__(
        self,
        client,
        operation: str,
        result_key: str,
        **kwargs
    ):
        """
        Initialize iterator.
        
        Args:
            client: Boto3 client
            operation: Paginator operation name
            result_key: Key containing results in response
            **kwargs: Arguments to pass to paginate()
        """
        self.client = client
        self.operation = operation
        self.result_key = result_key
        self.kwargs = kwargs
        
        self._filters: List[Callable[[Any], bool]] = []
        self._limit: Optional[int] = None
        self._transform: Optional[Callable[[Any], Any]] = None
        self._count = 0
    
    def __iter__(self) -> Iterator[Any]:
        """Iterate through paginated results."""
        self._count = 0
        paginator = self.client.get_paginator(self.operation)
        
        for page in paginator.paginate(**self.kwargs):
            items = page.get(self.result_key, [])
            
            for item in items:
                # Check limit
                if self._limit is not None and self._count >= self._limit:
                    return
                
                # Apply filters
                if all(f(item) for f in self._filters):
                    self._count += 1
                    
                    # Apply transform if set
                    if self._transform:
                        yield self._transform(item)
                    else:
                        yield item
    
    def filter(self, predicate: Callable[[Any], bool]) -> 'GenericAWSIterator':
        """
        Add a filter predicate.
        
        Args:
            predicate: Function that returns True to keep item
            
        Returns:
            Self for chaining
        """
        self._filters.append(predicate)
        return self
    
    def limit(self, n: int) -> 'GenericAWSIterator':
        """
        Limit number of results.
        
        Args:
            n: Maximum number of results
            
        Returns:
            Self for chaining
        """
        self._limit = n
        return self
    
    def transform(self, func: Callable[[Any], Any]) -> 'GenericAWSIterator':
        """
        Transform each result.
        
        Args:
            func: Transform function
            
        Returns:
            Self for chaining
        """
        self._transform = func
        return self
    
    def to_list(self) -> List[Any]:
        """Collect all results into a list."""
        return list(self)
    
    def first(self) -> Optional[Any]:
        """Get first result or None."""
        for item in self:
            return item
        return None
    
    def count(self) -> int:
        """Count all matching results."""
        return sum(1 for _ in self)
    
    @property
    def items_yielded(self) -> int:
        """Number of items yielded so far."""
        return self._count

# Example usage
s3 = boto3.client('s3')

# List large files in a bucket
large_files = (
    GenericAWSIterator(s3, 'list_objects_v2', 'Contents', Bucket='my-bucket')
    .filter(lambda obj: obj['Size'] > 1024 * 1024)  # > 1MB
    .transform(lambda obj: {'key': obj['Key'], 'size_mb': obj['Size'] / (1024*1024)})
    .limit(100)
)

# for file in large_files:
#     print(f"{file['key']}: {file['size_mb']:.1f} MB")
```
</details>

### Exercise 3: AWS Service Health Dashboard

Create a comprehensive health dashboard for multiple AWS services.

In [None]:
# Exercise 3: Your code here
from dataclasses import dataclass
from typing import List, Dict

@dataclass
class ServiceHealth:
    name: str
    status: str  # 'healthy', 'degraded', 'unhealthy'
    latency_ms: float
    details: Dict[str, Any]

class AWSDashboard:
    """AWS service health dashboard."""
    
    def __init__(self, config: dict):
        pass
    
    def check_all_services(self) -> List[ServiceHealth]:
        """Check all configured services."""
        pass
    
    def get_summary(self) -> Dict[str, Any]:
        """Get overall health summary."""
        pass

<details>
<summary>Click to see solution</summary>

```python
import boto3
import time
from dataclasses import dataclass, field
from typing import List, Dict, Any, Optional
from datetime import datetime
from botocore.exceptions import ClientError
from concurrent.futures import ThreadPoolExecutor, as_completed

@dataclass
class ServiceHealth:
    name: str
    status: str  # 'healthy', 'degraded', 'unhealthy'
    latency_ms: float
    details: Dict[str, Any] = field(default_factory=dict)
    checked_at: datetime = field(default_factory=datetime.utcnow)
    error: Optional[str] = None

class AWSDashboard:
    """Comprehensive AWS service health dashboard."""
    
    # Latency thresholds (ms)
    LATENCY_HEALTHY = 200
    LATENCY_DEGRADED = 1000
    
    def __init__(self, config: dict, region: str = 'us-east-1'):
        """
        Initialize dashboard.
        
        Args:
            config: Service configuration dict
                {
                    's3': {'bucket': 'my-bucket'},
                    'dynamodb': {'table': 'my-table'},
                    'sqs': {'queue_url': '...'},
                    'lambda': {'function': 'my-function'}
                }
            region: AWS region
        """
        self.config = config
        self.region = region
        self.session = boto3.Session(region_name=region)
        self._last_check: Optional[datetime] = None
        self._last_results: List[ServiceHealth] = []
    
    def _check_s3(self, bucket: str) -> ServiceHealth:
        """Check S3 bucket health."""
        s3 = self.session.client('s3')
        start = time.time()
        
        try:
            # Check bucket exists
            s3.head_bucket(Bucket=bucket)
            
            # Get bucket stats
            objects = s3.list_objects_v2(Bucket=bucket, MaxKeys=1)
            
            latency = (time.time() - start) * 1000
            status = self._latency_to_status(latency)
            
            return ServiceHealth(
                name='s3',
                status=status,
                latency_ms=latency,
                details={
                    'bucket': bucket,
                    'has_objects': objects.get('KeyCount', 0) > 0
                }
            )
        except ClientError as e:
            latency = (time.time() - start) * 1000
            return ServiceHealth(
                name='s3',
                status='unhealthy',
                latency_ms=latency,
                error=str(e),
                details={'bucket': bucket}
            )
    
    def _check_dynamodb(self, table: str) -> ServiceHealth:
        """Check DynamoDB table health."""
        dynamodb = self.session.client('dynamodb')
        start = time.time()
        
        try:
            response = dynamodb.describe_table(TableName=table)
            table_info = response['Table']
            
            latency = (time.time() - start) * 1000
            
            # Check table status
            table_status = table_info['TableStatus']
            if table_status != 'ACTIVE':
                status = 'degraded'
            else:
                status = self._latency_to_status(latency)
            
            return ServiceHealth(
                name='dynamodb',
                status=status,
                latency_ms=latency,
                details={
                    'table': table,
                    'table_status': table_status,
                    'item_count': table_info.get('ItemCount', 0),
                    'size_bytes': table_info.get('TableSizeBytes', 0)
                }
            )
        except ClientError as e:
            latency = (time.time() - start) * 1000
            return ServiceHealth(
                name='dynamodb',
                status='unhealthy',
                latency_ms=latency,
                error=str(e),
                details={'table': table}
            )
    
    def _check_sqs(self, queue_url: str) -> ServiceHealth:
        """Check SQS queue health."""
        sqs = self.session.client('sqs')
        start = time.time()
        
        try:
            response = sqs.get_queue_attributes(
                QueueUrl=queue_url,
                AttributeNames=['All']
            )
            attrs = response['Attributes']
            
            latency = (time.time() - start) * 1000
            status = self._latency_to_status(latency)
            
            return ServiceHealth(
                name='sqs',
                status=status,
                latency_ms=latency,
                details={
                    'queue_url': queue_url,
                    'messages_available': int(attrs.get('ApproximateNumberOfMessages', 0)),
                    'messages_in_flight': int(attrs.get('ApproximateNumberOfMessagesNotVisible', 0)),
                    'messages_delayed': int(attrs.get('ApproximateNumberOfMessagesDelayed', 0))
                }
            )
        except ClientError as e:
            latency = (time.time() - start) * 1000
            return ServiceHealth(
                name='sqs',
                status='unhealthy',
                latency_ms=latency,
                error=str(e)
            )
    
    def _check_lambda(self, function_name: str) -> ServiceHealth:
        """Check Lambda function health."""
        lambda_client = self.session.client('lambda')
        start = time.time()
        
        try:
            response = lambda_client.get_function(FunctionName=function_name)
            config = response['Configuration']
            
            latency = (time.time() - start) * 1000
            
            func_state = config.get('State', 'Unknown')
            if func_state != 'Active':
                status = 'degraded'
            else:
                status = self._latency_to_status(latency)
            
            return ServiceHealth(
                name='lambda',
                status=status,
                latency_ms=latency,
                details={
                    'function': function_name,
                    'state': func_state,
                    'runtime': config.get('Runtime'),
                    'memory_mb': config.get('MemorySize'),
                    'timeout': config.get('Timeout'),
                    'last_modified': config.get('LastModified')
                }
            )
        except ClientError as e:
            latency = (time.time() - start) * 1000
            return ServiceHealth(
                name='lambda',
                status='unhealthy',
                latency_ms=latency,
                error=str(e),
                details={'function': function_name}
            )
    
    def _latency_to_status(self, latency_ms: float) -> str:
        """Convert latency to status."""
        if latency_ms < self.LATENCY_HEALTHY:
            return 'healthy'
        elif latency_ms < self.LATENCY_DEGRADED:
            return 'degraded'
        return 'unhealthy'
    
    def check_all_services(self, parallel: bool = True) -> List[ServiceHealth]:
        """
        Check all configured services.
        
        Args:
            parallel: Run checks in parallel
            
        Returns:
            List of health check results
        """
        checks = []
        
        if 's3' in self.config:
            checks.append(('s3', self._check_s3, self.config['s3']['bucket']))
        
        if 'dynamodb' in self.config:
            checks.append(('dynamodb', self._check_dynamodb, self.config['dynamodb']['table']))
        
        if 'sqs' in self.config:
            checks.append(('sqs', self._check_sqs, self.config['sqs']['queue_url']))
        
        if 'lambda' in self.config:
            checks.append(('lambda', self._check_lambda, self.config['lambda']['function']))
        
        results = []
        
        if parallel:
            with ThreadPoolExecutor(max_workers=len(checks)) as executor:
                futures = {
                    executor.submit(func, arg): name
                    for name, func, arg in checks
                }
                for future in as_completed(futures):
                    results.append(future.result())
        else:
            for name, func, arg in checks:
                results.append(func(arg))
        
        self._last_check = datetime.utcnow()
        self._last_results = results
        
        return results
    
    def get_summary(self) -> Dict[str, Any]:
        """Get overall health summary."""
        if not self._last_results:
            self.check_all_services()
        
        healthy = sum(1 for r in self._last_results if r.status == 'healthy')
        degraded = sum(1 for r in self._last_results if r.status == 'degraded')
        unhealthy = sum(1 for r in self._last_results if r.status == 'unhealthy')
        
        avg_latency = sum(r.latency_ms for r in self._last_results) / len(self._last_results)
        
        # Overall status
        if unhealthy > 0:
            overall = 'unhealthy'
        elif degraded > 0:
            overall = 'degraded'
        else:
            overall = 'healthy'
        
        return {
            'overall_status': overall,
            'checked_at': self._last_check.isoformat() if self._last_check else None,
            'services': {
                'total': len(self._last_results),
                'healthy': healthy,
                'degraded': degraded,
                'unhealthy': unhealthy
            },
            'average_latency_ms': round(avg_latency, 2),
            'details': [
                {
                    'service': r.name,
                    'status': r.status,
                    'latency_ms': round(r.latency_ms, 2),
                    'error': r.error
                }
                for r in self._last_results
            ]
        }

# Usage
config = {
    's3': {'bucket': 'my-bucket'},
    'dynamodb': {'table': 'my-table'},
    'sqs': {'queue_url': 'https://sqs.us-east-1.amazonaws.com/123456789/my-queue'},
    'lambda': {'function': 'my-function'}
}

dashboard = AWSDashboard(config)
# summary = dashboard.get_summary()
# print(json.dumps(summary, indent=2))
```
</details>

---

## Summary

In this notebook, we covered:

1. **Error Handling**
   - Botocore exception hierarchy
   - Retryable vs non-retryable errors
   - Exponential backoff with jitter

2. **Pagination**
   - Manual token-based pagination
   - Using boto3 paginators
   - JMESPath filtering

3. **Async Operations**
   - ThreadPoolExecutor for concurrency
   - Batch processing patterns
   - Parallel downloads/uploads

4. **Best Practices**
   - Client reuse and pooling
   - Configuration management
   - Structured logging
   - Health monitoring

### Key Takeaways

- **Always handle errors**: AWS operations can fail
- **Use paginators**: Don't assume one request gets all data
- **Reuse clients**: Creating clients is expensive
- **Implement retries**: Transient errors are normal
- **Monitor health**: Track latency and availability

### AWS Cost Warning

> **Production Considerations**:
> - Retries increase API call count (and cost)
> - Pagination may require many requests for large datasets
> - Use appropriate timeouts to avoid hung operations
> - Consider reserved capacity for predictable workloads

---

## Conclusion

This module has covered the essential patterns for working with AWS using boto3:

1. **01_aws_fundamentals_setup**: Account basics, IAM, credentials
2. **02_s3_operations**: Storage operations
3. **03_other_services**: DynamoDB, Lambda, SQS, SNS
4. **04_practical_patterns**: Error handling, pagination, best practices

You now have the knowledge to build robust, production-ready AWS applications with Python!