# Notebook 3: Practical Applications & Best Practices

In the previous notebooks, we learned the fundamentals and advanced techniques of dictionary serialization. This final notebook focuses on real-world applications, crucial security considerations, and best practices for writing robust and maintainable serialization code.

**Learning Objectives:**
*   See practical examples: configuration management, API data exchange, inventory systems.
*   Understand the security risks associated with deserialization (especially Pickle).
*   Learn the importance of input validation.
*   Summarize best practices: choosing formats, versioning, error handling, testing.

## Part 3: Practical Applications and Best Practices

### Real-World Applications of Dictionary Serialization

**1. Configuration Management**

Dictionaries are ideal for structuring application settings. Serializing them to JSON or YAML makes configuration easy to read, write, and manage.

*Format Choice:* YAML is often preferred for its readability and support for comments, while JSON is simpler and universally supported.

In [None]:
pip install pyyaml

In [None]:
# Note: Requires PyYAML: pip install pyyaml
try:
    import yaml
    import os
    from pprint import pprint

    config_file = 'app_config.yaml'

    default_config = {
        'database': {
            'type': 'postgresql',
            'host': 'localhost',
            'port': 5432,
            'user': 'dev_user',
            'password': 'change_me'
        },
        'logging': {
            'level': 'INFO', # Options: DEBUG, INFO, WARNING, ERROR
            'file': '/var/log/app.log'
        },
        'features': {
            'enable_beta': False,
            'max_users': 1000
        },
        'api_keys': {
            'service_a': None # Expect key to be provided externally
        }
    }

    # --- Save Default Configuration ---
    print(f"Saving default config to {config_file}...")
    try:
        with open(config_file, 'w') as f:
            # Use dump for nice formatting and comments (though comments aren't added here)
            yaml.dump(default_config, f, default_flow_style=False, sort_keys=False)
        print("Default config saved.")
    except Exception as e:
        print(f"Error saving config: {e}")
        
    # --- Load Configuration --- 
    print(f"\nLoading config from {config_file}...")
    loaded_config = {}
    try:
        with open(config_file, 'r') as f:
            # Use safe_load to avoid potential security issues with arbitrary code execution
            loaded_config = yaml.safe_load(f)
        print("Config loaded successfully:")
        pprint(loaded_config)
    except FileNotFoundError:
        print(f"Error: Config file {config_file} not found. Using empty config.")
    except yaml.YAMLError as e:
        print(f"Error parsing YAML config file {config_file}: {e}")
    except Exception as e:
        print(f"An unexpected error occurred loading config: {e}")

    # --- Simulate Overriding with Environment Variables (Common Practice) ---
    print("\nChecking for environment variable overrides...")
    # Example: export DB_HOST=prod.db.server.com
    # Example: export LOG_LEVEL=DEBUG
    db_host_override = os.environ.get('DB_HOST')
    log_level_override = os.environ.get('LOG_LEVEL')

    if db_host_override and 'database' in loaded_config:
        print(f"Overriding DB host with: {db_host_override}")
        loaded_config['database']['host'] = db_host_override

    if log_level_override and 'logging' in loaded_config:
        print(f"Overriding log level with: {log_level_override}")
        loaded_config['logging']['level'] = log_level_override
        
    print("\nFinal Config (potentially overridden):")
    pprint(loaded_config)

    # Clean up config file
    if os.path.exists(config_file):
        os.remove(config_file)
        print(f"\nCleaned up {config_file}")

except ImportError:
    print("PyYAML not installed. Run 'pip install pyyaml' to run this cell.")

**2. Data Exchange in APIs**

JSON is the de facto standard for sending data to and receiving data from web APIs (REST, GraphQL, etc.). Python dictionaries are easily converted to/from JSON for these interactions.

*Requires Installation:* `pip install requests`

In [None]:
pip install requests

In [None]:
# Note: Requires the 'requests' library: pip install requests
try:
    import requests
    import json
    from pprint import pprint

    # Example: Creating a new user via a hypothetical API endpoint
    api_endpoint = 'https://jsonplaceholder.typicode.com/posts' # Using a public test API

    # Prepare data as a Python dictionary
    new_post_data = {
        'title': 'My New Blog Post',
        'body': 'This is the content of the post using serialization!',
        'userId': 101 # Example user ID
    }
    
    print("Dictionary to send:")
    pprint(new_post_data)

    # --- Send POST request with JSON payload ---
    try:
        # The 'json' parameter in requests automatically serializes the dict to JSON 
        # and sets the 'Content-Type: application/json' header.
        response = requests.post(api_endpoint, json=new_post_data)
        
        # Check if the request was successful (e.g., status code 201 Created)
        response.raise_for_status() # Raises an exception for bad status codes (4xx or 5xx)
        
        print(f"\nAPI Request Successful! Status Code: {response.status_code}")

        # --- Deserialize the JSON response --- 
        # The response.json() method automatically deserializes the JSON response body
        response_data = response.json() 
        
        print("\nDeserialized API Response (Dictionary):")
        pprint(response_data)
        
        # Verify we got an ID back (specific to this test API)
        assert 'id' in response_data
        print(f"\nSuccessfully created resource with ID: {response_data.get('id')}")

    except requests.exceptions.RequestException as e:
        print(f"\nAPI request failed: {e}")
    except json.JSONDecodeError as e:
        print(f"\nFailed to decode API response: {e}")
        print(f"Raw response text: {response.text[:200]}...") # Show part of the raw response
    except Exception as e:
        print(f"\nAn unexpected error occurred: {e}")

except ImportError:
    print("requests library not installed. Run 'pip install requests' to run this cell.")

**3. Simple Inventory Management System**

This example shows using JSON serialization to persist the state of a simple inventory stored in a dictionary.

In [None]:
import json
import os
from pprint import pprint

class InventorySystem:
    def __init__(self, storage_file='inventory_data.json'):
        self.storage_file = storage_file
        self.inventory = self._load_inventory()
        print(f"Inventory system initialized. Loaded {len(self.inventory)} items from {self.storage_file}")

    def _load_inventory(self):
        """Loads inventory from the JSON file."""
        if os.path.exists(self.storage_file):
            try:
                with open(self.storage_file, 'r') as f:
                    data = json.load(f)
                    # Basic validation: check if it's a dictionary
                    if isinstance(data, dict):
                        return data
                    else:
                        print(f"Warning: Data in {self.storage_file} is not a dictionary. Starting empty.")
                        return {}
            except json.JSONDecodeError:
                print(f"Warning: Could not decode JSON from {self.storage_file}. Starting empty.")
                return {}
            except Exception as e:
                print(f"Warning: Error loading inventory from {self.storage_file}: {e}. Starting empty.")
                return {}
        return {} # Return empty dict if file doesn't exist

    def save_inventory(self):
        """Saves the current inventory to the JSON file."""
        try:
            with open(self.storage_file, 'w') as f:
                # Use indent for readability
                json.dump(self.inventory, f, indent=2)
            # print(f"Inventory saved to {self.storage_file}") # Optional: uncomment for verbose logging
        except Exception as e:
            print(f"Error saving inventory to {self.storage_file}: {e}")

    def add_product(self, product_id, name, quantity, price):
        if product_id in self.inventory:
            print(f"Warning: Product ID {product_id} already exists. Use update methods.")
            return False
        if not isinstance(quantity, int) or quantity < 0:
             print(f"Error: Quantity ({quantity}) must be a non-negative integer.")
             return False
        if not isinstance(price, (int, float)) or price < 0:
             print(f"Error: Price ({price}) must be a non-negative number.")
             return False
             
        self.inventory[product_id] = {
            'name': str(name),
            'quantity': quantity,
            'price': float(price)
        }
        print(f"Added product: {product_id} - {name}")
        self.save_inventory()
        return True

    def update_quantity(self, product_id, quantity_change):
        if product_id in self.inventory:
            new_quantity = self.inventory[product_id]['quantity'] + quantity_change
            if new_quantity < 0:
                print(f"Error: Quantity cannot drop below zero for {product_id}.")
                return False
            self.inventory[product_id]['quantity'] = new_quantity
            change_str = f"Increased by {quantity_change}" if quantity_change > 0 else f"Decreased by {-quantity_change}"
            print(f"Updated quantity for {product_id}: {change_str}. New quantity: {new_quantity}")
            self.save_inventory()
            return True
        else:
            print(f"Error: Product ID {product_id} not found.")
            return False

    def get_product(self, product_id):
        return self.inventory.get(product_id) # Returns None if not found
        
    def display_inventory(self):
        print("\nCurrent Inventory:")
        if not self.inventory:
            print("  (Empty)")
            return
        pprint(self.inventory)

# --- Usage Example ---
inventory_file = 'inventory_data.json'
# Ensure clean start for demo
if os.path.exists(inventory_file):
    os.remove(inventory_file)

inventory_system = InventorySystem(storage_file=inventory_file)
inventory_system.display_inventory()

print("\n--- Performing operations ---")
inventory_system.add_product('LAP001', 'Laptop Pro 15"', 25, 1299.99)
inventory_system.add_product('MOU007', 'Wireless Mouse', 150, 19.50)
inventory_system.add_product('KEY003', 'Mechanical Keyboard', 75, 89.00)

inventory_system.update_quantity('LAP001', -5) # Sold 5 laptops
inventory_system.update_quantity('MOU007', 50) # Received 50 mice
inventory_system.update_quantity('XYZ999', 10) # Try updating non-existent product
inventory_system.add_product('MON001', '27" 4K Monitor', -5, 399.00) # Try adding with invalid quantity

inventory_system.display_inventory()

print("\n--- Retrieving a product ---")
product = inventory_system.get_product('KEY003')
if product:
    print(f"Details for KEY003:")
    pprint(product)

# --- Simulate script ending and restarting ---
print("\n--- Simulating restart: Loading inventory again ---")
inventory_system_restarted = InventorySystem(storage_file=inventory_file)
inventory_system_restarted.display_inventory()

# Final check: verify data persisted
assert inventory_system_restarted.get_product('LAP001')['quantity'] == 20
assert inventory_system_restarted.get_product('MOU007')['quantity'] == 200

# Clean up inventory file
if os.path.exists(inventory_file):
    os.remove(inventory_file)
    print(f"\nCleaned up {inventory_file}")

### Security Considerations

**Deserialization can be dangerous**, especially when processing data from untrusted sources (e.g., user uploads, external APIs).

**1. Pickle Security Risks**

`pickle.loads()` can be tricked into executing arbitrary code embedded within the pickled data. This is a major vulnerability.

**Rule: NEVER unpickle data from untrusted sources.**

Use safer formats like JSON for data exchange with external systems or users.

In [None]:
import pickle
import os
import json

# --- Example of potentially malicious pickle data --- 
# This data, when unpickled, attempts to run the 'echo' command.
# On Linux/macOS, it might print 'Malicious payload executed!' to the console.
# On Windows, 'echo' is usually harmless, but 'os.system' could run *any* command.

# Constructing the malicious payload (for demonstration ONLY - DO NOT RUN blindly)
# This creates pickle data that calls os.system('echo Malicious payload executed!')
malicious_command = "echo Malicious payload executed!"
class PickleRCE:
    def __reduce__(self):
        return (os.system, (malicious_command,))

malicious_pickle_data = pickle.dumps(PickleRCE())

print("--- WARNING: Demonstrating Unpickling Risk ---")
print("The following line attempts to unpickle data that executes a command.")
print("It's designed to be relatively harmless ('echo'), but shows the potential danger.\n")

try:
    # *** THIS IS THE DANGEROUS OPERATION ***
    # Suppressing output for safety in automated environments, 
    # but the os.system call *is* attempted.
    print("Attempting pickle.loads(malicious_pickle_data)...")
    result = pickle.loads(malicious_pickle_data)
    print("\nUnpickling finished. Check console output if 'echo' command ran.")
    # In a real attack, the command could be 'rm -rf /' or download malware.
    
except Exception as e:
    print(f"\nUnpickling failed (this might happen depending on environment): {e}")

# --- Safer Alternative: JSON ---
print("\n--- Using JSON (Safe Alternative) ---")
# JSON contains only data, not executable code.
safe_data_string = '{"command": "echo Malicious payload executed!", "is_safe": true}'

try:
    loaded_safe_data = json.loads(safe_data_string)
    print("JSON data loaded safely:")
    pprint(loaded_safe_data)
    # You would then explicitly decide how to handle 'loaded_safe_data['command']',
    # rather than it being executed automatically.
except json.JSONDecodeError as e:
    print(f"Error decoding JSON: {e}")

print("\nConclusion: Avoid pickle.loads() with data from outside your direct control.")

**2. Input Validation**

Even with safe formats like JSON, always validate the *structure* and *content* of deserialized data before using it. Assume external data might be malformed, missing required fields, or contain invalid values.

Libraries like `jsonschema` can help validate data against a predefined structure.

*Requires Installation:* `pip install jsonschema`

In [None]:
pip install jsonschema

In [None]:
# Note: Requires jsonschema: pip install jsonschema
try:
    import json
    from jsonschema import validate
    from jsonschema.exceptions import ValidationError
    from pprint import pprint

    # --- Define a Schema for Expected User Data --- 
    # This schema describes the expected structure and types
    user_schema = {
        "type": "object",
        "properties": {
            "userId": {"type": "string", "pattern": "^[a-zA-Z0-9_-]{3,16}$"}, # Alphanumeric, underscore, hyphen, 3-16 chars
            "email": {"type": "string", "format": "email"}, # Use built-in email format check
            "displayName": {"type": "string", "minLength": 1, "maxLength": 50},
            "roles": {
                "type": "array",
                "items": {"type": "string", "enum": ["user", "editor", "admin"]} # Must be one of these roles
            },
            "preferences": {
                "type": "object",
                "properties": {
                    "theme": {"type": "string", "enum": ["light", "dark"]},
                    "notifications": {"type": "boolean"}
                },
                "required": ["theme"]
            }
        },
        "required": ["userId", "email", "roles"] # These fields must be present
    }

    # --- Example Incoming Data Payloads --- 
    valid_payload = '''
    {
        "userId": "alice_k",
        "email": "alice.k@example.com",
        "displayName": "Alice K.",
        "roles": ["user", "editor"],
        "preferences": {
            "theme": "dark",
            "notifications": true
        }
    }
    '''

    invalid_payload_missing_field = '''
    {
        "userId": "bob_m", 
        "displayName": "Bob M.",
        "roles": ["user"]
        // Missing 'email' which is required
    }
    '''
    
    invalid_payload_bad_type = '''
    {
        "userId": "charlie_d",
        "email": "charlie.d@example.com",
        "roles": "admin" // Should be an array, not a string
    }
    '''
    
    invalid_payload_bad_value = '''
    {
        "userId": "dave**", // Invalid character in userId
        "email": "dave@", // Invalid email format
        "roles": ["user", "guest"] // 'guest' is not in the allowed enum
    }
    '''
    
    payloads_to_test = {
        "Valid Payload": valid_payload,
        "Missing Required Field ('email')": invalid_payload_missing_field,
        "Incorrect Type ('roles')": invalid_payload_bad_type,
        "Invalid Values ('userId', 'email', 'roles')": invalid_payload_bad_value
    }

    # --- Process and Validate Each Payload --- 
    for name, payload_str in payloads_to_test.items():
        print(f"\n--- Testing: {name} ---")
        try:
            # 1. Deserialize the JSON string
            data = json.loads(payload_str)
            print("Successfully deserialized JSON.")
            
            # 2. Validate against the schema
            validate(instance=data, schema=user_schema)
            print("Schema validation PASSED.")
            # Proceed with using the validated 'data' dictionary
            # print("Validated Data:")
            # pprint(data)

        except json.JSONDecodeError as e:
            # Handle cases where the input isn't even valid JSON
            print(f"Validation FAILED: Invalid JSON format - {e}")
        except ValidationError as e:
            # Handle schema validation errors
            print(f"Validation FAILED: Schema validation error - {e.message}")
            # You might want to log e.path, e.validator, e.schema_path for more details
        except Exception as e:
            # Handle other unexpected errors
            print(f"Validation FAILED: An unexpected error occurred - {e}")

except ImportError:
    print("jsonschema not installed. Run 'pip install jsonschema' to run this cell.")

### Best Practices for Dictionary Serialization

**1. Choose the Right Format for Your Needs**

*   **JSON:** Interoperability, web APIs, human-readable simple data.
*   **Pickle:** Python-only, complex objects, caching, performance (but beware security).
*   **MessagePack/Protocol Buffers:** Performance-critical, binary, compact size.
*   **YAML:** Configuration files, human readability, comments.
*   **HDF5:** Very large numerical data (often within dict values), partial I/O, scientific computing.

**2. Handle Versioning**

If the structure of your serialized dictionaries might change over time, include a version marker in the data. When deserializing, check the version to apply appropriate logic or migration steps.

In [None]:
import json

def serialize_data_v1(data):
    # Original V1 format
    payload = {
        '_version': '1.0',
        'user_id': data['id'],
        'user_name': data['name']
    }
    return json.dumps(payload)

def serialize_data_v2(data):
    # New V2 format - fields renamed, added timestamp
    payload = {
        '_version': '2.0',
        'userId': data['id'],
        'displayName': data['name'],
        'timestamp': data['time']
    }
    return json.dumps(payload)

def process_data(serialized_string):
    try:
        loaded_data = json.loads(serialized_string)
        version = loaded_data.get('_version')

        if version == '1.0':
            print("Processing V1.0 data:")
            # Adapt V1 data to internal representation
            internal_repr = {
                'id': loaded_data.get('user_id'),
                'name': loaded_data.get('user_name'),
                'time': None # V1 didn't have timestamp
            }
            return internal_repr
            
        elif version == '2.0':
            print("Processing V2.0 data:")
            # Adapt V2 data to internal representation
            internal_repr = {
                'id': loaded_data.get('userId'),
                'name': loaded_data.get('displayName'),
                'time': loaded_data.get('timestamp') # Handle potential missing key
            }
            return internal_repr
            
        elif version is None:
             print("Error: Data is missing '_version' field.")
             # Handle legacy data or raise error
             return None
        else:
            print(f"Error: Unsupported data version: {version}")
            # Raise error or attempt fallback
            return None
            
    except json.JSONDecodeError:
        print("Error: Invalid JSON input string.")
        return None
    except Exception as e:
        print(f"Error processing data: {e}")
        return None

# --- Example Usage ---
user_data_v1 = {'id': 'usr123', 'name': 'Old Format'}
user_data_v2 = {'id': 'usr456', 'name': 'New Format', 'time': '2024-01-01T10:00:00Z'}

serialized_v1 = serialize_data_v1(user_data_v1)
serialized_v2 = serialize_data_v2(user_data_v2)
serialized_unknown = '{"product": "widget"}' # Missing version

print("--- Processing serialized strings ---")
processed1 = process_data(serialized_v1)
print("Result V1:", processed1)

processed2 = process_data(serialized_v2)
print("Result V2:", processed2)

processed_unknown = process_data(serialized_unknown)
print("Result Unknown:", processed_unknown)

**3. Error Handling and Fallbacks**

Serialization and deserialization can fail (file not found, invalid format, network issues). Wrap these operations in `try...except` blocks and handle potential errors gracefully (e.g., log warnings, use default values, back up corrupted files).

In [None]:
pip install logging shutil

In [None]:
import json
import logging
import os
import shutil

logging.basicConfig(level=logging.INFO, format='%(levelname)s: %(message)s')

def default_app_settings():
    """Returns the default settings dictionary."""
    return {'theme': 'light', 'language': 'en', 'timeout': 30}

def load_settings(filename='app_settings.json'):
    """Loads settings from JSON file with robust error handling."""
    try:
        logging.info(f"Attempting to load settings from {filename}...")
        with open(filename, 'r') as f:
            settings = json.load(f)
            # Basic validation: Is it a dictionary?
            if not isinstance(settings, dict):
                 logging.error(f"Invalid settings format in {filename}: Expected a dictionary.")
                 raise ValueError("Settings format is not a dictionary")
            logging.info(f"Settings loaded successfully from {filename}.")
            # Merge with defaults to ensure all keys are present
            full_settings = default_app_settings()
            full_settings.update(settings) # Loaded settings override defaults
            return full_settings
            
    except FileNotFoundError:
        logging.warning(f"Settings file '{filename}' not found. Using default settings.")
        return default_app_settings()
    except json.JSONDecodeError as e:
        logging.error(f"Invalid JSON syntax in '{filename}': {e}. Using default settings.")
        # Attempt to backup the corrupted file
        backup_filename = f"{filename}.corrupted_bak"
        try:
            shutil.copy(filename, backup_filename)
            logging.info(f"Backed up corrupted file to {backup_filename}")
        except Exception as backup_err:
            logging.error(f"Could not backup corrupted file: {backup_err}")
        return default_app_settings()
    except ValueError as e:
        # Catch our custom validation error from above
        logging.error(f"Settings validation failed: {e}. Using default settings.")
        return default_app_settings()
    except Exception as e:
        logging.exception(f"An unexpected error occurred while loading settings from {filename}: {e}. Using default settings.")
        return default_app_settings()

# --- Test Cases ---
settings_file = 'app_settings.json'

# Test 1: File does not exist
print("\n--- Test 1: File Not Found ---")
if os.path.exists(settings_file): os.remove(settings_file)
settings1 = load_settings(settings_file)
print("Loaded settings:", settings1)
assert settings1 == default_app_settings()

# Test 2: Valid JSON file
print("\n--- Test 2: Valid JSON ---")
valid_settings_data = {'theme': 'dark', 'timeout': 60}
with open(settings_file, 'w') as f: json.dump(valid_settings_data, f)
settings2 = load_settings(settings_file)
expected_settings = default_app_settings()
expected_settings.update(valid_settings_data) # Should merge
print("Loaded settings:", settings2)
assert settings2 == expected_settings

# Test 3: Invalid JSON file (syntax error)
print("\n--- Test 3: Invalid JSON Syntax ---")
with open(settings_file, 'w') as f: f.write('{"theme": "dark", invalid json')
settings3 = load_settings(settings_file)
print("Loaded settings:", settings3)
assert settings3 == default_app_settings()
assert os.path.exists(settings_file + ".corrupted_bak") # Check if backup was created
if os.path.exists(settings_file + ".corrupted_bak"): os.remove(settings_file + ".corrupted_bak")

# Test 4: Valid JSON, but wrong root type (list instead of dict)
print("\n--- Test 4: Valid JSON, Wrong Type ---")
with open(settings_file, 'w') as f: json.dump([1, 2, 3], f)
settings4 = load_settings(settings_file)
print("Loaded settings:", settings4)
assert settings4 == default_app_settings()

# Clean up test file
if os.path.exists(settings_file): os.remove(settings_file)

**Explained - Error Messages Generated by Previous Block** 

**The `ERROR` and `WARNING` messages you see in the first block of output are *expected* because:**

1.  **Test 1:** The file doesn't exist, so `load_settings` correctly logs a `WARNING` and returns the default settings.
2.  **Test 3:** The file contains invalid JSON, so `load_settings` correctly logs an `ERROR`, attempts a backup (logging `INFO`), and returns the default settings.
3.  **Test 4:** The file contains valid JSON but the wrong type (list instead of dict), so `load_settings` correctly logs an `ERROR` (due to the `ValueError` raised and caught) and returns the default settings.

The "error" isn't a program crash or a bug in the logic, but rather the expected logging output when the function encounters these specific error conditions designed by the tests.

**Error Handling Demo**

To run these tests *without seeing the expected WARNING/ERROR messages in the log output during the test execution*, you can temporarily raise the logging level around the specific calls that are *designed* to fail.

Here's the modified code that suppresses these expected log messages during the tests:

**Reasoning for the fix:**

1.  **Identify Expected Logs:** Tests 1, 3, and 4 are specifically designed to trigger the error handling paths in `load_settings`, which correctly produce `WARNING` and `ERROR` level logs.
2.  **Control Logging Level:** The most direct way to prevent these specific, expected logs from appearing *during the test run* without changing the function's core logging behavior is to temporarily change the logging level of the root logger.
3.  **Implementation:**
    *   Get the root logger: `logger = logging.getLogger()`.
    *   Store its current level: `original_level = logger.level`.
    *   Before calling `load_settings` in tests where failure logs are expected (Tests 1, 3, 4), set the level higher than `ERROR` (e.g., `logging.CRITICAL`). This tells the logger to ignore anything less severe than `CRITICAL`.
    *   Immediately after the call, restore the logger's level to `original_level` so that subsequent logs (or logs in other parts of a larger application) behave as configured initially.

The code in the next cell will now execute the tests, verify the correct fallback behavior via the `assert` statements, but will not print the `WARNING` or `ERROR` messages generated by the `load_settings` function during those specific failing test cases.

In [None]:
import json
import logging
import os
import shutil
import sys # Import sys to potentially redirect logging if needed, though level change is better

# Configure logging - let's keep the basic config
logging.basicConfig(level=logging.INFO, format='%(levelname)s: %(message)s')
# Get the root logger to control its level during tests
logger = logging.getLogger()

def default_app_settings():
    """Returns the default settings dictionary."""
    return {'theme': 'light', 'language': 'en', 'timeout': 30}

def load_settings(filename='app_settings.json'):
    """Loads settings from JSON file with robust error handling."""
    try:
        # This info log will still appear if the level is INFO
        logging.info(f"Attempting to load settings from {filename}...")
        with open(filename, 'r') as f:
            settings = json.load(f)
            # Basic validation: Is it a dictionary?
            if not isinstance(settings, dict):
                 # This error log will be suppressed if level is CRITICAL
                 logging.error(f"Invalid settings format in {filename}: Expected a dictionary.")
                 raise ValueError("Settings format is not a dictionary")
            # This info log will still appear if the level is INFO
            logging.info(f"Settings loaded successfully from {filename}.")
            # Merge with defaults to ensure all keys are present
            full_settings = default_app_settings()
            full_settings.update(settings) # Loaded settings override defaults
            return full_settings

    except FileNotFoundError:
        # This warning log will be suppressed if level is CRITICAL
        logging.warning(f"Settings file '{filename}' not found. Using default settings.")
        return default_app_settings()
    except json.JSONDecodeError as e:
        # This error log will be suppressed if level is CRITICAL
        logging.error(f"Invalid JSON syntax in '{filename}': {e}. Using default settings.")
        # Attempt to backup the corrupted file
        backup_filename = f"{filename}.corrupted_bak"
        try:
            shutil.copy(filename, backup_filename)
            # This info log will still appear if the level is INFO
            logging.info(f"Backed up corrupted file to {backup_filename}")
        except Exception as backup_err:
             # This error log will be suppressed if level is CRITICAL
            logging.error(f"Could not backup corrupted file: {backup_err}")
        return default_app_settings()
    except ValueError as e:
        # Catch our custom validation error from above
        # This error log will be suppressed if level is CRITICAL
        logging.error(f"Settings validation failed: {e}. Using default settings.")
        return default_app_settings()
    except Exception as e:
        # This exception log might still appear depending on the error level
        logging.exception(f"An unexpected error occurred while loading settings from {filename}: {e}. Using default settings.")
        return default_app_settings()

# --- Test Cases ---
settings_file = 'app_settings.json'

# Store original logging level
original_level = logger.level

# Test 1: File does not exist
print("\n--- Test 1: File Not Found ---")
if os.path.exists(settings_file): os.remove(settings_file)
# Suppress expected WARNING for this specific call
logger.setLevel(logging.CRITICAL)
settings1 = load_settings(settings_file)
# Restore original level
logger.setLevel(original_level)
print("Loaded settings:", settings1)
assert settings1 == default_app_settings()

# Test 2: Valid JSON file
print("\n--- Test 2: Valid JSON ---")
valid_settings_data = {'theme': 'dark', 'timeout': 60}
with open(settings_file, 'w') as f: json.dump(valid_settings_data, f)
# No need to suppress logs here, expected to succeed cleanly
settings2 = load_settings(settings_file)
print("Loaded settings:", settings2)
expected_settings = default_app_settings()
expected_settings.update(valid_settings_data) # Should merge
assert settings2 == expected_settings

# Test 3: Invalid JSON file (syntax error)
print("\n--- Test 3: Invalid JSON Syntax ---")
with open(settings_file, 'w') as f: f.write('{"theme": "dark", invalid json')
# Suppress expected ERROR/INFO logs for this specific call
logger.setLevel(logging.CRITICAL)
settings3 = load_settings(settings_file)
# Restore original level
logger.setLevel(original_level)
print("Loaded settings:", settings3)
assert settings3 == default_app_settings()
# We can still check if the backup file was created *silently*
assert os.path.exists(settings_file + ".corrupted_bak") # Check if backup was created
if os.path.exists(settings_file + ".corrupted_bak"): os.remove(settings_file + ".corrupted_bak")

# Test 4: Valid JSON, but wrong root type (list instead of dict)
print("\n--- Test 4: Valid JSON, Wrong Type ---")
with open(settings_file, 'w') as f: json.dump([1, 2, 3], f)
# Suppress expected ERROR for this specific call
logger.setLevel(logging.CRITICAL)
settings4 = load_settings(settings_file)
# Restore original level
logger.setLevel(original_level)
print("Loaded settings:", settings4)
assert settings4 == default_app_settings()

# Clean up test file
if os.path.exists(settings_file):
    os.remove(settings_file)
    print(f"\nCleaned up {settings_file}")

**4. Testing Serialization Code**

Thoroughly test your serialization and deserialization logic:
*   **Round-trip tests:** Serialize an object, deserialize it, and assert it's identical to the original.
*   **Edge cases:** Test with empty dictionaries, nested structures, special values (None, NaN, infinity if applicable), different data types.
*   **Custom types:** If using custom encoders/decoders, ensure they handle all expected types correctly.
*   **Error handling:** Test how your code behaves with corrupted or invalid input data.

In [None]:
import unittest
import json
from datetime import datetime

# Assume DateTimeEncoder and datetime_decoder are defined as in Notebook 2
class DateTimeEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, datetime):
            return {'__datetime__': obj.isoformat()} 
        return super().default(obj)

def datetime_decoder(dct):
    if '__datetime__' in dct:
        return datetime.fromisoformat(dct['__datetime__'])
    return dct

# --- Unit Test Class --- 
class TestDictionarySerialization(unittest.TestCase):

    def test_json_round_trip_simple(self):
        """Test basic JSON serialize/deserialize round trip."""
        original = {'name': 'Test', 'value': 123, 'active': True, 'items': [1, None, 'abc']}
        serialized = json.dumps(original)
        deserialized = json.loads(serialized)
        self.assertEqual(original, deserialized, "Simple JSON round trip failed")

    def test_json_round_trip_nested(self):
        """Test nested JSON serialize/deserialize round trip."""
        original = {'a': 1, 'b': {'c': [2, 3], 'd': 'nested'}, 'e': []}
        serialized = json.dumps(original)
        deserialized = json.loads(serialized)
        self.assertEqual(original, deserialized, "Nested JSON round trip failed")

    def test_json_custom_datetime_round_trip(self):
        """Test JSON round trip with custom datetime encoder/decoder."""
        now = datetime.now()
        # Truncate microseconds for consistent comparison after ISO format round trip
        now = now.replace(microsecond=0) 
        original = {'event': 'Meeting', 'time': now, 'participants': ['A', 'B']}
        
        serialized = json.dumps(original, cls=DateTimeEncoder)
        deserialized = json.loads(serialized, object_hook=datetime_decoder)
        
        self.assertEqual(original, deserialized, "Custom datetime JSON round trip failed")
        self.assertIsInstance(deserialized['time'], datetime, "Datetime type not restored")

    def test_pickle_round_trip_complex(self):
        """Test Pickle round trip with complex types (set, datetime)."""
        import pickle
        now = datetime.now()
        original = {'id': 10, 'tags': {'urgent', 'dev'}, 'timestamp': now}
        
        serialized = pickle.dumps(original)
        deserialized = pickle.loads(serialized)
        
        self.assertEqual(original, deserialized, "Complex Pickle round trip failed")
        self.assertIsInstance(deserialized['tags'], set, "Set type not restored")
        self.assertIsInstance(deserialized['timestamp'], datetime, "Datetime type not restored")

# --- Run the tests --- 
# In a real scenario, you'd run this from the command line using 'python -m unittest your_test_module.py'
# Here, we run it directly within the notebook environment.
print("Running serialization unit tests...")
suite = unittest.TestSuite()
suite.addTest(unittest.makeSuite(TestDictionarySerialization))
runner = unittest.TextTestRunner()
runner.run(suite)

### Conclusion

Mastering dictionary serialization is crucial for building robust Python applications. We've covered:

*   **Why and What:** The need for serialization and basic concepts.
*   **Formats:** JSON, Pickle, YAML, MessagePack, HDF5 – their strengths and weaknesses.
*   **Advanced Techniques:** Handling custom types, optimizing performance, managing large data.
*   **Applications:** Configuration, APIs, data persistence (like caching or inventory).
*   **Security & Best Practices:** The dangers of `pickle.loads`, the necessity of input validation, versioning, error handling, and testing.

By choosing the right serialization format for your task and following best practices, especially regarding security and error handling, you can effectively persist, exchange, and manage dictionary-based data in your Python projects.