# AEI Telecom Failure Simulator Module: Zero-to-Production Roadmap

This roadmap outlines the systematic development and validation phases for the `aei.telecom.failure.failure_simulator` Python module. This module is responsible for orchestrating and executing simulations of telecom network failures, leveraging validated configurations provided by the `AEIFailureConfigManager`.

## Phase -1: Module Scaffolding
* [ ] Module Creation: Established as `failure_simulator.py` within the `aei.telecom.failure` package structure.
    ```
    aei/
    └── telecom/
        └── failure/
            ├── __init__.py
            ├── config.py
            └── failure_simulator.py  # This module
    ```
* [ ] External Dependencies: Identify and include necessary external libraries (e.g., `numpy` for numerical operations, `pandas` for data handling).
    ```python
    # Example dependencies
    import numpy as np
    import pandas as pd
    ```
* [ ] Internal Dependencies: Import and utilize the `AEIFailureConfigManager` (or its `FailureConfig` output) for simulation parameters.
    ```python
    from aei.telecom.failure.config import FailureConfig, AEIFailureConfigManager
    ```
* [ ] Dedicated Logger: Configure a specific logger instance for precise log management within the simulation process.
    ```python
    import logging
    logger = logging.getLogger("aei.telecom.failure.failure_simulator")
    logger.setLevel(logging.INFO) # Default to INFO for simulation progress
    ```
* **Tools**: Python Standard Library, `numpy`, `pandas` (or other data science libraries)

---

## Phase 0: Design & Architecture
* [ ] Core Class Defined:
    * `AEIFailureSimulator`: The central class for setting up and running network failure simulations.
* [ ] Simulation Method Signature: Define clear inputs (validated configuration, network topology/state data, simulation duration) and outputs (simulation results, metrics).
    ```python
    # Example method signature
    class AEIFailureSimulator:
        def __init__(self, config: FailureConfig):
            self.config = config
            # ... initialize simulation environment ...

        def simulate(
            self,
            network_topology: Dict[str, Any],
            initial_user_data: pd.DataFrame,
            duration_steps: int
        ) -> Dict[str, Any]:
            # ... simulation logic ...
            pass
    ```
* [ ] Integration with Configuration: Ensure seamless consumption of `FailureConfig` object from `AEIFailureConfigManager`.
* [ ] Output Data Structure: Design the format of simulation results (e.g., failed cells over time, dropped call counts, performance degradation metrics).
* [ ] Modularity for Failure Models: Consider abstracting different failure types (e.g., power outage, backhaul failure) into separate, interchangeable models if complexity warrants.
* **Tools**: Conceptual Design, Data Schemas

---

## Phase 1: Core Implementation
* [ ] Initialization Logic: Set up the simulation environment based on the provided `FailureConfig`. This might involve pre-calculating probabilities or setting up internal state.
* [ ] Main Simulation Loop: Implement the iterative process of the simulation, step by step, applying failure conditions and observing network behavior.
* [ ] Failure Injection Logic: Integrate the `FailureType` probabilities and thresholds from `config` to determine when and how failures occur.
    ```python
    # Example: Inside simulate method
    for step in range(duration_steps):
        # Apply random failures based on self.config.failure_probabilities
        if np.random.rand() < self.config.power_outage_probability:
            self._inject_power_outage()
        # ... other failure types ...

        # Recalculate network state and metrics
        # ...
    ```
* [ ] Metric Calculation: Implement the logic to derive key performance indicators (KPIs) and impact metrics based on the simulated network state.
* [ ] Error Handling: Implement error handling for scenarios specific to the simulator (e.g., invalid network input, unhandled simulation states).
* **Tools**: `numpy`, `pandas` (for data manipulation within simulation)

---

## Phase 1.5: Code Quality
* [ ] Type Hinting: Apply comprehensive type hints to all function parameters, return values, and class attributes for clarity and maintainability.
* [ ] Readability & Style: Adhere to PEP 8 guidelines, using clear variable names and concise function definitions.
* [ ] Logging Implementation: Integrate `logger.info` for major simulation steps, `logger.debug` for detailed internal calculations, and `logger.warning`/`logger.error` for unusual or problematic simulation events.
    ```python
    # Example logging within simulation
    logger.info(f"Starting simulation for {duration_steps} steps.")
    logger.debug(f"Step {step}: Applied {num_failures} new failures.")
    ```
* **Tools**: Python Type Hinting

---

## Phase 2: Testing (External Unit & Integration Test Coverage)
* [ ] Unit Tests for Simulation Components: Test individual failure injection models, metric calculation functions, or state update logic in isolation.
    ```python
    # Example unit test for a specific failure model
    def test_power_outage_injection():
        simulator = AEIFailureSimulator(mock_config_with_power_outage)
        initial_state = {"cell_A": "active"}
        # ... apply power outage ...
        assert "cell_A" == "failed"
    ```
* [ ] Integration Tests for Full Simulation Runs: Run the `simulate` method with various `FailureConfig` objects and simplified network data, asserting on expected outcomes (e.g., number of dropped calls, specific cells failing).
    ```python
    # Example integration test
    def test_end_to_end_simulation():
        config_manager = AEIFailureConfigManager({"probabilities": {"dropped_call": 0.5}})
        config = config_manager.create_failure_config()
        simulator = AEIFailureSimulator(config)
        results = simulator.simulate(simple_network, simple_users, 100)
        assert results["total_dropped_calls"] > 0
    ```
* [ ] Regression Tests: Create a suite of tests to ensure that changes to the simulation logic do not inadvertently alter previously validated behavior.
* [ ] Performance Tests (Basic): Measure execution time for simple simulation runs to establish a baseline.
* **Tools**: `unittest` or `pytest` (external)

---

## Phase 3: Documentation
* [ ] Class and Method Docstrings: Provide comprehensive docstrings for the `AEIFailureSimulator` class and its `simulate` method, explaining purpose, parameters, and return values.
* [ ] Usage Examples: Develop clear, runnable examples demonstrating how to initialize the simulator, provide inputs, run a simulation, and interpret outputs.
* [ ] Input Data Specification: Document the required format and structure of `network_topology` and `initial_user_data`.
* [ ] Output Data Dictionary: Provide a clear explanation of all keys and values in the dictionary returned by the `simulate` method.
* **Tools**: Python Docstrings, Markdown (for overall project documentation)

---

## Phase 4: Packaging (Integration into Larger Project)
* [ ] Module Inclusion: This `failure_simulator.py` module will be included as part of the broader `aei.telecom.failure` Python package distribution.
* [ ] `pyproject.toml` / `setup.py` Entry: Its inclusion will be managed by the parent project's packaging configuration.
* **Tools**: `setuptools` / `poetry` (of the parent project)

---

## Phase 5: CI/CD Automation (Part of Parent Project's Pipeline)
* [ ] Automated Test Execution: Unit and integration tests for this module will be executed automatically as part of the parent project's continuous integration pipeline.
* [ ] Simulation Regression Runs: Consider including automated "mini-simulations" in CI/CD to catch regressions in core simulation behavior or performance.
* [ ] Code Quality Gates: Linting and type checking for this module's code will be enforced during CI.
* **Tools**: GitHub Actions, GitLab CI, Jenkins (of the parent project)

---

## Phase 6: Production Readiness
* [ ] Scalability & Performance: Optimize the simulator for performance and memory usage, especially for large-scale network simulations or long durations. Consider parallelization if applicable.
* [ ] Reproducibility: Ensure simulation results are reproducible given the same inputs and random seeds.
* [ ] Resource Management: Implement graceful handling of system resources (CPU, memory) to prevent crashes during long or intense simulations.
* [ ] Error Reporting & Monitoring: Integrate the simulator's logs and exceptions with a centralized monitoring system to track simulation health and identify issues.
* [ ] Robust Input Handling: Implement robust validation for `network_topology` and `initial_user_data` inputs beyond just type checking.
* **Tools**: Profilers, `pytest-benchmark`, Application Monitoring (e.g., Sentry, Prometheus), Logging Aggregation (e.g., ELK Stack)

---

### Final Checklist (for `aei.telecom.failure.failure_simulator`)

1.  [ ] **Core Simulation Logic Implemented**: Applies failure models based on configuration.
2.  [ ] **Integration with `FailureConfig`**: Seamlessly consumes validated configuration.
3.  [ ] **Clear Input/Output Contracts**: Well-defined data structures for simulation inputs and results.
4.  [ ] **Comprehensive Internal Logging**: Provides visibility into simulation progress and events.
5.  [ ] **Robust Error Handling**: Catches and reports simulation-specific issues.
6.  [ ] **Thorough Unit & Integration Tests**: Validates simulation components and overall behavior.
7.  [ ] **Performance Benchmarked**: Establishes and tracks simulation speed.
8.  [ ] **Reproducible Results**: Ensures consistent outputs for identical inputs.
9.  [ ] **Integrated into Parent Package**: (Future, deployment phase)
10. [ ] **Covered by Parent Project's CI/CD**: (Future, CI/CD phase)

---

```bash
/kaggle/working/
├── aei/
│   ├── telecom/
│   │   ├── failure/
│   │   │   ├── __init__.py
│   │   │   ├── config.py             # Assuming this file exists from previous steps
│   │   │   └── failure_simulator.py  # The code you provided
├── tests/
│   ├── __init__.py
│   └── test_failure_simulator.py
└── data_quality_report.json

In [1]:
import os
import importlib.util
import sys
import json
import logging
import textwrap
import pandas as pd

# ==============================================================================
# 1. SCRIPT SETUP
# ==============================================================================

# Configure logging for clear, informative output
logging.basicConfig(level=logging.INFO, format='%(levelname)s: %(message)s')
logger = logging.getLogger(__name__)

# Define constants for module name and paths for clarity and easy modification
MODULE_NAME = "aei.telecom.failure.config"
BASE_DIR = "/kaggle/working/aei/telecom/failure"
CONFIG_FILE_PATH = os.path.join(BASE_DIR, "config.py")

# ==============================================================================
# 2. THE MODULE'S SOURCE CODE
# ==============================================================================

# The source code for the config.py module with all fixes applied.
failure_config_code = textwrap.dedent('''
# -------------------------------
# ⚠️ AEI Failure Configuration Manager
# -------------------------------
# aei/telecom/failure/config.py
# Author: AID Edge Inc. Network Reliability Team
# Date: 2025-07-16 (Cleaned logger configuration for assertLogs compatibility)
# Version: 1.1.12

import json
from dataclasses import dataclass, field
from enum import Enum, auto
from typing import Dict, Any, Optional, List
import logging
import os
import pandas as pd

# IMPORTANT: Define the logger with an explicit, consistent name
logger = logging.getLogger("aei.telecom.failure.config")
logger.setLevel(logging.DEBUG)
logger.propagate = True

# ------------------------------------------------------------------------------
# ⚠️ Custom Exception
# ------------------------------------------------------------------------------
class ConfigurationError(Exception):
    """Custom exception for configuration-related errors."""
    pass

# ------------------------------------------------------------------------------
# 🧩 Failure Types
# ------------------------------------------------------------------------------
class FailureType(Enum):
    POWER = "POWER"
    BACKHAUL = "BACKHAUL"
    SOFTWARE = "SOFTWARE"
    HARDWARE = "HARDWARE"
    HANDOVER_FAILURE = "HANDOVER_FAILURE"
    DROPPED_CALL = "DROPPED_CALL"
    POWER_OUTAGE = "POWER_OUTAGE"

# ------------------------------------------------------------------------------
# 📦 Failure Event Configuration Models
# ------------------------------------------------------------------------------
@dataclass(frozen=True)
class FailureEventConfig:
    failure_id: int
    tower_id: str
    start_time: int  # Unix timestamp
    duration_minutes: int
    failure_type: str

@dataclass
class FailureEventListConfig:
    events: List[FailureEventConfig]

# ------------------------------------------------------------------------------
# 🔧 Failure Simulation Thresholds and Weights
# ------------------------------------------------------------------------------
@dataclass
class FailureConfig:
    handover_user_threshold: int = 50
    power_outage_prob: float = 0.01
    dropped_call_weight: float = 0.5
    cqi_penalty_weight: float = 0.5
    sinr_threshold: float = 3.0
    rsrq_threshold: float = -15.0
    cqi_threshold: float = 3.0
    rsrp_threshold: float = -110.0
    distance_weight: float = 0.4
    base_failure_rate: float = 0.05
    failure_types: List[str] = field(default_factory=lambda: [
        "POWER", "BACKHAUL", "SOFTWARE", "HANDOVER_FAILURE", "DROPPED_CALL"
    ])

# ------------------------------------------------------------------------------
# 🧩 Final Combined Failure Configuration
# ------------------------------------------------------------------------------
@dataclass
class AEIFailureConfig:
    simulation: FailureConfig
    events: Optional[FailureEventListConfig] = None

class AEIFailureSimulator:
    """Simulates network failures based on configuration"""
    
    def __init__(self, config: AEIFailureConfig):
        """Initialize simulator with configuration"""
        # Debug print types
        logger.debug("AEIFailureSimulator Input Types:")
        logger.debug("Simulation Config:")
        for field, value in config.simulation.__dict__.items():
            logger.debug(f"{field}: {type(value)}")
        
        if config.events:
            logger.debug("Event Configs:")
            for i, event in enumerate(config.events.events, 1):
                logger.debug(f"Event {i}:")
                for field, value in event.__dict__.items():
                    logger.debug(f"{field}: {type(value)}")
        
        # Store config
        self.config = config
        
    def simulate(self, df: pd.DataFrame) -> pd.DataFrame:
        """Run failure simulation on input data"""
        # Add your simulation logic here
        return df

class AEIFailureConfigManager:
    @staticmethod
    def _get_initial_defaults() -> Dict[str, Any]:
        return {
            "thresholds": {
                "handover_user": 50, "sinr": 0.0, "rsrq": -15.0, "cqi": 3
            },
            "probabilities": {
                "power_outage": 0.001, "dropped_call": 0.5, "cqi_penalty": 0.5
            },
            "failure_types": [
                "POWER", "BACKHAUL", "SOFTWARE", "HARDWARE"
            ]
        }

    _DEFAULTS = _get_initial_defaults()

    def __init__(self, override_config: Optional[Dict[str, Any]] = None):
        if not isinstance(override_config, (dict, type(None))):
            raise ConfigurationError(f"Override config must be a dictionary or None. Got {type(override_config).__name__}")

        self.config = AEIFailureConfigManager._deep_merge(
            AEIFailureConfigManager._get_initial_defaults(), override_config or {}
        )

        self._validate_for_unknown_keys(self.config, self._get_config_schema())
        self._validate_structure_and_types(self.config, self._get_config_schema())
        self._validate_physical_constraints()
        self._documentation = self._generate_documentation()
        logger.debug("AEIFailureConfigManager initialized successfully.")

    @property
    def documentation(self) -> str:
        return self._documentation

    def create_failure_config(self) -> "FailureConfig":
        return FailureConfig(
            handover_user_threshold=self.config["thresholds"]["handover_user"],
            power_outage_prob=self.config["probabilities"]["power_outage"],
            dropped_call_weight=self.config["probabilities"]["dropped_call"],
            cqi_penalty_weight=self.config["probabilities"]["cqi_penalty"],
            sinr_threshold=self.config["thresholds"]["sinr"],
            rsrq_threshold=self.config["thresholds"]["rsrq"],
            cqi_threshold=self.config["thresholds"]["cqi"],
            failure_types=self.config["failure_types"]
        )

    @classmethod
    def from_json(cls, filepath: str) -> "AEIFailureConfigManager":
        if not os.path.exists(filepath):
            raise FileNotFoundError(f"Configuration file not found: {filepath}")
        try:
            with open(filepath, 'r') as f:
                config_data = json.load(f)
            logger.info(f"Loaded configuration from JSON: {filepath}")
            return cls(override_config=config_data)
        except json.JSONDecodeError as e:
            raise ConfigurationError(f"Invalid JSON format in {filepath}: {e}") from e
        except Exception as e:
            raise ConfigurationError(f"Failed to load config from {filepath}: {e}") from e

    @classmethod
    def from_dict(cls, config_dict: Dict[str, Any]) -> "AEIFailureConfigManager":
        logger.info("Creating AEIFailureConfigManager from dictionary.")
        return cls(override_config=config_dict)

    def _get_config_schema(self) -> Dict[str, Any]:
        return {
            "thresholds": {
                "handover_user": (int, float), "sinr": (int, float),
                "rsrq": (int, float), "cqi": int
            },
            "probabilities": {
                "power_outage": (int, float), "dropped_call": (int, float),
                "cqi_penalty": (int, float)
            },
            "failure_types": list
        }

    def _validate_for_unknown_keys(self, config_part: Dict, schema_part: Dict, path: str = ""):
        unknown_keys = [k for k in config_part if k not in schema_part]
        if unknown_keys:
            expected_keys_str = ", ".join(sorted(schema_part.keys()))
            raise ConfigurationError(f"Unknown configuration keys found at '{path}': {', '.join(unknown_keys)}. Expected keys: {expected_keys_str}.")

        for key, value in config_part.items():
            if isinstance(value, dict) and key in schema_part and isinstance(schema_part[key], dict):
                self._validate_for_unknown_keys(value, schema_part[key], f"{path}{key}.")

    def _validate_structure_and_types(self, config_part: Any, schema_part: Any, path: str = ""):
        if path == "" and not isinstance(config_part, dict):
            raise ConfigurationError(f"Configuration at '{path}' must be a dictionary. Got {type(config_part).__name__}.")

        if not isinstance(schema_part, dict):
            if not isinstance(config_part, schema_part):
                if isinstance(schema_part, tuple):
                    expected_name = " or ".join([t.__name__ for t in schema_part])
                else:
                    expected_name = schema_part.__name__
                raise ConfigurationError(f"'{path}' must be of type {expected_name}. Got {type(config_part).__name__}.")
            return

        for key, expected_type in schema_part.items():
            if key not in config_part:
                continue

            current_value = config_part[key]
            current_path = f"{path}{key}" if path else key

            if isinstance(expected_type, dict):
                if not isinstance(current_value, dict):
                    raise ConfigurationError(f"'{current_path}' must be a dictionary, not {type(current_value).__name__}")
                self._validate_structure_and_types(current_value, expected_type, current_path + ".")
            else:
                if not isinstance(current_value, expected_type):
                    if isinstance(expected_type, tuple):
                        expected_name = " or ".join([t.__name__ for t in expected_type])
                    else:
                        expected_name = expected_type.__name__
                    raise ConfigurationError(f"'{current_path}' must be of type {expected_name}. Got {type(current_value).__name__}.")

    def _validate_physical_constraints(self):
        probs = self.config.get("probabilities", {})
        for name, value in probs.items():
            if not 0 <= value <= 1:
                raise ConfigurationError(f"'{name}' probability ({value}) must be between 0 and 1.")

        thresholds = self.config.get("thresholds", {})
        if "handover_user" in thresholds and thresholds["handover_user"] < 0:
            raise ConfigurationError("Handover user threshold cannot be negative.")

        if "rsrq" in thresholds:
            if not -30 <= thresholds["rsrq"] <= 0:
                raise ConfigurationError(f"RSRQ threshold ({thresholds['rsrq']} dB) must be between -30 and 0 dB.")

        if "cqi" in thresholds:
            if not 1 <= thresholds["cqi"] <= 15:
                raise ConfigurationError(f"CQI threshold ({thresholds['cqi']}) must be between 1 and 15.")

        failure_types = self.config.get("failure_types", [])
        for f_type_str in failure_types:
            if f_type_str not in FailureType.__members__:
                raise ConfigurationError(f"Invalid failure type '{f_type_str}'.")

    def _generate_documentation(self) -> str:
        thresholds_doc = json.dumps(self.config['thresholds'], indent=2)
        probabilities_doc = json.dumps(self.config['probabilities'], indent=2)
        failure_types_doc = json.dumps([ft.name for ft in FailureType], indent=2)

        return f"""## AEI Failure Simulation Configuration Documentation

This document outlines the current configuration for the AEI Failure Simulation System.

### Thresholds
{thresholds_doc}

### Probabilities
{probabilities_doc}

### Available Failure Types
{failure_types_doc}"""

    @staticmethod
    def _deep_merge(base: Dict, update: Dict) -> Dict:
        for key, value in update.items():
            if isinstance(value, dict) and key in base and isinstance(base[key], dict):
                base[key] = AEIFailureConfigManager._deep_merge(base[key], value)
            else:
                base[key] = value
        return base
''')

# ==============================================================================
# 3. HELPER FUNCTIONS AND MAIN LOGIC
# ==============================================================================

def setup_environment():
    """Create directory structure and write the module file."""
    logger.info("Setting up test environment")
    os.makedirs(BASE_DIR, exist_ok=True)
    for path_segment in ["/aei", "/aei/telecom", "/aei/telecom/failure"]:
        init_path = os.path.join("/kaggle/working", path_segment.lstrip('/'), "__init__.py")
        open(init_path, 'a').close()

    with open(CONFIG_FILE_PATH, 'w') as f:
        f.write(failure_config_code)
    logger.info(f"Module saved to {CONFIG_FILE_PATH}")

def import_dynamically():
    """Handles path, module reloading, and dynamic import."""
    logger.info("Dynamically Importing Module")
    if "/kaggle/working" not in sys.path:
        sys.path.append("/kaggle/working")
    
    if MODULE_NAME in sys.modules:
        del sys.modules[MODULE_NAME]
    
    spec = importlib.util.spec_from_file_location(MODULE_NAME, CONFIG_FILE_PATH)
    if spec is None:
        raise ImportError(f"Could not load spec for module {MODULE_NAME}")
    
    config_module = importlib.util.module_from_spec(spec)
    sys.modules[MODULE_NAME] = config_module
    spec.loader.exec_module(config_module)
    return config_module

def main():
    """Main execution function to set up, import, and test the module."""
    setup_environment()
    config_module = import_dynamically()
    
    # Get the required classes
    AEIFailureConfigManager = config_module.AEIFailureConfigManager
    ConfigurationError = config_module.ConfigurationError
    FailureConfig = config_module.FailureConfig

    # Test default configuration
    logger.info("Testing default configuration")
    try:
        default_manager = AEIFailureConfigManager()
        default_config = default_manager.create_failure_config()
        logger.info("Default configuration test passed")
    except Exception as e:
        logger.error(f"Default configuration test failed: {e}")

    # Test custom configuration
    logger.info("Testing custom configuration")
    try:
        custom_config = {
            "thresholds": {"cqi": 10, "sinr": 2.5},
            "probabilities": {"dropped_call": 0.5, "cqi_penalty": 0.5},
            "failure_types": ["POWER", "BACKHAUL"]
        }
        custom_manager = AEIFailureConfigManager(override_config=custom_config)
        custom_config_obj = custom_manager.create_failure_config()
        logger.info("Custom configuration test passed")
    except Exception as e:
        logger.error(f"Custom configuration test failed: {e}")

if __name__ == "__main__":
    try:
        main()
    except Exception as e:
        logger.critical(f"Script failed: {e}", exc_info=True)

In [2]:
import os
import sys
import pandas as pd
import numpy as np
import logging

# Configure logging to see messages from the simulator
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')


# Ensure necessary directories and __init__.py files exist
os.makedirs("/kaggle/working/aei/telecom/failure", exist_ok=True)
open("/kaggle/working/aei/__init__.py", "a").close()
open("/kaggle/working/aei/telecom/__init__.py", "a").close()
open("/kaggle/working/aei/telecom/failure/__init__.py", "a").close()

# The corrected code for failure_simulator.py
# We need to rewrite the file each time to ensure the latest version is used
# and handle Prometheus import gracefully within the file itself.
corrected_failure_simulator_code = '''
# -------------------------------
# ⚠️ AEI Telecom Failure Simulator
# -------------------------------
# aei/telecom/failure/failure_simulator.py
# Author: AID Edge Inc. Network Reliability Team
# Date: 2025-07-16
# Version: 1.2.0 (with Enterprise SDK enhancements)

import numpy as np
import logging
import pandas as pd
from typing import Dict, Any, Optional, List
from dataclasses import dataclass, field
from enum import Enum, auto
from functools import wraps
from time import perf_counter
import json # For data quality report

# --- Prometheus Integration ---
# Install via: pip install prometheus_client
try:
    from prometheus_client import Counter, Gauge, generate_latest, REGISTRY as PROMETHEUS_REGISTRY_GLOBAL # Import REGISTRY directly
    PROMETHEUS_AVAILABLE = True
except ImportError:
    PROMETHEUS_AVAILABLE = False
    logging.warning("Prometheus client not installed. Metrics will not be exported. Run 'pip install prometheus_client'")

# Import from your new config module
from aei.telecom.failure.config import FailureConfig, FailureType, ConfigurationError, AEIFailureConfigManager

# Setup logging
logger = logging.getLogger(__name__)

# Prometheus metrics are now initialized within the AEIFailureSimulator's __init__ method
# to ensure the class is defined before metrics are associated with it.

def timed_operation(func):
    """Decorator to log execution time of functions and update Prometheus gauge."""
    @wraps(func)
    def wrapper(*args, **kwargs):
        start_time = perf_counter()
        result = func(*args, **kwargs)
        end_time = perf_counter()
        duration = end_time - start_time
        logger.debug(f"Operation '{func.__name__}' completed in {duration:.4f} seconds.")
        
        # Access metrics via the instance if needed, or assume they are global and initialized
        # For simplicity, assuming global Prometheus metrics once initialized globally by the first instance.
        if PROMETHEUS_AVAILABLE and func.__name__ == 'simulate':
            # This depends on the global definition of sim_simulation_duration_seconds.
            # If metrics are instance-specific, this decorator needs to be a method of the class
            # or receive the instance and its metrics. For now, it remains global scope.
            try:
                # Access the class-level gauge directly
                if hasattr(AEIFailureSimulator, '_sim_simulation_duration_seconds_gauge') and AEIFailureSimulator._sim_simulation_duration_seconds_gauge is not None:
                    AEIFailureSimulator._sim_simulation_duration_seconds_gauge.set(duration)
            except NameError:
                logger.warning("Prometheus gauge 'sim_simulation_duration_seconds' not found or initialized for timed_operation.")
        
        return result
    return wrapper

class AEIFailureSimulator:
    """
    Enterprise-grade failure simulation for telecom networks.
    
    Features:
    - Configurable failure thresholds via FailureConfig object.
    - Detailed logging and performance monitoring.
    - Type safety and input validation.
    - Reproducible simulations using a random seed.
    - Compatibility layer for existing data generation pipelines.
    - Prometheus metrics integration for operational monitoring.
    - (Conceptual) JSON config encryption support.
    - (Conceptual) Data Quality Report generation.
    - (Conceptual) Dask compatibility for large datasets.
    """
    
    # Class-level attributes for Prometheus metrics to ensure they are defined once
    # and shared across instances.
    if PROMETHEUS_AVAILABLE:
        _metrics_initialized = False
        _sim_total_simulations_counter = None
        _sim_failure_type_counts_counter = None
        _sim_high_level_failure_counts_counter = None
        _sim_avg_request_failure_rate_gauge = None
        _sim_simulation_duration_seconds_gauge = None # Renamed for clarity and consistency

    def __init__(self, seed: Optional[int] = None, config: Optional[FailureConfig] = None):
        """
        Initializes the AEIFailureSimulator.

        Args:
            seed (Optional[int]): Seed for random number generation for reproducibility.
            config (Optional[FailureConfig]): Configuration object for failure parameters.
                                            If None, a default FailureConfig is created
                                            using AEIFailureConfigManager's defaults.
        
        Raises:
            ConfigurationError: If the provided config is invalid during internal validation.
        """
        self.rng = np.random.RandomState(seed) # Initialize RNG with seed
        
        if config is None:
            # If no config is provided, use the AEIFailureConfigManager to get a default one.
            try:
                default_config_manager = AEIFailureConfigManager()
                self.config = default_config_manager.create_failure_config()
            except ConfigurationError as e:
                logger.critical(f"Failed to load default configuration for simulator: {e}")
                raise # Re-raise as it's a critical setup failure
        else:
            if not isinstance(config, FailureConfig):
                raise TypeError("Provided 'config' must be an instance of FailureConfig.")
            self.config = config
        
        self._validate_config_internal() # Validate the received/created FailureConfig
        logger.info(f"AEIFailureSimulator initialized with seed: {seed}, "
                                f"Failure Types: {self.config.failure_types}, "
                                f"Power Outage Prob: {self.config.power_outage_prob}")
        
        # --- Prometheus Metrics Initialization (Moved into __init__) ---
        if PROMETHEUS_AVAILABLE and not AEIFailureSimulator._metrics_initialized:
            logger.info("Initializing Prometheus metrics for AEIFailureSimulator.")
            AEIFailureSimulator._sim_total_simulations_counter = Counter(
                'aei_telecom_sim_total_simulations',
                'Total number of telecom failure simulations run',
                registry=PROMETHEUS_REGISTRY_GLOBAL # Explicitly use the global registry
            )
            AEIFailureSimulator._sim_failure_type_counts_counter = Counter(
                'aei_telecom_sim_failure_type_total',
                'Total count of specific failure types detected',
                ['failure_type_detailed'],
                registry=PROMETHEUS_REGISTRY_GLOBAL
            )
            AEIFailureSimulator._sim_high_level_failure_counts_counter = Counter(
                'aei_telecom_sim_high_level_failure_total',
                'Total count of high-level failure types assigned',
                ['failure_type_high_level'],
                registry=PROMETHEUS_REGISTRY_GLOBAL
            )
            AEIFailureSimulator._sim_avg_request_failure_rate_gauge = Gauge(
                'aei_telecom_sim_avg_request_failure_rate',
                'Average Request Failure Rate from the last simulation run',
                registry=PROMETHEUS_REGISTRY_GLOBAL
            )
            AEIFailureSimulator._sim_simulation_duration_seconds_gauge = Gauge(
                'aei_telecom_sim_simulation_duration_seconds',
                'Duration of the last comprehensive simulation run',
                registry=PROMETHEUS_REGISTRY_GLOBAL
            )
            AEIFailureSimulator._metrics_initialized = True # Set flag after initialization
            logger.debug("Prometheus metrics initialized.")

        # Increment total simulations counter on initialization
        if PROMETHEUS_AVAILABLE:
            if AEIFailureSimulator._sim_total_simulations_counter: # Ensure it's initialized before incrementing
                AEIFailureSimulator._sim_total_simulations_counter.inc()


    def _validate_config_internal(self):
        """
        Performs internal validation on the FailureConfig object.
        This is a safeguard, as AEIFailureConfigManager should already validate.
        """
        if not 0 <= self.config.power_outage_prob <= 1:
            raise ConfigurationError("Power outage probability must be between 0 and 1.")
        if self.config.handover_user_threshold < 0:
            raise ConfigurationError("Handover user threshold cannot be negative.")
        if not isinstance(self.config.failure_types, list) or not self.config.failure_types:
            raise ConfigurationError("failure_types must be a non-empty list of strings.")
        
        # Verify that all failure types listed in config are valid Enum members
        for f_type_str in self.config.failure_types:
            try:
                FailureType[f_type_str]
            except KeyError:
                raise ConfigurationError(f"Configured failure type '{f_type_str}' is not a valid "
                                         f"FailureType enum member. Valid types: {[t.name for t in FailureType]}.")
        logger.debug("FailureConfig internally validated successfully.")

    @timed_operation
    def simulate_failures(self, df: pd.DataFrame, failure_prob_overall: float = 0.05) -> pd.DataFrame:
        """
        Simulates failures and assigns a high-level 'failure_type' based on simulation outcomes.
        This method acts as an adapter for compatibility, ensuring 'failure_type' is derived
        from detailed simulation results.

        Args:
            df (pd.DataFrame): Input DataFrame with telecom metrics.
            failure_prob_overall (float): An overall probability to randomly introduce generic
                                            failure types where no specific failure is detected.
                                            This is for compatibility and can be set to 0.0 if not needed.

        Returns:
            pd.DataFrame: DataFrame with added detailed failure columns and a high-level 'failure_type'.
        """
        if df.empty:
            logger.warning("Received empty DataFrame for failure simulation. Returning empty DataFrame.")
            return df.copy()

        df_processed = df.copy()
        
        # 1. First, run the detailed simulations to get specific failure indicators
        df_processed = self.simulate(df_processed)

        # 2. Determine the high-level 'failure_type' based on the detailed simulations
        # Initialize with NONE
        df_processed['failure_type'] = FailureType.NONE.name

        # Prioritize assigning specific failure types if they occur
        # Ensure the order below reflects desired priority
        df_processed.loc[df_processed['Power_Outage'] == 1, 'failure_type'] = FailureType.POWER_OUTAGE.name
        df_processed.loc[df_processed['Handover_Failures'] == 1, 'failure_type'] = FailureType.HANDOVER_FAILURE.name
        df_processed.loc[df_processed['Dropped_Calls'] == 1, 'failure_type'] = FailureType.DROPPED_CALL.name
        
        # 3. Apply generic failure types for remaining 'NONE' entries based on overall probability
        remaining_mask = (df_processed['failure_type'] == FailureType.NONE.name)
        if remaining_mask.any() and failure_prob_overall > 0:
            # Generate random assignments ONLY for the `remaining_mask` rows
            random_values = self.rng.random(size=remaining_mask.sum())
            
            # Create a Series from random_values < failure_prob_overall to align with remaining_mask's index
            # This ensures proper alignment and broadcasting
            temp_random_assign_mask = pd.Series(random_values < failure_prob_overall, index=df_processed[remaining_mask].index)

            # Filter failure types that are *not* the specific ones calculated above
            # Ensure we only pick from the `self.config.failure_types` that are enabled
            generic_failure_types = [
                ft for ft in self.config.failure_types
                if ft not in [FailureType.DROPPED_CALL.name, FailureType.HANDOVER_FAILURE.name, FailureType.POWER_OUTAGE.name]
            ]
            
            if generic_failure_types:
                # Combine the masks correctly by index for the final assignment
                final_assignment_mask = remaining_mask & temp_random_assign_mask
                
                # Assign chosen generic failure types to the filtered rows
                df_processed.loc[final_assignment_mask, 'failure_type'] = \
                    self.rng.choice(generic_failure_types, size=final_assignment_mask.sum())
            else:
                logger.debug("No generic failure types configured to assign to 'NONE' entries.")

        logger.info(f"High-level 'failure_type' assigned to {len(df_processed)} records.")
        
        # --- Prometheus: Increment high-level failure type counts ---
        if PROMETHEUS_AVAILABLE and AEIFailureSimulator._sim_high_level_failure_counts_counter:
            for f_type in df_processed['failure_type'].unique():
                AEIFailureSimulator._sim_high_level_failure_counts_counter.labels(f_type).inc(df_processed[df_processed['failure_type'] == f_type].shape[0])

        return df_processed

    @timed_operation
    def simulate(self, df: pd.DataFrame) -> pd.DataFrame:
        """
        Simulate network failures on telecom data. This is the comprehensive method
        that adds detailed failure indicator columns.
        
        Args:
            df (pd.DataFrame): DataFrame containing network metrics (RSRP, RSRQ, SINR, CQI, device_count).
                               Note: "User Count" is expected to be "device_count" from TelecomDataGenerator.
                               
        Returns:
            pd.DataFrame: DataFrame with added binary failure columns:
                            - Dropped_Calls (binary: 0 or 1)
                            - Handover_Failures (binary: 0 or 1)
                            - Power_Outage (binary: 0 or 1)
                            - Request_Failure_Rate (float: 0-100)
        
        Raises:
            ValueError: If required columns are missing in the input DataFrame.
            Exception: For other unexpected errors during simulation.
        """
        self._validate_input(df)
        
        try:
            df_processed = df.copy()
            df_processed["Dropped_Calls"] = self._calculate_dropped_calls(df_processed)
            df_processed["Handover_Failures"] = self._calculate_handover_failures(df_processed)
            df_processed["Power_Outage"] = self._simulate_power_outages(len(df_processed))
            df_processed["Request_Failure_Rate"] = self._calculate_failure_rate(df_processed)
            
            logger.info(f"Simulated detailed failures on {len(df_processed)} records.")
            self._log_failure_stats(df_processed)

            # --- Prometheus: Update detailed failure type counts ---
            if PROMETHEUS_AVAILABLE and AEIFailureSimulator._sim_failure_type_counts_counter and AEIFailureSimulator._sim_avg_request_failure_rate_gauge:
                AEIFailureSimulator._sim_failure_type_counts_counter.labels(FailureType.DROPPED_CALL.name).inc(df_processed["Dropped_Calls"].sum())
                AEIFailureSimulator._sim_failure_type_counts_counter.labels(FailureType.HANDOVER_FAILURE.name).inc(df_processed["Handover_Failures"].sum())
                AEIFailureSimulator._sim_failure_type_counts_counter.labels(FailureType.POWER_OUTAGE.name).inc(df_processed["Power_Outage"].sum())
                
                if not df_processed.empty:
                    AEIFailureSimulator._sim_avg_request_failure_rate_gauge.set(df_processed["Request_Failure_Rate"].mean())
                else:
                    AEIFailureSimulator._sim_avg_request_failure_rate_gauge.set(0) # Reset if no data

            return df_processed
            
        except Exception as e:
            logger.error(f"Comprehensive failure simulation error: {str(e)}", exc_info=True)
            raise # Re-raise the exception after logging for upstream handling
            
    def _validate_input(self, df: pd.DataFrame):
        """
        Validate input DataFrame structure before simulation.
        
        Args:
            df (pd.DataFrame): The input DataFrame to validate.
        
        Raises:
            ValueError: If required columns are missing.
        """
        required_columns = {"RSRP", "RSRQ", "SINR", "CQI", "device_count"}
        missing = required_columns - set(df.columns)
        if missing:
            raise ValueError(f"Missing required columns for failure simulation: {missing}. "
                             f"Required: {required_columns}, Found: {set(df.columns)}")
        logger.debug("Input DataFrame validated for failure simulation.")
            
    def _calculate_dropped_calls(self, df: pd.DataFrame) -> pd.Series:
        """
        Calculate dropped calls based on signal quality (SINR and RSRQ thresholds).
        """
        sinr = pd.to_numeric(df["SINR"], errors='coerce').fillna(self.config.sinr_threshold - 1) # Treat NaNs as bad signal
        rsrq = pd.to_numeric(df["RSRQ"], errors='coerce').fillna(self.config.rsrq_threshold - 1) # Treat NaNs as bad signal
        return ((sinr < self.config.sinr_threshold) | 
                (rsrq < self.config.rsrq_threshold)).astype(int)
    
    def _calculate_handover_failures(self, df: pd.DataFrame) -> pd.Series:
        """
        Calculate handover failures based on low CQI and high device count.
        """
        cqi = pd.to_numeric(df["CQI"], errors='coerce').fillna(self.config.cqi_threshold - 1) # Treat NaNs as bad CQI
        device_count = pd.to_numeric(df["device_count"], errors='coerce').fillna(0)
        return ((cqi < self.config.cqi_threshold) & 
                (device_count > self.config.handover_user_threshold)).astype(int)
    
    def _simulate_power_outages(self, size: int) -> pd.Series:
        """
        Simulate random power outages based on a global probability.
        """
        return pd.Series(self.rng.rand(size) < self.config.power_outage_prob).astype(int)
    
    def _calculate_failure_rate(self, df: pd.DataFrame) -> pd.Series:
        """
        Calculate a composite Request Failure Rate metric based on dropped calls and CQI.
        The rate is clipped between 0 and 100.
        """
        cqi = pd.to_numeric(df["CQI"], errors='coerce').fillna(1) # Fill NaN CQI with 1 to avoid division by zero or large penalty
        
        # Normalize CQI to a 0-1 scale, then invert for penalty (lower CQI = higher penalty)
        # Assuming CQI ranges from 1 to 15 based on standard definition
        cqi_normalized_penalty = 1 - ((cqi - 1) / 14) # If CQI is 1, penalty is 1. If CQI is 15, penalty is 0.
        cqi_normalized_penalty = np.clip(cqi_normalized_penalty, 0, 1) # Ensure it's strictly between 0 and 1

        composite_rate = (self.config.dropped_call_weight * df["Dropped_Calls"] * 100 +
                          self.config.cqi_penalty_weight * cqi_normalized_penalty * 100)
        
        return np.clip(composite_rate, 0, 100) # Clip the final rate between 0 and 100
    
    def _log_failure_stats(self, df: pd.DataFrame):
        """Log statistics about simulated failures."""
        if df.empty:
            logger.info("No data to log failure statistics for.")
            return

        stats = {
            FailureType.DROPPED_CALL: df["Dropped_Calls"].sum(),
            FailureType.HANDOVER_FAILURE: df["Handover_Failures"].sum(),
            FailureType.POWER_OUTAGE: df["Power_Outage"].sum()
        }
        
        total_records = len(df)
        for failure_type, count in stats.items():
            rate = (count / total_records * 100) if total_records > 0 else 0.0
            logger.info(f"{failure_type.name}: {count} occurrences ({rate:.2f}%)")
        
        # Log distribution of the high-level 'failure_type'
        if 'failure_type' in df.columns:
            logger.info("High-level 'failure_type' distribution:")
            type_counts = df['failure_type'].value_counts()
            for f_type, count in type_counts.items():
                rate = (count / total_records * 100) if total_records > 0 else 0.0
                logger.info(f"  - {f_type}: {count} ({rate:.2f}%)")

    # --- Enterprise-Level SDK Enhancements (Conceptual/Placeholder Methods) ---

    def generate_data_quality_report(self, df_simulated: pd.DataFrame, output_path: str = "data_quality_report.json"):
        """
        Generates a basic data quality report summarizing simulation outcomes.
        For an enterprise SDK, this would be much more detailed (e.g., HTML, visual summaries).
        
        Args:
            df_simulated (pd.DataFrame): The DataFrame after simulation.
            output_path (str): Path to save the report (e.g., JSON, HTML).
        """
        if df_simulated.empty:
            logger.warning("Cannot generate data quality report for an empty DataFrame.")
            return

        report_summary = {
            "total_records": len(df_simulated),
            "simulated_timestamp_utc": pd.Timestamp.now(tz='UTC').isoformat(),
            "failure_counts": {},
            "failure_percentages": {},
            "request_failure_rate_stats": {
                "mean": df_simulated["Request_Failure_Rate"].mean(),
                "median": df_simulated["Request_Failure_Rate"].median(),
                "std": df_simulated["Request_Failure_Rate"].std(),
                "min": df_simulated["Request_Failure_Rate"].min(),
                "max": df_simulated["Request_Failure_Rate"].max(),
            },
            "high_level_failure_type_distribution": df_simulated['failure_type'].value_counts().to_dict()
        }

        detailed_failure_columns = ["Dropped_Calls", "Handover_Failures", "Power_Outage"]
        for col in detailed_failure_columns:
            if col in df_simulated.columns:
                count = df_simulated[col].sum()
                report_summary["failure_counts"][col] = count
                report_summary["failure_percentages"][col] = (count / len(df_simulated) * 100) if len(df_simulated) > 0 else 0.0
        
        try:
            with open(output_path, 'w') as f:
                json.dump(report_summary, f, indent=4)
            logger.info(f"Data quality report saved to {output_path}")
        except Exception as e:
            logger.error(f"Failed to save data quality report: {e}", exc_info=True)

    @staticmethod
    def decrypt_config_json(encrypted_filepath: str, decryption_key: bytes) -> Dict[str, Any]:
        """
        Conceptual method for decrypting a JSON configuration file.
        This would involve a proper cryptography library (e.g., PyCryptodome or cryptography.fernet).
        
        Args:
            encrypted_filepath (str): Path to the encrypted JSON file.
            decryption_key (bytes): The decryption key.
            
        Returns:
            Dict[str, Any]: The decrypted configuration dictionary.
            
        Raises:
            NotImplementedError: As full encryption/decryption is complex and not fully implemented here.
            
        Example (conceptual):
            from cryptography.fernet import Fernet
            # key = Fernet.generate_key() # Generate once, store securely
            # f = Fernet(key)
            # encrypted_data = f.encrypt(json.dumps(config_dict).encode())
            # with open(filepath, 'wb') as file: file.write(encrypted_data)
            
            # To decrypt:
            # with open(encrypted_filepath, 'rb') as file: encrypted_content = file.read()
            # decrypted_bytes = f.decrypt(encrypted_content)
            # return json.loads(decrypted_bytes.decode())
        """
        logger.warning("Decryption functionality is conceptual and not fully implemented.")
        raise NotImplementedError("JSON config decryption requires a robust key management and cryptography implementation.")

    def process_with_dask(self, dask_dataframe) -> Any: # Returns a Dask DataFrame conceptually
        """
        Conceptual method to process a Dask DataFrame.
        This would require adapting _calculate_dropped_calls, _calculate_handover_failures, etc.,
        to use Dask DataFrame operations directly or via Dask's apply/map_partitions.
        
        Args:
            dask_dataframe: A Dask DataFrame with required columns.
            
        Returns:
            A Dask DataFrame with simulated failures.
            
        Raises:
            NotImplementedError: As full Dask integration requires significant refactoring.
        """
        logger.warning("Dask integration is conceptual and not fully implemented.")
        # Example of how you might start:
        # if not hasattr(dask_dataframe, 'dask'):
        #     raise TypeError("Input must be a Dask DataFrame.")
        #
        # dask_dataframe_processed = dask_dataframe.copy()
        # dask_dataframe_processed["Dropped_Calls"] = dask_dataframe_processed.apply(
        #     lambda row: self._calculate_dropped_calls(pd.DataFrame([row])), axis=1, meta=('Dropped_Calls', int)
        # )
        # This is a simplified example, direct dask operations are better for performance.
        
        raise NotImplementedError("Dask compatibility requires significant refactoring of core calculation methods to be Dask-aware.")
'''

# Save to file
with open("/kaggle/working/aei/telecom/failure/failure_simulator.py", "w") as f:
    f.write(corrected_failure_simulator_code)

# Add /kaggle/working to sys.path so Python can find the aei package
if "/kaggle/working" not in sys.path:
    sys.path.append("/kaggle/working")

# Reload the module to ensure changes are picked up, or remove from sys.modules
# This is crucial in interactive environments where modules are cached.
if 'aei.telecom.failure.failure_simulator' in sys.modules:
    del sys.modules['aei.telecom.failure.failure_simulator']
if 'aei.telecom.failure.config' in sys.modules:
    del sys.modules['aei.telecom.failure.config']


# Import the necessary components AFTER deleting from sys.modules if they were loaded before
from aei.telecom.failure.config import AEIFailureConfigManager, FailureConfig, FailureType, ConfigurationError
from aei.telecom.failure.failure_simulator import AEIFailureSimulator, PROMETHEUS_AVAILABLE

# If Prometheus is available, ensure its registry is clean for this run
if PROMETHEUS_AVAILABLE:
    from prometheus_client import REGISTRY, PROCESS_COLLECTOR, PLATFORM_COLLECTOR

    # Clear the default registry for testing
    # This is a common workaround for older prometheus_client versions
    # and effectively removes all metrics from the default registry.
    if hasattr(REGISTRY, '_names_to_collectors'):
        REGISTRY._names_to_collectors.clear()
    if hasattr(REGISTRY, '_collector_to_names'):
        REGISTRY._collector_to_names.clear()
    
    # Also unregister default collectors explicitly, handling KeyError
    try:
        REGISTRY.unregister(PROCESS_COLLECTOR)
    except KeyError: # Collector is not registered (prometheus_client 0.9.0 and above)
        pass
    except ValueError: # Collector is not registered (older prometheus_client versions might raise ValueError)
        pass
    except Exception as e:
        print(f"Warning: Could not unregister PROCESS_COLLECTOR: {e}")

    try:
        REGISTRY.unregister(PLATFORM_COLLECTOR)
    except KeyError: # Collector is not registered
        pass
    except ValueError: # Collector is not registered
        pass
    except Exception as e:
        print(f"Warning: Could not unregister PLATFORM_COLLECTOR: {e}")

    print("Prometheus registry cleared for testing.")


try:
    print("Successfully imported AEIFailureSimulator from the corrected file!")

    # Example test DataFrame (ensure it has required columns)
    test_df = pd.DataFrame({
        'RSRP': np.random.uniform(-100, -50, 10),
        'RSRQ': np.random.uniform(-20, -5, 10),
        'SINR': np.random.uniform(-5, 25, 10),
        'CQI': np.random.randint(1, 16, 10),
        'device_count': np.random.randint(10, 100, 10)
    })

    # Create a custom config using AEIFailureConfigManager
    custom_config_manager = AEIFailureConfigManager.from_dict({
        "thresholds": {
            "handover_user": 70,
            "sinr": -3.0,
            "rsrq": -18.0,
            "cqi": 2
        },
        "probabilities": {
            "power_outage": 0.5, # High probability for testing
            "dropped_call": 0.5,
            "cqi_penalty": 0.5
        },
        # Use the names of the FailureType enum members as strings
        "failure_types": [
            FailureType.POWER.name,
            FailureType.SOFTWARE.name,
            FailureType.HARDWARE.name # Add a few more generic types
        ]
    })
    custom_failure_config = custom_config_manager.create_failure_config()

    # Initialize simulator with the created config
    simulator = AEIFailureSimulator(seed=42, config=custom_failure_config)
    result_df = simulator.simulate_failures(test_df, failure_prob_overall=0.3)

    print("\nSample generated failures:")
    print(result_df[['RSRP', 'SINR', 'CQI', 'device_count', 'failure_type', 
                     'Dropped_Calls', 'Handover_Failures', 'Power_Outage', 'Request_Failure_Rate']].head())
    print("\nFailure Type Distribution (from simulate_failures):")
    print(result_df['failure_type'].value_counts())

    # Generate a data quality report
    report_path = "/kaggle/working/data_quality_report.json"
    simulator.generate_data_quality_report(result_df, output_path=report_path)
    print(f"\nData quality report generated at: {report_path}")

    # Optionally, print Prometheus metrics if PROMETHEUS_AVAILABLE
    if PROMETHEUS_AVAILABLE:
        from prometheus_client import generate_latest
        print("\nPrometheus Metrics:")
        print(generate_latest().decode('utf-8'))


except ConfigurationError as e:
    print(f"Configuration Error during test setup: {e}")
except ImportError as e:
    print(f"Error importing AEIFailureSimulator: {e}. Make sure the file is saved correctly and __init__.py files are present.")
except Exception as e:
    print(f"An unexpected error occurred during testing: {e}")

Prometheus registry cleared for testing.
Successfully imported AEIFailureSimulator from the corrected file!
An unexpected error occurred during testing: NONE


In [3]:
import os
import sys
import unittest
import json
import logging
import textwrap
import importlib.util
import shutil
import re
from io import StringIO

# Configure logging for the test runner script
logging.basicConfig(level=logging.DEBUG, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)

MODULE_NAME = "aei.telecom.failure.config"
BASE_DIR = "/kaggle/working/aei/telecom/failure"
CONFIG_FILE_PATH = os.path.join(BASE_DIR, "config.py")
project_root = "/kaggle/working"

if project_root not in sys.path:
    sys.path.insert(0, project_root)
    logger.info(f"Added {project_root} to sys.path.")

def cleanup_environment():
    """Removes created directories and files, including pycache."""
    logger.info("--- 🗑️ Cleaning up previous environment ---")
    if os.path.exists(os.path.join(project_root, "aei")):
        shutil.rmtree(os.path.join(project_root, "aei"), ignore_errors=True)
        logger.info("Removed old 'aei' directory.")
    if os.path.exists(os.path.join(project_root, "temp_test_config.json")):
        os.remove(os.path.join(project_root, "temp_test_config.json"))
        logger.info("Removed old 'temp_test_config.json'.")
    
    for root, dirs, files in os.walk(project_root):
        if '__pycache__' in dirs:
            shutil.rmtree(os.path.join(root, '__pycache__'), ignore_errors=True)
            logger.info(f"Removed __pycache__ in {root}.")
    
    for module in list(sys.modules.keys()):
        if module.startswith('aei.telecom.failure') or module == MODULE_NAME:
            if module in sys.modules and hasattr(sys.modules[module], 'logger'):
                mod_logger = sys.modules[module].logger
                for handler in list(mod_logger.handlers):
                    mod_logger.removeHandler(handler)
                mod_logger.propagate = True
                mod_logger.setLevel(logging.NOTSET)
            del sys.modules[module]
            logger.debug(f"Cleanup: Cleared '{module}' from sys.modules.")

    target_logger = logging.getLogger("aei.telecom.failure.config")
    for handler in list(target_logger.handlers):
        target_logger.removeHandler(handler)
    target_logger.propagate = True
    target_logger.setLevel(logging.NOTSET)

    root_logger = logging.getLogger()
    for handler in list(root_logger.handlers):
        root_logger.removeHandler(handler)
    root_logger.setLevel(logging.WARNING)

    if hasattr(logging.Logger, 'manager') and hasattr(logging.Logger.manager, 'loggerDict'):
        for key in list(logging.Logger.manager.loggerDict.keys()):
            if key.startswith('aei.telecom.failure') or key == MODULE_NAME:
                if isinstance(logging.Logger.manager.loggerDict[key], logging.Logger):
                    temp_logger = logging.Logger.manager.loggerDict[key]
                    for handler in list(temp_logger.handlers):
                        temp_logger.removeHandler(handler)
                del logging.Logger.manager.loggerDict[key]
                logger.debug(f"Cleanup: Cleared logger '{key}' from manager.loggerDict.")

cleanup_environment()

def setup_module_environment():
    """Create directory structure and write the module file."""
    logger.info("--- ⚙️ Setting up module environment ---")
    os.makedirs(BASE_DIR, exist_ok=True)
    for path_segment in ["/aei", "/aei/telecom", "/aei/telecom/failure"]:
        init_path = os.path.join(project_root, path_segment.lstrip('/'), "__init__.py")
        os.makedirs(os.path.dirname(init_path), exist_ok=True)
        if not os.path.exists(init_path):
            open(init_path, "a").close()
            logger.debug(f"Created {init_path}")
    
    failure_config_code = textwrap.dedent('''
# -------------------------------
# ⚠️ AEI Failure Configuration Manager
# -------------------------------
# aei/telecom/failure/config.py
# Author: AID Edge Inc. Network Reliability Team
# Date: 2025-07-16 (Cleaned logger configuration for assertLogs compatibility)
# Version: 1.1.12

import json
from dataclasses import dataclass, field
from enum import Enum, auto
from typing import Dict, Any, Optional, List

import logging
import os

logger = logging.getLogger("aei.telecom.failure.config")
logger.setLevel(logging.DEBUG)
logger.propagate = True

class ConfigurationError(Exception):
    """Custom exception for configuration-related errors."""
    pass

class FailureType(Enum):
    POWER = "POWER"
    BACKHAUL = "BACKHAUL"
    SOFTWARE = "SOFTWARE"
    HARDWARE = "HARDWARE"
    HANDOVER_FAILURE = "HANDOVER_FAILURE"
    DROPPED_CALL = "DROPPED_CALL"
    POWER_OUTAGE = "POWER_OUTAGE"  

@dataclass(frozen=True)
class FailureEventConfig:
    failure_id: int
    tower_id: str
    start_time: int  # Unix timestamp
    duration_minutes: int
    failure_type: str

@dataclass
class FailureEventListConfig:
    events: List[FailureEventConfig]

@dataclass(frozen=True)
class FailureConfig:
    handover_user_threshold: int = 50
    power_outage_prob: float = 0.01
    dropped_call_weight: float = 0.5
    cqi_penalty_weight: float = 0.5
    sinr_threshold: float = 3.0
    rsrq_threshold: float = -15.0
    cqi_threshold: float = 3.0
    rsrp_threshold: float = -110.0
    distance_weight: float = 0.4
    base_failure_rate: float = 0.05
    failure_types: List[str] = field(default_factory=lambda: [
        "POWER",
        "BACKHAUL",
        "SOFTWARE",
        "HANDOVER_FAILURE",
        "DROPPED_CALL"
    ])

@dataclass
class AEIFailureConfig:
    simulation: FailureConfig
    events: Optional[FailureEventListConfig] = None

class AEIFailureConfigManager:
    @staticmethod
    def _get_initial_defaults() -> Dict[str, Any]:
        return {
            "thresholds": {
                "handover_user": 50, "sinr": 0.0, "rsrq": -15.0, "cqi": 3
            },
            "probabilities": {
                "power_outage": 0.001, "dropped_call": 0.5, "cqi_penalty": 0.5
            },
            "failure_types": [
                "POWER", "BACKHAUL",
                "SOFTWARE", "HARDWARE"
            ]
        }

    _DEFAULTS = _get_initial_defaults()

    def __init__(self, override_config: Optional[Dict[str, Any]] = None):
        if not isinstance(override_config, (dict, type(None))):
            raise ConfigurationError(f"Override config must be a dictionary or None. Got {type(override_config).__name__}.")

        self.config = AEIFailureConfigManager._deep_merge(
            AEIFailureConfigManager._get_initial_defaults(), override_config or {}
        )

        self._validate_for_unknown_keys(self.config, self._get_config_schema())
        self._validate_structure_and_types(self.config, self._get_config_schema())
        self._validate_physical_constraints()
        self._documentation = self._generate_documentation()
        logger.debug("AEIFailureConfigManager initialized successfully.")

    @property
    def documentation(self) -> str:
        return self._documentation

    def create_failure_config(self) -> "FailureConfig":
        return FailureConfig(
            handover_user_threshold=self.config["thresholds"]["handover_user"],
            power_outage_prob=self.config["probabilities"]["power_outage"],
            dropped_call_weight=self.config["probabilities"]["dropped_call"],
            cqi_penalty_weight=self.config["probabilities"]["cqi_penalty"],
            sinr_threshold=self.config["thresholds"]["sinr"],
            rsrq_threshold=self.config["thresholds"]["rsrq"],
            cqi_threshold=self.config["thresholds"]["cqi"],
            failure_types=self.config["failure_types"]
        )

    @classmethod
    def from_json(cls, filepath: str) -> "AEIFailureConfigManager":
        if not os.path.exists(filepath):
            raise FileNotFoundError(f"Configuration file not found: {filepath}")
        try:
            with open(filepath, 'r') as f:
                config_data = json.load(f)
            logger.info(f"Loaded configuration from JSON: {filepath}")
            return cls(override_config=config_data)
        except json.JSONDecodeError as e:
            raise ConfigurationError(f"Invalid JSON format in {filepath}: {e}") from e
        except Exception as e:
            raise ConfigurationError(f"Failed to load config from {filepath}: {e}") from e

    @classmethod
    def from_dict(cls, config_dict: Dict[str, Any]) -> "AEIFailureConfigManager":
        logger.info("Creating AEIFailureConfigManager from dictionary.")
        return cls(override_config=config_dict)

    def _get_config_schema(self) -> Dict[str, Any]:
        return {
            "thresholds": {
                "handover_user": (int, float), "sinr": (int, float),
                "rsrq": (int, float), "cqi": int
            },
            "probabilities": {
                "power_outage": (int, float), "dropped_call": (int, float),
                "cqi_penalty": (int, float)
            },
            "failure_types": list
        }

    def _validate_for_unknown_keys(self, config_part: Dict, schema_part: Dict, path: str = ""):
        unknown_keys = [k for k in config_part if k not in schema_part]
        if unknown_keys:
            expected_keys_str = ", ".join(sorted(schema_part.keys()))
            raise ConfigurationError(f"Unknown configuration keys found at '{path}': {', '.join(unknown_keys)}. Expected keys: {expected_keys_str}.")

        for key, value in config_part.items():
            if isinstance(value, dict) and key in schema_part and isinstance(schema_part[key], dict):
                self._validate_for_unknown_keys(value, schema_part[key], f"{path}{key}.")

    def _validate_structure_and_types(self, config_part: Any, schema_part: Any, path: str = ""):
        if path == "" and not isinstance(config_part, dict):
            raise ConfigurationError(f"Configuration at '{path}' must be a dictionary. Got {type(config_part).__name__}.")

        if not isinstance(schema_part, dict):
            if not isinstance(config_part, schema_part):
                if isinstance(schema_part, tuple):
                    expected_name = " or ".join([t.__name__ for t in schema_part])
                else:
                    expected_name = schema_part.__name__
                raise ConfigurationError(f"'{path}' must be of type {expected_name}. Got {type(config_part).__name__}.")
            return

        for key, expected_type in schema_part.items():
            if key not in config_part:
                continue

            current_value = config_part[key]
            current_path = f"{path}{key}" if path else key

            if isinstance(expected_type, dict):
                if not isinstance(current_value, dict):
                    raise ConfigurationError(f"'{current_path}' must be a dictionary, not {type(current_value).__name__}")
                self._validate_structure_and_types(current_value, expected_type, current_path + ".")
            else:
                if not isinstance(current_value, expected_type):
                    if isinstance(expected_type, tuple):
                        expected_name = " or ".join([t.__name__ for t in expected_type])
                    else:
                        expected_name = expected_type.__name__
                    raise ConfigurationError(f"'{current_path}' must be of type {expected_name}. Got {type(current_value).__name__}.")

    def _validate_physical_constraints(self):
        probs = self.config.get("probabilities", {})
        for name, value in probs.items():
            if not 0 <= value <= 1:
                raise ConfigurationError(f"'{name}' probability ({value}) must be between 0 and 1.")

        thresholds = self.config.get("thresholds", {})
        if "handover_user" in thresholds and thresholds["handover_user"] < 0:
            raise ConfigurationError("Handover user threshold cannot be negative.")

        if "rsrq" in thresholds:
            if not -30 <= thresholds["rsrq"] <= 0:
                raise ConfigurationError(f"RSRQ threshold ({thresholds['rsrq']} dB) must be between -30 and 0 dB.")

        if "cqi" in thresholds:
            logger.debug(f"CQI validation check: thresholds['cqi'] is {thresholds['cqi']}")
            if not 1 <= thresholds["cqi"] <= 15:
                raise ConfigurationError(f"CQI threshold ({thresholds['cqi']}) must be between 1 and 15.")

        if "sinr" in thresholds and thresholds["sinr"] > 5.0:
            logger.warning(f"SINR threshold ({thresholds['sinr']} dB) is positive and quite high. Consider typical SINR values are often negative for poor conditions.")

        failure_types = self.config.get("failure_types", [])
        for f_type_str in failure_types:
            if f_type_str not in [ft.value for ft in FailureType]:
                raise ConfigurationError(f"Invalid failure type '{f_type_str}'.")

        if "dropped_call" in probs and "cqi_penalty" in probs:
            total_weight = probs["dropped_call"] + probs["cqi_penalty"]
            if abs(total_weight - 1.0) > 1e-6:
                logger.warning(f"Weights for 'dropped_call' and 'cqi_penalty' sum to {total_weight}, which is not exactly 1.0. This might indicate an imbalance.")

    def _generate_documentation(self) -> str:
        thresholds_doc = json.dumps(self.config['thresholds'], indent=2)
        probabilities_doc = json.dumps(self.config['probabilities'], indent=2)
        failure_types_doc = json.dumps([ft.value for ft in FailureType], indent=2)

        return f"""## AEI Failure Simulation Configuration Documentation

This document outlines the current configuration for the AEI Failure Simulation System.

### Thresholds
{thresholds_doc}

### Probabilities
{probabilities_doc}

### Available Failure Types
{failure_types_doc}"""

    @staticmethod
    def _deep_merge(base: Dict, update: Dict) -> Dict:
        for key, value in update.items():
            if isinstance(value, dict) and key in base and isinstance(base[key], dict):
                base[key] = AEIFailureConfigManager._deep_merge(base[key], value)
            else:
                base[key] = value
        return base
''')
    
    with open(CONFIG_FILE_PATH, 'w') as f:
        f.write(failure_config_code)
    logger.info(f"✅ Module saved to {CONFIG_FILE_PATH}")

setup_module_environment()

for module in list(sys.modules.keys()):
    if module.startswith('aei.telecom.failure') or module == MODULE_NAME:
        if module in sys.modules and hasattr(sys.modules[module], 'logger'):
            mod_logger = sys.modules[module].logger
            for handler in list(mod_logger.handlers):
                mod_logger.removeHandler(handler)
            mod_logger.propagate = True
            mod_logger.setLevel(logging.NOTSET) 
        del sys.modules[module]
        logger.debug(f"Cleanup: Cleared '{module}' from sys.modules.")

spec = importlib.util.spec_from_file_location(MODULE_NAME, CONFIG_FILE_PATH)
if spec is None:
    raise ImportError(f"Could not load spec for module {MODULE_NAME}")
config_module = importlib.util.module_from_spec(spec)
sys.modules[MODULE_NAME] = config_module
spec.loader.exec_module(config_module)
logger.info(f"✅ Successfully loaded '{MODULE_NAME}'.")

if hasattr(config_module, 'AEIFailureConfigManager'):
    initial_defaults_check = config_module.AEIFailureConfigManager._get_initial_defaults()
    cqi_default_check_post_load = initial_defaults_check["thresholds"]["cqi"]
    logger.info(f"AEIFailureConfigManager._get_initial_defaults()['thresholds']['cqi'] after module load: {cqi_default_check_post_load}")
    if cqi_default_check_post_load != 3:
        logger.error(f"CRITICAL: Default CQI in _get_initial_defaults() is not 3! It is {cqi_default_check_post_load}")
        raise AssertionError(f"CRITICAL: Default CQI in _get_initial_defaults() is not 3! It is {cqi_default_check_post_load}")
    
    dc_prob_default = initial_defaults_check["probabilities"]["dropped_call"]
    cqi_pen_default = initial_defaults_check["probabilities"]["cqi_penalty"]
    if abs(dc_prob_default + cqi_pen_default - 1.0) > 1e-6:
        logger.error(f"CRITICAL: Default dropped_call ({dc_prob_default}) and cqi_penalty ({cqi_pen_default}) do not sum to 1.0!")
        raise AssertionError(f"CRITICAL: Default dropped_call and cqi_penalty do not sum to 1.0!")
else:
    logger.error("AEIFailureConfigManager class not found in loaded module.")

try:
    from aei.telecom.failure.config import AEIFailureConfigManager, FailureConfig, FailureType, ConfigurationError
    logger.info("Successfully imported AEIFailureConfigManager and related classes for unit testing.")

    class TestAEIFailureConfigManager(unittest.TestCase):
        @classmethod
        def tearDownClass(cls):
            cleanup_environment()
            logger.info("--- ✅ All tests completed and environment cleaned. ---")

        def setUp(self):
            self.test_module_logger = logging.getLogger("aei.telecom.failure.config")
            for handler in list(self.test_module_logger.handlers):
                self.test_module_logger.removeHandler(handler)
            self.test_module_logger.propagate = True
            self.test_module_logger.setLevel(logging.DEBUG)

            root_logger = logging.getLogger()
            for handler in list(root_logger.handlers):
                root_logger.removeHandler(handler)
            root_logger.setLevel(logging.DEBUG)

            if not root_logger.handlers: 
                handler = logging.StreamHandler(sys.stdout)
                formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
                handler.setFormatter(formatter)
                root_logger.addHandler(handler)
                logger.debug("Added a StreamHandler to the root logger for general test visibility.")

            self.default_config_data = AEIFailureConfigManager._get_initial_defaults()

            self.temp_config_filepath = os.path.join(project_root, "temp_test_config.json")
            with open(self.temp_config_filepath, 'w') as f:
                json.dump(self.default_config_data, f)
            logging.debug(f"Created temporary config file: {self.temp_config_filepath}")

        def tearDown(self):
            if os.path.exists(self.temp_config_filepath):
                os.remove(self.temp_config_filepath)
            logging.debug(f"Removed temporary config file: {self.temp_config_filepath}")

            test_module_logger = logging.getLogger("aei.telecom.failure.config")
            for handler in list(test_module_logger.handlers):
                test_module_logger.removeHandler(handler)
            test_module_logger.propagate = True
            test_module_logger.setLevel(logging.NOTSET)

            root_logger = logging.getLogger()
            for handler in list(root_logger.handlers):
                root_logger.removeHandler(handler)
            root_logger.setLevel(logging.WARNING)

        def test_default_initialization(self):
            manager = AEIFailureConfigManager()
            self.assertIsInstance(manager, AEIFailureConfigManager)
            self.assertEqual(manager.config["thresholds"]["handover_user"], self.default_config_data["thresholds"]["handover_user"])
            self.assertEqual(manager.config["probabilities"]["power_outage"], self.default_config_data["probabilities"]["power_outage"])
            self.assertIn("POWER", manager.config["failure_types"])
            self.assertEqual(manager.config["thresholds"]["cqi"], 3)

        def test_override_initialization(self):
            override = {
                "thresholds": {"handover_user": 100, "sinr": -5.0},
                "probabilities": {"power_outage": 0.005},
                "failure_types": ["POWER_OUTAGE", "SOFTWARE"]  # Changed from CUSTOM_FAILURE to SOFTWARE
            }
            manager = AEIFailureConfigManager(override)
            self.assertEqual(manager.config["thresholds"]["handover_user"], 100)
            self.assertEqual(manager.config["thresholds"]["sinr"], -5.0)
            self.assertEqual(manager.config["probabilities"]["power_outage"], 0.005)
            self.assertEqual(manager.config["probabilities"]["dropped_call"], 0.5)
            self.assertEqual(sorted(manager.config["failure_types"]), sorted(["POWER_OUTAGE", "SOFTWARE"]))
            self.assertEqual(manager.config["thresholds"]["cqi"], 3)

        def test_create_failure_config_instance(self):
            manager = AEIFailureConfigManager()
            failure_cfg = manager.create_failure_config()
            self.assertIsInstance(failure_cfg, FailureConfig)
            self.assertEqual(failure_cfg.handover_user_threshold, 50)
            self.assertEqual(failure_cfg.sinr_threshold, 0.0)
            self.assertIsInstance(failure_cfg.failure_types, list)
            self.assertEqual(failure_cfg.cqi_threshold, 3)

        def test_failure_config_immutability(self):
            manager = AEIFailureConfigManager()
            failure_cfg = manager.create_failure_config()
            with self.assertRaises(AttributeError):
                failure_cfg.handover_user_threshold = 60

        def test_from_json_success(self):
            manager = AEIFailureConfigManager.from_json(self.temp_config_filepath)
            self.assertIsInstance(manager, AEIFailureConfigManager)
            self.assertEqual(manager.config["thresholds"]["handover_user"], 50)
            self.assertEqual(manager.config["thresholds"]["cqi"], 3)

        def test_from_json_file_not_found(self):
            with self.assertRaisesRegex(FileNotFoundError, "Configuration file not found"):
                AEIFailureConfigManager.from_json("non_existent_file_xyz.json")

        def test_from_json_invalid_json(self):
            invalid_json_path = os.path.join(project_root, "invalid_config_test.json")
            with open(invalid_json_path, 'w') as f:
                f.write("{invalid_json")
            with self.assertRaisesRegex(ConfigurationError, "Invalid JSON format"):
                AEIFailureConfigManager.from_json(invalid_json_path)
            os.remove(invalid_json_path)

        def test_from_dict_success(self):
            config_dict = {
                "thresholds": {"handover_user": 80},
                "probabilities": {"dropped_call": 0.15, "cqi_penalty": 0.85}
            }
            manager = AEIFailureConfigManager.from_dict(config_dict)
            self.assertIsInstance(manager, AEIFailureConfigManager)
            self.assertEqual(manager.config["thresholds"]["handover_user"], 80)
            self.assertEqual(manager.config["probabilities"]["dropped_call"], 0.15)
            self.assertEqual(manager.config["thresholds"]["cqi"], 3)

        def test_invalid_type_in_config(self):
            invalid_config = {"thresholds": {"cqi": "not_an_int"}}
            with self.assertRaisesRegex(ConfigurationError, r"'thresholds\.cqi' must be of type int\. Got str\."):
                AEIFailureConfigManager(invalid_config)

        def test_unknown_key_in_config_top_level(self):
            invalid_config = {"unknown_top_level_key": {}}
            with self.assertRaisesRegex(ConfigurationError, r"Unknown configuration keys found at '': unknown_top_level_key\. Expected keys: .*"):
                AEIFailureConfigManager(invalid_config)
                
        def test_unknown_key_in_config_nested(self):
            invalid_config_nested = {"thresholds": {"cqi": 5, "extra_key": 10}}
            with self.assertRaisesRegex(ConfigurationError, r"Unknown configuration keys found at 'thresholds\.': extra_key\. Expected keys: .*"):
                AEIFailureConfigManager(invalid_config_nested)

        def test_invalid_structure_not_dict_sub_level(self):
            invalid_config = {"thresholds": "not_a_dict"}
            with self.assertRaisesRegex(ConfigurationError, r"'thresholds' must be a dictionary, not str"):
                AEIFailureConfigManager(invalid_config)
                
        def test_invalid_structure_not_dict_top_level(self):
            with self.assertRaisesRegex(ConfigurationError, r"Override config must be a dictionary or None\. Got str\."):
                AEIFailureConfigManager("not_a_dict")

        def test_prob_out_of_range_high(self):
            invalid_config = {"probabilities": {"power_outage": 1.1}}
            with self.assertRaisesRegex(ConfigurationError, r"'power_outage' probability \(1\.1\) must be between 0 and 1\."):
                AEIFailureConfigManager(invalid_config)

        def test_prob_out_of_range_low(self):
            invalid_config = {"probabilities": {"dropped_call": -0.1}}
            with self.assertRaisesRegex(ConfigurationError, r"'dropped_call' probability \(-0\.1\) must be between 0 and 1\."):
                AEIFailureConfigManager(invalid_config)

        def test_handover_user_negative(self):
            invalid_config = {"thresholds": {"handover_user": -10}}
            with self.assertRaisesRegex(ConfigurationError, "Handover user threshold cannot be negative."):
                AEIFailureConfigManager(invalid_config)

        def test_rsrq_out_of_range(self):
            invalid_config_high = {"thresholds": {"rsrq": 5.0}}
            with self.assertRaisesRegex(ConfigurationError, r"RSRQ threshold.*must be between -30 and 0 dB\."):
                AEIFailureConfigManager(invalid_config_high)
                
            invalid_config_low = {"thresholds": {"rsrq": -40.0}}
            with self.assertRaisesRegex(ConfigurationError, r"RSRQ threshold.*must be between -30 and 0 dB\."):
                AEIFailureConfigManager(invalid_config_low)

        def test_cqi_out_of_range(self):
            invalid_config_high = {"thresholds": {"cqi": 16}}
            with self.assertRaisesRegex(ConfigurationError, r"CQI threshold \(16\) must be between 1 and 15\."):
                AEIFailureConfigManager(invalid_config_high)
                
            invalid_config_low = {"thresholds": {"cqi": 0}}
            with self.assertRaisesRegex(ConfigurationError, r"CQI threshold \(0\) must be between 1 and 15\."):
                AEIFailureConfigManager(invalid_config_low)

        def test_sinr_threshold_warning(self):
            log_capture_stream = StringIO()
            handler = logging.StreamHandler(log_capture_stream)
            formatter = logging.Formatter('%(levelname)s - %(message)s')
            handler.setFormatter(formatter)
            handler.setLevel(logging.WARNING) 
            
            target_logger = logging.getLogger('aei.telecom.failure.config')
            
            for h in list(target_logger.handlers):
                target_logger.removeHandler(h)

            target_logger.addHandler(handler)
            target_logger.setLevel(logging.DEBUG) 
            target_logger.propagate = False 

            try:
                AEIFailureConfigManager({"thresholds": {"sinr": 10.0}})
                log_output = log_capture_stream.getvalue()
                self.assertIn("WARNING - SINR threshold (10.0 dB) is positive and quite high.", log_output)
                log_capture_stream.truncate(0) 
                log_capture_stream.seek(0)

                AEIFailureConfigManager({"thresholds": {"sinr": -5.0}})
                log_output_no_warning = log_capture_stream.getvalue()
                self.assertNotIn("SINR threshold", log_output_no_warning)
                self.assertEqual("", log_output_no_warning.strip()) 
            finally:
                target_logger.removeHandler(handler) 
                target_logger.propagate = True 
                target_logger.setLevel(logging.NOTSET) 

        def test_dropped_cqi_weights_sum_warning(self):
            log_capture_stream = StringIO()
            handler = logging.StreamHandler(log_capture_stream)
            formatter = logging.Formatter('%(levelname)s - %(message)s')
            handler.setFormatter(formatter)
            handler.setLevel(logging.WARNING)
            
            target_logger = logging.getLogger('aei.telecom.failure.config')

            for h in list(target_logger.handlers):
                target_logger.removeHandler(h)

            target_logger.addHandler(handler)
            target_logger.setLevel(logging.DEBUG)
            target_logger.propagate = False 

            try:
                AEIFailureConfigManager({"probabilities": {"dropped_call": 0.3, "cqi_penalty": 0.4}})
                log_output = log_capture_stream.getvalue()
                self.assertIn("WARNING - Weights for 'dropped_call' and 'cqi_penalty' sum to 0.7, which is not exactly 1.0.", log_output)
                log_capture_stream.truncate(0)
                log_capture_stream.seek(0)

                AEIFailureConfigManager({"probabilities": {"dropped_call": 0.65, "cqi_penalty": 0.3}})
                log_output = log_capture_stream.getvalue()
                self.assertIn("WARNING - Weights for 'dropped_call' and 'cqi_penalty' sum to 0.95, which is not exactly 1.0.", log_output)
                log_capture_stream.truncate(0)
                log_capture_stream.seek(0)

                AEIFailureConfigManager({"probabilities": {"dropped_call": 0.5, "cqi_penalty": 0.5}})
                log_output_no_warning = log_capture_stream.getvalue()
                self.assertEqual("", log_output_no_warning.strip())
                log_capture_stream.truncate(0)
                log_capture_stream.seek(0)
                
                AEIFailureConfigManager({"probabilities": {"dropped_call": 0.4999999999, "cqi_penalty": 0.5000000001}})
                log_output_no_warning = log_capture_stream.getvalue()
                self.assertEqual("", log_output_no_warning.strip())

            finally:
                target_logger.removeHandler(handler)
                target_logger.propagate = True
                target_logger.setLevel(logging.NOTSET)

        def test_documentation_generation(self):
            manager = AEIFailureConfigManager()
            doc = manager.documentation
            self.assertIsInstance(doc, str)
            self.assertIn("AEI Failure Simulation Configuration Documentation", doc)
            self.assertIn("handover_user", doc)
            self.assertIn(str(manager.config["thresholds"]["handover_user"]), doc)
            self.assertIn("POWER", doc)

    if __name__ == '__main__':
        unittest.main(argv=['first-arg-is-ignored'], exit=False)

except ImportError as e:
    logger.critical(f"Error importing modules. Please ensure your 'aei/telecom/failure/config.py' file is correctly set up and accessible.", exc_info=True)
    sys.exit(1)
except Exception as e:
    logger.critical(f"An unexpected critical error occurred during test setup or execution: {e}", exc_info=True)
    sys.exit(1)

2025-07-21 17:38:46,757 - __main__ - DEBUG - Added a StreamHandler to the root logger for general test visibility.
2025-07-21 17:38:46,759 - root - DEBUG - Created temporary config file: /kaggle/working/temp_test_config.json
2025-07-21 17:38:46,760 - aei.telecom.failure.config - DEBUG - CQI validation check: thresholds['cqi'] is 16
2025-07-21 17:38:46,761 - aei.telecom.failure.config - DEBUG - CQI validation check: thresholds['cqi'] is 0
2025-07-21 17:38:46,762 - root - DEBUG - Removed temporary config file: /kaggle/working/temp_test_config.json


.

2025-07-21 17:38:46,764 - __main__ - DEBUG - Added a StreamHandler to the root logger for general test visibility.
2025-07-21 17:38:46,767 - root - DEBUG - Created temporary config file: /kaggle/working/temp_test_config.json
2025-07-21 17:38:46,768 - aei.telecom.failure.config - DEBUG - CQI validation check: thresholds['cqi'] is 3
2025-07-21 17:38:46,769 - aei.telecom.failure.config - DEBUG - AEIFailureConfigManager initialized successfully.
2025-07-21 17:38:46,771 - root - DEBUG - Removed temporary config file: /kaggle/working/temp_test_config.json


.

2025-07-21 17:38:46,773 - __main__ - DEBUG - Added a StreamHandler to the root logger for general test visibility.
2025-07-21 17:38:46,774 - root - DEBUG - Created temporary config file: /kaggle/working/temp_test_config.json
2025-07-21 17:38:46,775 - aei.telecom.failure.config - DEBUG - CQI validation check: thresholds['cqi'] is 3
2025-07-21 17:38:46,776 - aei.telecom.failure.config - DEBUG - AEIFailureConfigManager initialized successfully.
2025-07-21 17:38:46,777 - root - DEBUG - Removed temporary config file: /kaggle/working/temp_test_config.json


.

2025-07-21 17:38:46,779 - __main__ - DEBUG - Added a StreamHandler to the root logger for general test visibility.
2025-07-21 17:38:46,780 - root - DEBUG - Created temporary config file: /kaggle/working/temp_test_config.json
2025-07-21 17:38:46,781 - aei.telecom.failure.config - DEBUG - CQI validation check: thresholds['cqi'] is 3
2025-07-21 17:38:46,782 - aei.telecom.failure.config - DEBUG - AEIFailureConfigManager initialized successfully.
2025-07-21 17:38:46,783 - root - DEBUG - Removed temporary config file: /kaggle/working/temp_test_config.json


.

2025-07-21 17:38:46,785 - __main__ - DEBUG - Added a StreamHandler to the root logger for general test visibility.
2025-07-21 17:38:46,787 - root - DEBUG - Created temporary config file: /kaggle/working/temp_test_config.json
2025-07-21 17:38:46,789 - root - DEBUG - Removed temporary config file: /kaggle/working/temp_test_config.json


.

2025-07-21 17:38:46,791 - __main__ - DEBUG - Added a StreamHandler to the root logger for general test visibility.
2025-07-21 17:38:46,793 - root - DEBUG - Created temporary config file: /kaggle/working/temp_test_config.json
2025-07-21 17:38:46,794 - aei.telecom.failure.config - DEBUG - CQI validation check: thresholds['cqi'] is 3
2025-07-21 17:38:46,794 - aei.telecom.failure.config - DEBUG - AEIFailureConfigManager initialized successfully.
2025-07-21 17:38:46,796 - root - DEBUG - Removed temporary config file: /kaggle/working/temp_test_config.json


.

2025-07-21 17:38:46,798 - __main__ - DEBUG - Added a StreamHandler to the root logger for general test visibility.
2025-07-21 17:38:46,799 - root - DEBUG - Created temporary config file: /kaggle/working/temp_test_config.json
2025-07-21 17:38:46,800 - aei.telecom.failure.config - INFO - Creating AEIFailureConfigManager from dictionary.
2025-07-21 17:38:46,801 - aei.telecom.failure.config - DEBUG - CQI validation check: thresholds['cqi'] is 3
2025-07-21 17:38:46,801 - aei.telecom.failure.config - DEBUG - AEIFailureConfigManager initialized successfully.
2025-07-21 17:38:46,802 - root - DEBUG - Removed temporary config file: /kaggle/working/temp_test_config.json


.

2025-07-21 17:38:46,804 - __main__ - DEBUG - Added a StreamHandler to the root logger for general test visibility.
2025-07-21 17:38:46,807 - root - DEBUG - Created temporary config file: /kaggle/working/temp_test_config.json
2025-07-21 17:38:46,808 - root - DEBUG - Removed temporary config file: /kaggle/working/temp_test_config.json


.

2025-07-21 17:38:46,810 - __main__ - DEBUG - Added a StreamHandler to the root logger for general test visibility.
2025-07-21 17:38:46,811 - root - DEBUG - Created temporary config file: /kaggle/working/temp_test_config.json
2025-07-21 17:38:46,813 - root - DEBUG - Removed temporary config file: /kaggle/working/temp_test_config.json


.

2025-07-21 17:38:46,814 - __main__ - DEBUG - Added a StreamHandler to the root logger for general test visibility.
2025-07-21 17:38:46,816 - root - DEBUG - Created temporary config file: /kaggle/working/temp_test_config.json
2025-07-21 17:38:46,817 - aei.telecom.failure.config - INFO - Loaded configuration from JSON: /kaggle/working/temp_test_config.json
2025-07-21 17:38:46,818 - aei.telecom.failure.config - DEBUG - CQI validation check: thresholds['cqi'] is 3
2025-07-21 17:38:46,819 - aei.telecom.failure.config - DEBUG - AEIFailureConfigManager initialized successfully.
2025-07-21 17:38:46,820 - root - DEBUG - Removed temporary config file: /kaggle/working/temp_test_config.json


.

2025-07-21 17:38:46,822 - __main__ - DEBUG - Added a StreamHandler to the root logger for general test visibility.
2025-07-21 17:38:46,823 - root - DEBUG - Created temporary config file: /kaggle/working/temp_test_config.json
2025-07-21 17:38:46,825 - root - DEBUG - Removed temporary config file: /kaggle/working/temp_test_config.json


.

2025-07-21 17:38:46,826 - __main__ - DEBUG - Added a StreamHandler to the root logger for general test visibility.
2025-07-21 17:38:46,828 - root - DEBUG - Created temporary config file: /kaggle/working/temp_test_config.json
2025-07-21 17:38:46,829 - root - DEBUG - Removed temporary config file: /kaggle/working/temp_test_config.json


.

2025-07-21 17:38:46,831 - __main__ - DEBUG - Added a StreamHandler to the root logger for general test visibility.
2025-07-21 17:38:46,832 - root - DEBUG - Created temporary config file: /kaggle/working/temp_test_config.json
2025-07-21 17:38:46,834 - root - DEBUG - Removed temporary config file: /kaggle/working/temp_test_config.json


.

2025-07-21 17:38:46,835 - __main__ - DEBUG - Added a StreamHandler to the root logger for general test visibility.
2025-07-21 17:38:46,838 - root - DEBUG - Created temporary config file: /kaggle/working/temp_test_config.json
2025-07-21 17:38:46,839 - root - DEBUG - Removed temporary config file: /kaggle/working/temp_test_config.json


.

2025-07-21 17:38:46,841 - __main__ - DEBUG - Added a StreamHandler to the root logger for general test visibility.
2025-07-21 17:38:46,843 - root - DEBUG - Created temporary config file: /kaggle/working/temp_test_config.json
2025-07-21 17:38:46,844 - aei.telecom.failure.config - DEBUG - CQI validation check: thresholds['cqi'] is 3
2025-07-21 17:38:46,844 - aei.telecom.failure.config - DEBUG - AEIFailureConfigManager initialized successfully.
2025-07-21 17:38:46,845 - root - DEBUG - Removed temporary config file: /kaggle/working/temp_test_config.json


.

2025-07-21 17:38:46,847 - __main__ - DEBUG - Added a StreamHandler to the root logger for general test visibility.
2025-07-21 17:38:46,849 - root - DEBUG - Created temporary config file: /kaggle/working/temp_test_config.json
2025-07-21 17:38:46,850 - root - DEBUG - Removed temporary config file: /kaggle/working/temp_test_config.json


.

2025-07-21 17:38:46,852 - __main__ - DEBUG - Added a StreamHandler to the root logger for general test visibility.
2025-07-21 17:38:46,853 - root - DEBUG - Created temporary config file: /kaggle/working/temp_test_config.json
2025-07-21 17:38:46,854 - root - DEBUG - Removed temporary config file: /kaggle/working/temp_test_config.json


.

2025-07-21 17:38:46,856 - __main__ - DEBUG - Added a StreamHandler to the root logger for general test visibility.
2025-07-21 17:38:46,858 - root - DEBUG - Created temporary config file: /kaggle/working/temp_test_config.json
2025-07-21 17:38:46,860 - root - DEBUG - Removed temporary config file: /kaggle/working/temp_test_config.json


.

2025-07-21 17:38:46,862 - __main__ - DEBUG - Added a StreamHandler to the root logger for general test visibility.
2025-07-21 17:38:46,863 - root - DEBUG - Created temporary config file: /kaggle/working/temp_test_config.json
2025-07-21 17:38:46,865 - root - DEBUG - Removed temporary config file: /kaggle/working/temp_test_config.json


.

2025-07-21 17:38:46,866 - __main__ - DEBUG - Added a StreamHandler to the root logger for general test visibility.
2025-07-21 17:38:46,868 - root - DEBUG - Created temporary config file: /kaggle/working/temp_test_config.json
2025-07-21 17:38:46,870 - root - DEBUG - Removed temporary config file: /kaggle/working/temp_test_config.json


.

2025-07-21 17:38:46,871 - __main__ - DEBUG - Added a StreamHandler to the root logger for general test visibility.
2025-07-21 17:38:46,873 - root - DEBUG - Created temporary config file: /kaggle/working/temp_test_config.json
2025-07-21 17:38:46,875 - root - DEBUG - Removed temporary config file: /kaggle/working/temp_test_config.json


.
----------------------------------------------------------------------
Ran 21 tests in 0.123s

OK


```python

# Version 1

```python

import os
import sys
import pandas as pd # Needed for the test block below
import numpy as np # Needed for the test block below

# Create necessary directories and __init__.py files (ensure they exist)
os.makedirs("/kaggle/working/aei/telecom/failure", exist_ok=True)
# Ensure top-level __init__.py files are present for package recognition
open("/kaggle/working/aei/__init__.py", "a").close() # Use 'a' to append if exists, or create
open("/kaggle/working/aei/telecom/__init__.py", "a").close()
open("/kaggle/working/aei/telecom/failure/__init__.py", "a").close()


# The corrected code for failure_simulator.py
corrected_failure_simulator_code = '''
import numpy as np
import logging
from typing import Dict, Any, Optional
from dataclasses import dataclass
from enum import Enum, auto
from functools import wraps
from time import perf_counter
import pandas as pd

# Setup logging
logger = logging.getLogger(__name__)

class FailureType(Enum):
    DROPPED_CALL = auto()
    HANDOVER_FAILURE = auto()
    POWER_OUTAGE = auto()
    # Adding generic types for compatibility with the main generator's config
    POWER = auto()
    BACKHAUL = auto()
    SOFTWARE = auto()
    HARDWARE = auto()

@dataclass
class FailureConfig:
    """Configuration for failure simulation parameters"""
    handover_user_threshold: int = 50
    power_outage_prob: float = 0.001
    dropped_call_weight: float = 0.1
    cqi_penalty_weight: float = 0.2
    sinr_threshold: float = 0.0
    rsrq_threshold: float = -15.0
    cqi_threshold: int = 3
    # Ensure these are also part of the config if used internally
    failure_types: list = None # Will be set by __init__ if provided

DEFAULT_FAILURE_CONFIG = FailureConfig(failure_types=['POWER', 'BACKHAUL', 'SOFTWARE', 'HARDWARE'])

def timed_operation(func):
    """Decorator to log execution time of functions"""
    @wraps(func)
    def wrapper(*args, **kwargs):
        start_time = perf_counter()
        result = func(*args, **kwargs)
        end_time = perf_counter()
        logger.debug(f"Operation {func.__name__} completed in {end_time - start_time:.4f} seconds")
        return result
    return wrapper

class AEIFailureSimulator:
    """
    Enterprise-grade failure simulation for telecom networks
    
    Features:
    - Configurable failure thresholds
    - Detailed logging
    - Performance monitoring
    - Type safety
    - Validation
    - Compatibility layer for existing data generation pipelines.
    """
    
    def __init__(self, seed: Optional[int] = None, failure_types: Optional[list] = None, config: Optional[FailureConfig] = None):
        """
        Initializes the AEIFailureSimulator.

        Args:
            seed (Optional[int]): Seed for random number generation for reproducibility.
            failure_types (Optional[list]): List of failure types (e.g., 'POWER', 'BACKHAUL').
                                             If None, DEFAULT_FAILURE_CONFIG's types are used.
            config (Optional[FailureConfig]): Configuration object for failure parameters.
                                             If None, a default FailureConfig will be used.
        """
        self.rng = np.random.RandomState(seed) # Initialize RNG with seed
        self.config = config or DEFAULT_FAILURE_CONFIG
        
        # Override failure_types in config if provided directly
        if failure_types is not None:
            self.config.failure_types = failure_types
        else:
            # Ensure failure_types is always set, even if not provided in config or directly
            self.config.failure_types = self.config.failure_types or DEFAULT_FAILURE_CONFIG.failure_types

        self._validate_config()
        logger.info(f"AEIFailureSimulator initialized with seed: {seed}, failure_types: {self.config.failure_types}")

    def _validate_config(self):
        """Ensure configuration parameters are valid"""
        if not 0 <= self.config.power_outage_prob <= 1:
            raise ValueError("Power outage probability must be between 0 and 1")
        if self.config.handover_user_threshold < 0:
            raise ValueError("User threshold cannot be negative")
        if not isinstance(self.config.failure_types, list) or not self.config.failure_types:
            raise ValueError("failure_types must be a non-empty list.")
        logger.debug("FailureConfig validated successfully.")

    # This method acts as an adapter for compatibility with TelecomDataGenerator
    def simulate_failures(self, df: pd.DataFrame, failure_prob: float = 0.05) -> pd.DataFrame:
        """
        Simulates failures by introducing 'failure_type' based on a probability.
        This method is an adapter to match the expected signature from TelecomDataGenerator.
        It will call the more comprehensive 'simulate' method internally.
        """
        if df.empty:
            logger.warning("Received empty DataFrame for failure simulation.")
            return df.copy()

        # The 'simulate' method in this class doesn't take failure_prob directly as a method argument
        # but uses self.config.power_outage_prob.
        # For compatibility, we can temporarily set the power_outage_prob if it differs,
        # or just acknowledge that the main simulate method has its own internal logic.
        # Given the existing structure, the 'simulate' method is more about adding specific failure columns
        # based on thresholds, rather than a single 'failure_prob' for a generic 'failure_type'.
        # We need to map the generic 'failure_prob' from TelecomDataGenerator to the specific failure types.

        # Let's add a generic failure_type column first based on the overall probability,
        # and then let the simulate method add specific failure indicators.
        df_copy = df.copy()
        
        # Ensure 'failure_type' column is initialized as a string type before assigning categorical values
        # This is the column TelecomDataGenerator expects to be populated with 'NONE', 'POWER', etc.
        mask = self.rng.random(size=len(df_copy)) < failure_prob # Use self.rng for reproducibility
        df_copy['failure_type'] = pd.Series('NONE', index=df_copy.index, dtype=object) 
        if mask.any():
            df_copy.loc[mask, 'failure_type'] = self.rng.choice(self.config.failure_types, size=mask.sum()) # Use self.rng

        # Now call the internal comprehensive simulation
        # The 'simulate' method will add 'Dropped_Calls', 'Handover_Failures', 'Power_Outage', 'Request_Failure_Rate'
        # These are internal detailed failure indicators. The 'failure_type' above is the high-level category.
        df_final = self.simulate(df_copy)
        
        return df_final


    @timed_operation
    def simulate(self, df: pd.DataFrame) -> pd.DataFrame:
        """
        Simulate network failures on telecom data. This is the comprehensive method.
        
        Args:
            df: DataFrame containing network metrics (RSRP, RSRQ, SINR, CQI, device_count).
                Note: "User Count" is expected to be "device_count" from TelecomDataGenerator.
            
        Returns:
            DataFrame with added failure columns:
            - Dropped_Calls (binary)
            - Handover_Failures (binary)
            - Power_Outage (binary)
            - Request_Failure_Rate (float)
        """
        self._validate_input(df)
        
        try:
            df_processed = df.copy()
            df_processed["Dropped_Calls"] = self._calculate_dropped_calls(df_processed)
            df_processed["Handover_Failures"] = self._calculate_handover_failures(df_processed)
            df_processed["Power_Outage"] = self._simulate_power_outages(len(df_processed)) # Use self.rng internally
            df_processed["Request_Failure_Rate"] = self._calculate_failure_rate(df_processed)
            
            logger.info(f"Simulated detailed failures on {len(df_processed)} records")
            self._log_failure_stats(df_processed)
            
            return df_processed
            
        except Exception as e:
            logger.error(f"Comprehensive failure simulation error: {str(e)}", exc_info=True)
            raise
            
    def _validate_input(self, df: pd.DataFrame):
        """Validate input DataFrame structure"""
        # Adjusted to expect 'device_count' instead of 'User Count'
        required_columns = {"RSRP", "RSRQ", "SINR", "CQI", "device_count"}
        missing = required_columns - set(df.columns)
        if missing:
            raise ValueError(f"Missing required columns for failure simulation: {missing}")
        logger.debug("Input DataFrame validated for failure simulation.")
            
    def _calculate_dropped_calls(self, df: pd.DataFrame) -> pd.Series:
        """Calculate dropped calls based on signal quality"""
        # Ensure columns are numeric, coercing errors to handle potential NaNs
        sinr = pd.to_numeric(df["SINR"], errors='coerce').fillna(0)
        rsrq = pd.to_numeric(df["RSRQ"], errors='coerce').fillna(0)
        return ((sinr < self.config.sinr_threshold) | 
                (rsrq < self.config.rsrq_threshold)).astype(int)
    
    def _calculate_handover_failures(self, df: pd.DataFrame) -> pd.Series:
        """Calculate handover failures based on load and quality"""
        # Adjusted to use 'device_count'
        cqi = pd.to_numeric(df["CQI"], errors='coerce').fillna(0)
        device_count = pd.to_numeric(df["device_count"], errors='coerce').fillna(0)
        return ((cqi < self.config.cqi_threshold) & 
                (device_count > self.config.handover_user_threshold)).astype(int)
    
    def _simulate_power_outages(self, size: int) -> pd.Series:
        """Simulate random power outages"""
        return (self.rng.rand(size) < self.config.power_outage_prob).astype(int) # Use self.rng
    
    def _calculate_failure_rate(self, df: pd.DataFrame) -> pd.Series:
        """Calculate composite failure rate metric"""
        # Ensure CQI is numeric for calculation
        cqi = pd.to_numeric(df["CQI"], errors='coerce').fillna(0)
        return np.clip(
            (self.config.dropped_call_weight * df["Dropped_Calls"] +
             self.config.cqi_penalty_weight * (100 - cqi)), # Use the numeric cqi
            0, 100
        )
    
    def _log_failure_stats(self, df: pd.DataFrame):
        """Log statistics about simulated failures"""
        stats = {
            FailureType.DROPPED_CALL: df["Dropped_Calls"].sum(),
            FailureType.HANDOVER_FAILURE: df["Handover_Failures"].sum(),
            FailureType.POWER_OUTAGE: df["Power_Outage"].sum()
        }
        
        for failure_type, count in stats.items():
            if len(df) > 0:
                rate = count / len(df) * 100
            else:
                rate = 0.0 # Handle empty DataFrame case
            logger.info(f"{failure_type.name}: {count} occurrences ({rate:.2f}%)")

'''

# Save to file
with open("/kaggle/working/aei/telecom/failure/failure_simulator.py", "w") as f:
    f.write(corrected_failure_simulator_code)

# Add /kaggle/working to sys.path so Python can find the aei package
if "/kaggle/working" not in sys.path:
    sys.path.append("/kaggle/working")

# Now you can import and test
try:
    from aei.telecom.failure.failure_simulator import AEIFailureSimulator, FailureConfig
    print("Successfully imported AEIFailureSimulator from the corrected file!")

    # Example test DataFrame (ensure it has required columns)
    test_df = pd.DataFrame({
        'RSRP': np.random.uniform(-100, -50, 10),
        'RSRQ': np.random.uniform(-20, -5, 10),
        'SINR': np.random.uniform(-5, 25, 10),
        'CQI': np.random.randint(1, 16, 10),
        'device_count': np.random.randint(10, 100, 10)
    })

    # Test with seed and custom failure types
    simulator = AEIFailureSimulator(seed=42, failure_types=['TYPE_A', 'TYPE_B'])
    result_df = simulator.simulate_failures(test_df, failure_prob=0.3) # Using the adapter method

    print("\nSample generated failures:")
    print(result_df[['RSRP', 'SINR', 'CQI', 'device_count', 'failure_type', 
                     'Dropped_Calls', 'Handover_Failures', 'Power_Outage', 'Request_Failure_Rate']].head())
    print("\nFailure Type Distribution (from simulate_failures):")
    print(result_df['failure_type'].value_counts())

except ImportError as e:
    print(f"Error importing AEIFailureSimulator: {e}. Make sure the file is saved correctly and __init__.py files are present.")
except Exception as e:
    print(f"An error occurred during testing: {e}")

```python

import os
import sys
import pandas as pd
import numpy as np
import pytest
import textwrap # Import textwrap for clean string indentation
import logging

# Set up logging for the test environment to avoid clutter from the simulator's INFO logs
logging.basicConfig(level=logging.WARNING, format='%(asctime)s - %(levelname)s - %(message)s')

# Ensure the aei package path is in sys.path
# This block should ideally be run once before any imports from 'aei'
# but is included here for completeness of the self-contained test script.
if "/kaggle/working" not in sys.path:
    sys.path.append("/kaggle/working")

# Create necessary directories and __init__.py files for the AEI package structure
# This ensures the 'aei' package is discoverable by Python's import system
os.makedirs("/kaggle/working/aei/telecom/signal", exist_ok=True)
os.makedirs("/kaggle/working/aei/telecom/failure", exist_ok=True)
os.makedirs("/kaggle/working/aei/telecom/outage", exist_ok=True)

open("/kaggle/working/aei/__init__.py", "w").close()
open("/kaggle/working/aei/telecom/__init__.py", "w").close()
open("/kaggle/working/aei/telecom/signal/__init__.py", "w").close()
open("/kaggle/working/aei/telecom/failure/__init__.py", "w").close()
open("/kaggle/working/aei/telecom/outage/__init__.py", "w").close()

# --- Start of corrected aei/telecom/failure/failure_simulator.py content ---
# This part ensures the simulator itself is correctly defined before tests run
failure_simulator_code = textwrap.dedent('''
    import numpy as np
    import logging
    from typing import Dict, Any, Optional
    from dataclasses import dataclass, field
    from enum import Enum, auto
    from functools import wraps
    from time import perf_counter
    import pandas as pd

    # Setup logging
    logger = logging.getLogger(__name__)

    class AEIFailureType(Enum):
        DROPPED_CALL = auto()
        HANDOVER_FAILURE = auto()
        POWER_OUTAGE = auto()
        POWER = auto()
        BACKHAUL = auto()
        SOFTWARE = auto()
        HARDWARE = auto()

    @dataclass
    class AEIFailureConfig:
        """Configuration for failure simulation parameters"""
        handover_user_threshold: int = 50
        power_outage_prob: float = 0.001
        dropped_call_weight: float = 0.1
        cqi_penalty_weight: float = 0.2
        sinr_threshold: float = 0.0
        rsrq_threshold: float = -15.0
        cqi_threshold: int = 3
        failure_types: list = field(default_factory=lambda: ['POWER', 'BACKHAUL', 'SOFTWARE', 'HARDWARE']) # Using default_factory

        def __post_init__(self):
            """Validate configuration parameters after initialization."""
            if not 0 <= self.power_outage_prob <= 1:
                raise ValueError("Power outage probability must be between 0 and 1")
            if self.handover_user_threshold < 0:
                raise ValueError("User threshold cannot be negative")
            if not isinstance(self.failure_types, list) or not self.failure_types:
                raise ValueError("failure_types must be a non-empty list.")
            logger.debug("AEIFailureConfig validated successfully.")


    AEIDEFAULT_FAILURE_CONFIG = AEIFailureConfig() # Now just instantiate directly, validation in __post_init__

    def timed_operation(func):
        """Decorator to log execution time of functions"""
        @wraps(func)
        def wrapper(*args, **kwargs):
            start_time = perf_counter()
            result = func(*args, **kwargs)
            end_time = perf_counter()
            logger.debug(f"Operation {func.__name__} completed in {end_time - start_time:.4f} seconds")
            return result
        return wrapper

    class AEIFailureSimulator:
        """
        Enterprise-grade failure simulation for telecom networks
        
        Features:
        - Configurable failure thresholds
        - Detailed logging
        - Performance monitoring
        - Type safety
        - Validation
        - Compatibility layer for existing data generation pipelines.
        """
        
        def __init__(self, seed: Optional[int] = None, failure_types: Optional[list] = None, config: Optional[AEIFailureConfig] = None):
            """
            Initializes the AEIFailureSimulator.

            Args:
                seed (Optional[int]): Seed for random number generation for reproducibility.
                failure_types (Optional[list]): List of failure types (e.g., 'POWER', 'BACKHAUL').
                                                 If None, AEIDEFAULT_FAILURE_CONFIG's types are used.
                config (Optional[AEIFailureConfig]): Configuration object for failure parameters.
                                                 If None, a default AEIFailureConfig will be used.
            """
            self.rng = np.random.RandomState(seed) # Initialize RNG with seed
            
            if config is None:
                # If no config is provided, create one and potentially override failure_types
                if failure_types is not None:
                    self.config = AEIFailureConfig(failure_types=failure_types)
                else:
                    self.config = AEIDEFAULT_FAILURE_CONFIG
            else:
                self.config = config
                # If a config is provided, and failure_types is ALSO provided, override config's failure_types
                if failure_types is not None:
                    self.config.failure_types = failure_types # This directly modifies the provided config object

            logger.info(f"AEIFailureSimulator initialized with seed: {seed}, failure_types: {self.config.failure_types}")

        # Removed _validate_config as validation is now in AEIFailureConfig.__post_init__

        def simulate_failures(self, df: pd.DataFrame, failure_prob: float = 0.05) -> pd.DataFrame:
            """
            Simulates failures by introducing 'failure_type' based on a probability.
            This method is an adapter to match the expected signature from TelecomDataGenerator.
            It will call the more comprehensive 'simulate' method internally.
            """
            if df.empty:
                logger.warning("Received empty DataFrame for failure simulation.")
                empty_result = self.simulate(df.copy()) # Call simulate to get expected columns
                empty_result['failure_type'] = pd.Series(dtype=object) # Add the column as empty
                return empty_result

            df_copy = df.copy()
            
            mask = self.rng.random(size=len(df_copy)) < failure_prob
            df_copy['failure_type'] = pd.Series('NONE', index=df_copy.index, dtype=object) 
            if mask.any():
                df_copy.loc[mask, 'failure_type'] = self.rng.choice(self.config.failure_types, size=mask.sum())

            df_final = self.simulate(df_copy)
            
            return df_final


        @timed_operation
        def simulate(self, df: pd.DataFrame) -> pd.DataFrame:
            """
            Simulate network failures on telecom data. This is the comprehensive method.
            
            Args:
                df: DataFrame containing network metrics (RSRP, RSRQ, SINR, CQI, device_count).
                
            Returns:
                DataFrame with added failure columns:
                - Dropped_Calls (binary)
                - Handover_Failures (binary)
                - Power_Outage (binary)
                - Request_Failure_Rate (float)
                (Note: 'failure_type' is added by simulate_failures, not directly by this method)
            """
            self._validate_input(df) # Input DataFrame structure validation
            
            try:
                df_processed = df.copy()
                df_processed["Dropped_Calls"] = self._calculate_dropped_calls(df_processed)
                df_processed["Handover_Failures"] = self._calculate_handover_failures(df_processed)
                df_processed["Power_Outage"] = self._simulate_power_outages(len(df_processed))
                df_processed["Request_Failure_Rate"] = self._calculate_failure_rate(df_processed)
                
                logger.info(f"Simulated detailed failures on {len(df_processed)} records")
                self._log_failure_stats(df_processed)
                
                return df_processed
                
            except Exception as e:
                logger.error(f"Comprehensive failure simulation error: {str(e)}", exc_info=True)
                raise
                
        def _validate_input(self, df: pd.DataFrame):
            """Validate input DataFrame structure"""
            required_columns = {"RSRP", "RSRQ", "SINR", "CQI", "device_count"}
            missing = required_columns - set(df.columns)
            if missing:
                raise ValueError(f"Missing required columns for failure simulation: {missing}")
            logger.debug("Input DataFrame validated for failure simulation.")
                
        def _calculate_dropped_calls(self, df: pd.DataFrame) -> pd.Series:
            """Calculate dropped calls based on signal quality"""
            sinr = pd.to_numeric(df["SINR"], errors='coerce').fillna(0)
            rsrq = pd.to_numeric(df["RSRQ"], errors='coerce').fillna(0)
            return ((sinr < self.config.sinr_threshold) | 
                    (rsrq < self.config.rsrq_threshold)).astype(int)
        
        def _calculate_handover_failures(self, df: pd.DataFrame) -> pd.Series:
            """Calculate handover failures based on load and quality"""
            cqi = pd.to_numeric(df["CQI"], errors='coerce').fillna(0)
            device_count = pd.to_numeric(df["device_count"], errors='coerce').fillna(0)
            return ((cqi < self.config.cqi_threshold) & 
                    (device_count > self.config.handover_user_threshold)).astype(int)
        
        def _simulate_power_outages(self, size: int) -> pd.Series:
            """Simulate random power outages"""
            return (self.rng.rand(size) < self.config.power_outage_prob).astype(int)
        
        def _calculate_failure_rate(self, df: pd.DataFrame) -> pd.Series:
            """Calculate composite failure rate metric"""
            cqi = pd.to_numeric(df["CQI"], errors='coerce').fillna(0)
            return np.clip(
                (self.config.dropped_call_weight * df["Dropped_Calls"] +
                 self.config.cqi_penalty_weight * (100 - cqi)),
                0, 100
            )
        
        def _log_failure_stats(self, df: pd.DataFrame):
            """Log statistics about simulated failures"""
            stats = {
                AEIFailureType.DROPPED_CALL: df["Dropped_Calls"].sum(),
                AEIFailureType.HANDOVER_FAILURE: df["Handover_Failures"].sum(),
                AEIFailureType.POWER_OUTAGE: df["Power_Outage"].sum()
            }
            
            for failure_type, count in stats.items():
                if len(df) > 0:
                    rate = count / len(df) * 100
                else:
                    rate = 0.0
                logger.info(f"{failure_type.name}: {count} occurrences ({rate:.2f}%)")
''')

# Save the corrected failure_simulator.py content
with open("/kaggle/working/aei/telecom/failure/failure_simulator.py", "w") as f:
    f.write(failure_simulator_code)

# --- End of aei/telecom/failure/failure_simulator.py content ---


# --- Start of test_failure_simulator.py content ---
test_code = textwrap.dedent('''
    import pytest
    import pandas as pd
    import numpy as np
    import logging
    from aei.telecom.failure.failure_simulator import AEIFailureSimulator, AEIFailureConfig, AEIFailureType, AEIDEFAULT_FAILURE_CONFIG

    # Configure logging to suppress INFO messages during tests,
    # as pytest handles test reporting.
    logging.basicConfig(level=logging.WARNING, format='%(asctime)s - %(levelname)s - %(message)s')
    logger = logging.getLogger(__name__)

    @pytest.fixture
    def sample_df() -> pd.DataFrame:
        """
        Provides a sample DataFrame for testing the Simulator.
        """
        return pd.DataFrame({
            "RSRP": np.random.uniform(-100, -50, 100),
            "RSRQ": np.random.uniform(-20, -5, 100),
            "SINR": np.random.uniform(-5, 20, 100),
            "CQI": np.random.randint(1, 16, 100),
            "device_count": np.random.randint(10, 100, 100)
        })

    def test_basic_simulation(sample_df: pd.DataFrame):
        """
        Tests the basic simulation functionality with default configuration.
        Ensures all expected columns are added and their types are correct.
        This test calls the 'simulate' method, which does NOT add 'failure_type'.
        """
        simulator = AEIFailureSimulator(seed=42)
        result_df = simulator.simulate(sample_df.copy())

        expected_columns = ["Dropped_Calls", "Handover_Failures", "Power_Outage", "Request_Failure_Rate"]
        for col in expected_columns:
            assert col in result_df.columns, f"Column '{col}' missing from result DataFrame."

        assert result_df["Dropped_Calls"].dtype == int, "'Dropped_Calls' column not integer type."
        assert result_df["Handover_Failures"].dtype == int, "'Handover_Failures' column not integer type."
        assert result_df["Power_Outage"].dtype == int, "'Power_Outage' column not integer type."
        assert np.issubdtype(result_df["Request_Failure_Rate"].dtype, np.number), "'Request_Failure_Rate' column not numeric type."

        assert len(result_df) == len(sample_df), "Result DataFrame row count mismatch."
        assert "failure_type" not in result_df.columns, "'failure_type' column should NOT be present from direct simulate() call."

    def test_custom_config_simulation(sample_df: pd.DataFrame):
        """
        Tests simulation with a custom AEIFailureConfig.
        Ensures the simulator accepts and uses the custom configuration.
        This test calls the 'simulate' method, which does NOT add 'failure_type'.
        """
        custom_config = AEIFailureConfig(
            handover_user_threshold=20,
            power_outage_prob=0.1,
            dropped_call_weight=0.5,
            cqi_penalty_weight=0.8,
            sinr_threshold=5.0,
            rsrq_threshold=-10.0,
            cqi_threshold=5,
            failure_types=['CUSTOM_A', 'CUSTOM_B']
        )
        simulator_custom = AEIFailureSimulator(config=custom_config)
        result_df_custom = simulator_custom.simulate(sample_df.copy())

        assert "Dropped_Calls" in result_df_custom.columns, "Custom config simulation failed: 'Dropped_Calls' column missing."
        assert len(result_df_custom) == len(sample_df), "Custom config simulation failed: Result DataFrame row count mismatch."
        assert "failure_type" not in result_df_custom.columns, "'failure_type' column should NOT be present from direct simulate() call."


    def test_input_validation_missing_columns():
        """
        Tests that a ValueError is raised when required input columns are missing.
        """
        df_invalid = pd.DataFrame({"RSRP": [1], "CQI": [2], "SINR": [3], "RSRQ": [4]})
        simulator = AEIFailureSimulator()
        with pytest.raises(ValueError, match="Missing required columns"):
            simulator.simulate(df_invalid)

    def test_config_validation_invalid_power_outage_prob():
        """
        Tests that a ValueError is raised for an invalid power_outage_prob in configuration.
        """
        with pytest.raises(ValueError, match="Power outage probability must be between 0 and 1"):
            AEIFailureConfig(power_outage_prob=1.5)
        with pytest.raises(ValueError, match="Power outage probability must be between 0 and 1"):
            AEIFailureConfig(power_outage_prob=-0.1)

    def test_config_validation_negative_user_threshold():
        """
        Tests that a ValueError is raised for a negative handover_user_threshold in configuration.
        """
        with pytest.raises(ValueError, match="User threshold cannot be negative"):
            AEIFailureConfig(handover_user_threshold=-10)

    def test_config_validation_empty_failure_types():
        """
        Tests that a ValueError is raised for an empty or invalid failure_types list in configuration.
        """
        with pytest.raises(ValueError, match="failure_types must be a non-empty list."):
            AEIFailureConfig(failure_types=[])
        # The default_factory handles None if not explicitly passed, so this might not raise if AEIFailureConfig() is default
        with pytest.raises(ValueError, match="failure_types must be a non-empty list."):
             # This explicitly sets failure_types to None
            AEIFailureConfig(failure_types=None)
        with pytest.raises(ValueError, match="failure_types must be a non-empty list."):
            AEIFailureConfig(failure_types="not a list")


    def test_dropped_call_condition():
        """
        Tests that dropped calls are correctly identified based on SINR and RSRQ thresholds.
        """
        # Scenario 1: SINR too low -> Dropped Call
        df_sinr_low = pd.DataFrame({
            "RSRP": [-70], "RSRQ": [-10], "SINR": [-1.0], "CQI": [10], "device_count": [30]
        })
        sim_sinr = AEIFailureSimulator(config=AEIFailureConfig(sinr_threshold=0.0, rsrq_threshold=-15.0))
        res_sinr = sim_sinr.simulate(df_sinr_low)
        assert res_sinr["Dropped_Calls"].iloc[0] == 1, "Dropped Call not detected for low SINR."

        # Scenario 2: RSRQ too low -> Dropped Call
        df_rsrq_low = pd.DataFrame({
            "RSRP": [-70], "RSRQ": [-18], "SINR": [5.0], "CQI": [10], "device_count": [30]
        })
        sim_rsrq = AEIFailureSimulator(config=AEIFailureConfig(sinr_threshold=0.0, rsrq_threshold=-15.0))
        res_rsrq = sim_rsrq.simulate(df_rsrq_low)
        assert res_rsrq["Dropped_Calls"].iloc[0] == 1, "Dropped Call not detected for low RSRQ."

        # Scenario 3: Both OK -> No Dropped Call
        df_all_ok = pd.DataFrame({
            "RSRP": [-70], "RSRQ": [-10], "SINR": [5.0], "CQI": [10], "device_count": [30]
        })
        sim_ok = AEIFailureSimulator(config=AEIFailureConfig(sinr_threshold=0.0, rsrq_threshold=-15.0))
        res_ok = sim_ok.simulate(df_all_ok)
        assert res_ok["Dropped_Calls"].iloc[0] == 0, "Dropped Call incorrectly detected when conditions are met."


    def test_handover_failure_condition():
        """
        Tests that handover failures are correctly identified based on CQI and device_count thresholds.
        """
        # Scenario 1: CQI low AND device_count high -> Handover Failure
        df_fail = pd.DataFrame({
            "RSRP": [-70], "RSRQ": [-10], "SINR": [10.0], "CQI": [2], "device_count": [60]
        })
        sim_fail = AEIFailureSimulator(config=AEIFailureConfig(cqi_threshold=3, handover_user_threshold=50))
        res_fail = sim_fail.simulate(df_fail)
        assert res_fail["Handover_Failures"].iloc[0] == 1, "Handover Failure not detected."

        # Scenario 2: CQI low BUT device_count low -> No Handover Failure
        df_cqi_low_device_low = pd.DataFrame({
            "RSRP": [-70], "RSRQ": [-10], "SINR": [10.0], "CQI": [2], "device_count": [40]
        })
        sim_no_fail1 = AEIFailureSimulator(config=AEIFailureConfig(cqi_threshold=3, handover_user_threshold=50))
        res_no_fail1 = sim_no_fail1.simulate(df_cqi_low_device_low)
        assert res_no_fail1["Handover_Failures"].iloc[0] == 0, "Handover Failure incorrectly detected (device count low)."

        # Scenario 3: CQI high AND device_count high -> No Handover Failure
        df_cqi_high_device_high = pd.DataFrame({
            "RSRP": [-70], "RSRQ": [-10], "SINR": [10.0], "CQI": [5], "device_count": [60]
        })
        sim_no_fail2 = AEIFailureSimulator(config=AEIFailureConfig(cqi_threshold=3, handover_user_threshold=50))
        res_no_fail2 = sim_no_fail2.simulate(df_cqi_high_device_high)
        assert res_no_fail2["Handover_Failures"].iloc[0] == 0, "Handover Failure incorrectly detected (CQI high)."

    def test_power_outage_simulation():
        """
        Tests that power outages are simulated correctly based on probability.
        """
        simulator = AEIFailureSimulator(seed=10, config=AEIFailureConfig(power_outage_prob=0.5))
        df_test = pd.DataFrame({"RSRP": [1]*10, "RSRQ": [1]*10, "SINR": [1]*10, "CQI": [1]*10, "device_count": [1]*10})
        result_df = simulator.simulate(df_test)
        
        assert result_df["Power_Outage"].dtype == int
        assert result_df["Power_Outage"].isin([0, 1]).all()
        # For prob 0.5, with seed 10, rng.rand(10) will produce 6 values < 0.5
        assert result_df["Power_Outage"].sum() == 6 # CORRECTED: Changed expected sum from 7 to 6


    def test_failure_rate_calculation():
        """
        Tests the composite failure rate calculation, ensuring it's within bounds.
        """
        df_rate_calc = pd.DataFrame({
            "RSRP": [-70], "RSRQ": [-10], "SINR": [-1.0], "CQI": [50], "device_count": [30]
        })
        
        config_rate = AEIFailureConfig(dropped_call_weight=0.1, cqi_penalty_weight=0.2, sinr_threshold=0.0)
        simulator_rate = AEIFailureSimulator(config=config_rate)
        
        result_df = simulator_rate.simulate(df_rate_calc)
        
        expected_rate = (config_rate.dropped_call_weight * 1 +
                         config_rate.cqi_penalty_weight * (100 - 50))
        
        assert np.isclose(result_df["Request_Failure_Rate"].iloc[0], expected_rate), \
            f"Failure rate calculation incorrect. Expected {expected_rate}, got {result_df['Request_Failure_Rate'].iloc[0]}"
        
        config_clip = AEIFailureConfig(dropped_call_weight=0.1, cqi_penalty_weight=1.5, sinr_threshold=0.0)
        simulator_clip = AEIFailureSimulator(config=config_clip)

        df_clipped_input = pd.DataFrame({
            "RSRP": [-70], "RSRQ": [-10], "SINR": [1.0], "CQI": [0], "device_count": [30]
        })
        
        result_clipped = simulator_clip.simulate(df_clipped_input)
        assert result_clipped["Request_Failure_Rate"].iloc[0] == 100, "Failure rate not clipped to 100."

    def test_simulate_failures_adapter_method_empty_df():
        """
        Tests the simulate_failures adapter method with an empty DataFrame,
        ensuring it returns an empty DataFrame with expected columns.
        """
        simulator = AEIFailureSimulator(seed=1)
        empty_df = pd.DataFrame(columns=["RSRP", "RSRQ", "SINR", "CQI", "device_count"])
        
        result = simulator.simulate_failures(empty_df, failure_prob=0.5)
        
        assert result.empty
        assert "Dropped_Calls" in result.columns
        assert "Handover_Failures" in result.columns
        assert "Power_Outage" in result.columns
        assert "Request_Failure_Rate" in result.columns
        assert "failure_type" in result.columns
        assert result['failure_type'].dtype == object
''')

# Write the corrected test_failure_simulator.py content
with open("/kaggle/working/test_failure_simulator.py", "w") as f:
    f.write(test_code)

print("✅ test_failure_simulator.py and aei/telecom/failure/failure_simulator.py saved successfully.")

# Now, execute the pytest command
print("\nRunning pytest...")
!pytest /kaggle/working/test_failure_simulator.py -v