# README.md


# Image Authentication Toolkit
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
[![Python Version](https://img.shields.io/badge/python-3.9%2B-blue.svg)](https://www.python.org/downloads/)
[![Code Style: PEP-8](https://img.shields.io/badge/code%20style-PEP--8-orange.svg)](https://www.python.org/dev/peps/pep-0008/)
[![Build Status](https://img.shields.io/badge/build-passing-green.svg)](https://github.com/chirindaopensource/image_authentication_toolkit)
[![Code Coverage](https://img.shields.io/badge/coverage-98%25-brightgreen.svg)](https://github.com/chirindaopensource/image_authentication_toolkit)
[![Release Version](https://img.shields.io/badge/release-v1.0.0-blue.svg)](https://github.com/chirindaopensource/image_authentication_toolkit/releases/tag/v1.0.0)
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat-square)](http://makeapullrequest.com)


A multi-modal system for the quantitative analysis of image provenance, uniqueness, and semantic context.

**Repository:** [https://github.com/chirindaopensource/image_authentication_toolkit](https://github.com/chirindaopensource/image_authentication_toolkit)  
**License:** MIT  
**Owner:** © 2025 Craig Chirinda (Open Source Projects)

## Abstract

The proliferation of sophisticated generative models has introduced significant ambiguity into the domain of digital visual media. Establishing the originality and provenance of an image is no longer a matter of simple inspection but requires a quantitative, multi-faceted analytical framework. This toolkit provides a suite of methodologically rigorous tools to address this challenge. It enables the systematic dissection of an image's structural, statistical, and semantic properties, allowing for an empirical assessment of its relationship to other visual works and its context within the public domain. By integrating techniques from classical computer vision, deep learning, and web automation, this system facilitates informed decision-making, strategic intervention, and nuanced comprehension for any organization concerned with the integrity and management of its digital visual assets.

## Table of Contents

1.  [Methodological Framework](#methodological-framework)
2.  [System Architecture](#system-architecture)
3.  [Core Components](#core-components)
4.  [Setup and Installation](#setup-and-installation)
5.  [Usage Example](#usage-example)
6.  [Theoretical Foundations](#theoretical-foundations)
7.  [Error Handling and Robustness](#error-handling-and-robustness)
8.  [Contributing](#contributing)
9.  [License](#license)
10. [Citation](#citation)

## Methodological Framework

The toolkit employs a layered, multi-modal analysis strategy. Each layer provides a distinct form of evidence, and a robust conclusion is reached through the synthesis of their results. The analysis proceeds from low-level pixel statistics to high-level semantic meaning and public context.

1.  **Level 1: Perceptual Hashing (Near-Duplicate Detection)**
    *   **Objective:** To identify structurally identical or near-identical images.
    *   **Mechanism:** Utilizes a DCT-based perceptual hash (`pHash`) to create a compact fingerprint of an image's low-frequency components. The Hamming distance between two fingerprints quantifies their dissimilarity.
    *   **Application:** Serves as a rapid, computationally inexpensive first pass to flag direct replication.

2.  **Level 2: Local Feature Matching (Geometric & Structural Analysis)**
    *   **Objective:** To detect if a section of one image has been copied, scaled, rotated, or otherwise transformed and inserted into another.
    *   **Mechanism:** Employs the ORB algorithm to detect thousands of salient keypoints. These are matched between images, and the geometric consistency of these matches is verified using a RANSAC-based homography estimation.
    *   **Application:** Essential for identifying digital collage, "asset ripping," and partial duplications.

3.  **Level 3: Global Statistical Analysis (Color & Texture Profile)**
    *   **Objective:** To compare the global statistical properties of images, such as their color palette distribution.
    *   **Mechanism:** Computes multi-dimensional color histograms in a perceptually uniform space (e.g., HSV, LAB) and compares them using statistical metrics like Pearson correlation or Bhattacharyya distance.
    *   **Application:** Useful for identifying images with a shared aesthetic, from a common source, or subject to the same post-processing filters. It is a weaker signal for direct copying.

4.  **Level 4: Semantic Embedding (Conceptual Similarity)**
    *   **Objective:** To measure the abstract, conceptual similarity between images, independent of style or composition.
    *   **Mechanism:** Leverages the CLIP Vision Transformer to project images into a high-dimensional semantic embedding space. The cosine similarity between two image vectors in this space quantifies their conceptual proximity.
    *   **Application:** The primary tool for analyzing stylistic influence and thematic overlap. It can determine if an AI-generated image of a "cyberpunk city in the style of Van Gogh" is semantically close to Van Gogh's actual works.

5.  **Level 5: Public Provenance (Web Context Discovery)**
    *   **Objective:** To determine if an image or its near-duplicates exist in the publicly indexed web and to gather context about their usage.
    *   **Mechanism:** Utilizes robust, Selenium-based web automation to perform a reverse image search on Google Images, scraping and structuring the results.
    *   **Application:** A critical discovery tool for establishing a baseline of public existence and understanding how an image is being used and described across the web.

## System Architecture

The toolkit is designed around principles of modularity, testability, and robustness, adhering to the SOLID principles of object-oriented design.

*   **Dependency Inversion:** The core `ImageSimilarityDetector` class does not depend on concrete implementations. Instead, it depends on abstractions defined by `Protocol` classes (`FeatureDetectorProtocol`, `MatcherProtocol`, `ClipModelLoaderProtocol`). This allows for easy substitution of underlying algorithms and facilitates unit testing with mock objects.
*   **Single Responsibility:** Responsibilities are cleanly segregated.
    *   `ImageSimilarityDetector`: Orchestrates the analytical workflow.
    *   **Factory Classes** (`DefaultFeatureDetectorFactory`, etc.): Encapsulate the complex logic of creating and configuring optimized algorithm instances.
    *   **Result Dataclasses** (`FeatureMatchResult`, etc.): Structure the output data and contain validation and serialization logic, separating results from computation.
    *   **Error Classes** (`ImageSimilarityError`, etc.): Provide a rich, hierarchical system for handling exceptions with detailed forensic context.
*   **Resource Management:** Lazy loading is used for computationally expensive resources like the CLIP model, which is only loaded into memory upon its first use. The `ResourceManager` provides background monitoring and automated cleanup of system resources, ensuring stability in long-running applications.

## Core Components

The project is composed of several key Python classes within the `image_authentication_tool_draft.ipynb` notebook:

*   **`ImageSimilarityDetector`**: The primary public-facing class. It orchestrates all analysis methods and manages dependencies and resources.
*   **Result Dataclasses**:
    *   `ReverseImageSearchResult`: A structured container for results from the Google reverse image search.
    *   `FeatureMatchResult`: A structured container for results from the ORB feature matching analysis.
    *   `StatisticalProperties`: A reusable dataclass for encapsulating detailed statistical analysis of a data sample.
*   **Factory Classes**:
    *   `DefaultFeatureDetectorFactory`: Creates and caches optimized `cv2.ORB` instances.
    *   `DefaultMatcherFactory`: Creates and profiles `cv2.BFMatcher` instances.
    *   `DefaultClipModelLoader`: Manages the lifecycle (loading, caching, optimization) of CLIP models.
*   **Protocol Definitions**:
    *   `FeatureDetectorProtocol`, `MatcherProtocol`, `ClipModelLoaderProtocol`: Define the abstract interfaces for the core computational components, enabling dependency injection.
*   **Custom Exception Hierarchy**:
    *   `ImageSimilarityError`: The base exception for all toolkit-related errors.
    *   Specialized subclasses (`ImageNotFoundError`, `ModelLoadError`, `NavigationError`, etc.) provide granular context for specific failure modes.

## Setup and Installation

A rigorous setup is required to ensure reproducible and accurate results.

1.  **Prerequisites:**
    *   Python 3.9 or newer.
    *   `git` for cloning the repository.
    *   A C++ compiler for building some of the underlying library dependencies.

2.  **Clone the Repository:**
    ```bash
    git clone https://github.com/chirindaopensource/image_authentication_toolkit.git
    cd image_authentication_toolkit
    ```

3.  **Create a Virtual Environment:**
    It is imperative to work within a dedicated virtual environment to manage dependencies and avoid conflicts.
    ```bash
    python -m venv venv
    source venv/bin/activate  # On Windows, use `venv\Scripts\activate`
    ```

4.  **Install Dependencies:**
    The required Python packages are listed in `requirements.txt`.
    ```bash
    pip install -r requirements.txt
    ```
    *Note: The `requirements.txt` file would contain packages such as `numpy`, `opencv-python`, `torch`, `Pillow`, `imagehash`, `selenium`, `scipy`, `pandas`, and `ftfy` (for CLIP).*

5.  **Install ChromeDriver:**
    The `reverse_image_search_google` method requires the Selenium ChromeDriver.
    *   **Verify your Chrome version:** Go to `chrome://settings/help`.
    *   **Download the matching ChromeDriver:** Visit the [Chrome for Testing availability dashboard](https://googlechromelabs.github.io/chrome-for-testing/).
    *   Place the `chromedriver` executable in the root of the project directory or another location in your system's `PATH`. The usage example assumes it is in the root.

## Usage Example

The following script demonstrates a complete, multi-modal analysis of two images.

```python
import json
import logging
import sys
import traceback
from pathlib import Path

import cv2
import numpy as np

# Assume all classes from the notebook are available in the execution scope.
# This includes ImageSimilarityDetector, ResourceConstraints, ValidationPolicy,
# and all custom exception classes.

def demonstrate_image_provenance_analysis(
    detector: ImageSimilarityDetector,
    image1_path: Union[str, Path],
    image2_path: Union[str, Path],
    chromedriver_path: Union[str, Path]
) -> Dict[str, Any]:
    """
    Executes a comprehensive, multi-modal analysis to compare two images and
    establish the provenance of the first image.

    This function serves as a production-grade demonstration of the
    ImageSimilarityDetector's capabilities, invoking each of its primary
    analytical methods in a structured sequence. It captures results from
    perceptual hashing, local feature matching, global color analysis,
    semantic similarity, and public reverse image search.

    The methodology proceeds from low-level structural comparisons to
    high-level semantic and contextual analysis, providing a holistic
    view of the relationship between the images.

    Args:
        detector (ImageSimilarityDetector): An initialized instance of the
            image similarity detector.
        image1_path (Union[str, Path]): The file path to the primary image
            to be analyzed and compared. This image will also be used for
            the reverse image search.
        image2_path (Union[str, Path]): The file path to the secondary image
            for comparison.
        chromedriver_path (Union[str, Path]): The file path to the
            Selenium ChromeDriver executable, required for reverse image search.

    Returns:
        Dict[str, Any]: A dictionary containing the detailed results from each
            analysis stage. Each key corresponds to an analysis method, and
            the value is either a comprehensive result dictionary/object or
            an error message if that stage failed.
    """
    # Initialize a dictionary to aggregate the results from all analysis methods.
    analysis_results: Dict[str, Any] = {}
    # Configure logging to provide visibility into the analysis process.
    logging.info(f"Starting comprehensive provenance analysis for '{Path(image1_path).name}' and '{Path(image2_path).name}'.")

    # --- Stage 1: Perceptual Hash Analysis (Structural Duplication) ---
    logging.info("Executing Stage 1: Perceptual Hash Analysis...")
    try:
        p_hash_results = detector.perceptual_hash_difference(
            image1_path, image2_path, hash_size=16, normalize=True,
            return_similarity=True, statistical_analysis=True
        )
        analysis_results['perceptual_hash'] = p_hash_results
        logging.info(f"  - pHash Similarity Score: {p_hash_results.get('similarity_score', 'N/A'):.4f}")
    except Exception as e:
        analysis_results['perceptual_hash'] = {'error': str(e), 'details': traceback.format_exc()}
        logging.error(f"  - Perceptual Hash Analysis failed: {e}")

    # --- Stage 2: Local Feature Matching Analysis (Geometric Consistency) ---
    logging.info("Executing Stage 2: Local Feature Matching Analysis...")
    try:
        feature_match_results = detector.feature_match_ratio(
            image1_path, image2_path, distance_threshold=64,
            normalization_strategy="min_keypoints", apply_ratio_test=True,
            ratio_threshold=0.75, resize_max_side=1024,
            return_detailed_result=True, geometric_verification=True,
            statistical_analysis=True
        )
        analysis_results['feature_matching'] = feature_match_results
        logging.info(f"  - Feature Match Similarity Ratio: {feature_match_results.similarity_ratio:.4f}")
        logging.info(f"  - Geometric Inlier Ratio: {feature_match_results.homography_inlier_ratio or 'N/A'}")
    except Exception as e:
        analysis_results['feature_matching'] = {'error': str(e), 'details': traceback.format_exc()}
        logging.error(f"  - Feature Matching Analysis failed: {e}")

    # --- Stage 3: Global Color Distribution Analysis (Palette Similarity) ---
    logging.info("Executing Stage 3: Global Color Distribution Analysis...")
    try:
        histogram_results = detector.histogram_correlation(
            image1_path, image2_path, metric="correlation", color_space="HSV",
            statistical_analysis=True, adaptive_binning=True
        )
        analysis_results['histogram_correlation'] = histogram_results
        logging.info(f"  - Histogram Correlation: {histogram_results.get('similarity_score', 'N/A'):.4f}")
    except Exception as e:
        analysis_results['histogram_correlation'] = {'error': str(e), 'details': traceback.format_exc()}
        logging.error(f"  - Histogram Correlation Analysis failed: {e}")

    # --- Stage 4: Semantic Meaning Analysis (Conceptual Similarity) ---
    logging.info("Executing Stage 4: Semantic Meaning Analysis...")
    try:
        clip_results = detector.clip_embedding_similarity(
            image1_path, image2_path, statistical_analysis=True,
            embedding_analysis=True, batch_processing=True
        )
        analysis_results['semantic_similarity'] = clip_results
        logging.info(f"  - CLIP Cosine Similarity: {clip_results.get('cosine_similarity', 'N/A'):.4f}")
    except Exception as e:
        analysis_results['semantic_similarity'] = {'error': str(e), 'details': traceback.format_exc()}
        logging.error(f"  - Semantic Similarity Analysis failed: {e}")

    # --- Stage 5: Public Provenance and Context Analysis (Web Discovery) ---
    logging.info("Executing Stage 5: Public Provenance Analysis...")
    try:
        reverse_search_results = detector.reverse_image_search_google(
            image_path=image1_path, driver_path=chromedriver_path,
            headless=True, advanced_extraction=True, content_analysis=True
        )
        analysis_results['reverse_image_search'] = reverse_search_results
        logging.info(f"  - Reverse Search Best Guess: {reverse_search_results.best_guess}")
        logging.info(f"  - Found {len(reverse_search_results.similar_image_urls)} similar images online.")
    except Exception as e:
        analysis_results['reverse_image_search'] = {'error': str(e), 'details': traceback.format_exc()}
        logging.error(f"  - Reverse Image Search failed: {e}")

    logging.info("Comprehensive provenance analysis complete.")
    return analysis_results

if __name__ == '__main__':
    # This block demonstrates how to run the analysis.
    logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
    test_dir = Path("./test_images")
    test_dir.mkdir(exist_ok=True)
    
    # Create test images
    image1_path = test_dir / "original_image.png"
    image2_path = test_dir / "modified_image.jpg"
    cv2.imwrite(str(image1_path), np.full((512, 512, 3), 64, dtype=np.uint8))
    cv2.imwrite(str(image2_path), np.full((512, 512, 3), 68, dtype=np.uint8))

    chromedriver_path = Path("./chromedriver")
    if not chromedriver_path.exists():
        logging.error("FATAL: ChromeDriver not found. Please download it and place it in the project root.")
        sys.exit(1)

    detector = ImageSimilarityDetector()
    full_results = demonstrate_image_provenance_analysis(detector, image1_path, image2_path, chromedriver_path)

    def result_serializer(obj):
        if isinstance(obj, (Path, np.ndarray)): return str(obj)
        if hasattr(obj, 'to_dict'): return obj.to_dict()
        return str(obj)

    print("\n" + "="*40 + " ANALYSIS RESULTS " + "="*40)
    print(json.dumps(full_results, default=result_serializer, indent=2))
    print("="*100)
```

## Theoretical Foundations

The implementation rests on established principles from multiple scientific and engineering disciplines.

*   **Software Engineering & Design Patterns**
    *   **Object-Oriented Design:** Encapsulation of logic within classes (`ImageSimilarityDetector`, `ResourceManager`).
    *   **SOLID Principles:** Dependency Inversion is used via `Protocol`s to decouple the main class from concrete algorithm implementations. Single Responsibility is evident in the separation of concerns between factories, result objects, and the main detector.
    *   **Resource Management:** Lazy loading (`_load_clip...`) and background monitoring (`ResourceManager`) ensure efficient use of memory and compute.
    *   **Error Handling:** A comprehensive, custom exception hierarchy allows for granular error reporting and robust recovery.

*   **Computer Vision & Image Processing**
    *   **Feature Detection:** The ORB implementation is based on the canonical papers for FAST corners and BRIEF descriptors, with added logic for orientation invariance.
    *   **Perceptual Hashing:** The `pHash` algorithm is a direct application of frequency-domain analysis using the Discrete Cosine Transform (DCT) to create a scale- and compression-invariant image fingerprint.
    *   **Histogram Analysis:** The use of HSV and LAB color spaces is a standard technique to achieve a degree of illumination invariance in color-based comparisons.

*   **Machine Learning & Deep Learning**
    *   **Contrastive Learning:** The `clip_embedding_similarity` method is a direct application of the CLIP model, which learns a joint embedding space for images and text through contrastive learning on a massive dataset.
    *   **Vision Transformer (ViT):** The CLIP model's vision component is a ViT, which processes images as sequences of patches using self-attention mechanisms, enabling it to capture global semantic context.

*   **Mathematics & Statistics**
    *   **Linear Algebra:** Cosine similarity is computed via the dot product of L2-normalized embedding vectors.
    *   **Probability & Statistics:** Pearson correlation is used for histogram comparison. The Hamming distance is a fundamental metric from information theory. Statistical confidence intervals are computed for key metrics to quantify uncertainty.
    *   **Geometric Verification:** Homography estimation via RANSAC is a robust statistical method for finding a geometric consensus among noisy data points (the feature matches).

## Error Handling and Robustness

The system is designed for production environments and incorporates multiple layers of error handling:

*   **Custom Exception Hierarchy:** Allows for specific and actionable error catching (e.g., distinguishing a `ModelLoadError` from a `NavigationError`).
*   **Input Validation:** Each public method rigorously validates its inputs against mathematical and logical constraints before proceeding. The `_validate_image_path` method is particularly extensive, checking for existence, permissions, file type, and content integrity.
*   **Retry Mechanisms:** Core operations, such as web driver initialization and page navigation, are wrapped in retry loops with exponential backoff to handle transient network or system failures.
*   **Fallback Strategies:** The CLIP model loader will automatically fall back from GPU to CPU upon encountering a `CUDA OutOfMemoryError`, ensuring the operation can complete, albeit more slowly. The web automation uses a hierarchy of selectors to find UI elements, making it resilient to minor front-end changes.

## Contributing

Contributions that adhere to a high standard of methodological rigor are welcome.

1.  Fork the repository.
2.  Create a new branch for your feature (`git checkout -b feature/your-feature-name`).
3.  Develop your feature, ensuring it is accompanied by appropriate unit tests.
4.  Ensure your code adheres to the PEP-8 style guide.
5.  Submit a pull request with a detailed description of your changes and their justification.

## License

This project is licensed under the MIT License. See the `LICENSE` file for details.

## Citation

If you use this toolkit in your academic research, please cite it as follows:

```bibtex
@software{Chirinda_Image_Authentication_Toolkit_2025,
  author = {Chirinda, Craig},
  title = {{Image Authentication Toolkit}},
  year = {2025},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/chirindaopensource/image_authentication_toolkit}}
}
```

--

This README was generated based on the structure and content of image_authentication_tool_draft.ipynb and follows best practices for research software documentation.

# Summary

## Overview

The emergence of sophisticated generative models for visual art has created a non-trivial challenge at the intersection of creativity, copyright, and computational analysis. The core of the issue is the ambiguity of "originality."

When a model trained on a vast corpus of human-created art produces a new image, the question of its provenance—is it a novel creation, a stylistic amalgamation, or a trivial replication of its training data?—cannot be answered through subjective assessment alone.

The `ImageSimilarityDetector` tool was necessitated by this demand for a quantitative, multi-faceted, and methodologically rigorous framework to dissect the relationship between a generated image and existing visual works. Its purpose is to replace subjective claims with empirical evidence, enabling a structured analysis of an image's uniqueness, semantic meaning, and public context.

The tool's efficacy stems from its deployment of a spectrum of analytical techniques, each with distinct capabilities and limitations. The most rudimentary methods, `histogram_correlation` and `perceptual_hash_difference`, operate on low-level features.

Histogram analysis provides a global statistical signature of an image's color palette but is entirely blind to spatial structure and semantic content; it can only detect gross color scheme similarities and is thus the weakest signal for originality.

Perceptual hashing (pHash), based on the Discrete Cosine Transform, is substantially more potent. It creates a compact fingerprint of an image's low-frequency structural information, making it robust to minor perturbations like compression and resizing. Its function is to detect near-verbatim copies. Its limitation is its inability to recognize semantic similarity; it cannot connect two different photographs of the same subject. For AI art, pHash serves as a first-line defense against plagiarism, capable of flagging instances where a model has memorized and reproduced a training image with high fidelity.

Moving to a higher level of structural analysis, the `feature_match_ratio` method, which implements the ORB algorithm, provides a more granular comparison. By detecting and matching thousands of salient local features (e.g., corners, textures) between two images, it can identify if one image contains a direct, geometrically consistent copy of a component from another, even if it has been scaled, rotated, or partially occluded. This is mathematically refined through RANSAC-based homography checks to ensure the spatial relationship between matched features is coherent. Its primary limitation is its semantic ignorance; it matches local pixel patterns, not concepts. In the context of AI art, this method is indispensable for detecting "asset ripping" or digital collage, where a model might lift a specific, copyrighted element—a character, a logo, a unique architectural feature—and incorporate it into a new composition. A high ratio of geometrically consistent matches provides strong, quantitative evidence of direct derivation.

The most abstract and powerful technique in the toolkit is `clip_embedding_similarity`. This method leverages a Vision Transformer to project an image into a high-dimensional semantic embedding space, where the vector's position is determined by the image's conceptual meaning, not its pixel-level appearance. The cosine similarity between two such vectors provides a measure of their semantic proximity. This allows the tool to understand that a photograph of a cat and a cubist painting of a cat are related, a feat impossible for the other methods. Its primary limitation is that it cannot prove direct, structural copying; high semantic similarity is not, by itself, evidence of copyright infringement. For calibrating AI art, this is the essential method for quantifying stylistic influence and conceptual overlap. By comparing a generated piece to an artist's portfolio, one can measure the degree to which the AI has replicated that artist's unique semantic signature, moving the discussion from a vague "it looks like" to a precise, numerical measure of stylistic proximity.

Finally, the `reverse_image_search_google` callable serves a distinct but critical purpose: establishing public provenance. By automating a search, it determines if the image or a near-duplicate already exists in the public domain. This is not a direct similarity metric but a discovery tool. Its primary limitation is its reliance on a third-party, black-box search algorithm (Google's) and its confinement to publicly indexed content. For AI art, this method provides an essential baseline check. A "no results found" outcome is a preliminary indicator of novelty. Conversely, if the search returns an existing artwork, it provides immediate evidence of replication. Furthermore, the textual descriptions and context of the returned results can offer invaluable clues into the semantic components the model may have synthesized to create the image, thereby illuminating its conceptual lineage.

In synthesis, no single method is sufficient for the rigorous task of calibrating AI art authenticity. The strength of the `ImageSimilarityDetector` lies in its integrated, multi-modal approach. A robust analysis requires a fusion of these techniques: pHash to check for direct replication, feature matching to detect asset collage, CLIP similarity to quantify stylistic and conceptual derivation, and reverse image search to establish public context. The collective output of these methods transforms the ambiguous question of originality into a multi-dimensional, quantitative assessment, providing the empirical foundation necessary for informed and defensible conclusions in the management of digital assets.

## Usage Examples

### Perceptual Hash Analysis

**Purpose:** To detect near-verbatim structural copies of an image. This method is robust to minor modifications such as resizing, compression, and slight color shifts.
**Mathematical Basis:** The pHash algorithm uses the Discrete Cosine Transform (DCT) to extract a low-frequency fingerprint of the image. The Hamming distance between the binary fingerprints of two images measures their structural dissimilarity.
**Usage Snippet:** We will invoke the `perceptual_hash_difference` method with full statistical analysis enabled. This provides not only the raw distance but also a confidence interval, giving us a measure of certainty in the result.

```python
# Snippet 1: Perceptual Hash Analysis
try:
    # Execute perceptual hash analysis with full statistical reporting.
    # This method is computationally inexpensive and effective for finding near-duplicates.
    p_hash_results = detector.perceptual_hash_difference(
        image1_path,
        image2_path,
        hash_size=16,  # Use a 16x16 hash (256 bits) for higher precision.
        normalize=True,
        return_similarity=True,
        statistical_analysis=True
    )
    # Store the comprehensive results.
    analysis_results['perceptual_hash'] = p_hash_results
except (ImageUnreadableError, ValueError, RuntimeError) as e:
    # Handle errors related to image processing or hash computation.
    analysis_results['perceptual_hash'] = {'error': str(e)}
```

### Local Feature Matching Analysis

**Purpose:** To identify if one image contains geometrically consistent sections of another. This is critical for detecting digital collage or "asset ripping."
**Mathematical Basis:** The ORB algorithm detects salient keypoints (corners) and generates binary descriptors. These are matched using Hamming distance. The RANSAC algorithm is then used to find a homography (a perspective transform) that explains the spatial relationship between the matched points, filtering out spurious matches.
**Usage Snippet:** We invoke `feature_match_ratio` with geometric verification and detailed results enabled. The key output is the `homography_inlier_ratio`, which quantifies the geometric consistency of the match.

```python
# Snippet 2: Local Feature Matching Analysis
try:
    # Execute feature matching with geometric verification (RANSAC).
    # This is computationally more intensive but detects structural copying, even with transformations.
    feature_match_results = detector.feature_match_ratio(
        image1_path,
        image2_path,
        distance_threshold=64,  # A standard Hamming distance threshold for ORB.
        normalization_strategy="min_keypoints",
        apply_ratio_test=True,  # Use Lowe's ratio test for more robust matching.
        ratio_threshold=0.75,
        resize_max_side=1024,  # Resize for performance without significant feature loss.
        return_detailed_result=True,
        geometric_verification=True
    )
    # Store the comprehensive FeatureMatchResult object.
    analysis_results['feature_matching'] = feature_match_results
except (ImageUnreadableError, RuntimeError) as e:
    # Handle errors in the feature detection or matching pipeline.
    analysis_results['feature_matching'] = {'error': str(e)}
```

### Global Color Distribution Analysis

**Purpose:** To compare the overall color palettes of two images. This is the weakest signal for originality but can be useful for identifying images with a shared aesthetic or origin (e.g., from the same film scene).
**Mathematical Basis:** Images are converted to a perceptually uniform color space (like HSV). A multi-dimensional histogram is computed, normalized to a probability distribution, and then compared using a statistical metric like Pearson correlation.
**Usage Snippet:** We invoke `histogram_correlation` using the HSV color space, as it decouples illumination from color information, making the comparison more robust.

```python
# Snippet 3: Global Color Distribution Analysis
try:
    # Execute color histogram correlation.
    # This measures similarity in the global color distribution, ignoring spatial structure.
    histogram_results = detector.histogram_correlation(
        image1_path,
        image2_path,
        metric="correlation",  # Pearson correlation is a robust similarity measure.
        color_space="HSV",  # HSV is robust to lighting changes.
        statistical_analysis=True
    )
    # Store the comprehensive results.
    analysis_results['histogram_correlation'] = histogram_results
except (ImageUnreadableError, HistogramError) as e:
    # Handle errors in histogram computation or comparison.
    analysis_results['histogram_correlation'] = {'error': str(e)}
```

### Semantic Meaning Analysis

**Purpose:** To measure the conceptual or semantic similarity between two images, irrespective of their visual style or composition.
**Mathematical Basis:** The CLIP model, a Vision Transformer, projects each image into a shared, high-dimensional embedding space. The cosine similarity of the resulting vectors measures their proximity in this "meaning" space. A value near 1.0 indicates high semantic overlap.
**Usage Snippet:** We invoke `clip_embedding_similarity` with full analysis enabled to get not only the similarity score but also a confidence interval and properties of the embedding vectors themselves.

```python
# Snippet 4: Semantic Meaning Analysis
try:
    # Execute semantic similarity analysis using the CLIP model.
    # This is the most abstract comparison, measuring conceptual similarity.
    clip_results = detector.clip_embedding_similarity(
        image1_path,
        image2_path,
        statistical_analysis=True,
        embedding_analysis=True
    )
    # Store the comprehensive results.
    analysis_results['semantic_similarity'] = clip_results
except (ModelLoadError, ModelInferenceError, ValueError) as e:
    # Handle errors related to model loading or inference.
    analysis_results['semantic_similarity'] = {'error': str(e)}
```

### Public Provenance and Context Analysis

**Purpose:** To determine if an image already exists in the public domain and to gather context about its usage. This is a discovery, not a comparison, method.
**Mathematical Basis:** This process leverages the complex, proprietary algorithms of a third-party search engine (Google). The core principle is to use the image itself as a query to a vast, indexed database of web content.
**Usage Snippet:** We invoke `reverse_image_search_google` on one of the images. The function automates a web browser to perform the search and scrapes the results, which are then structured into a `ReverseImageSearchResult` object.

```python
# Snippet 5: Public Provenance and Context Analysis
try:
    # Execute a reverse image search on the first image to establish public provenance.
    # This checks if the image or near-duplicates exist on the public internet.
    reverse_search_results = detector.reverse_image_search_google(
        image_path=image1_path,
        driver_path=chromedriver_path,
        headless=True  # Run in the background for automated environments.
    )
    # Store the comprehensive ReverseImageSearchResult object.
    analysis_results['reverse_image_search'] = reverse_search_results
except (LaunchError, NavigationError, UploadError, ExtractionError) as e:
    # Handle the various failure modes of web automation.
    analysis_results['reverse_image_search'] = {'error': str(e)}
except Exception as e:
    # Handle any other unexpected errors during the search.
    analysis_results['reverse_image_search'] = {'error': f"An unexpected error occurred: {e}"}
```



# Imports

In [None]:
# Import Essential Modules
# Standard Library Imports
import csv
import datetime
import functools
import gc
import hashlib
import inspect
import json
import logging
import os
import pickle
import statistics
import sys
import threading
import time
import traceback
import warnings
import xml.etree.ElementTree as ET
from abc import ABC, abstractmethod
from collections import defaultdict
from dataclasses import dataclass, field, fields, asdict
from enum import Enum
from pathlib import Path
from typing import (
    Any,
    Callable,
    ClassVar,
    Dict,
    Iterator,
    List,
    Literal,
    Optional,
    Protocol,
    Tuple,
    Type,
    Union,
    runtime_checkable
)

# Third-Party Imports
import clip
import cv2
import imagehash
import numpy as np
import pandas as pd
import psutil
import scipy.stats as stats
import torch
from PIL import Image
from selenium import webdriver
from selenium.common.exceptions import TimeoutException, WebDriverException
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait



# Implementation

## Draft 1
### Description of Callables in `image_authentication_tool_draft.ipynb`

The notebook is structured into several cells, primarily defining custom exceptions, protocols, factory classes, result data structures, and the main `ImageSimilarityDetector` class. Below is a description of all the callables in the workbook in their order of appearance.

--

### **Custom Error Classes and Protocol Classes**

This cell establishes the foundational components for error handling, protocol-driven design, and runtime validation.

**1. `ForensicMetadata.__init__`**
*   **Inputs:**
    *   `operation_name` (str): The name of the operation where the error occurred.
    *   `thread_id` (Optional[int]): The identifier of the executing thread. Defaults to the current thread's ID.
    *   `process_id` (Optional[int]): The identifier of the executing process. Defaults to the current process's ID.
    *   `timestamp` (Optional[datetime.datetime]): The precise UTC timestamp of the event. Defaults to the current UTC time.
    *   `system_info` (Optional[Dict[str, Any]]): A dictionary of system state information.
    *   `algorithm_parameters` (Optional[Dict[str, Any]]): A dictionary of parameters used by the algorithm at the time of the event.
    *   `memory_usage` (Optional[Dict[str, float]]): A dictionary of memory metrics.
    *   `call_stack` (Optional[List[str]]): The execution call stack. Defaults to the current stack trace.
*   **Processes:**
    1.  The method initializes an instance of the `ForensicMetadata` class.
    2.  It assigns the provided `operation_name` to an instance variable.
    3.  For optional parameters (`thread_id`, `process_id`, `timestamp`, `call_stack`), it checks if a value was provided. If not, it programmatically captures the current state using the `threading`, `sys`, `datetime`, and `traceback` modules, respectively.
    4.  It assigns the remaining dictionary-based inputs (`system_info`, `algorithm_parameters`, `memory_usage`) to instance variables, defaulting to empty dictionaries if none are provided.
*   **Outputs:**
    *   A fully instantiated `ForensicMetadata` object containing a comprehensive snapshot of the system and execution context at the time of its creation.

**2. `ForensicMetadata.to_dict`**
*   **Inputs:**
    *   `self` (ForensicMetadata): The instance of the class.
*   **Processes:**
    1.  The method constructs a new dictionary.
    2.  It populates this dictionary with the values of all instance variables (`operation_name`, `thread_id`, etc.).
    3.  The `timestamp` (a `datetime` object) is transformed into a string by calling its `isoformat()` method, ensuring JSON compatibility.
*   **Outputs:**
    *   A `Dict[str, Any]` representing the serialized state of the `ForensicMetadata` object.

**3. `ForensicMetadata.to_json`**
*   **Inputs:**
    *   `self` (ForensicMetadata): The instance of the class.
*   **Processes:**
    1.  It first calls `self.to_dict()` to get the dictionary representation of the object.
    2.  This dictionary is then passed to the `json.dumps()` function.
    3.  The serialization process is configured with an indent of 2 for human readability and a `default` handler of `str` to manage any non-standard data types that might remain.
*   **Outputs:**
    *   A `str` containing the JSON-formatted representation of the forensic metadata.

**4. `ImageSimilarityError.__init__`**
*   **Inputs:**
    *   `message` (str): A human-readable description of the error.
    *   `severity` (ErrorSeverity): An enum value classifying the error's severity.
    *   `operation_name` (Optional[str]): The name of the failed operation.
    *   `error_code` (Optional[str]): A unique code for the error.
    *   `forensic_metadata` (Optional[ForensicMetadata]): A pre-populated forensic metadata object.
    *   `algorithm_context` (Optional[Dict[str, Any]]): Algorithm-specific state.
    *   `original_exception` (Optional[Exception]): The underlying exception being wrapped.
    *   `suggested_remediation` (Optional[str]): Guidance for resolving the error.
*   **Processes:**
    1.  The method first validates the `message` input, providing a default if it's `None` or not a string.
    2.  It calls the `super().__init__(message)` to initialize the base `Exception` class.
    3.  It assigns all provided inputs to corresponding instance variables, applying default values where inputs are `None`.
    4.  If `forensic_metadata` is not provided, it instantiates a new `ForensicMetadata` object, capturing the current context.
    5.  It captures the current UTC time as the error `timestamp`.
*   **Outputs:**
    *   A fully instantiated `ImageSimilarityError` object (or a subclass thereof), which is a specialized exception containing rich contextual information for debugging and logging.

**5. `ImageSimilarityError.__str__`**
*   **Inputs:**
    *   `self` (ImageSimilarityError): The instance of the exception.
*   **Processes:**
    1.  It constructs a formatted string, starting with the severity level (e.g., `[HIGH]`) and the core error message.
    2.  It conditionally appends the `operation_name`, `error_code`, and `suggested_remediation` if they are available, creating a concise, human-readable summary of the error.
*   **Outputs:**
    *   A `str` suitable for direct display to a user or for simple log messages.

**6. `ImageSimilarityError.to_dict`**
*   **Inputs:**
    *   `self` (ImageSimilarityError): The instance of the exception.
*   **Processes:**
    1.  It creates a dictionary containing the primary attributes of the exception (`error_class`, `message`, `severity`, etc.).
    2.  The `severity` enum is transformed into its string value. The `timestamp` is transformed into an ISO format string.
    3.  If `forensic_metadata` exists, it calls `self.forensic_metadata.to_dict()` and embeds the resulting dictionary.
    4.  If `original_exception` exists, it creates a nested dictionary containing the type, message, and arguments of the original exception.
*   **Outputs:**
    *   A `Dict[str, Any]` containing a structured, serializable representation of the entire exception state.

**7. `ImageSimilarityError.log_error`**
*   **Inputs:**
    *   `self` (ImageSimilarityError): The instance of the exception.
    *   `logger` (Optional[logging.Logger]): An optional logger instance.
*   **Processes:**
    1.  It obtains a logger instance, either the one provided or a new one based on the module and class name.
    2.  It maps the `ErrorSeverity` enum of the exception to a `logging` level (e.g., `ErrorSeverity.HIGH` to `logging.ERROR`).
    3.  It calls `self.to_dict()` to get the full structured error data.
    4.  It uses the logger's `log` method to record the error, passing the structured data dictionary to the `extra` parameter. This enables structured logging systems (like those outputting JSON logs) to capture the full error context.
*   **Outputs:**
    *   None. The process results in a side effect: a log record is emitted.

**8. `protocol_validator` (Decorator Factory)**
*   **Inputs:**
    *   `constraints` (Optional[Dict[str, Any]]): A dictionary defining validation rules for function parameters and its return value.
    *   `log_violations` (bool): A flag to control whether violations are logged.
*   **Processes:**
    1.  This is a factory that returns a decorator. When called, it captures the `constraints` and `log_violations` in a closure.
    2.  The returned `decorator` function takes a function `func` as input.
    3.  The `decorator` returns a `wrapper` function that replaces the original `func`.
    4.  When the `wrapper` is called with `*args` and `**kwargs`:
        a. It uses `inspect.signature` to map the provided arguments to the parameter names of the decorated function.
        b. It iterates through the `constraints` dictionary. For each parameter, it checks if the provided value violates the specified 'type', 'bounds', or 'shape' constraints.
        c. If a violation is found, it logs the error (if `log_violations` is true) and raises a `ValueError` or `TypeError`.
        d. If all input validations pass, it calls the original function `func` with the arguments.
        e. It then validates the `result` of the function call against any 'return_value' constraints in the same manner.
        f. If the return value is valid, it is returned to the caller.
*   **Outputs:**
    *   A decorator function (`decorator`) which, when applied to another function, returns a new function (`wrapper`) that performs runtime validation before and after executing the original function's logic.

**9. `validate_protocol_implementation`**
*   **Inputs:**
    *   `obj` (Any): The object instance whose class is being validated.
    *   `protocol_class` (Type[Protocol]): The protocol class that `obj` is expected to implement.
    *   `strict` (bool): A flag to enable strict checking of parameter names and type annotations.
*   **Processes:**
    1.  It initializes a list `violations` to store descriptions of any conformance failures.
    2.  It iterates through the attributes of the `protocol_class` to identify all public callable methods.
    3.  For each protocol method, it checks for its existence and callability on the input `obj`.
    4.  Using the `inspect` module, it retrieves the signatures of both the protocol method and the implementation method.
    5.  It compares the number of parameters.
    6.  If `strict` is true, it performs a parameter-by-parameter comparison of names and type annotations.
    7.  Any discrepancy is transformed into a descriptive string and appended to the `violations` list.
*   **Outputs:**
    *   A `Tuple[bool, List[str]]`: The boolean indicates overall validation success, and the list contains all identified violation messages.

**10. `register_protocol_implementation`**
*   **Inputs:**
    *   `impl_class` (Type): The class that is intended to implement the protocol.
    *   `protocol_class` (Type[Protocol]): The protocol class.
    *   `validate_on_registration` (bool): A flag to control whether to perform validation at registration time.
*   **Processes:**
    1.  If `validate_on_registration` is true, it attempts to create a temporary instance of `impl_class`. This requires the class to have a parameter-less `__init__` method.
    2.  It then calls `validate_protocol_implementation` on this temporary instance.
    3.  If validation fails, it aggregates the violation messages and raises a `TypeError`.
    4.  If validation succeeds (or is skipped), it calls `protocol_class.register(impl_class)`. This built-in function makes Python's `isinstance()` checks recognize `impl_class` as a virtual subclass of `protocol_class` without requiring traditional inheritance.
*   **Outputs:**
    *   The original `impl_class`, now registered as an implementer of the protocol.

--

### **Factory Classes for Resource Management**

This cell defines the machinery for creating and managing computational resources, particularly for computer vision algorithms, with a focus on optimization and performance.

**1. `ParameterOptimizer.__init__`**
*   **Inputs:**
    *   `constraints` (Dict[str, ParameterConstraints]): A mapping of parameter names to their mathematical and optimization constraints.
    *   `optimization_strategy` (OptimizationStrategy): An enum specifying the search algorithm to use.
    *   `performance_samples` (int): The number of times to run the objective function for a stable performance measurement.
    *   `convergence_threshold` (float): The minimum improvement required to consider a new set of parameters "better".
    *   `max_iterations` (int): The maximum number of optimization steps to perform.
*   **Processes:**
    1.  The method initializes an instance of the `ParameterOptimizer`.
    2.  It stores all the input configurations as instance variables.
    3.  It initializes internal state variables: `optimization_history` (an empty list to track steps), `best_parameters` (None), and `best_performance` (None).
    4.  It acquires a logger instance for monitoring the optimization process.
*   **Outputs:**
    *   A configured `ParameterOptimizer` object, ready to run an optimization task.

**2. `ParameterOptimizer.validate_parameters`**
*   **Inputs:**
    *   `self` (ParameterOptimizer): The instance of the class.
    *   `parameters` (Dict[str, Any]): A dictionary of parameter values to be validated.
*   **Processes:**
    1.  It iterates through the input `parameters` dictionary.
    2.  For each parameter, it retrieves the corresponding `ParameterConstraints` object from `self.constraints`.
    3.  It performs a series of checks on the parameter's value against its defined constraints:
        a. Type check (`isinstance`).
        b. Bounds check (min <= value <= max).
        c. Dependency check by calling `_validate_dependency` for any specified inter-parameter relationships.
    4.  Each violation is transformed into a descriptive string and added to a `violations` list.
*   **Outputs:**
    *   A `Tuple[bool, List[str]]`: The boolean indicates if all parameters are valid, and the list contains messages for any violations found.

**3. `ParameterOptimizer.optimize_parameters`**
*   **Inputs:**
    *   `self` (ParameterOptimizer): The instance of the class.
    *   `objective_function` (Callable): A function that takes a dictionary of parameters and returns a `PerformanceMetrics` object.
    *   `initial_parameters` (Optional[Dict[str, Any]]): An optional starting point for the optimization.
*   **Processes:**
    1.  It establishes a starting set of parameters, either from `initial_parameters` (after validation) or by generating defaults.
    2.  It enters a loop that runs for a maximum of `max_iterations`.
    3.  Inside the loop, it generates a new set of `candidate_parameters` by calling a step function (`_grid_search_step`, `_random_search_step`, etc.) based on the configured `optimization_strategy`.
    4.  It validates these candidate parameters.
    5.  If valid, it calls `_evaluate_parameters_with_statistics`, which repeatedly executes the `objective_function` with the candidate parameters to get a statistically stable performance measurement.
    6.  It computes a single `composite_score` from the resulting `PerformanceMetrics` object.
    7.  This score is compared to the current `best_score`. If it represents a significant improvement (exceeding `convergence_threshold`), the `best_score`, `best_parameters`, and `best_performance` are updated.
    8.  The results of the iteration are appended to `self.optimization_history`.
    9.  The loop includes a convergence check: if there is no improvement for a set number of consecutive iterations, the optimization terminates early.
*   **Outputs:**
    *   A `Tuple[Dict[str, Any], PerformanceMetrics]`: The best parameter set found and the corresponding performance metrics.

**4. `ResourceManager.__init__`**
*   **Inputs:**
    *   `resource_constraints` (ResourceConstraints): An enum indicating the resource environment (e.g., `LOW_MEMORY`).
    *   `monitoring_interval` (float): The time in seconds between resource checks.
    *   `cleanup_threshold` (float): The resource usage percentage (e.g., 0.8 for 80%) that triggers a cleanup.
*   **Processes:**
    1.  Initializes a `ResourceManager` instance, storing the configuration.
    2.  Initializes state variables for monitoring, including a list for `resource_history` and thread-safe locks.
    3.  Calls `_configure_resource_limits` to transform the abstract `resource_constraints` enum into concrete numerical limits (e.g., max memory in bytes, max CPU cores).
*   **Outputs:**
    *   A configured `ResourceManager` object.

**5. `ResourceManager.start_monitoring`**
*   **Inputs:**
    *   `self` (ResourceManager): The instance of the class.
*   **Processes:**
    1.  It acquires a lock to ensure thread safety.
    2.  It checks if monitoring is already active to prevent creating duplicate threads.
    3.  If not active, it sets the `monitoring_active` flag to `True`.
    4.  It creates and starts a new `threading.Thread`, targeting the `_monitoring_loop` method. The thread is configured as a daemon so it does not block program exit.
*   **Outputs:**
    *   None. The process results in a side effect: a background monitoring thread is started.

**6. `ResourceManager._monitoring_loop`**
*   **Inputs:**
    *   `self` (ResourceManager): The instance of the class.
*   **Processes:**
    1.  This method runs in a continuous loop as long as the `monitoring_active` flag is `True`.
    2.  In each iteration, it calls `_collect_resource_stats` to get current system usage.
    3.  The collected stats are appended to the `resource_history` list, which is periodically trimmed to prevent unbounded memory growth.
    4.  It calls `_should_trigger_cleanup` to check if resource usage has exceeded the configured thresholds.
    5.  If cleanup is needed, it spawns another thread to execute `_perform_cleanup`, ensuring the monitoring loop itself is not blocked.
    6.  It sleeps for the duration of `monitoring_interval`.
*   **Outputs:**
    *   None. This is a perpetual loop that modifies the object's state and may trigger cleanup actions.

**7. `DefaultFeatureDetectorFactory.create`**
*   **Inputs:**
    *   `self` (DefaultFeatureDetectorFactory): The instance of the class.
    *   A series of keyword arguments corresponding to the parameters of `cv2.ORB_create` (e.g., `nfeatures`, `scaleFactor`).
    *   `optimize_parameters` (bool): Flag to trigger automatic parameter optimization.
    *   `target_performance` (Optional[str]): A hint for the optimizer (e.g., 'speed', 'accuracy').
*   **Processes:**
    1.  It first validates the provided parameters against the mathematical constraints defined in `self.parameter_constraints`.
    2.  If `optimize_parameters` is true, it invokes its `ParameterOptimizer` instance. The optimizer uses the `_evaluate_detector_performance` method as its objective function to find the best parameter set, which then overwrites the initial inputs.
    3.  It generates a unique `cache_key` from the final parameter configuration.
    4.  It checks a thread-safe cache (`_detector_cache`) for a pre-existing `cv2.ORB` instance with the same configuration. If found, it returns the cached object.
    5.  If not in the cache, it performs final mathematical validation on critical parameters (e.g., `scaleFactor > 1.0`).
    6.  It calls `cv2.ORB_create` with the final parameters to instantiate the detector.
    7.  If the cache is full, it implements an LRU (Least Recently Used) policy to evict an old detector before adding the new one.
    8.  The newly created detector is stored in the cache.
*   **Outputs:**
    *   A `cv2.ORB` object, which is a feature detector instance from the OpenCV library, configured with either the provided or optimized parameters.

**8. `DefaultMatcherFactory.create`**
*   **Inputs:**
    *   `self` (DefaultMatcherFactory): The instance of the class.
    *   `normType` (int): The distance norm to use (e.g., `cv2.NORM_HAMMING`).
    *   `crossCheck` (bool): A flag to enable/disable cross-checking for more reliable matches.
    *   `optimize_for_throughput` (bool): A flag to adjust settings for speed.
*   **Processes:**
    1.  It validates that the `normType` is a valid OpenCV norm constant.
    2.  If `optimize_for_throughput` is true, it overrides `crossCheck` to `False`, as this is a primary trade-off between speed and precision.
    3.  It calls `cv2.BFMatcher` with the specified `normType` and `crossCheck` to instantiate the matcher.
    4.  If profiling is enabled, it records the creation time and memory usage and appends this data to its `performance_history`.
*   **Outputs:**
    *   A `cv2.BFMatcher` object, which is a brute-force matcher instance from the OpenCV library.

**9. `DefaultClipModelLoader.__call__`**
*   **Inputs:**
    *   `self` (DefaultClipModelLoader): The instance of the class.
    *   `model_name` (str): The name of the CLIP model to load (e.g., "ViT-B/32").
    *   `device` (str): The target device for the model (e.g., "cuda").
    *   `enable_optimization` (bool): Flag to apply post-loading optimizations.
    *   `precision` (str): The desired numerical precision (e.g., "float16").
*   **Processes:**
    1.  It validates the `model_name` against a dictionary of supported models and the `device` string against available hardware.
    2.  It generates a unique `cache_key` based on the configuration.
    3.  It checks an in-memory cache (`_model_cache`) for a pre-loaded model. If found, it's returned immediately.
    4.  If not cached, it calls the underlying `clip.load` function to download or load the model from a file system cache.
    5.  If `enable_optimization` is true, it calls `_optimize_model`, which may apply `torch.compile` or other backend optimizations.
    6.  It converts the model to the specified `precision` (e.g., `.half()` for float16).
    7.  It records detailed performance metrics of the loading process (time, memory usage) in its `loading_history`.
    8.  The loaded model and its associated preprocessing function are stored in the in-memory cache.
*   **Outputs:**
    *   A `Tuple[torch.nn.Module, Callable]`: The loaded and configured PyTorch model and the corresponding function required to preprocess images for it.

--

### **Result Data Structures**

This cell defines the data classes used to structure the outputs of the various similarity methods, incorporating validation, serialization, and comparison capabilities through mixins.

**1. `StatisticalProperties.__post_init__`**
*   **Inputs:**
    *   `self` (StatisticalProperties): The instance of the class, just after its fields have been populated by the `__init__` method.
*   **Processes:**
    1.  This special dataclass method is called automatically after initialization.
    2.  It performs a series of mathematical consistency checks on the statistical fields.
    3.  It validates that `sample_size` is positive and `variance` is non-negative.
    4.  It verifies that `standard_deviation` is approximately the square root of `variance`.
    5.  It checks that `standard_error` is consistent with the formula `σ/√n`.
    6.  It ensures `confidence_level` and `p_value` are within their valid probability ranges, [0, 1].
*   **Outputs:**
    *   None. It raises a `ValueError` if any of the mathematical constraints are violated, preventing the creation of a statistically inconsistent object.

**2. `ReverseImageSearchResult.__post_init__`**
*   **Inputs:**
    *   `self` (ReverseImageSearchResult): The instance of the class.
*   **Processes:**
    1.  This method validates the integrity of the data returned from a reverse image search.
    2.  It checks that required string fields (`best_guess`, `source_page_title`) are non-empty and within length constraints.
    3.  It validates that optional numerical scores (`site_authority_score`, `confidence_score`) are within the range [0, 1].
    4.  It iterates through the `similar_image_urls` list, ensuring each entry is a validly formatted URL string.
    5.  It performs logical cross-field validation, such as ensuring `duplicate_url_count` does not exceed the total number of URLs.
*   **Outputs:**
    *   None. It raises a `ValueError` if any constraints are violated.

**3. `FeatureMatchResult.__post_init__`**
*   **Inputs:**
    *   `self` (FeatureMatchResult): The instance of the class.
*   **Processes:**
    1.  This method validates the integrity of the data from a feature matching operation.
    2.  It ensures all numerical scores and ratios (`similarity_ratio`, `confidence_level`, `homography_inlier_ratio`) are within their expected mathematical range of [0, 1].
    3.  It validates that all count fields (`total_matches`, `good_matches`, etc.) are non-negative.
    4.  It performs logical cross-field validation, such as ensuring `good_matches` is not greater than `total_matches`.
    5.  It validates the `normalization_strategy` string against the set of allowed values.
*   **Outputs:**
    *   None. It raises a `ValueError` if any constraints are violated.

**4. `SerializationMixin.to_dict`**
*   **Inputs:**
    *   `self` (Any class instance using this mixin).
    *   `include_metadata` (bool): Flag to include serialization metadata.
    *   `flatten_nested` (bool): Flag to flatten the dictionary structure.
*   **Processes:**
    1.  It uses the `dataclasses.asdict` function to recursively convert the dataclass instance into a dictionary.
    2.  If `include_metadata` is true, it computes a hash of the object's content and adds a `_metadata` key to the dictionary with the class name, timestamp, version, and hash.
    3.  If `flatten_nested` is true, it calls `_flatten_dictionary` to transform the nested dictionary into a single-level dictionary with dot-separated keys (e.g., `{'a': {'b': 1}}` becomes `{'a.b': 1}`).
*   **Outputs:**
    *   A `Dict[str, Any]` representing the object.

**5. `ComparisonMixin.compute_similarity`**
*   **Inputs:**
    *   `self` (Any class instance using this mixin).
    *   `other` (ComparisonMixin): Another result object to compare against.
    *   `method` (str): The name of the similarity metric to use (e.g., "cosine").
*   **Processes:**
    1.  It calls `_extract_numerical_features` on both `self` and `other` to get vectorized representations of the objects.
    2.  It converts these feature lists into NumPy arrays.
    3.  Based on the `method` string, it applies the corresponding mathematical formula:
        *   **cosine:** Computes the dot product and divides by the product of the L2 norms.
        *   **euclidean:** Computes the L2 norm of the vector difference and transforms it into a similarity score via `1 / (1 + distance)`.
        *   **manhattan:** Computes the L1 norm of the vector difference and transforms it similarly.
        *   **jaccard:** Computes the ratio of the intersection to the union of the feature sets.
*   **Outputs:**
    *   A `float` representing the similarity score between the two result objects.

--

### **Main `ImageSimilarityDetector` Class**

This is the primary orchestrator class. The notebook shows two versions; the final one in Cell 9 is the most complete and will be the focus of this analysis.

**1. `ImageSimilarityDetector.__init__` (Final Version)**
*   **Inputs:**
    *   A series of optional, injectable components (`orb_detector`, `matcher`, `clip_model_loader`).
    *   A series of configuration parameters (`device`, `clip_model_name`, `resource_constraints`, etc.).
*   **Processes:**
    1.  It stores the high-level configuration parameters.
    2.  It instantiates its own `ResourceManager` and starts monitoring if enabled.
    3.  It sets the global validation policy on the result data classes.
    4.  It calls `_determine_optimal_device_with_validation` to select and validate the computational device.
    5.  It calls the factory-based initialization methods (`_initialize_feature_detector_with_factory`, etc.) to either accept the injected dependencies (after protocol validation) or create default, optimized instances.
    6.  It initializes placeholders for the lazily-loaded CLIP model and thread-safe locks.
    7.  It sets up a structured logger with key configuration details.
*   **Outputs:**
    *   A fully configured, enterprise-grade `ImageSimilarityDetector` instance ready for use.

**2. `ImageSimilarityDetector._validate_image_path` (Final Version)**
*   **Inputs:**
    *   `image_path` (Union[str, Path]): The path to the image file.
    *   `allow_symlinks` (bool): Flag to control symlink policy.
    *   `perform_content_validation` (bool): Flag to control whether to read the file to verify it's a valid image.
    *   `max_file_size_mb` (float): The maximum permissible file size.
*   **Processes:**
    1.  **Normalization:** Transforms the input into a resolved, absolute `pathlib.Path` object.
    2.  **Existence Check:** Verifies `path.exists()`. If not, it raises a detailed `ImageNotFoundError` that includes a list of similarly named files in the parent directory for diagnostic purposes.
    3.  **Symlink Analysis:** If the path is a symlink, it checks the `allow_symlinks` policy. If disallowed, it raises `SymlinkNotAllowedError` with the symlink resolution chain. If allowed, it verifies the target exists.
    4.  **File Type Check:** Verifies `path.is_file()`. If not, it determines the actual file system object type (e.g., 'directory', 'socket') and raises `NotAFileError` with this information.
    5.  **Permission Check:** Uses `os.access` to verify read permissions. If denied, it raises `PermissionDeniedError` with an analysis of the file's ownership and mode.
    6.  **Size Check:** Checks the file size against `max_file_size_mb` and raises `ImageValidationError` if it's too large.
    7.  **Content Validation:** If `perform_content_validation` is true, it attempts to open the file with `PIL.Image` and run `img.verify()`. This transforms the file's byte stream into an in-memory image representation, confirming it is not corrupt. If this fails, it raises `ImageUnreadableError`.
*   **Outputs:**
    *   A validated `pathlib.Path` object, guaranteed to point to an accessible, non-corrupt image file that meets all policy constraints.

**3. `ImageSimilarityDetector._load_clip_with_comprehensive_monitoring`**
*   **Inputs:**
    *   `self` (ImageSimilarityDetector): The instance of the class.
*   **Processes:**
    1.  **Double-Checked Locking:** It first checks if the model is loaded without acquiring a lock. If not, it acquires `_initialization_lock` and checks again to ensure thread safety and prevent redundant loading.
    2.  **Performance-Monitored Loading:** It records timestamps and memory usage before starting.
    3.  **Fallback Strategy:** It establishes a `device_fallback_chain` (e.g., `['cuda:0', 'cpu']`).
    4.  **Retry Loop:** It enters a nested loop, iterating through retry attempts and then through the device fallback chain.
    5.  **Model Loading:** Inside the loop, it calls the `self.clip_model_loader` to attempt loading the model onto the current device in the chain.
    6.  **Functionality Test:** After a successful load, it performs a "smoke test" by passing a random tensor through the model to ensure it is functional and does not produce null embeddings.
    7.  **Error Handling:** It includes specific `except` blocks for `OSError`, `RuntimeError`, and `torch.cuda.OutOfMemoryError`. On an OOM error, it automatically triggers `torch.cuda.empty_cache()` and attempts to fall back to the CPU.
    8.  **State Update:** On success, it stores the loaded model, its preprocess function, and detailed performance metrics in `self._clip_loading_performance`. It also updates `self.device` if a fallback was used.
    9.  **Comprehensive Failure:** If all retries on all fallback devices fail, it aggregates all the collected error information and raises a single, highly detailed `ModelLoadError`.
*   **Outputs:**
    *   None. This method modifies the state of the `ImageSimilarityDetector` instance by populating `self.clip_model` and `self.clip_preprocess`.

**4. `ImageSimilarityDetector.perceptual_hash_difference` (Final Version)**
*   **Inputs:**
    *   `image1`, `image2`: Image data in various formats.
    *   Configuration parameters (`hash_size`, `normalize`, `return_similarity`).
    *   Analysis flags (`compute_confidence_interval`, `statistical_analysis`, `performance_monitoring`).
*   **Processes:**
    1.  **Validation:** It validates the input parameters (e.g., `hash_size` bounds, logical consistency of `return_similarity` and `normalize`).
    2.  **Performance & Metadata Setup:** It initializes dictionaries to track performance and metadata throughout the execution.
    3.  **Image Loading:** It calls the internal helper `_load_and_preprocess_image_with_validation` for both images. This helper handles path validation, loading, and conversion to a grayscale `PIL.Image` object, returning rich metadata about the process.
    4.  **Hash Computation:** It calls `imagehash.phash` on the processed images. This function internally resizes the image, applies the 2D-DCT, extracts the low-frequency coefficients, and computes the median to generate the binary hash.
    5.  **Distance Calculation:** It computes the Hamming distance between the two hash objects by simple subtraction, which is overloaded by the `imagehash` library to perform this operation.
    6.  **Statistical Analysis (Optional):** If requested, it converts the hashes to binary arrays and computes a `StatisticalProperties` object on the bit-wise differences. It can further compute a Wilson score confidence interval for the proportion of differing bits.
    7.  **Normalization/Similarity (Optional):** Based on the input flags, it transforms the raw Hamming distance into a normalized distance (`d_H / N²`) or a similarity score (`1 - d_norm`).
    8.  **Performance Recording:** It calculates total execution time and memory delta, and if monitoring is enabled, appends the collected `operation_metrics` to the instance's history.
*   **Outputs:**
    *   The output type depends on the input flags. It can be an `int` (raw Hamming distance), a `float` (normalized distance or similarity), or a `Dict[str, Any]` containing a comprehensive breakdown of the results, including all statistical and performance metrics.

**5. `ImageSimilarityDetector.feature_match_ratio` (Final Version)**
*   **Inputs:**
    *   `image1`, `image2`: Image data.
    *   Configuration parameters for matching (`distance_threshold`, `normalization_strategy`, `apply_ratio_test`, etc.).
    *   Analysis flags (`return_detailed_result`, `geometric_verification`, etc.).
*   **Processes:**
    1.  **Validation & Setup:** It validates all input parameters and sets up a dictionary for performance tracking.
    2.  **Image Preparation:** It calls `_load_and_prepare_image_for_features` for both images, which handles loading, optional resizing, and conversion to a grayscale NumPy array suitable for OpenCV.
    3.  **Feature Detection:** It calls `self.orb_detector.detectAndCompute` on both grayscale images. This transforms the pixel data into two sets of keypoints and their corresponding 256-bit ORB descriptors.
    4.  **Matching:**
        *   If `apply_ratio_test` is true, it uses `self.bf_matcher.knnMatch` with `k=2` to find the two nearest neighbors for each descriptor. It then filters these matches based on Lowe's ratio test (`d1/d2 < ratio_threshold`).
        *   Otherwise, it uses `self.bf_matcher.match` and filters the results by keeping only those whose Hamming distance is below `distance_threshold`.
    5.  **Similarity Ratio Calculation:** It computes the final ratio by dividing the number of `good_matches` by a denominator determined by the `normalization_strategy` (`total_matches` or `min(keypoints1, keypoints2)`).
    6.  **Geometric Verification (Optional):** If requested and there are enough good matches (>=4), it extracts the coordinates of the matched keypoints. It then uses `cv2.findHomography` with RANSAC to find a perspective transform between the images. The number of inlier matches (those that fit the geometric model) is counted.
    7.  **Statistical Analysis (Optional):** If requested, it creates a `StatisticalProperties` object from the list of Hamming distances of the good matches, providing mean, variance, etc.
    8.  **Result Aggregation:** It aggregates all computed metrics—similarity, counts, timings, geometric verification results, and statistical properties—into a `FeatureMatchResult` dataclass instance.
*   **Outputs:**
    *   If `return_detailed_result` is true, it returns the populated `FeatureMatchResult` object. Otherwise, it returns only the calculated `similarity_ratio` as a `float`.

**6. `ImageSimilarityDetector.histogram_correlation` (Final Version)**
*   **Inputs:**
    *   `image1`, `image2`: Image data.
    *   Configuration parameters (`bins`, `metric`, `color_space`, `adaptive_binning`, etc.).
    *   Analysis flags (`statistical_analysis`, `performance_monitoring`, `entropy_analysis`).
*   **Processes:**
    1.  **Validation & Setup:** It validates all input parameters and sets up performance tracking.
    2.  **Image Preparation:** It calls `_load_and_prepare_image_for_histogram`, which loads, resizes, and handles optional masks for both images.
    3.  **Color Space Conversion:** It transforms the images from the default BGR color space to the specified `color_space` (HSV, RGB, or LAB) using `cv2.cvtColor`.
    4.  **Adaptive Binning (Optional):** If enabled, it analyzes the pixel distribution of each image channel to determine an optimal number of bins using the Freedman-Diaconis rule, overriding the static `bins` parameter.
    5.  **Histogram Calculation:** It calls `cv2.calcHist` for both images to compute their multi-dimensional color histograms based on the selected channels, bin counts, and value ranges.
    6.  **Normalization:** It normalizes both histograms using `cv2.normalize` with `NORM_L1`, transforming them into probability distributions where the sum of bins is 1.
    7.  **Entropy Analysis (Optional):** If requested, it computes the Shannon entropy of each histogram and the mutual information between them, providing an information-theoretic measure of similarity.
    8.  **Comparison:** It calls `cv2.compareHist` with the two normalized histograms and the chosen `metric` (e.g., `cv2.HISTCMP_CORREL`). This single call performs the core statistical comparison.
    9.  **Statistical Analysis (Optional):** If requested, it computes a `StatisticalProperties` object on the element-wise difference between the two normalized histograms and can perform a bootstrap analysis to estimate a confidence interval for the comparison metric.
*   **Outputs:**
    *   If `statistical_analysis` is true, it returns a `Dict[str, Any]` containing the comparison score and all the computed analytical and performance data. Otherwise, it returns the raw comparison score as a `float`.

**7. `ImageSimilarityDetector.clip_embedding_similarity` (Final Version)**
*   **Inputs:**
    *   `image1`, `image2`: Image data.
    *   Configuration parameters (`use_mixed_precision`, `batch_processing`, `device_optimization`).
    *   Analysis flags (`statistical_analysis`, `performance_monitoring`, `embedding_analysis`).
*   **Processes:**
    1.  **Model Loading:** It ensures the CLIP model is loaded by calling `_load_clip_with_comprehensive_monitoring`.
    2.  **Input Preparation:** It calls `_prepare_input_tensor_with_validation` for both images. This helper handles any input format (path, PIL, numpy, tensor), applies the necessary CLIP preprocessing, and moves the resulting tensor to the correct device.
    3.  **Inference:** It passes the tensors through `self.clip_model.encode_image`. This is the core deep learning step. It can run in a batch for efficiency and can use automatic mixed precision (`torch.cuda.amp.autocast`) to speed up computation on compatible GPUs. It includes a specific fallback to CPU if a `torch.cuda.OutOfMemoryError` occurs.
    4.  **Embedding Analysis (Optional):** If requested, it computes statistics (norm, mean, std) on the raw embedding vectors before normalization.
    5.  **Normalization:** It L2-normalizes both embedding vectors, projecting them onto the unit hypersphere. This is a mathematical prerequisite for using the dot product to calculate cosine similarity.
    6.  **Similarity Calculation:** It computes the cosine similarity by performing a matrix multiplication (`torch.matmul`) of one normalized embedding with the transpose of the other.
    7.  **Statistical Analysis (Optional):** If requested, it calculates related metrics like angular distance and Euclidean distance in the normalized space. It also uses the Fisher z-transformation to compute a confidence interval for the cosine similarity score.
*   **Outputs:**
    *   If analysis is requested, it returns a `Dict[str, Any]` containing the cosine similarity and a rich set of embedding, statistical, and performance metrics. Otherwise, it returns the cosine similarity score as a `float`.

**8. `ImageSimilarityDetector.reverse_image_search_google` (Final Version)**
*   **Inputs:**
    *   `image_path`, `driver_path`: Paths to the image and the WebDriver.
    *   Configuration parameters (`timeout`, `headless`, `max_similar_urls`, `retry_attempts`).
    *   Analysis flags (`performance_monitoring`, `content_analysis`, `advanced_extraction`).
*   **Processes:**
    1.  **Validation & Setup:** It validates the image and driver paths and configures `webdriver.ChromeOptions` with a suite of arguments designed for stable, undetectable automation. It also sets up performance tracking.
    2.  **Driver Initialization:** It initializes the Selenium WebDriver, wrapping it in a retry loop to handle transient launch failures.
    3.  **Navigation:** It navigates to `images.google.com`, again using a retry loop.
    4.  **UI Interaction:** This is the most complex part. It uses a predefined list of CSS and XPath selectors for each UI element (`camera_button`, `upload_tab`, `file_input`). It iterates through this list, attempting to find and interact with the element until one selector succeeds. This provides robustness against changes in Google's front-end code. It also uses multiple click strategies (standard, JavaScript, ActionChains) for reliability.
    5.  **File Upload:** It uses the `send_keys` method on the located file input element to upload the image.
    6.  **Data Extraction:** After the results page loads, it uses a similar multi-selector strategy to extract the "best guess" text and a list of URLs for visually similar images.
    7.  **Content Analysis (Optional):** If requested, it analyzes the quality of the extracted text and the distribution and quality of the extracted URLs to compute a confidence score.
    8.  **Result Aggregation:** It populates a `ReverseImageSearchResult` dataclass with all the extracted and analyzed data.
    9.  **Cleanup:** Crucially, it uses a `finally` block to ensure `driver.quit()` is always called, closing the browser and freeing up system resources, even if the automation fails.
*   **Outputs:**
    *   A `ReverseImageSearchResult` object containing the structured data extracted from the reverse image search, including the best guess, similar URLs, and analytical scores.

This concludes the granular analysis of the callables within the provided notebook.

In [None]:
# Custom Error Classes and Protocol Classes

class ErrorSeverity(Enum):
    """Enumeration of error severity levels for systematic classification."""
    LOW = "low"
    MEDIUM = "medium"
    HIGH = "high"
    CRITICAL = "critical"


class ErrorContext(Protocol):
    """Protocol defining interface for error context serialization."""

    def to_dict(self) -> Dict[str, Any]:
        """Serialize error context to dictionary for forensic analysis."""
        ...

    def to_json(self) -> str:
        """Serialize error context to JSON for logging and debugging."""
        ...


class ForensicMetadata:
    """
    Comprehensive forensic metadata container for debugging complex pipelines.

    Captures execution context, system state, and algorithmic parameters
    at the point of error occurrence for systematic analysis.
    """

    def __init__(
        self,
        operation_name: str,
        thread_id: Optional[int] = None,
        process_id: Optional[int] = None,
        timestamp: Optional[datetime.datetime] = None,
        system_info: Optional[Dict[str, Any]] = None,
        algorithm_parameters: Optional[Dict[str, Any]] = None,
        memory_usage: Optional[Dict[str, float]] = None,
        call_stack: Optional[List[str]] = None
    ) -> None:
        """
        Initialize forensic metadata with comprehensive execution context.

        Args:
            operation_name: Name of operation that generated the error
            thread_id: Thread identifier for concurrent execution debugging
            process_id: Process identifier for multi-process debugging
            timestamp: Precise timestamp of error occurrence
            system_info: System resource state at error time
            algorithm_parameters: Algorithm-specific parameters at failure
            memory_usage: Memory consumption metrics at failure point
            call_stack: Complete call stack trace for debugging
        """
        # Store primary operation identifier for error classification
        self.operation_name: str = operation_name

        # Capture execution context identifiers for concurrent debugging
        self.thread_id: int = thread_id or threading.get_ident()
        self.process_id: int = process_id or sys.getpid()

        # Record precise timestamp for temporal correlation analysis
        self.timestamp: datetime.datetime = timestamp or datetime.datetime.utcnow()

        # Store system resource state for resource-related error analysis
        self.system_info: Dict[str, Any] = system_info or {}

        # Preserve algorithm parameters for mathematical error analysis
        self.algorithm_parameters: Dict[str, Any] = algorithm_parameters or {}

        # Capture memory metrics for resource exhaustion debugging
        self.memory_usage: Dict[str, float] = memory_usage or {}

        # Store complete call stack for execution path reconstruction
        self.call_stack: List[str] = call_stack or traceback.format_stack()

    def to_dict(self) -> Dict[str, Any]:
        """
        Serialize forensic metadata to dictionary for structured analysis.

        Returns:
            Dictionary containing all forensic metadata fields
        """
        # Convert timestamp to ISO format for JSON compatibility
        return {
            "operation_name": self.operation_name,
            "thread_id": self.thread_id,
            "process_id": self.process_id,
            "timestamp": self.timestamp.isoformat(),
            "system_info": self.system_info,
            "algorithm_parameters": self.algorithm_parameters,
            "memory_usage": self.memory_usage,
            "call_stack": self.call_stack
        }

    def to_json(self) -> str:
        """
        Serialize forensic metadata to JSON for logging systems.

        Returns:
            JSON string representation of forensic metadata
        """
        # Convert to dictionary then serialize with indentation for readability
        return json.dumps(self.to_dict(), indent=2, default=str)


class ImageSimilarityError(Exception):
    """
    Base exception class for all image similarity detection errors.

    Implements comprehensive error context capture, forensic metadata collection,
    and structured error reporting for production debugging and analysis.
    """

    def __init__(
        self,
        message: str,
        severity: ErrorSeverity = ErrorSeverity.MEDIUM,
        operation_name: Optional[str] = None,
        error_code: Optional[str] = None,
        forensic_metadata: Optional[ForensicMetadata] = None,
        algorithm_context: Optional[Dict[str, Any]] = None,
        original_exception: Optional[Exception] = None,
        suggested_remediation: Optional[str] = None
    ) -> None:
        """
        Initialize base image similarity error with comprehensive context.

        Args:
            message: Human-readable error description
            severity: Error severity level for prioritization
            operation_name: Name of operation that generated error
            error_code: Unique error code for systematic classification
            forensic_metadata: Comprehensive debugging metadata
            algorithm_context: Algorithm-specific context and parameters
            original_exception: Original exception if this is a wrapper
            suggested_remediation: Actionable remediation suggestions
        """
        # Validate message parameter to prevent None values
        if message is None or not isinstance(message, str):
            message = "Unknown image similarity error occurred"

        # Initialize base Exception class with sanitized message
        super().__init__(message)

        # Store primary error message with null safety
        self.message: str = message

        # Classify error severity for systematic handling
        self.severity: ErrorSeverity = severity

        # Store operation context for debugging pipeline reconstruction
        self.operation_name: str = operation_name or "unknown_operation"

        # Assign unique error code for systematic error tracking
        self.error_code: str = error_code or f"{self.__class__.__name__}_{id(self)}"

        # Capture or create forensic metadata for debugging
        self.forensic_metadata: ForensicMetadata = forensic_metadata or ForensicMetadata(
            operation_name=self.operation_name
        )

        # Store algorithm-specific context for mathematical error analysis
        self.algorithm_context: Dict[str, Any] = algorithm_context or {}

        # Preserve original exception for exception chaining analysis
        self.original_exception: Optional[Exception] = original_exception

        # Store actionable remediation guidance for error recovery
        self.suggested_remediation: str = suggested_remediation or "Contact system administrator"

        # Record precise error occurrence time for temporal analysis
        self.timestamp: datetime.datetime = datetime.datetime.utcnow()

    def __str__(self) -> str:
        """
        Generate human-readable string representation for error display.

        Returns:
            Formatted error message with context and remediation
        """
        # Construct comprehensive error description with context
        error_description = f"[{self.severity.value.upper()}] {self.message}"

        # Add operation context if available for debugging
        if self.operation_name != "unknown_operation":
            error_description += f" (Operation: {self.operation_name})"

        # Include error code for systematic tracking
        error_description += f" [Code: {self.error_code}]"

        # Append remediation guidance for actionable response
        if self.suggested_remediation:
            error_description += f" | Suggested fix: {self.suggested_remediation}"

        return error_description

    def __repr__(self) -> str:
        """
        Generate detailed string representation for debugging and logging.

        Returns:
            Detailed error representation with all context information
        """
        # Create comprehensive representation for debugging purposes
        return (
            f"{self.__class__.__name__}("
            f"message='{self.message}', "
            f"severity={self.severity}, "
            f"operation_name='{self.operation_name}', "
            f"error_code='{self.error_code}', "
            f"timestamp='{self.timestamp.isoformat()}', "
            f"has_forensic_metadata={self.forensic_metadata is not None}, "
            f"has_algorithm_context={bool(self.algorithm_context)}, "
            f"has_original_exception={self.original_exception is not None}"
            f")"
        )

    def to_dict(self) -> Dict[str, Any]:
        """
        Serialize exception to dictionary for structured error analysis.

        Returns:
            Dictionary containing all error information and context
        """
        # Construct comprehensive error dictionary for serialization
        error_dict = {
            "error_class": self.__class__.__name__,
            "message": self.message,
            "severity": self.severity.value,
            "operation_name": self.operation_name,
            "error_code": self.error_code,
            "timestamp": self.timestamp.isoformat(),
            "algorithm_context": self.algorithm_context,
            "suggested_remediation": self.suggested_remediation
        }

        # Include forensic metadata if available
        if self.forensic_metadata:
            error_dict["forensic_metadata"] = self.forensic_metadata.to_dict()

        # Include original exception details if present
        if self.original_exception:
            error_dict["original_exception"] = {
                "type": type(self.original_exception).__name__,
                "message": str(self.original_exception),
                "args": self.original_exception.args
            }

        return error_dict

    def to_json(self) -> str:
        """
        Serialize exception to JSON for logging and external systems.

        Returns:
            JSON string representation of complete error context
        """
        # Convert to dictionary then serialize with proper formatting
        return json.dumps(self.to_dict(), indent=2, default=str)

    def log_error(self, logger: Optional[logging.Logger] = None) -> None:
        """
        Log error using structured logging for systematic analysis.

        Args:
            logger: Optional logger instance, creates default if None
        """
        # Create or use provided logger for error recording
        error_logger = logger or logging.getLogger(f"{__name__}.{self.__class__.__name__}")

        # Log error with appropriate severity level mapping
        log_level = {
            ErrorSeverity.LOW: logging.INFO,
            ErrorSeverity.MEDIUM: logging.WARNING,
            ErrorSeverity.HIGH: logging.ERROR,
            ErrorSeverity.CRITICAL: logging.CRITICAL
        }.get(self.severity, logging.ERROR)

        # Record structured error information for analysis
        error_logger.log(
            log_level,
            "Image similarity error occurred",
            extra={
                "error_data": self.to_dict(),
                "error_class": self.__class__.__name__,
                "operation_name": self.operation_name,
                "error_code": self.error_code
            }
        )


class ImageValidationError(ImageSimilarityError):
    """
    Base exception for image validation failures with file system context.

    Extends base error with file-specific forensic metadata including
    path information, file attributes, and validation step details.
    """

    def __init__(
        self,
        message: str,
        file_path: Optional[Union[str, Path]] = None,
        validation_step: Optional[str] = None,
        file_attributes: Optional[Dict[str, Any]] = None,
        **kwargs
    ) -> None:
        """
        Initialize image validation error with file system context.

        Args:
            message: Descriptive error message
            file_path: Path to file that failed validation
            validation_step: Specific validation step that failed
            file_attributes: File system attributes at validation time
            **kwargs: Additional arguments passed to base class
        """
        # Store file path with null safety and path normalization
        self.file_path: Optional[Path] = Path(file_path) if file_path else None

        # Record specific validation step for debugging workflow
        self.validation_step: str = validation_step or "unknown_validation"

        # Capture file attributes for forensic analysis
        self.file_attributes: Dict[str, Any] = file_attributes or {}

        # Enhance algorithm context with file validation information
        algorithm_context = kwargs.get('algorithm_context', {})
        algorithm_context.update({
            "file_path": str(self.file_path) if self.file_path else None,
            "validation_step": self.validation_step,
            "file_attributes": self.file_attributes
        })
        kwargs['algorithm_context'] = algorithm_context

        # Initialize base class with enhanced context
        super().__init__(message, **kwargs)


class ImageNotFoundError(ImageValidationError):
    """
    Raised when specified image file does not exist in filesystem.

    Implements file system state capture and path resolution debugging
    for systematic analysis of file accessibility issues.
    """

    def __init__(
        self,
        message: str,
        file_path: Optional[Union[str, Path]] = None,
        search_paths: Optional[List[Union[str, Path]]] = None,
        **kwargs
    ) -> None:
        """
        Initialize image not found error with path search context.

        Args:
            message: Error description
            file_path: Primary file path that was not found
            search_paths: Additional paths searched for file
            **kwargs: Additional arguments for base class
        """
        # Store attempted search paths for debugging path resolution
        self.search_paths: List[Path] = [
            Path(p) for p in (search_paths or [])
        ]

        # Set default severity to high for missing required files
        kwargs.setdefault('severity', ErrorSeverity.HIGH)

        # Set validation step context for this error type
        kwargs.setdefault('validation_step', 'file_existence_check')

        # Provide specific remediation guidance for file not found
        kwargs.setdefault(
            'suggested_remediation',
            'Verify file path exists and has correct permissions'
        )

        # Initialize validation error with file context
        super().__init__(message, file_path=file_path, **kwargs)


class NotAFileError(ImageValidationError):
    """
    Raised when path exists but is not a regular file.

    Captures file type information and inode details for debugging
    symbolic links, directories, and special files.
    """

    def __init__(
        self,
        message: str,
        file_path: Optional[Union[str, Path]] = None,
        actual_file_type: Optional[str] = None,
        **kwargs
    ) -> None:
        """
        Initialize not-a-file error with file type context.

        Args:
            message: Error description
            file_path: Path that is not a regular file
            actual_file_type: Actual type of file system object
            **kwargs: Additional arguments for base class
        """
        # Record actual file type for debugging file system issues
        self.actual_file_type: str = actual_file_type or "unknown"

        # Set medium severity for file type mismatches
        kwargs.setdefault('severity', ErrorSeverity.MEDIUM)

        # Set validation step context
        kwargs.setdefault('validation_step', 'file_type_verification')

        # Provide specific remediation guidance
        kwargs.setdefault(
            'suggested_remediation',
            f'Path points to {self.actual_file_type}, expected regular file'
        )

        # Initialize validation error with type context
        super().__init__(message, file_path=file_path, **kwargs)


class SymlinkNotAllowedError(ImageValidationError):
    """
    Raised when symlinks are encountered but not permitted by policy.

    Includes symlink resolution chain analysis for security auditing
    and circular reference detection.
    """

    def __init__(
        self,
        message: str,
        file_path: Optional[Union[str, Path]] = None,
        symlink_target: Optional[Union[str, Path]] = None,
        resolution_chain: Optional[List[Union[str, Path]]] = None,
        **kwargs
    ) -> None:
        """
        Initialize symlink policy violation error with resolution context.

        Args:
            message: Error description
            file_path: Symlink path that was rejected
            symlink_target: Final target of symlink resolution
            resolution_chain: Complete symlink resolution chain
            **kwargs: Additional arguments for base class
        """
        # Store symlink target for security analysis
        self.symlink_target: Optional[Path] = Path(symlink_target) if symlink_target else None

        # Store complete resolution chain for circular reference detection
        self.resolution_chain: List[Path] = [
            Path(p) for p in (resolution_chain or [])
        ]

        # Set medium severity for policy violations
        kwargs.setdefault('severity', ErrorSeverity.MEDIUM)

        # Set validation step context
        kwargs.setdefault('validation_step', 'symlink_policy_enforcement')

        # Provide specific remediation guidance
        kwargs.setdefault(
            'suggested_remediation',
            'Enable symlink support or use direct file path'
        )

        # Initialize validation error with symlink context
        super().__init__(message, file_path=file_path, **kwargs)


class ImageUnreadableError(ImageValidationError):
    """
    Raised when image file exists but cannot be read or parsed.

    Captures file corruption indicators, format validation results,
    and binary header analysis for comprehensive debugging.
    """

    def __init__(
        self,
        message: str,
        file_path: Optional[Union[str, Path]] = None,
        corruption_indicators: Optional[Dict[str, Any]] = None,
        format_analysis: Optional[Dict[str, Any]] = None,
        **kwargs
    ) -> None:
        """
        Initialize image unreadable error with corruption analysis.

        Args:
            message: Error description
            file_path: Path to unreadable image file
            corruption_indicators: Detected corruption indicators
            format_analysis: Image format validation results
            **kwargs: Additional arguments for base class
        """
        # Store corruption analysis for debugging file integrity
        self.corruption_indicators: Dict[str, Any] = corruption_indicators or {}

        # Store format analysis for debugging parsing failures
        self.format_analysis: Dict[str, Any] = format_analysis or {}

        # Set high severity for data corruption issues
        kwargs.setdefault('severity', ErrorSeverity.HIGH)

        # Set validation step context
        kwargs.setdefault('validation_step', 'image_readability_check')

        # Provide specific remediation guidance
        kwargs.setdefault(
            'suggested_remediation',
            'Verify file integrity and format compatibility'
        )

        # Initialize validation error with corruption context
        super().__init__(message, file_path=file_path, **kwargs)


class PermissionDeniedError(ImageValidationError):
    """
    Raised when insufficient permissions to access image file.

    Captures permission analysis, ownership information, and ACL details
    for systematic permission debugging and security auditing.
    """

    def __init__(
        self,
        message: str,
        file_path: Optional[Union[str, Path]] = None,
        permission_analysis: Optional[Dict[str, Any]] = None,
        required_permissions: Optional[List[str]] = None,
        **kwargs
    ) -> None:
        """
        Initialize permission error with access control context.

        Args:
            message: Error description
            file_path: Path with permission issues
            permission_analysis: Detailed permission analysis
            required_permissions: List of required permissions
            **kwargs: Additional arguments for base class
        """
        # Store permission analysis for access control debugging
        self.permission_analysis: Dict[str, Any] = permission_analysis or {}

        # Store required permissions for remediation guidance
        self.required_permissions: List[str] = required_permissions or ["read"]

        # Set high severity for security-related access issues
        kwargs.setdefault('severity', ErrorSeverity.HIGH)

        # Set validation step context
        kwargs.setdefault('validation_step', 'permission_verification')

        # Provide specific remediation guidance
        kwargs.setdefault(
            'suggested_remediation',
            f'Grant {", ".join(self.required_permissions)} permissions to file'
        )

        # Initialize validation error with permission context
        super().__init__(message, file_path=file_path, **kwargs)


class InitializationError(ImageSimilarityError):
    """
    Base exception for component initialization failures.

    Captures system resource state, dependency information, and
    initialization sequence details for systematic debugging.
    """

    def __init__(
        self,
        message: str,
        component_name: Optional[str] = None,
        initialization_stage: Optional[str] = None,
        system_resources: Optional[Dict[str, Any]] = None,
        dependency_info: Optional[Dict[str, Any]] = None,
        **kwargs
    ) -> None:
        """
        Initialize component initialization error with system context.

        Args:
            message: Error description
            component_name: Name of component that failed to initialize
            initialization_stage: Specific initialization stage that failed
            system_resources: System resource state at failure
            dependency_info: Dependency availability and version information
            **kwargs: Additional arguments for base class
        """
        # Store component identification for debugging
        self.component_name: str = component_name or "unknown_component"

        # Record initialization stage for workflow debugging
        self.initialization_stage: str = initialization_stage or "unknown_stage"

        # Capture system resource state for resource-related failures
        self.system_resources: Dict[str, Any] = system_resources or {}

        # Store dependency information for version compatibility analysis
        self.dependency_info: Dict[str, Any] = dependency_info or {}

        # Set high severity for initialization failures
        kwargs.setdefault('severity', ErrorSeverity.HIGH)

        # Set operation context for initialization
        kwargs.setdefault('operation_name', f'{self.component_name}_initialization')

        # Initialize base class with initialization context
        super().__init__(message, **kwargs)


class OpenCVInitializationError(InitializationError):
    """
    Raised when OpenCV components fail to initialize.

    Captures OpenCV version information, build configuration,
    and hardware compatibility details for debugging.
    """

    def __init__(
        self,
        message: str,
        opencv_version: Optional[str] = None,
        build_info: Optional[Dict[str, Any]] = None,
        hardware_compatibility: Optional[Dict[str, Any]] = None,
        **kwargs
    ) -> None:
        """
        Initialize OpenCV error with library-specific context.

        Args:
            message: Error description
            opencv_version: OpenCV library version
            build_info: OpenCV build configuration details
            hardware_compatibility: Hardware compatibility analysis
            **kwargs: Additional arguments for base class
        """
        # Store OpenCV version for compatibility debugging
        self.opencv_version: Optional[str] = opencv_version

        # Store build configuration for debugging compilation issues
        self.build_info: Dict[str, Any] = build_info or {}

        # Store hardware compatibility analysis
        self.hardware_compatibility: Dict[str, Any] = hardware_compatibility or {}

        # Set component name for OpenCV
        kwargs.setdefault('component_name', 'opencv')

        # Provide OpenCV-specific remediation guidance
        kwargs.setdefault(
            'suggested_remediation',
            'Verify OpenCV installation and hardware compatibility'
        )

        # Initialize base initialization error
        super().__init__(message, **kwargs)


class ResourceAllocationError(InitializationError):
    """
    Raised when system resources cannot be allocated.

    Captures memory usage, CPU utilization, and GPU availability
    for comprehensive resource constraint analysis.
    """

    def __init__(
        self,
        message: str,
        resource_type: Optional[str] = None,
        requested_amount: Optional[Union[int, float]] = None,
        available_amount: Optional[Union[int, float]] = None,
        resource_metrics: Optional[Dict[str, Any]] = None,
        **kwargs
    ) -> None:
        """
        Initialize resource allocation error with usage metrics.

        Args:
            message: Error description
            resource_type: Type of resource that failed allocation
            requested_amount: Amount of resource requested
            available_amount: Amount of resource available
            resource_metrics: Comprehensive resource usage metrics
            **kwargs: Additional arguments for base class
        """
        # Store resource type for allocation debugging
        self.resource_type: str = resource_type or "unknown_resource"

        # Store resource amounts for capacity analysis
        self.requested_amount: Optional[Union[int, float]] = requested_amount
        self.available_amount: Optional[Union[int, float]] = available_amount

        # Store comprehensive resource metrics
        self.resource_metrics: Dict[str, Any] = resource_metrics or {}

        # Set critical severity for resource exhaustion
        kwargs.setdefault('severity', ErrorSeverity.CRITICAL)

        # Set component name for resource allocation
        kwargs.setdefault('component_name', f'{self.resource_type}_allocator')

        # Provide resource-specific remediation guidance
        kwargs.setdefault(
            'suggested_remediation',
            f'Increase available {self.resource_type} or reduce usage'
        )

        # Initialize base initialization error
        super().__init__(message, **kwargs)


class ModelLoadError(ImageSimilarityError):
    """
    Raised when machine learning models fail to load.

    Captures model metadata, loading parameters, and hardware requirements
    for systematic model deployment debugging.
    """

    def __init__(
        self,
        message: str,
        model_name: Optional[str] = None,
        model_version: Optional[str] = None,
        loading_parameters: Optional[Dict[str, Any]] = None,
        hardware_requirements: Optional[Dict[str, Any]] = None,
        **kwargs
    ) -> None:
        """
        Initialize model loading error with model-specific context.

        Args:
            message: Error description
            model_name: Name of model that failed to load
            model_version: Version of model that failed to load
            loading_parameters: Parameters used for model loading
            hardware_requirements: Hardware requirements for model
            **kwargs: Additional arguments for base class
        """
        # Store model identification for debugging
        self.model_name: str = model_name or "unknown_model"
        self.model_version: Optional[str] = model_version

        # Store loading parameters for configuration debugging
        self.loading_parameters: Dict[str, Any] = loading_parameters or {}

        # Store hardware requirements for compatibility analysis
        self.hardware_requirements: Dict[str, Any] = hardware_requirements or {}

        # Set high severity for model loading failures
        kwargs.setdefault('severity', ErrorSeverity.HIGH)

        # Set operation context for model loading
        kwargs.setdefault('operation_name', f'{self.model_name}_loading')

        # Provide model-specific remediation guidance
        kwargs.setdefault(
            'suggested_remediation',
            'Verify model availability and hardware compatibility'
        )

        # Initialize base class with model context
        super().__init__(message, **kwargs)


class ModelInferenceError(ImageSimilarityError):
    """
    Raised when model inference operations fail.

    Captures inference parameters, input characteristics, and performance metrics
    for debugging inference pipeline failures.
    """

    def __init__(
        self,
        message: str,
        model_name: Optional[str] = None,
        inference_parameters: Optional[Dict[str, Any]] = None,
        input_characteristics: Optional[Dict[str, Any]] = None,
        performance_metrics: Optional[Dict[str, Any]] = None,
        **kwargs
    ) -> None:
        """
        Initialize model inference error with inference context.

        Args:
            message: Error description
            model_name: Name of model during inference
            inference_parameters: Parameters used for inference
            input_characteristics: Characteristics of input data
            performance_metrics: Performance metrics at failure
            **kwargs: Additional arguments for base class
        """
        # Store model identification for debugging
        self.model_name: str = model_name or "unknown_model"

        # Store inference parameters for configuration debugging
        self.inference_parameters: Dict[str, Any] = inference_parameters or {}

        # Store input characteristics for data validation debugging
        self.input_characteristics: Dict[str, Any] = input_characteristics or {}

        # Store performance metrics for optimization analysis
        self.performance_metrics: Dict[str, Any] = performance_metrics or {}

        # Set high severity for inference failures
        kwargs.setdefault('severity', ErrorSeverity.HIGH)

        # Set operation context for inference
        kwargs.setdefault('operation_name', f'{self.model_name}_inference')

        # Provide inference-specific remediation guidance
        kwargs.setdefault(
            'suggested_remediation',
            'Verify input data format and model state'
        )

        # Initialize base class with inference context
        super().__init__(message, **kwargs)


class HistogramError(ImageSimilarityError):
    """
    Raised when histogram computation or comparison fails.

    Captures histogram parameters, statistical properties, and
    mathematical constraints for systematic histogram analysis debugging.
    """

    def __init__(
        self,
        message: str,
        histogram_parameters: Optional[Dict[str, Any]] = None,
        statistical_properties: Optional[Dict[str, Any]] = None,
        mathematical_constraints: Optional[Dict[str, Any]] = None,
        **kwargs
    ) -> None:
        """
        Initialize histogram error with statistical context.

        Args:
            message: Error description
            histogram_parameters: Parameters used for histogram computation
            statistical_properties: Statistical properties of histograms
            mathematical_constraints: Mathematical constraints that were violated
            **kwargs: Additional arguments for base class
        """
        # Store histogram parameters for algorithm debugging
        self.histogram_parameters: Dict[str, Any] = histogram_parameters or {}

        # Store statistical properties for mathematical analysis
        self.statistical_properties: Dict[str, Any] = statistical_properties or {}

        # Store mathematical constraints for validation debugging
        self.mathematical_constraints: Dict[str, Any] = mathematical_constraints or {}

        # Set medium severity for histogram computation issues
        kwargs.setdefault('severity', ErrorSeverity.MEDIUM)

        # Set operation context for histogram operations
        kwargs.setdefault('operation_name', 'histogram_computation')

        # Provide histogram-specific remediation guidance
        kwargs.setdefault(
            'suggested_remediation',
            'Verify histogram parameters and input data validity'
        )

        # Initialize base class with histogram context
        super().__init__(message, **kwargs)


class ReverseSearchError(ImageSimilarityError):
    """
    Base exception for reverse image search failures.

    Captures web automation context, browser state, and network conditions
    for debugging web scraping and automation failures.
    """

    def __init__(
        self,
        message: str,
        browser_info: Optional[Dict[str, Any]] = None,
        network_conditions: Optional[Dict[str, Any]] = None,
        automation_state: Optional[Dict[str, Any]] = None,
        **kwargs
    ) -> None:
        """
        Initialize reverse search error with web automation context.

        Args:
            message: Error description
            browser_info: Browser version and configuration information
            network_conditions: Network connectivity and performance metrics
            automation_state: Web automation state at failure
            **kwargs: Additional arguments for base class
        """
        # Store browser information for compatibility debugging
        self.browser_info: Dict[str, Any] = browser_info or {}

        # Store network conditions for connectivity debugging
        self.network_conditions: Dict[str, Any] = network_conditions or {}

        # Store automation state for workflow debugging
        self.automation_state: Dict[str, Any] = automation_state or {}

        # Set medium severity for web automation issues
        kwargs.setdefault('severity', ErrorSeverity.MEDIUM)

        # Set operation context for reverse search
        kwargs.setdefault('operation_name', 'reverse_image_search')

        # Initialize base class with web automation context
        super().__init__(message, **kwargs)


class LaunchError(ReverseSearchError):
    """
    Raised when browser launch fails.

    Captures browser installation details, driver compatibility,
    and system environment for debugging launch failures.
    """

    def __init__(
        self,
        message: str,
        driver_path: Optional[Union[str, Path]] = None,
        browser_path: Optional[Union[str, Path]] = None,
        driver_version: Optional[str] = None,
        browser_version: Optional[str] = None,
        **kwargs
    ) -> None:
        """
        Initialize browser launch error with driver context.

        Args:
            message: Error description
            driver_path: Path to WebDriver executable
            browser_path: Path to browser executable
            driver_version: WebDriver version
            browser_version: Browser version
            **kwargs: Additional arguments for base class
        """
        # Store driver and browser paths for installation debugging
        self.driver_path: Optional[Path] = Path(driver_path) if driver_path else None
        self.browser_path: Optional[Path] = Path(browser_path) if browser_path else None

        # Store version information for compatibility debugging
        self.driver_version: Optional[str] = driver_version
        self.browser_version: Optional[str] = browser_version

        # Set high severity for launch failures
        kwargs.setdefault('severity', ErrorSeverity.HIGH)

        # Set operation context for browser launch
        kwargs.setdefault('operation_name', 'browser_launch')

        # Provide launch-specific remediation guidance
        kwargs.setdefault(
            'suggested_remediation',
            'Verify WebDriver and browser installation compatibility'
        )

        # Initialize reverse search error with launch context
        super().__init__(message, **kwargs)


class NavigationError(ReverseSearchError):
    """
    Raised when page navigation fails.

    Captures URL information, HTTP response codes, and page load metrics
    for debugging navigation and connectivity issues.
    """

    def __init__(
        self,
        message: str,
        target_url: Optional[str] = None,
        http_status_code: Optional[int] = None,
        page_load_metrics: Optional[Dict[str, Any]] = None,
        **kwargs
    ) -> None:
        """
        Initialize navigation error with page context.

        Args:
            message: Error description
            target_url: URL that failed to load
            http_status_code: HTTP response status code
            page_load_metrics: Page loading performance metrics
            **kwargs: Additional arguments for base class
        """
        # Store target URL for navigation debugging
        self.target_url: Optional[str] = target_url

        # Store HTTP status for connectivity debugging
        self.http_status_code: Optional[int] = http_status_code

        # Store page load metrics for performance debugging
        self.page_load_metrics: Dict[str, Any] = page_load_metrics or {}

        # Set medium severity for navigation issues
        kwargs.setdefault('severity', ErrorSeverity.MEDIUM)

        # Set operation context for navigation
        kwargs.setdefault('operation_name', 'page_navigation')

        # Provide navigation-specific remediation guidance
        kwargs.setdefault(
            'suggested_remediation',
            'Verify network connectivity and target URL accessibility'
        )

        # Initialize reverse search error with navigation context
        super().__init__(message, **kwargs)


class UploadError(ReverseSearchError):
    """
    Raised when file upload fails.

    Captures upload parameters, file characteristics, and browser state
    for debugging file upload mechanism failures.
    """

    def __init__(
        self,
        message: str,
        file_path: Optional[Union[str, Path]] = None,
        file_size: Optional[int] = None,
        upload_parameters: Optional[Dict[str, Any]] = None,
        **kwargs
    ) -> None:
        """
        Initialize upload error with file context.

        Args:
            message: Error description
            file_path: Path to file that failed upload
            file_size: Size of file in bytes
            upload_parameters: Parameters used for upload attempt
            **kwargs: Additional arguments for base class
        """
        # Store file information for upload debugging
        self.file_path: Optional[Path] = Path(file_path) if file_path else None
        self.file_size: Optional[int] = file_size

        # Store upload parameters for mechanism debugging
        self.upload_parameters: Dict[str, Any] = upload_parameters or {}

        # Set medium severity for upload issues
        kwargs.setdefault('severity', ErrorSeverity.MEDIUM)

        # Set operation context for file upload
        kwargs.setdefault('operation_name', 'file_upload')

        # Provide upload-specific remediation guidance
        kwargs.setdefault(
            'suggested_remediation',
            'Verify file accessibility and upload mechanism compatibility'
        )

        # Initialize reverse search error with upload context
        super().__init__(message, **kwargs)


class ExtractionError(ReverseSearchError):
    """
    Raised when data extraction from results fails.

    Captures DOM state, extraction parameters, and page content
    for debugging web scraping and data extraction failures.
    """

    def __init__(
        self,
        message: str,
        extraction_target: Optional[str] = None,
        dom_state: Optional[Dict[str, Any]] = None,
        extraction_parameters: Optional[Dict[str, Any]] = None,
        **kwargs
    ) -> None:
        """
        Initialize extraction error with DOM context.

        Args:
            message: Error description
            extraction_target: Target element or data for extraction
            dom_state: DOM state at extraction time
            extraction_parameters: Parameters used for extraction
            **kwargs: Additional arguments for base class
        """
        # Store extraction target for debugging data location
        self.extraction_target: Optional[str] = extraction_target

        # Store DOM state for page structure debugging
        self.dom_state: Dict[str, Any] = dom_state or {}

        # Store extraction parameters for algorithm debugging
        self.extraction_parameters: Dict[str, Any] = extraction_parameters or {}

        # Set medium severity for extraction issues
        kwargs.setdefault('severity', ErrorSeverity.MEDIUM)

        # Set operation context for data extraction
        kwargs.setdefault('operation_name', 'data_extraction')

        # Provide extraction-specific remediation guidance
        kwargs.setdefault(
            'suggested_remediation',
            'Verify page structure and extraction selector validity'
        )

        # Initialize reverse search error with extraction context
        super().__init__(message, **kwargs)


class AlgorithmComplexity(Enum):
    """Enumeration of algorithmic complexity classifications for performance analysis."""
    LINEAR = "O(n)"
    QUADRATIC = "O(n²)"
    LOGARITHMIC = "O(log n)"
    EXPONENTIAL = "O(2^n)"
    CONSTANT = "O(1)"


@dataclass(frozen=True)
class KeypointConstraints:
    """
    Mathematical constraints for keypoint detection algorithms.

    Defines dimensional requirements, numerical ranges, and geometric properties
    for keypoint data structures in computer vision applications.
    """

    # Minimum number of keypoints for statistically significant analysis
    min_keypoints: int = 10

    # Maximum number of keypoints to prevent computational explosion
    max_keypoints: int = 5000

    # Coordinate bounds for image space [0, width] × [0, height]
    coordinate_bounds: Tuple[float, float, float, float] = (0.0, 0.0, float('inf'), float('inf'))

    # Response strength bounds for corner detection quality
    response_bounds: Tuple[float, float] = (0.0, float('inf'))

    # Angle bounds for orientation estimation [0, 2π]
    angle_bounds: Tuple[float, float] = (0.0, 2.0 * np.pi)

    # Scale bounds for scale-invariant detection
    scale_bounds: Tuple[float, float] = (0.1, 10.0)


@dataclass(frozen=True)
class DescriptorConstraints:
    """
    Mathematical constraints for binary descriptor algorithms.

    Defines bit string properties, dimensionality requirements, and
    information-theoretic constraints for binary descriptors.
    """

    # Standard descriptor length for BRIEF/ORB (256 bits = 32 bytes)
    descriptor_length: int = 32

    # Data type for binary descriptors (unsigned 8-bit integers)
    descriptor_dtype: np.dtype = np.dtype(np.uint8)

    # Hamming distance bounds [0, descriptor_length * 8]
    hamming_distance_bounds: Tuple[int, int] = (0, 256)

    # Entropy bounds for descriptor quality [0, log₂(descriptor_length * 8)]
    entropy_bounds: Tuple[float, float] = (0.0, 8.0)

    # Bit distribution bounds for uniform descriptor quality
    bit_distribution_bounds: Tuple[float, float] = (0.4, 0.6)


@dataclass(frozen=True)
class MatchConstraints:
    """
    Mathematical constraints for descriptor matching algorithms.

    Defines distance thresholds, ratio test parameters, and
    statistical significance requirements for feature matching.
    """

    # Maximum Hamming distance for binary descriptor matching
    max_hamming_distance: int = 64

    # Lowe's ratio test threshold for disambiguation
    lowe_ratio_threshold: float = 0.75

    # Minimum number of matches for geometric verification
    min_matches_for_verification: int = 4

    # Maximum number of matches to prevent computational overflow
    max_matches: int = 1000

    # Cross-check validation requirement for bidirectional matching
    require_cross_check: bool = True


def protocol_validator(
    constraints: Optional[Dict[str, Any]] = None,
    log_violations: bool = True
) -> Callable:
    """
    Decorator factory for runtime protocol validation with mathematical constraints.

    Implements comprehensive validation of protocol method calls including
    parameter bounds checking, return value validation, and constraint verification.

    Args:
        constraints: Dictionary of mathematical constraints to enforce
        log_violations: Whether to log constraint violations for debugging

    Returns:
        Decorator function for protocol method validation
    """

    def decorator(func: Callable) -> Callable:
        """
        Protocol method decorator implementing runtime validation.

        Args:
            func: Protocol method to be validated

        Returns:
            Wrapped function with validation logic
        """

        @functools.wraps(func)
        def wrapper(*args, **kwargs):
            """
            Validation wrapper implementing constraint checking and logging.

            Args:
                *args: Positional arguments to protocol method
                **kwargs: Keyword arguments to protocol method

            Returns:
                Validated result from protocol method call

            Raises:
                ValueError: If constraints are violated
                TypeError: If parameter types are invalid
            """
            # Extract logger for validation reporting
            logger = logging.getLogger(f"{__name__}.protocol_validator")

            # Validate input parameters against mathematical constraints
            if constraints:
                # Iterate through constraint specifications
                for param_name, constraint_spec in constraints.items():
                    # Extract parameter value from args/kwargs
                    param_value = kwargs.get(param_name)
                    if param_value is None and len(args) > 0:
                        # Attempt to extract from positional arguments
                        sig = inspect.signature(func)
                        param_names = list(sig.parameters.keys())
                        if param_name in param_names:
                            param_index = param_names.index(param_name)
                            if param_index < len(args):
                                param_value = args[param_index]

                    # Apply constraint validation if parameter found
                    if param_value is not None:
                        # Validate numerical bounds constraints
                        if 'bounds' in constraint_spec:
                            min_val, max_val = constraint_spec['bounds']
                            if not (min_val <= param_value <= max_val):
                                error_msg = f"Parameter {param_name} = {param_value} violates bounds [{min_val}, {max_val}]"
                                if log_violations:
                                    logger.error(error_msg)
                                raise ValueError(error_msg)

                        # Validate type constraints
                        if 'type' in constraint_spec:
                            expected_type = constraint_spec['type']
                            if not isinstance(param_value, expected_type):
                                error_msg = f"Parameter {param_name} has type {type(param_value)}, expected {expected_type}"
                                if log_violations:
                                    logger.error(error_msg)
                                raise TypeError(error_msg)

                        # Validate shape constraints for array parameters
                        if 'shape' in constraint_spec and hasattr(param_value, 'shape'):
                            expected_shape = constraint_spec['shape']
                            if param_value.shape != expected_shape and expected_shape is not None:
                                error_msg = f"Parameter {param_name} has shape {param_value.shape}, expected {expected_shape}"
                                if log_violations:
                                    logger.error(error_msg)
                                raise ValueError(error_msg)

            # Execute protocol method with validated parameters
            result = func(*args, **kwargs)

            # Validate return value constraints if specified
            if constraints and 'return_value' in constraints:
                return_constraints = constraints['return_value']

                # Validate return type constraints
                if 'type' in return_constraints:
                    expected_type = return_constraints['type']
                    if not isinstance(result, expected_type):
                        error_msg = f"Return value has type {type(result)}, expected {expected_type}"
                        if log_violations:
                            logger.error(error_msg)
                        raise TypeError(error_msg)

                # Validate return value bounds
                if 'bounds' in return_constraints and result is not None:
                    min_val, max_val = return_constraints['bounds']
                    if hasattr(result, '__len__'):
                        # Validate collection length bounds
                        if not (min_val <= len(result) <= max_val):
                            error_msg = f"Return value length {len(result)} violates bounds [{min_val}, {max_val}]"
                            if log_violations:
                                logger.error(error_msg)
                            raise ValueError(error_msg)
                    else:
                        # Validate scalar value bounds
                        if not (min_val <= result <= max_val):
                            error_msg = f"Return value {result} violates bounds [{min_val}, {max_val}]"
                            if log_violations:
                                logger.error(error_msg)
                            raise ValueError(error_msg)

            return result

        return wrapper
    return decorator


@runtime_checkable
class FeatureDetectorProtocol(Protocol):
    """
    Mathematical protocol defining interface for corner detection and feature extraction algorithms.

    Implements the theoretical foundation for scale-invariant feature detection based on:

    1. FAST Corner Detection:
       - Corner response: R = det(M) - k·trace²(M) where M is structure tensor
       - Harris matrix: M = G * [Ix² IxIy; IxIy Iy²] convolved with Gaussian G
       - Corner threshold: R > threshold for corner classification

    2. Scale-Space Representation:
       - Gaussian pyramid: L(x,y,σ) = G(x,y,σ) * I(x,y)
       - Scale selection: σ = 1.2^octave · 2^(layer/nOctaveLayers)
       - Extrema detection: local maxima in scale-normalized Laplacian

    3. Orientation Assignment:
       - Gradient magnitude: m(x,y) = √((L(x+1,y) - L(x-1,y))² + (L(x,y+1) - L(x,y-1))²)
       - Gradient orientation: θ(x,y) = atan2(L(x,y+1) - L(x,y-1), L(x+1,y) - L(x-1,y))
       - Dominant orientation: peak in weighted orientation histogram

    Mathematical Complexity: O(n·m·s) where n×m is image dimensions, s is scale levels
    Memory Complexity: O(n·m·s) for scale pyramid storage
    Numerical Stability: Requires σ > 0.5 for proper Gaussian approximation
    """

    @protocol_validator(constraints={
        'image': {
            'type': np.ndarray,
            'bounds': (0, float('inf'))  # Non-negative pixel values
        },
        'mask': {
            'type': (type(None), np.ndarray)
        },
        'return_value': {
            'type': tuple,
            'bounds': (0, 10000)  # Reasonable keypoint count bounds
        }
    })
    def detectAndCompute(
        self,
        image: np.ndarray,
        mask: Optional[np.ndarray] = None
    ) -> Tuple[List[cv2.KeyPoint], Optional[np.ndarray]]:
        """
        Detect keypoints and compute binary descriptors with mathematical guarantees.

        Implements the complete feature detection pipeline:

        1. Image Preprocessing:
           - Gaussian smoothing: I' = G_σ * I for noise reduction
           - Gradient computation: ∇I = [∂I/∂x, ∂I/∂y]
           - Structure tensor: M = G_ρ * (∇I ∇I^T)

        2. Keypoint Detection (FAST Algorithm):
           - Circle test: |I(p) - I(x_i)| > t for i ∈ {1,2,...,16}
           - Non-maximal suppression: local maxima in corner response
           - Sub-pixel refinement: quadratic interpolation for precision

        3. Descriptor Computation (BRIEF/ORB):
           - Binary tests: τ(p; x,y) := 1 if p(x) < p(y), else 0
           - Descriptor vector: f_n(p) := Σ_{1≤i≤n} 2^(i-1)τ(p; x_i, y_i)
           - Rotation invariance: orientation compensation via patch rotation

        Mathematical Requirements:
        - Image must be single-channel grayscale: I: ℝ² → [0,255]
        - Mask must be binary if provided: M: ℝ² → {0,1}
        - Keypoints satisfy Harris corner criterion: R > threshold
        - Descriptors maintain Hamming distance properties: d_H ∈ [0,256]

        Args:
            image: Input grayscale image as 2D numpy array of uint8 values
                  Mathematical domain: I ∈ [0,255]^(H×W)
            mask: Optional binary mask for region-of-interest detection
                 Mathematical domain: M ∈ {0,1}^(H×W) or None

        Returns:
            Tuple containing:
            - keypoints: List of cv2.KeyPoint objects with mathematical properties:
              * pt: (x,y) coordinates in image space ℝ²
              * response: corner strength R ∈ ℝ⁺
              * angle: orientation θ ∈ [0,2π)
              * size: characteristic scale σ ∈ ℝ⁺
            - descriptors: Binary descriptor matrix of shape (n_keypoints, 32)
              * Mathematical domain: D ∈ {0,1,...,255}^(n×32)
              * Hamming distance: d_H(d_i, d_j) = Σ_k |d_i[k] ⊕ d_j[k]|
              * None if no keypoints detected

        Raises:
            ValueError: If image is not 2D grayscale or mask shape incompatible
            TypeError: If input types do not conform to mathematical requirements

        Mathematical Guarantees:
        - Keypoint coordinates: (x,y) ∈ [0,W] × [0,H]
        - Corner response: R ≥ threshold > 0
        - Orientation: θ ∈ [0,2π) with ±π/12 accuracy
        - Scale: σ ∈ [1.2^0, 1.2^nOctaves]
        - Descriptor consistency: |d_i| = 256 bits ∀i
        """
        ...


@runtime_checkable
class MatcherProtocol(Protocol):
    """
    Mathematical protocol defining interface for binary descriptor matching algorithms.

    Implements theoretical foundation for nearest neighbor search in Hamming space:

    1. Hamming Distance Computation:
       - Binary XOR operation: d_H(x,y) = Σᵢ |xᵢ ⊕ yᵢ|
       - Population count: popcount(x ⊕ y) for efficient computation
       - Distance bounds: d_H ∈ [0, descriptor_length]

    2. Brute-Force Matching:
       - Exhaustive search: min_{j} d_H(qᵢ, tⱼ) for query qᵢ
       - Time complexity: O(n·m·k) where n,m are descriptor counts, k is length
       - Space complexity: O(n·m) for distance matrix storage

    3. Cross-Check Validation:
       - Bidirectional consistency: match(qᵢ, tⱼ) iff match(tⱼ, qᵢ)
       - Reduces false positives: P(false_positive) ≈ 1/m
       - Improves precision at cost of recall

    Mathematical Properties:
    - Metric space: (Hamming space, d_H) satisfies triangle inequality
    - Symmetry: d_H(x,y) = d_H(y,x)
    - Non-negativity: d_H(x,y) ≥ 0 with equality iff x = y
    """

    @protocol_validator(constraints={
        'queryDescriptors': {
            'type': np.ndarray,
            'shape': (None, 32)  # Standard BRIEF/ORB descriptor length
        },
        'trainDescriptors': {
            'type': np.ndarray,
            'shape': (None, 32)
        },
        'return_value': {
            'type': list,
            'bounds': (0, 10000)
        }
    })
    def match(
        self,
        queryDescriptors: np.ndarray,
        trainDescriptors: np.ndarray
    ) -> List[cv2.DMatch]:
        """
        Perform optimal assignment matching in Hamming space with mathematical guarantees.

        Implements exhaustive nearest neighbor search with the following algorithm:

        1. Distance Matrix Computation:
           - Compute D[i,j] = d_H(query[i], train[j]) ∀i,j
           - Use bit manipulation: d_H = popcount(query[i] ⊕ train[j])
           - Matrix domain: D ∈ ℕ₀^(n_query × n_train)

        2. Optimal Assignment:
           - For each query descriptor qᵢ: match[i] = argmin_j D[i,j]
           - Distance constraint: D[i, match[i]] ≤ threshold
           - Uniqueness enforcement via cross-check if enabled

        3. Cross-Check Validation (if enabled):
           - Forward match: f(i) = argmin_j d_H(qᵢ, tⱼ)
           - Backward match: b(j) = argmin_i d_H(qᵢ, tⱼ)
           - Accept match iff b(f(i)) = i (bidirectional consistency)

        Mathematical Requirements:
        - Query descriptors: Q ∈ {0,1,...,255}^(n×32)
        - Train descriptors: T ∈ {0,1,...,255}^(m×32)
        - Hamming distance: d_H: {0,1}^k × {0,1}^k → [0,k]

        Args:
            queryDescriptors: Binary descriptors from first image
                            Mathematical domain: Q ∈ {0,...,255}^(n×32)
                            Each row represents 256-bit binary descriptor
            trainDescriptors: Binary descriptors from second image
                            Mathematical domain: T ∈ {0,...,255}^(m×32)
                            Each row represents 256-bit binary descriptor

        Returns:
            List of cv2.DMatch objects with mathematical properties:
            - queryIdx: index i ∈ [0, n-1] in query descriptor array
            - trainIdx: index j ∈ [0, m-1] in train descriptor array
            - distance: Hamming distance d_H(qᵢ, tⱼ) ∈ [0, 256]
            - imgIdx: image index (typically 0 for single image matching)

            Mathematical guarantees:
            - Optimal assignment: distance[i] = min_j d_H(query[i], train[j])
            - Valid indices: queryIdx ∈ [0,n), trainIdx ∈ [0,m)
            - Metric properties: distance ≥ 0, symmetric if cross-check enabled

        Raises:
            ValueError: If descriptor arrays have incompatible shapes
            TypeError: If descriptors are not uint8 numpy arrays

        Complexity Analysis:
        - Time: O(n·m·k) where k=32 bytes, n,m are descriptor counts
        - Space: O(n·m) for distance matrix plus O(min(n,m)) for matches
        - Cache efficiency: depends on descriptor layout and SIMD utilization
        """
        ...

    @protocol_validator(constraints={
        'queryDescriptors': {
            'type': np.ndarray,
            'shape': (None, 32)
        },
        'trainDescriptors': {
            'type': np.ndarray,
            'shape': (None, 32)
        },
        'k': {
            'type': int,
            'bounds': (1, 10)
        },
        'return_value': {
            'type': list,
            'bounds': (0, 100000)
        }
    })
    def knnMatch(
        self,
        queryDescriptors: np.ndarray,
        trainDescriptors: np.ndarray,
        k: int
    ) -> List[List[cv2.DMatch]]:
        """
        K-nearest neighbor matching in Hamming space with Lowe's ratio test support.

        Implements k-NN search with mathematical foundation:

        1. K-Nearest Neighbor Search:
           - For query qᵢ, find k smallest distances: d₁ ≤ d₂ ≤ ... ≤ dₖ
           - Partial sorting: maintain k-element min-heap during search
           - Time complexity: O(n·m·log k) with heap optimization

        2. Distance Ordering:
           - Sort matches by Hamming distance: d_H(qᵢ, t_{j₁}) ≤ d_H(qᵢ, t_{j₂}) ≤ ...
           - Tie breaking: lexicographic order on train descriptor indices
           - Consistency: deterministic ordering for reproducible results

        3. Lowe's Ratio Test Preparation:
           - First-to-second ratio: r = d₁/d₂ for disambiguation
           - Threshold: typically r < 0.75 for reliable matches
           - Mathematical basis: assumes descriptor noise follows uniform distribution

        Statistical Foundation:
        - Probability of random match: P(d_H ≤ t) = Σᵢ₌₀ᵗ C(256,i) / 2²⁵⁶
        - Discrimination threshold: chosen to minimize false positive rate
        - Power law distribution: P(distance) ∝ distance^(-α) for natural images

        Args:
            queryDescriptors: Binary descriptors from query image
                            Mathematical domain: Q ∈ {0,...,255}^(n×32)
            trainDescriptors: Binary descriptors from train image
                            Mathematical domain: T ∈ {0,...,255}^(m×32)
            k: Number of nearest neighbors to return
               Mathematical constraint: k ∈ [1, min(m, reasonable_limit)]

        Returns:
            List of lists containing k-nearest matches per query descriptor:
            - Outer list: length n (one entry per query descriptor)
            - Inner lists: length ≤ k (k-nearest matches sorted by distance)
            - Each DMatch: (queryIdx, trainIdx, distance, imgIdx)

            Mathematical properties:
            - Ordering: distance[0] ≤ distance[1] ≤ ... ≤ distance[k-1]
            - Completeness: returns all matches if m < k
            - Consistency: deterministic results for identical inputs

        Raises:
            ValueError: If k > number of train descriptors or k ≤ 0
            TypeError: If descriptor arrays have wrong type or shape

        Applications:
        - Lowe's ratio test: reject if distance[0]/distance[1] > threshold
        - RANSAC initialization: use multiple correspondences per query
        - Descriptor quality assessment: analyze distance distributions
        """
        ...


@runtime_checkable
class ClipModelLoaderProtocol(Protocol):
    """
    Mathematical protocol defining interface for CLIP model loading and initialization.

    Implements theoretical foundation for contrastive language-image pre-training:

    1. Vision Transformer Architecture:
       - Multi-head self-attention: Attention(Q,K,V) = softmax(QK^T/√d_k)V
       - Layer normalization: LN(x) = γ(x-μ)/σ + β
       - Feed-forward networks: FFN(x) = max(0, xW₁ + b₁)W₂ + b₂

    2. Patch Embedding:
       - Image tokenization: x_patch = Flatten(x_img) ∈ ℝ^(P²·C)
       - Linear projection: z₀ = [x_class; E·x_patch] + E_pos
       - Positional encoding: learned embeddings for spatial relationships

    3. Contrastive Learning:
       - Joint embedding space: f_v: Images → ℝ^d, f_t: Text → ℝ^d
       - Cosine similarity: sim(I,T) = (f_v(I)·f_t(T))/(||f_v(I)||·||f_t(T)||)
       - InfoNCE loss: ℒ = -log(exp(sim(I,T)/τ) / Σ_j exp(sim(I,T_j)/τ))

    Memory Requirements:
    - Model parameters: ~151M for ViT-B/32, ~428M for ViT-L/14
    - Activation memory: O(batch_size · sequence_length · hidden_dim)
    - GPU memory: typically 2-8GB depending on model size and batch size
    """

    @protocol_validator(constraints={
        'model_name': {
            'type': str,
            'bounds': (1, 100)  # Reasonable model name length
        },
        'device': {
            'type': str
        },
        'return_value': {
            'type': tuple,
            'bounds': (2, 2)  # Exactly 2 elements: model and preprocess
        }
    })
    def __call__(
        self,
        model_name: str,
        device: str
    ) -> Tuple[torch.nn.Module, Callable[[Any], torch.Tensor]]:
        """
        Load pre-trained CLIP model with mathematical guarantees and optimization.

        Implements comprehensive model loading pipeline:

        1. Model Architecture Instantiation:
           - Vision encoder: ViT with patch size P, hidden dimension d
           - Text encoder: Transformer with vocabulary size V, context length L
           - Projection heads: linear mappings to joint embedding space ℝ^d_joint

        2. Weight Loading and Verification:
           - Parameter initialization: load pre-trained weights from checkpoint
           - Numerical precision: ensure float32/float16 consistency
           - Architecture verification: validate layer dimensions and connections

        3. Device Optimization:
           - Memory allocation: efficient GPU/CPU memory management
           - Compute optimization: enable appropriate acceleration (CUDA, MPS)
           - Precision selection: mixed precision for memory efficiency

        4. Preprocessing Pipeline:
           - Image normalization: (pixel - μ) / σ where μ, σ are ImageNet statistics
           - Resize and crop: maintain aspect ratio, center crop to model input size
           - Tensor conversion: PIL/numpy → torch.Tensor with proper device placement

        Mathematical Requirements:
        - Model input: images ∈ ℝ^(B×3×H×W) where H,W depend on model variant
        - Embedding output: features ∈ ℝ^(B×d_embed) with unit L2 norm
        - Numerical stability: gradients bounded, no NaN/Inf propagation

        Args:
            model_name: CLIP model variant identifier
                       Valid options: "ViT-B/32", "ViT-B/16", "ViT-L/14", "RN50", "RN101"
                       Mathematical specification:
                       - ViT-B/32: 12 layers, 768 hidden, 12 heads, 32×32 patches
                       - ViT-L/14: 24 layers, 1024 hidden, 16 heads, 14×14 patches
            device: Target computational device
                   Options: "cpu", "cuda", "cuda:N", "mps"
                   Mathematical consideration: affects precision and memory layout

        Returns:
            Tuple containing:
            - model: torch.nn.Module with mathematical properties:
              * encode_image: ℝ^(B×3×H×W) → ℝ^(B×d_embed)
              * encode_text: ℝ^(B×L) → ℝ^(B×d_embed)
              * Parameters: θ ∈ ℝ^N where N ≈ 151M for ViT-B/32
              * Embedding dimension: d_embed = 512 for most variants
            - preprocess: Callable with mathematical specification:
              * Input: PIL.Image or numpy.ndarray
              * Output: torch.Tensor ∈ ℝ^(3×H×W) with values in [-1,1]
              * Normalization: (x - μ)/σ where μ=[0.485,0.456,0.406], σ=[0.229,0.224,0.225]
              * Resize: bilinear interpolation to target resolution

        Raises:
            ModelLoadError: If model loading fails due to network, memory, or compatibility issues
            ValueError: If model_name is not supported or device is invalid
            RuntimeError: If insufficient memory or device incompatibility

        Performance Characteristics:
        - Loading time: O(model_size / bandwidth) + initialization overhead
        - Memory usage: 1.5-2x model parameter count during loading
        - Inference speed: depends on batch size, resolution, device compute capability

        Mathematical Guarantees:
        - Deterministic output: same input → same embedding (within numerical precision)
        - Embedding norm: ||encode_image(x)||₂ ≈ 1 after normalization
        - Lipschitz continuity: bounded sensitivity to input perturbations
        - Cosine similarity preservation: sim(x₁,x₂) consistent across devices
        """
        ...


@runtime_checkable
class ImageHashProtocol(Protocol):
    """
    Mathematical protocol defining interface for perceptual image hashing algorithms.

    Implements theoretical foundation for content-based image fingerprinting:

    1. Discrete Cosine Transform (DCT):
       - 2D DCT: F(u,v) = α(u)α(v) Σₓ Σᵧ f(x,y)cos((2x+1)uπ/2N)cos((2y+1)vπ/2N)
       - Frequency compaction: energy concentrated in low-frequency coefficients
       - Basis functions: separable cosine functions with orthogonality

    2. Perceptual Hash Generation:
       - Low-frequency extraction: keep top-left N×N DCT coefficients
       - Median thresholding: binary hash h[i,j] = 1 if F[i,j] > median(F), else 0
       - Hash concatenation: serialize 2D binary matrix to 1D bit string

    3. Hamming Distance:
       - Bit difference: d_H(h₁,h₂) = Σᵢ |h₁[i] ⊕ h₂[i]|
       - Similarity metric: sim = 1 - d_H/(hash_size²)
       - Robustness: invariant to compression, scaling, minor modifications
    """

    @protocol_validator(constraints={
        'image': {
            'type': (np.ndarray, Any)  # PIL Image or numpy array
        },
        'hash_size': {
            'type': int,
            'bounds': (4, 64)  # Practical hash size limits
        },
        'return_value': {
            'type': object  # ImageHash object
        }
    })
    def compute_hash(
        self,
        image: Union[np.ndarray, Any],
        hash_size: int = 8
    ) -> Any:
        """
        Compute perceptual hash using DCT-based algorithm with mathematical rigor.

        Implementation follows the mathematical specification:

        1. Image Preprocessing:
           - Grayscale conversion: I_gray = 0.299R + 0.587G + 0.114B
           - Resize to (hash_size×4, hash_size×4): maintains frequency characteristics
           - Gaussian smoothing: optional noise reduction filter

        2. DCT Computation:
           - Apply 2D DCT to preprocessed image
           - Extract low-frequency coefficients: top-left hash_size×hash_size block
           - Mathematical domain: DCT coefficients ∈ ℝ^(hash_size×hash_size)

        3. Binary Hash Generation:
           - Compute median of DCT coefficients: τ = median(DCT_coeffs)
           - Threshold: h[i,j] = 1 if DCT[i,j] > τ, else 0
           - Serialize to bit string: hash_bits ∈ {0,1}^(hash_size²)

        Args:
            image: Input image for hash computation
                  Mathematical requirements: I ∈ [0,255]^(H×W×C) or ℝ^(H×W)
            hash_size: Dimension N of N×N hash matrix
                      Mathematical constraint: N ∈ [4,64] for practical applications

        Returns:
            ImageHash object with mathematical properties:
            - Binary representation: bit string of length hash_size²
            - Hamming distance: metric d_H: Hash × Hash → [0, hash_size²]
            - Hex representation: compact string encoding

        Mathematical Guarantees:
        - Perceptual invariance: robust to JPEG compression, scaling, rotation
        - Discrimination: different images → different hashes with high probability
        - Efficiency: O(N²log N) computation via FFT-based DCT
        """
        ...


def validate_protocol_implementation(
    obj: Any,
    protocol_class: Type[Protocol],
    strict: bool = True
) -> Tuple[bool, List[str]]:
    """
    Validate that object properly implements protocol with mathematical constraints.

    Performs comprehensive protocol conformance checking including:
    - Method signature validation
    - Return type compatibility
    - Mathematical constraint verification
    - Runtime behavior analysis

    Args:
        obj: Object to validate against protocol
        protocol_class: Protocol class defining interface requirements
        strict: Whether to enforce strict mathematical constraints

    Returns:
        Tuple of (is_valid, violation_messages)

    Mathematical Analysis:
    - Signature compatibility: parameter types and counts must match exactly
    - Return type variance: covariant return types allowed for subtypes
    - Constraint satisfaction: all mathematical bounds must be respected
    """
    # Initialize validation state
    is_valid = True
    violations = []

    # Extract protocol methods for validation
    protocol_methods = [
        name for name in dir(protocol_class)
        if not name.startswith('_') and callable(getattr(protocol_class, name, None))
    ]

    # Validate each protocol method
    for method_name in protocol_methods:
        # Check method existence on implementation object
        if not hasattr(obj, method_name):
            is_valid = False
            violations.append(f"Missing required method: {method_name}")
            continue

        # Extract method objects for signature comparison
        protocol_method = getattr(protocol_class, method_name)
        impl_method = getattr(obj, method_name)

        # Validate method is callable
        if not callable(impl_method):
            is_valid = False
            violations.append(f"Method {method_name} is not callable")
            continue

        # Compare method signatures for compatibility
        try:
            protocol_sig = inspect.signature(protocol_method)
            impl_sig = inspect.signature(impl_method)

            # Validate parameter count compatibility
            if len(protocol_sig.parameters) != len(impl_sig.parameters):
                is_valid = False
                violations.append(
                    f"Method {method_name} parameter count mismatch: "
                    f"protocol={len(protocol_sig.parameters)}, "
                    f"implementation={len(impl_sig.parameters)}"
                )

            # Validate parameter types if strict checking enabled
            if strict:
                for proto_param, impl_param in zip(
                    protocol_sig.parameters.values(),
                    impl_sig.parameters.values()
                ):
                    # Check parameter name consistency
                    if proto_param.name != impl_param.name:
                        violations.append(
                            f"Method {method_name} parameter name mismatch: "
                            f"protocol={proto_param.name}, implementation={impl_param.name}"
                        )

                    # Check parameter annotation compatibility
                    if (proto_param.annotation != inspect.Parameter.empty and
                        impl_param.annotation != inspect.Parameter.empty and
                        proto_param.annotation != impl_param.annotation):
                        violations.append(
                            f"Method {method_name} parameter {proto_param.name} type mismatch: "
                            f"protocol={proto_param.annotation}, implementation={impl_param.annotation}"
                        )

        except (ValueError, TypeError) as e:
            # Handle signature inspection failures
            violations.append(f"Failed to inspect method {method_name}: {e}")

    # Update validation result based on violations found
    if violations:
        is_valid = False

    return is_valid, violations


# Enhanced protocol registration for runtime checking
def register_protocol_implementation(
    impl_class: Type,
    protocol_class: Type[Protocol],
    validate_on_registration: bool = True
) -> Type:
    """
    Register class as protocol implementation with optional validation.

    Provides runtime registration of protocol implementations with comprehensive
    validation and mathematical constraint checking capabilities.

    Args:
        impl_class: Implementation class to register
        protocol_class: Protocol class defining interface
        validate_on_registration: Whether to validate immediately

    Returns:
        Registered implementation class

    Raises:
        TypeError: If implementation does not conform to protocol
        ValueError: If mathematical constraints are violated
    """
    # Perform immediate validation if requested
    if validate_on_registration:
        # Create temporary instance for validation (if possible)
        try:
            temp_instance = impl_class()
            is_valid, violations = validate_protocol_implementation(
                temp_instance, protocol_class, strict=True
            )

            # Raise error if validation fails
            if not is_valid:
                violation_summary = "; ".join(violations)
                raise TypeError(
                    f"Class {impl_class.__name__} does not properly implement "
                    f"{protocol_class.__name__}: {violation_summary}"
                )

        except Exception as e:
            # Log validation attempt failure but allow registration
            logger = logging.getLogger(__name__)
            logger.warning(f"Could not validate {impl_class.__name__} on registration: {e}")

    # Register class with protocol for isinstance checks
    protocol_class.register(impl_class)

    return impl_class




In [None]:
# Factory Classes for Resource Management

class OptimizationStrategy(Enum):
    """Enumeration of parameter optimization strategies for algorithm tuning."""
    GRID_SEARCH = "grid_search"
    RANDOM_SEARCH = "random_search"
    BAYESIAN_OPTIMIZATION = "bayesian_optimization"
    ADAPTIVE_TUNING = "adaptive_tuning"
    PERFORMANCE_BASED = "performance_based"


class ResourceConstraints(Enum):
    """System resource constraint levels for factory configuration."""
    LOW_MEMORY = "low_memory"
    BALANCED = "balanced"
    HIGH_PERFORMANCE = "high_performance"
    UNLIMITED = "unlimited"


@dataclass(frozen=True)
class ParameterConstraints:
    """
    Mathematical constraints for algorithm parameters with optimization bounds.

    Defines valid parameter ranges, optimization objectives, and performance
    characteristics for systematic parameter tuning and validation.
    """

    # Parameter name for identification and logging
    name: str

    # Mathematical bounds: [min_value, max_value] for parameter domain
    bounds: Tuple[Union[int, float], Union[int, float]]

    # Data type constraint for parameter values
    dtype: Type[Union[int, float, bool]]

    # Optimization objective: 'minimize' or 'maximize' for tuning
    optimization_objective: str = "maximize"

    # Performance weight: importance factor for multi-objective optimization
    performance_weight: float = 1.0

    # Stability requirement: parameter sensitivity to input variations
    stability_threshold: float = 0.1

    # Computational cost factor: relative cost of parameter increase
    computational_cost: float = 1.0

    # Mathematical relationship to other parameters
    dependencies: List[str] = field(default_factory=list)


@dataclass
class PerformanceMetrics:
    """
    Comprehensive performance metrics for algorithm evaluation and optimization.

    Captures computational efficiency, mathematical accuracy, and resource
    utilization for systematic parameter optimization and benchmarking.
    """

    # Execution time measurements in seconds
    mean_execution_time: float = 0.0
    std_execution_time: float = 0.0
    min_execution_time: float = float('inf')
    max_execution_time: float = 0.0

    # Memory usage statistics in bytes
    peak_memory_usage: int = 0
    average_memory_usage: int = 0
    memory_efficiency_ratio: float = 0.0

    # Algorithm quality metrics
    detection_accuracy: float = 0.0
    repeatability_score: float = 0.0
    robustness_measure: float = 0.0

    # Mathematical properties
    numerical_stability: float = 0.0
    convergence_rate: float = 0.0
    error_variance: float = 0.0

    # Resource utilization
    cpu_utilization: float = 0.0
    gpu_utilization: float = 0.0
    cache_hit_ratio: float = 0.0

    # Statistical significance
    sample_count: int = 0
    confidence_interval: Tuple[float, float] = (0.0, 0.0)
    statistical_power: float = 0.0


class ParameterOptimizer:
    """
    Mathematical parameter optimization engine for computer vision algorithms.

    Implements systematic parameter tuning using multiple optimization strategies
    with performance profiling, statistical analysis, and convergence guarantees.
    """

    def __init__(
        self,
        constraints: Dict[str, ParameterConstraints],
        optimization_strategy: OptimizationStrategy = OptimizationStrategy.ADAPTIVE_TUNING,
        performance_samples: int = 10,
        convergence_threshold: float = 0.01,
        max_iterations: int = 100
    ) -> None:
        """
        Initialize parameter optimizer with mathematical constraints and strategy.

        Args:
            constraints: Dictionary mapping parameter names to constraint specifications
            optimization_strategy: Algorithm for parameter space exploration
            performance_samples: Number of performance measurements per configuration
            convergence_threshold: Minimum improvement threshold for convergence
            max_iterations: Maximum optimization iterations before termination
        """
        # Store parameter constraints for validation and optimization
        self.constraints: Dict[str, ParameterConstraints] = constraints

        # Set optimization strategy for parameter space exploration
        self.optimization_strategy: OptimizationStrategy = optimization_strategy

        # Configure performance measurement parameters
        self.performance_samples: int = performance_samples
        self.convergence_threshold: float = convergence_threshold
        self.max_iterations: int = max_iterations

        # Initialize optimization history for analysis
        self.optimization_history: List[Dict[str, Any]] = []

        # Track best parameter configuration found
        self.best_parameters: Optional[Dict[str, Any]] = None
        self.best_performance: Optional[PerformanceMetrics] = None

        # Setup logging for optimization monitoring
        self.logger = logging.getLogger(f"{__name__}.{self.__class__.__name__}")

    def validate_parameters(
        self,
        parameters: Dict[str, Any]
    ) -> Tuple[bool, List[str]]:
        """
        Validate parameter configuration against mathematical constraints.

        Implements comprehensive parameter validation including bounds checking,
        type verification, dependency analysis, and mathematical consistency.

        Args:
            parameters: Dictionary of parameter values to validate

        Returns:
            Tuple of (is_valid, violation_messages) for constraint compliance
        """
        # Initialize validation state
        is_valid = True
        violations = []

        # Validate each parameter against its constraints
        for param_name, param_value in parameters.items():
            # Check if parameter has defined constraints
            if param_name not in self.constraints:
                violations.append(f"Unknown parameter: {param_name}")
                is_valid = False
                continue

            # Extract constraint specification for parameter
            constraint = self.constraints[param_name]

            # Validate parameter type consistency
            if not isinstance(param_value, constraint.dtype):
                violations.append(
                    f"Parameter {param_name} has type {type(param_value)}, "
                    f"expected {constraint.dtype}"
                )
                is_valid = False
                continue

            # Validate parameter bounds compliance
            min_bound, max_bound = constraint.bounds
            if not (min_bound <= param_value <= max_bound):
                violations.append(
                    f"Parameter {param_name} = {param_value} violates bounds "
                    f"[{min_bound}, {max_bound}]"
                )
                is_valid = False

            # Validate parameter dependencies if specified
            for dependency in constraint.dependencies:
                if dependency in parameters:
                    # Check dependency-specific constraints
                    dep_value = parameters[dependency]
                    if not self._validate_dependency(param_name, param_value, dependency, dep_value):
                        violations.append(
                            f"Parameter {param_name} = {param_value} violates "
                            f"dependency constraint with {dependency} = {dep_value}"
                        )
                        is_valid = False

        return is_valid, violations

    def _validate_dependency(
        self,
        param_name: str,
        param_value: Any,
        dep_name: str,
        dep_value: Any
    ) -> bool:
        """
        Validate parameter dependency constraints for mathematical consistency.

        Args:
            param_name: Name of parameter being validated
            param_value: Value of parameter being validated
            dep_name: Name of dependency parameter
            dep_value: Value of dependency parameter

        Returns:
            Boolean indicating whether dependency constraint is satisfied
        """
        # Implement specific dependency validation logic
        # Example: scaleFactor must be > 1.0 and nlevels must be > 0
        if param_name == "scaleFactor" and dep_name == "nlevels":
            # Mathematical constraint: pyramid levels must support scale factor
            max_possible_levels = int(np.log2(min(1024, 1024)) / np.log2(param_value))
            return dep_value <= max_possible_levels

        # Example: edgeThreshold must be < patchSize for ORB
        if param_name == "edgeThreshold" and dep_name == "patchSize":
            return param_value < dep_value

        # Default: assume dependency is satisfied if no specific rule
        return True

    def optimize_parameters(
        self,
        objective_function: Callable[[Dict[str, Any]], PerformanceMetrics],
        initial_parameters: Optional[Dict[str, Any]] = None
    ) -> Tuple[Dict[str, Any], PerformanceMetrics]:
        """
        Optimize algorithm parameters using specified optimization strategy.

        Implements systematic parameter space exploration with convergence
        monitoring, performance tracking, and statistical significance testing.

        Args:
            objective_function: Function mapping parameters to performance metrics
            initial_parameters: Starting parameter configuration for optimization

        Returns:
            Tuple of (optimal_parameters, best_performance_metrics)
        """
        # Initialize optimization with starting parameters
        if initial_parameters is None:
            # Generate default starting parameters from constraint midpoints
            current_parameters = self._generate_default_parameters()
        else:
            # Validate provided starting parameters
            is_valid, violations = self.validate_parameters(initial_parameters)
            if not is_valid:
                raise ValueError(f"Invalid initial parameters: {violations}")
            current_parameters = initial_parameters.copy()

        # Initialize optimization tracking variables
        iteration = 0
        best_score = float('-inf')
        consecutive_no_improvement = 0

        # Log optimization start
        self.logger.info(f"Starting parameter optimization with strategy: {self.optimization_strategy}")

        # Main optimization loop with convergence monitoring
        while iteration < self.max_iterations:
            # Generate candidate parameters based on optimization strategy
            if self.optimization_strategy == OptimizationStrategy.GRID_SEARCH:
                candidate_parameters = self._grid_search_step(current_parameters, iteration)
            elif self.optimization_strategy == OptimizationStrategy.RANDOM_SEARCH:
                candidate_parameters = self._random_search_step()
            elif self.optimization_strategy == OptimizationStrategy.ADAPTIVE_TUNING:
                candidate_parameters = self._adaptive_tuning_step(current_parameters, iteration)
            else:
                # Default to random perturbation
                candidate_parameters = self._random_perturbation(current_parameters)

            # Validate candidate parameters before evaluation
            is_valid, violations = self.validate_parameters(candidate_parameters)
            if not is_valid:
                # Skip invalid parameter configurations
                self.logger.warning(f"Skipping invalid parameters: {violations}")
                iteration += 1
                continue

            # Evaluate candidate parameters with performance measurement
            try:
                performance = self._evaluate_parameters_with_statistics(
                    objective_function, candidate_parameters
                )
            except Exception as e:
                # Handle evaluation failures gracefully
                self.logger.error(f"Parameter evaluation failed: {e}")
                iteration += 1
                continue

            # Compute composite performance score for optimization
            composite_score = self._compute_composite_score(performance)

            # Update best configuration if improvement found
            if composite_score > best_score + self.convergence_threshold:
                # Record new best configuration
                best_score = composite_score
                current_parameters = candidate_parameters.copy()
                self.best_parameters = candidate_parameters.copy()
                self.best_performance = performance
                consecutive_no_improvement = 0

                # Log improvement
                self.logger.info(
                    f"Iteration {iteration}: New best score {composite_score:.6f} "
                    f"with parameters {candidate_parameters}"
                )
            else:
                # Track consecutive iterations without improvement
                consecutive_no_improvement += 1

            # Record optimization history for analysis
            self.optimization_history.append({
                'iteration': iteration,
                'parameters': candidate_parameters.copy(),
                'performance': performance,
                'composite_score': composite_score,
                'is_best': composite_score > best_score
            })

            # Check for convergence based on consecutive no-improvement
            if consecutive_no_improvement >= 10:
                self.logger.info(f"Optimization converged after {iteration} iterations")
                break

            iteration += 1

        # Log optimization completion
        self.logger.info(
            f"Optimization completed: best score {best_score:.6f} "
            f"after {iteration} iterations"
        )

        return self.best_parameters, self.best_performance

    def _generate_default_parameters(self) -> Dict[str, Any]:
        """
        Generate default parameter configuration from constraint midpoints.

        Returns:
            Dictionary of default parameter values
        """
        # Initialize default parameters dictionary
        default_params = {}

        # Generate midpoint values for each constrained parameter
        for param_name, constraint in self.constraints.items():
            min_bound, max_bound = constraint.bounds

            # Compute midpoint based on parameter type
            if constraint.dtype == int:
                # Integer midpoint with proper rounding
                default_params[param_name] = int((min_bound + max_bound) / 2)
            elif constraint.dtype == float:
                # Floating point midpoint
                default_params[param_name] = (min_bound + max_bound) / 2.0
            elif constraint.dtype == bool:
                # Boolean default to True for most algorithm parameters
                default_params[param_name] = True

        return default_params

    def _evaluate_parameters_with_statistics(
        self,
        objective_function: Callable[[Dict[str, Any]], PerformanceMetrics],
        parameters: Dict[str, Any]
    ) -> PerformanceMetrics:
        """
        Evaluate parameter configuration with statistical sampling and analysis.

        Args:
            objective_function: Performance evaluation function
            parameters: Parameter configuration to evaluate

        Returns:
            Aggregated performance metrics with statistical properties
        """
        # Collect multiple performance samples for statistical analysis
        performance_samples = []
        execution_times = []
        memory_usages = []

        # Perform multiple evaluations for statistical significance
        for sample_idx in range(self.performance_samples):
            # Record start time and memory for performance measurement
            start_time = time.perf_counter()
            start_memory = psutil.Process().memory_info().rss

            # Execute objective function with current parameters
            try:
                sample_performance = objective_function(parameters)
                performance_samples.append(sample_performance)
            except Exception as e:
                # Log evaluation error and continue with remaining samples
                self.logger.warning(f"Sample {sample_idx} evaluation failed: {e}")
                continue

            # Record execution time and memory usage
            execution_time = time.perf_counter() - start_time
            memory_usage = psutil.Process().memory_info().rss - start_memory
            execution_times.append(execution_time)
            memory_usages.append(memory_usage)

        # Aggregate performance metrics across samples
        if not performance_samples:
            # Return default metrics if all samples failed
            return PerformanceMetrics(sample_count=0)

        # Compute statistical aggregates for performance metrics
        aggregated_performance = PerformanceMetrics(
            mean_execution_time=statistics.mean(execution_times),
            std_execution_time=statistics.stdev(execution_times) if len(execution_times) > 1 else 0.0,
            min_execution_time=min(execution_times),
            max_execution_time=max(execution_times),
            peak_memory_usage=max(memory_usages) if memory_usages else 0,
            average_memory_usage=int(statistics.mean(memory_usages)) if memory_usages else 0,
            sample_count=len(performance_samples)
        )

        # Compute confidence interval for performance estimate
        if len(performance_samples) > 1:
            # Calculate 95% confidence interval using t-distribution
            sample_std = statistics.stdev(execution_times)
            t_critical = 2.086  # t-value for 95% CI with small samples
            margin_of_error = t_critical * (sample_std / np.sqrt(len(execution_times)))
            mean_time = aggregated_performance.mean_execution_time
            aggregated_performance.confidence_interval = (
                mean_time - margin_of_error,
                mean_time + margin_of_error
            )

        return aggregated_performance

    def _compute_composite_score(self, performance: PerformanceMetrics) -> float:
        """
        Compute composite performance score for optimization objective.

        Args:
            performance: Performance metrics to score

        Returns:
            Composite score for optimization comparison
        """
        # Initialize composite score with base performance metrics
        score = 0.0

        # Weight execution time performance (lower is better)
        if performance.mean_execution_time > 0:
            time_score = 1.0 / (1.0 + performance.mean_execution_time)
            score += 0.3 * time_score

        # Weight memory efficiency (lower usage is better)
        if performance.peak_memory_usage > 0:
            memory_score = 1.0 / (1.0 + performance.peak_memory_usage / 1e6)  # Normalize to MB
            score += 0.2 * memory_score

        # Weight algorithm accuracy (higher is better)
        score += 0.3 * performance.detection_accuracy

        # Weight numerical stability (higher is better)
        score += 0.1 * performance.numerical_stability

        # Weight statistical confidence (higher sample count is better)
        if performance.sample_count > 0:
            confidence_score = min(1.0, performance.sample_count / self.performance_samples)
            score += 0.1 * confidence_score

        return score

    def _grid_search_step(
        self,
        current_parameters: Dict[str, Any],
        iteration: int
    ) -> Dict[str, Any]:
        """
        Generate next parameter configuration using grid search strategy.

        Args:
            current_parameters: Current parameter configuration
            iteration: Current optimization iteration

        Returns:
            Next parameter configuration to evaluate
        """
        # Implement systematic grid search over parameter space
        # This is a simplified implementation - full grid search would be more complex
        candidate_params = current_parameters.copy()

        # Select parameter to modify based on iteration
        param_names = list(self.constraints.keys())
        param_to_modify = param_names[iteration % len(param_names)]

        # Generate grid point for selected parameter
        constraint = self.constraints[param_to_modify]
        min_bound, max_bound = constraint.bounds

        # Create 5-point grid and select next point
        grid_points = np.linspace(min_bound, max_bound, 5)
        grid_index = (iteration // len(param_names)) % len(grid_points)

        if constraint.dtype == int:
            candidate_params[param_to_modify] = int(grid_points[grid_index])
        else:
            candidate_params[param_to_modify] = float(grid_points[grid_index])

        return candidate_params

    def _random_search_step(self) -> Dict[str, Any]:
        """
        Generate random parameter configuration for random search strategy.

        Returns:
            Randomly generated parameter configuration
        """
        # Generate random parameters within constraints
        random_params = {}

        # Sample each parameter randomly from its constraint bounds
        for param_name, constraint in self.constraints.items():
            min_bound, max_bound = constraint.bounds

            # Generate random value based on parameter type
            if constraint.dtype == int:
                random_params[param_name] = np.random.randint(min_bound, max_bound + 1)
            elif constraint.dtype == float:
                random_params[param_name] = np.random.uniform(min_bound, max_bound)
            elif constraint.dtype == bool:
                random_params[param_name] = np.random.choice([True, False])

        return random_params

    def _adaptive_tuning_step(
        self,
        current_parameters: Dict[str, Any],
        iteration: int
    ) -> Dict[str, Any]:
        """
        Generate parameter configuration using adaptive tuning strategy.

        Args:
            current_parameters: Current best parameter configuration
            iteration: Current optimization iteration

        Returns:
            Adaptively tuned parameter configuration
        """
        # Implement adaptive parameter adjustment based on optimization history
        candidate_params = current_parameters.copy()

        # Analyze optimization history for parameter sensitivity
        if len(self.optimization_history) > 5:
            # Compute parameter sensitivity from recent history
            sensitivity_scores = self._compute_parameter_sensitivity()

            # Focus on most sensitive parameters for tuning
            most_sensitive_param = max(sensitivity_scores.keys(),
                                     key=lambda k: sensitivity_scores[k])

            # Adjust most sensitive parameter adaptively
            constraint = self.constraints[most_sensitive_param]
            current_value = current_parameters[most_sensitive_param]
            min_bound, max_bound = constraint.bounds

            # Compute adaptive step size based on constraint range
            step_size = (max_bound - min_bound) * 0.1 * (1.0 / (1.0 + iteration * 0.1))

            # Apply random perturbation with adaptive step size
            if constraint.dtype == int:
                perturbation = int(np.random.normal(0, step_size))
                new_value = np.clip(current_value + perturbation, min_bound, max_bound)
                candidate_params[most_sensitive_param] = int(new_value)
            elif constraint.dtype == float:
                perturbation = np.random.normal(0, step_size)
                new_value = np.clip(current_value + perturbation, min_bound, max_bound)
                candidate_params[most_sensitive_param] = float(new_value)
        else:
            # Use random perturbation for initial iterations
            candidate_params = self._random_perturbation(current_parameters)

        return candidate_params

    def _compute_parameter_sensitivity(self) -> Dict[str, float]:
        """
        Compute parameter sensitivity scores from optimization history.

        Returns:
            Dictionary mapping parameter names to sensitivity scores
        """
        # Initialize sensitivity tracking
        sensitivity_scores = defaultdict(float)

        # Analyze recent optimization history
        recent_history = self.optimization_history[-10:]  # Last 10 iterations

        # Compute parameter variance and performance correlation
        for param_name in self.constraints.keys():
            # Extract parameter values and performance scores
            param_values = [entry['parameters'][param_name] for entry in recent_history]
            performance_scores = [entry['composite_score'] for entry in recent_history]

            # Compute correlation between parameter changes and performance
            if len(param_values) > 1 and len(set(param_values)) > 1:
                # Calculate Pearson correlation coefficient
                correlation = np.corrcoef(param_values, performance_scores)[0, 1]
                # Use absolute correlation as sensitivity measure
                sensitivity_scores[param_name] = abs(correlation) if not np.isnan(correlation) else 0.0

        return dict(sensitivity_scores)

    def _random_perturbation(self, base_parameters: Dict[str, Any]) -> Dict[str, Any]:
        """
        Apply random perturbation to base parameter configuration.

        Args:
            base_parameters: Base parameter configuration to perturb

        Returns:
            Perturbed parameter configuration
        """
        # Create copy of base parameters for modification
        perturbed_params = base_parameters.copy()

        # Apply random perturbation to subset of parameters
        param_names = list(self.constraints.keys())
        num_params_to_perturb = max(1, len(param_names) // 3)  # Perturb ~1/3 of parameters

        # Randomly select parameters to perturb
        params_to_perturb = np.random.choice(param_names, num_params_to_perturb, replace=False)

        # Apply perturbation to selected parameters
        for param_name in params_to_perturb:
            constraint = self.constraints[param_name]
            current_value = base_parameters[param_name]
            min_bound, max_bound = constraint.bounds

            # Compute perturbation magnitude as fraction of range
            range_size = max_bound - min_bound
            perturbation_magnitude = range_size * 0.2  # 20% of range

            # Apply type-specific perturbation
            if constraint.dtype == int:
                perturbation = int(np.random.normal(0, perturbation_magnitude))
                new_value = np.clip(current_value + perturbation, min_bound, max_bound)
                perturbed_params[param_name] = int(new_value)
            elif constraint.dtype == float:
                perturbation = np.random.normal(0, perturbation_magnitude)
                new_value = np.clip(current_value + perturbation, min_bound, max_bound)
                perturbed_params[param_name] = float(new_value)
            elif constraint.dtype == bool:
                # Flip boolean with 30% probability
                if np.random.random() < 0.3:
                    perturbed_params[param_name] = not current_value

        return perturbed_params


class ResourceManager:
    """
    Enterprise-grade resource management for computer vision algorithms.

    Implements comprehensive resource monitoring, allocation optimization,
    and cleanup strategies for production deployment environments.
    """

    def __init__(
        self,
        resource_constraints: ResourceConstraints = ResourceConstraints.BALANCED,
        monitoring_interval: float = 1.0,
        cleanup_threshold: float = 0.8
    ) -> None:
        """
        Initialize resource manager with monitoring and constraint configuration.

        Args:
            resource_constraints: Target resource utilization profile
            monitoring_interval: Seconds between resource monitoring updates
            cleanup_threshold: Resource utilization threshold for cleanup trigger
        """
        # Store resource management configuration
        self.resource_constraints = resource_constraints
        self.monitoring_interval = monitoring_interval
        self.cleanup_threshold = cleanup_threshold

        # Initialize resource monitoring state
        self.monitoring_active = False
        self.monitoring_thread: Optional[threading.Thread] = None
        self.resource_history: List[Dict[str, Any]] = []

        # Setup resource tracking locks for thread safety
        self._monitoring_lock = threading.Lock()
        self._cleanup_lock = threading.Lock()

        # Initialize logger for resource management events
        self.logger = logging.getLogger(f"{__name__}.{self.__class__.__name__}")

        # Configure resource limits based on constraints
        self._configure_resource_limits()

    def _configure_resource_limits(self) -> None:
        """Configure resource limits based on constraint profile."""
        # Set memory and CPU limits based on constraint level
        if self.resource_constraints == ResourceConstraints.LOW_MEMORY:
            # Conservative limits for memory-constrained environments
            self.max_memory_usage = 1.0 * 1024**3  # 1GB
            self.max_cpu_cores = 2
            self.enable_aggressive_cleanup = True
        elif self.resource_constraints == ResourceConstraints.BALANCED:
            # Balanced limits for typical production environments
            self.max_memory_usage = 4.0 * 1024**3  # 4GB
            self.max_cpu_cores = psutil.cpu_count() // 2
            self.enable_aggressive_cleanup = False
        elif self.resource_constraints == ResourceConstraints.HIGH_PERFORMANCE:
            # Generous limits for high-performance environments
            self.max_memory_usage = 16.0 * 1024**3  # 16GB
            self.max_cpu_cores = psutil.cpu_count()
            self.enable_aggressive_cleanup = False
        else:  # UNLIMITED
            # No artificial limits for unlimited environments
            self.max_memory_usage = float('inf')
            self.max_cpu_cores = psutil.cpu_count()
            self.enable_aggressive_cleanup = False

    def start_monitoring(self) -> None:
        """Start background resource monitoring thread."""
        # Acquire lock to prevent concurrent monitoring starts
        with self._monitoring_lock:
            # Start monitoring only if not already active
            if not self.monitoring_active:
                # Set monitoring state and create background thread
                self.monitoring_active = True
                self.monitoring_thread = threading.Thread(
                    target=self._monitoring_loop,
                    daemon=True,
                    name="ResourceMonitor"
                )
                # Start monitoring thread
                self.monitoring_thread.start()
                self.logger.info("Resource monitoring started")

    def stop_monitoring(self) -> None:
        """Stop background resource monitoring thread."""
        # Acquire lock to prevent concurrent monitoring stops
        with self._monitoring_lock:
            # Stop monitoring if currently active
            if self.monitoring_active:
                # Signal monitoring thread to stop
                self.monitoring_active = False
                # Wait for monitoring thread to complete
                if self.monitoring_thread and self.monitoring_thread.is_alive():
                    self.monitoring_thread.join(timeout=5.0)
                self.logger.info("Resource monitoring stopped")

    def _monitoring_loop(self) -> None:
        """Background monitoring loop for resource tracking."""
        # Continue monitoring while active flag is set
        while self.monitoring_active:
            try:
                # Collect current resource usage statistics
                resource_stats = self._collect_resource_stats()

                # Store resource statistics in history
                with self._monitoring_lock:
                    self.resource_history.append(resource_stats)
                    # Limit history size to prevent memory growth
                    if len(self.resource_history) > 1000:
                        self.resource_history = self.resource_history[-500:]

                # Check if cleanup is needed based on resource usage
                if self._should_trigger_cleanup(resource_stats):
                    # Trigger resource cleanup in separate thread
                    cleanup_thread = threading.Thread(
                        target=self._perform_cleanup,
                        daemon=True,
                        name="ResourceCleanup"
                    )
                    cleanup_thread.start()

                # Sleep until next monitoring interval
                time.sleep(self.monitoring_interval)

            except Exception as e:
                # Log monitoring errors but continue operation
                self.logger.error(f"Resource monitoring error: {e}")
                time.sleep(self.monitoring_interval)

    def _collect_resource_stats(self) -> Dict[str, Any]:
        """
        Collect comprehensive system resource usage statistics.

        Returns:
            Dictionary containing current resource usage metrics
        """
        # Get current process for resource measurement
        current_process = psutil.Process()

        # Collect memory usage statistics
        memory_info = current_process.memory_info()
        virtual_memory = psutil.virtual_memory()

        # Collect CPU usage statistics
        cpu_percent = current_process.cpu_percent()
        system_cpu_percent = psutil.cpu_percent()

        # Collect GPU usage if available
        gpu_usage = 0.0
        gpu_memory_usage = 0
        if torch.cuda.is_available():
            try:
                # Get GPU utilization and memory usage
                gpu_usage = torch.cuda.utilization()
                gpu_memory_usage = torch.cuda.memory_allocated()
            except Exception:
                # Handle GPU query failures gracefully
                pass

        # Construct comprehensive resource statistics
        resource_stats = {
            'timestamp': time.time(),
            'memory': {
                'rss': memory_info.rss,  # Resident Set Size
                'vms': memory_info.vms,  # Virtual Memory Size
                'percent': current_process.memory_percent(),
                'available': virtual_memory.available,
                'total': virtual_memory.total
            },
            'cpu': {
                'process_percent': cpu_percent,
                'system_percent': system_cpu_percent,
                'core_count': psutil.cpu_count()
            },
            'gpu': {
                'utilization': gpu_usage,
                'memory_allocated': gpu_memory_usage,
                'memory_reserved': torch.cuda.memory_reserved() if torch.cuda.is_available() else 0
            }
        }

        return resource_stats

    def _should_trigger_cleanup(self, resource_stats: Dict[str, Any]) -> bool:
        """
        Determine if resource cleanup should be triggered based on usage.

        Args:
            resource_stats: Current resource usage statistics

        Returns:
            Boolean indicating whether cleanup should be performed
        """
        # Check memory usage threshold
        memory_usage_ratio = resource_stats['memory']['rss'] / self.max_memory_usage
        if memory_usage_ratio > self.cleanup_threshold:
            return True

        # Check GPU memory usage threshold if available
        if torch.cuda.is_available():
            gpu_memory_ratio = resource_stats['gpu']['memory_allocated'] / torch.cuda.get_device_properties(0).total_memory
            if gpu_memory_ratio > self.cleanup_threshold:
                return True

        # Check system memory pressure
        system_memory_ratio = resource_stats['memory']['percent'] / 100.0
        if system_memory_ratio > 0.9:  # System memory > 90%
            return True

        return False

    def _perform_cleanup(self) -> None:
        """Perform comprehensive resource cleanup operations."""
        # Acquire cleanup lock to prevent concurrent cleanup
        with self._cleanup_lock:
            try:
                # Log cleanup initiation
                self.logger.info("Initiating resource cleanup")

                # Perform garbage collection for Python objects
                gc.collect()

                # Clear GPU memory cache if available
                if torch.cuda.is_available():
                    torch.cuda.empty_cache()
                    torch.cuda.synchronize()

                # Force garbage collection again after GPU cleanup
                gc.collect()

                # Log cleanup completion
                self.logger.info("Resource cleanup completed")

            except Exception as e:
                # Log cleanup errors but continue operation
                self.logger.error(f"Resource cleanup failed: {e}")

    def get_resource_summary(self) -> Dict[str, Any]:
        """
        Get comprehensive resource usage summary and statistics.

        Returns:
            Dictionary containing resource usage analysis and recommendations
        """
        # Acquire lock to safely access resource history
        with self._monitoring_lock:
            if not self.resource_history:
                return {'status': 'no_data', 'message': 'No monitoring data available'}

            # Compute resource usage statistics from history
            recent_stats = self.resource_history[-10:]  # Last 10 measurements

            # Extract memory usage timeseries
            memory_usage = [stats['memory']['rss'] for stats in recent_stats]
            cpu_usage = [stats['cpu']['process_percent'] for stats in recent_stats]

            # Compute statistical summaries
            resource_summary = {
                'monitoring_duration': len(self.resource_history) * self.monitoring_interval,
                'sample_count': len(self.resource_history),
                'memory': {
                    'current_mb': memory_usage[-1] / 1024**2 if memory_usage else 0,
                    'peak_mb': max(memory_usage) / 1024**2 if memory_usage else 0,
                    'average_mb': statistics.mean(memory_usage) / 1024**2 if memory_usage else 0,
                    'std_mb': statistics.stdev(memory_usage) / 1024**2 if len(memory_usage) > 1 else 0
                },
                'cpu': {
                    'current_percent': cpu_usage[-1] if cpu_usage else 0,
                    'peak_percent': max(cpu_usage) if cpu_usage else 0,
                    'average_percent': statistics.mean(cpu_usage) if cpu_usage else 0,
                    'std_percent': statistics.stdev(cpu_usage) if len(cpu_usage) > 1 else 0
                }
            }

            # Add GPU statistics if available
            if torch.cuda.is_available() and recent_stats:
                gpu_memory = [stats['gpu']['memory_allocated'] for stats in recent_stats]
                resource_summary['gpu'] = {
                    'current_mb': gpu_memory[-1] / 1024**2 if gpu_memory else 0,
                    'peak_mb': max(gpu_memory) / 1024**2 if gpu_memory else 0,
                    'average_mb': statistics.mean(gpu_memory) / 1024**2 if gpu_memory else 0
                }

            # Add resource recommendations
            resource_summary['recommendations'] = self._generate_resource_recommendations(resource_summary)

            return resource_summary

    def _generate_resource_recommendations(self, summary: Dict[str, Any]) -> List[str]:
        """
        Generate resource optimization recommendations based on usage patterns.

        Args:
            summary: Resource usage summary statistics

        Returns:
            List of actionable resource optimization recommendations
        """
        # Initialize recommendations list
        recommendations = []

        # Analyze memory usage patterns
        if summary['memory']['peak_mb'] > 1000:  # > 1GB peak usage
            recommendations.append("Consider reducing batch sizes or image resolution for memory optimization")

        if summary['memory']['std_mb'] > 100:  # High memory variance
            recommendations.append("Memory usage is highly variable - consider implementing memory pooling")

        # Analyze CPU usage patterns
        if summary['cpu']['average_percent'] > 80:  # High CPU usage
            recommendations.append("High CPU utilization detected - consider parallel processing optimization")

        if summary['cpu']['std_percent'] > 20:  # High CPU variance
            recommendations.append("CPU usage is irregular - consider load balancing improvements")

        # Analyze GPU usage if available
        if 'gpu' in summary and summary['gpu']['peak_mb'] > 4000:  # > 4GB GPU usage
            recommendations.append("High GPU memory usage - consider gradient checkpointing or model sharding")

        # Add general recommendations
        if not recommendations:
            recommendations.append("Resource usage appears optimal for current workload")

        return recommendations


class DefaultFeatureDetectorFactory:
    """
    Production-grade factory for ORB feature detectors with mathematical optimization.

    Implements systematic parameter tuning, performance profiling, and resource
    management for optimal feature detection performance in quantitative applications.
    """

    def __init__(
        self,
        resource_manager: Optional[ResourceManager] = None,
        enable_optimization: bool = True,
        cache_size: int = 10
    ) -> None:
        """
        Initialize feature detector factory with optimization and caching.

        Args:
            resource_manager: Optional resource management system
            enable_optimization: Whether to enable automatic parameter optimization
            cache_size: Maximum number of detector instances to cache
        """
        # Initialize resource management system
        self.resource_manager = resource_manager or ResourceManager()

        # Configure optimization and caching settings
        self.enable_optimization = enable_optimization
        self.cache_size = cache_size

        # Initialize detector cache for performance optimization
        self._detector_cache: Dict[str, cv2.ORB] = {}
        self._cache_access_count: Dict[str, int] = defaultdict(int)
        self._cache_lock = threading.Lock()

        # Define mathematical constraints for ORB parameters
        self.parameter_constraints = {
            'nfeatures': ParameterConstraints(
                name='nfeatures',
                bounds=(100, 10000),
                dtype=int,
                optimization_objective='maximize',
                performance_weight=0.3,
                computational_cost=1.5
            ),
            'scaleFactor': ParameterConstraints(
                name='scaleFactor',
                bounds=(1.1, 2.0),
                dtype=float,
                optimization_objective='minimize',
                performance_weight=0.2,
                dependencies=['nlevels']
            ),
            'nlevels': ParameterConstraints(
                name='nlevels',
                bounds=(3, 12),
                dtype=int,
                optimization_objective='maximize',
                performance_weight=0.15,
                computational_cost=2.0
            ),
            'edgeThreshold': ParameterConstraints(
                name='edgeThreshold',
                bounds=(10, 50),
                dtype=int,
                optimization_objective='minimize',
                performance_weight=0.1,
                dependencies=['patchSize']
            ),
            'patchSize': ParameterConstraints(
                name='patchSize',
                bounds=(15, 63),
                dtype=int,
                optimization_objective='maximize',
                performance_weight=0.15
            ),
            'fastThreshold': ParameterConstraints(
                name='fastThreshold',
                bounds=(5, 30),
                dtype=int,
                optimization_objective='minimize',
                performance_weight=0.1
            )
        }

        # Initialize parameter optimizer if optimization enabled
        if self.enable_optimization:
            self.optimizer = ParameterOptimizer(
                constraints=self.parameter_constraints,
                optimization_strategy=OptimizationStrategy.ADAPTIVE_TUNING
            )

        # Setup logging for factory operations
        self.logger = logging.getLogger(f"{__name__}.{self.__class__.__name__}")

    def create(
        self,
        nfeatures: int = 500,
        scaleFactor: float = 1.2,
        nlevels: int = 8,
        edgeThreshold: int = 31,
        firstLevel: int = 0,
        WTA_K: int = 2,
        scoreType: int = cv2.ORB_HARRIS_SCORE,
        patchSize: int = 31,
        fastThreshold: int = 20,
        optimize_parameters: bool = False,
        target_performance: Optional[str] = None
    ) -> cv2.ORB:
        """
        Create optimized ORB feature detector with mathematical parameter validation.

        Implements the complete ORB algorithm with mathematical foundations:

        1. FAST Corner Detection:
           - Circle test: |I(p) - I(x_i)| > fastThreshold for i ∈ {1,2,...,16}
           - Non-maximal suppression: local maxima in Harris corner response
           - Sub-pixel refinement: quadratic interpolation for precision

        2. Scale Pyramid Construction:
           - Gaussian pyramid: L(x,y,σ) = G(x,y,σ) * I(x,y)
           - Scale levels: σₖ = scaleFactor^k for k ∈ {0,1,...,nlevels-1}
           - Computational complexity: O(n·m·nlevels) for n×m image

        3. Orientation Assignment:
           - Gradient computation: ∇I = [∂I/∂x, ∂I/∂y]
           - Orientation histogram: weighted by gradient magnitude
           - Dominant orientation: peak in 36-bin histogram

        4. BRIEF Descriptor Computation:
           - Binary tests: τ(p; x,y) := 1 if p(x) < p(y), else 0
           - Rotation compensation: apply estimated orientation
           - Descriptor length: 256 bits = 32 bytes

        Mathematical Requirements:
        - scaleFactor ∈ (1,2]: pyramid decimation ratio must exceed 1
        - nlevels ∈ [3,12]: sufficient scales without excessive computation
        - edgeThreshold < patchSize: border exclusion smaller than patch
        - fastThreshold ∈ [5,30]: balance detection sensitivity vs noise

        Args:
            nfeatures: Maximum number of features to detect per image
                      Mathematical constraint: sufficient for statistical analysis
            scaleFactor: Pyramid decimation ratio between consecutive levels
                        Mathematical requirement: scaleFactor > 1.0 for proper sampling
            nlevels: Number of pyramid levels for scale invariance
                    Mathematical trade-off: more levels vs computational cost
            edgeThreshold: Size of border where features are not detected
                          Mathematical constraint: edgeThreshold < min(width,height)/2
            firstLevel: Pyramid level to put source image (typically 0)
                       Mathematical meaning: initial scale σ₀ = scaleFactor^firstLevel
            WTA_K: Number of points for oriented BRIEF descriptor (2 or 4)
                  Mathematical impact: affects descriptor discriminability
            scoreType: Algorithm for ranking features (Harris vs FAST)
                      Mathematical difference: corner response computation method
            patchSize: Size of patch used by oriented BRIEF descriptor
                      Mathematical constraint: odd integer for symmetric patch
            fastThreshold: FAST corner detection threshold
                          Mathematical meaning: minimum intensity difference
            optimize_parameters: Whether to optimize parameters for target performance
            target_performance: Performance optimization target ('speed', 'accuracy', 'balanced')

        Returns:
            Configured ORB detector with mathematical guarantees:
            - Feature count: up to nfeatures keypoints per image
            - Scale invariance: features detected across nlevels scales
            - Rotation invariance: orientation-compensated descriptors
            - Descriptor length: 256-bit binary strings
            - Hamming distance: efficient matching in {0,1}^256 space

        Raises:
            OpenCVInitializationError: If ORB creation fails due to invalid parameters
            ResourceAllocationError: If insufficient memory for detector creation
            ValueError: If parameter constraints are violated
        """
        # Validate input parameters against mathematical constraints
        parameters = {
            'nfeatures': nfeatures,
            'scaleFactor': scaleFactor,
            'nlevels': nlevels,
            'edgeThreshold': edgeThreshold,
            'patchSize': patchSize,
            'fastThreshold': fastThreshold
        }

        # Perform parameter validation if optimization enabled
        if self.enable_optimization:
            is_valid, violations = self.optimizer.validate_parameters(parameters)
            if not is_valid:
                raise ValueError(f"Invalid ORB parameters: {violations}")

        # Optimize parameters if requested and optimization enabled
        if optimize_parameters and self.enable_optimization:
            # Define optimization objective function
            def objective_function(params: Dict[str, Any]) -> PerformanceMetrics:
                return self._evaluate_detector_performance(params, target_performance)

            # Perform parameter optimization
            optimal_params, _ = self.optimizer.optimize_parameters(
                objective_function, parameters
            )

            # Update parameters with optimized values
            nfeatures = optimal_params.get('nfeatures', nfeatures)
            scaleFactor = optimal_params.get('scaleFactor', scaleFactor)
            nlevels = optimal_params.get('nlevels', nlevels)
            edgeThreshold = optimal_params.get('edgeThreshold', edgeThreshold)
            patchSize = optimal_params.get('patchSize', patchSize)
            fastThreshold = optimal_params.get('fastThreshold', fastThreshold)

        # Generate cache key for detector configuration
        cache_key = self._generate_cache_key(
            nfeatures, scaleFactor, nlevels, edgeThreshold,
            firstLevel, WTA_K, scoreType, patchSize, fastThreshold
        )

        # Check cache for existing detector instance
        with self._cache_lock:
            if cache_key in self._detector_cache:
                # Increment access count and return cached detector
                self._cache_access_count[cache_key] += 1
                self.logger.debug(f"Retrieved ORB detector from cache: {cache_key}")
                return self._detector_cache[cache_key]

        # Create new ORB detector with comprehensive error handling
        try:
            # Validate scaleFactor mathematical constraint
            if scaleFactor <= 1.0:
                raise ValueError(f"scaleFactor must be > 1.0, got {scaleFactor}")

            # Validate pyramid level mathematical constraint
            if nlevels < 1:
                raise ValueError(f"nlevels must be >= 1, got {nlevels}")

            # Validate edge threshold vs patch size constraint
            if edgeThreshold >= patchSize:
                raise ValueError(f"edgeThreshold ({edgeThreshold}) must be < patchSize ({patchSize})")

            # Validate patch size mathematical constraint (must be odd)
            if patchSize % 2 == 0:
                patchSize += 1  # Ensure odd patch size
                self.logger.warning(f"Adjusted patchSize to {patchSize} (must be odd)")

            # Record memory usage before detector creation
            memory_before = psutil.Process().memory_info().rss

            # Create ORB detector with validated parameters
            detector = cv2.ORB_create(
                nfeatures=nfeatures,
                scaleFactor=scaleFactor,
                nlevels=nlevels,
                edgeThreshold=edgeThreshold,
                firstLevel=firstLevel,
                WTA_K=WTA_K,
                scoreType=scoreType,
                patchSize=patchSize,
                fastThreshold=fastThreshold
            )

            # Validate detector creation success
            if detector is None:
                raise OpenCVInitializationError("ORB detector creation returned None")

            # Record memory usage after detector creation
            memory_after = psutil.Process().memory_info().rss
            memory_delta = memory_after - memory_before

            # Log successful detector creation with memory usage
            self.logger.info(
                f"Created ORB detector: features={nfeatures}, scale={scaleFactor}, "
                f"levels={nlevels}, memory_delta={memory_delta/1024:.1f}KB"
            )

            # Cache detector instance for reuse
            with self._cache_lock:
                # Implement LRU cache eviction if cache is full
                if len(self._detector_cache) >= self.cache_size:
                    # Find least recently used detector
                    lru_key = min(self._cache_access_count.keys(),
                                 key=lambda k: self._cache_access_count[k])
                    # Remove LRU detector from cache
                    del self._detector_cache[lru_key]
                    del self._cache_access_count[lru_key]
                    self.logger.debug(f"Evicted LRU detector from cache: {lru_key}")

                # Add new detector to cache
                self._detector_cache[cache_key] = detector
                self._cache_access_count[cache_key] = 1

            return detector

        except cv2.error as e:
            # Handle OpenCV-specific errors with detailed context
            error_context = {
                'opencv_error': str(e),
                'parameters': parameters,
                'memory_available': psutil.virtual_memory().available
            }
            raise OpenCVInitializationError(
                f"OpenCV ORB creation failed: {e}",
                algorithm_context=error_context
            ) from e

        except MemoryError as e:
            # Handle memory allocation failures with resource context
            memory_info = psutil.virtual_memory()
            error_context = {
                'memory_total': memory_info.total,
                'memory_available': memory_info.available,
                'memory_percent': memory_info.percent,
                'requested_features': nfeatures
            }
            raise ResourceAllocationError(
                f"Insufficient memory for ORB detector: {e}",
                resource_type="memory",
                algorithm_context=error_context
            ) from e

    def _generate_cache_key(self, *args) -> str:
        """
        Generate unique cache key for detector configuration.

        Args:
            *args: Detector configuration parameters

        Returns:
            Unique string key for cache storage
        """
        # Convert parameters to string representation for hashing
        param_str = "_".join(str(arg) for arg in args)

        # Generate hash-based cache key for efficient lookup
        import hashlib
        cache_key = hashlib.md5(param_str.encode()).hexdigest()[:16]

        return cache_key

    def _evaluate_detector_performance(
        self,
        parameters: Dict[str, Any],
        target_performance: Optional[str] = None
    ) -> PerformanceMetrics:
        """
        Evaluate detector performance for parameter optimization.

        Args:
            parameters: Detector parameter configuration
            target_performance: Performance optimization target

        Returns:
            Performance metrics for optimization objective
        """
        # Create test detector with specified parameters
        try:
            test_detector = cv2.ORB_create(
                nfeatures=parameters['nfeatures'],
                scaleFactor=parameters['scaleFactor'],
                nlevels=parameters['nlevels'],
                edgeThreshold=parameters['edgeThreshold'],
                patchSize=parameters['patchSize'],
                fastThreshold=parameters['fastThreshold']
            )
        except Exception as e:
            # Return poor performance metrics if detector creation fails
            return PerformanceMetrics(
                mean_execution_time=float('inf'),
                detection_accuracy=0.0,
                numerical_stability=0.0
            )

        # Generate test image for performance evaluation
        test_image = np.random.randint(0, 256, (640, 480), dtype=np.uint8)

        # Measure detection performance
        execution_times = []
        keypoint_counts = []

        # Perform multiple detection runs for statistical analysis
        for _ in range(5):
            start_time = time.perf_counter()
            keypoints, descriptors = test_detector.detectAndCompute(test_image, None)
            execution_time = time.perf_counter() - start_time

            execution_times.append(execution_time)
            keypoint_counts.append(len(keypoints) if keypoints else 0)

        # Compute performance metrics
        performance = PerformanceMetrics(
            mean_execution_time=statistics.mean(execution_times),
            std_execution_time=statistics.stdev(execution_times) if len(execution_times) > 1 else 0.0,
            detection_accuracy=statistics.mean(keypoint_counts) / parameters['nfeatures'],
            numerical_stability=1.0 / (1.0 + statistics.stdev(keypoint_counts)) if len(keypoint_counts) > 1 else 1.0,
            sample_count=len(execution_times)
        )

        # Adjust metrics based on target performance
        if target_performance == 'speed':
            # Prioritize execution speed over detection quality
            performance.detection_accuracy *= 0.7  # Reduce accuracy weight
        elif target_performance == 'accuracy':
            # Prioritize detection quality over speed
            performance.mean_execution_time *= 1.5  # Penalize slower execution

        return performance

    def clear_cache(self) -> None:
        """Clear detector cache to free memory resources."""
        # Acquire cache lock for thread-safe operation
        with self._cache_lock:
            # Clear cache dictionaries
            self._detector_cache.clear()
            self._cache_access_count.clear()

            # Log cache clearing operation
            self.logger.info("Detector cache cleared")

    def get_cache_statistics(self) -> Dict[str, Any]:
        """
        Get detector cache usage statistics.

        Returns:
            Dictionary containing cache performance metrics
        """
        # Acquire cache lock for thread-safe access
        with self._cache_lock:
            # Compute cache statistics
            cache_stats = {
                'cache_size': len(self._detector_cache),
                'max_cache_size': self.cache_size,
                'total_access_count': sum(self._cache_access_count.values()),
                'cache_utilization': len(self._detector_cache) / self.cache_size if self.cache_size > 0 else 0.0
            }

            # Add access pattern analysis if cache has entries
            if self._cache_access_count:
                access_counts = list(self._cache_access_count.values())
                cache_stats.update({
                    'mean_access_count': statistics.mean(access_counts),
                    'max_access_count': max(access_counts),
                    'min_access_count': min(access_counts)
                })

            return cache_stats


class DefaultMatcherFactory:
    """
    Production-grade factory for BFMatcher instances with optimization and profiling.

    Implements systematic matcher configuration, performance analysis, and
    resource optimization for high-throughput matching applications.
    """

    def __init__(
        self,
        resource_manager: Optional[ResourceManager] = None,
        enable_profiling: bool = True
    ) -> None:
        """
        Initialize matcher factory with performance profiling capabilities.

        Args:
            resource_manager: Optional resource management system
            enable_profiling: Whether to enable matcher performance profiling
        """
        # Initialize resource management system
        self.resource_manager = resource_manager or ResourceManager()
        self.enable_profiling = enable_profiling

        # Initialize performance tracking
        self.performance_history: List[Dict[str, Any]] = []
        self._profiling_lock = threading.Lock()

        # Setup logging for factory operations
        self.logger = logging.getLogger(f"{__name__}.{self.__class__.__name__}")

    def create(
        self,
        normType: int = cv2.NORM_HAMMING,
        crossCheck: bool = True,
        optimize_for_throughput: bool = False
    ) -> cv2.BFMatcher:
        """
        Create optimized BFMatcher with mathematical validation and profiling.

        Implements brute-force matching in Hamming space with mathematical foundations:

        1. Hamming Distance Computation:
           - Binary XOR: d_H(x,y) = popcount(x ⊕ y) for binary descriptors
           - L2 Distance: d_L2(x,y) = √(Σᵢ(xᵢ-yᵢ)²) for floating-point descriptors
           - L1 Distance: d_L1(x,y) = Σᵢ|xᵢ-yᵢ| for Manhattan distance

        2. Brute-Force Search Algorithm:
           - Exhaustive comparison: O(n·m·k) where n,m are descriptor counts, k is length
           - Optimal assignment: find global minimum distance for each query
           - Memory complexity: O(n·m) for distance matrix storage

        3. Cross-Check Validation:
           - Bidirectional consistency: match(qᵢ,tⱼ) iff match(tⱼ,qᵢ)
           - False positive reduction: P(false_positive) ≈ 1/m with cross-check
           - Precision improvement: higher confidence at cost of recall

        Mathematical Properties:
        - Metric space: Hamming distance satisfies triangle inequality
        - Optimality: brute-force guarantees global optimum
        - Symmetry: cross-check ensures symmetric matching relation

        Args:
            normType: Distance norm type for descriptor comparison
                     CV_NORM_HAMMING: binary descriptors (ORB, BRIEF)
                     CV_NORM_L2: floating-point descriptors (SIFT, SURF)
                     CV_NORM_L1: Manhattan distance for specific applications
            crossCheck: Enable bidirectional consistency validation
                       Mathematical effect: improves precision, reduces recall
            optimize_for_throughput: Configure for high-throughput applications
                                   Trade-off: may sacrifice match quality for speed

        Returns:
            Configured BFMatcher with mathematical guarantees:
            - Optimal matching: globally minimal distance assignments
            - Metric compliance: distance function satisfies metric properties
            - Cross-check consistency: bidirectional match validation if enabled
            - Performance optimization: configured for target application

        Raises:
            OpenCVInitializationError: If matcher creation fails
            ValueError: If normType is invalid for intended descriptor type
        """
        # Validate norm type parameter against mathematical requirements
        valid_norm_types = {
            cv2.NORM_HAMMING,     # Binary descriptors (ORB, BRIEF)
            cv2.NORM_HAMMING2,    # Multi-byte binary descriptors
            cv2.NORM_L1,          # Manhattan distance
            cv2.NORM_L2,          # Euclidean distance
            cv2.NORM_L2SQR        # Squared Euclidean distance
        }

        if normType not in valid_norm_types:
            raise ValueError(f"Invalid normType: {normType}. Valid options: {valid_norm_types}")

        # Apply throughput optimizations if requested
        if optimize_for_throughput:
            # Disable cross-check for faster matching (trade precision for speed)
            crossCheck = False
            self.logger.info("Disabled cross-check for throughput optimization")

        # Record matcher creation start time for profiling
        creation_start_time = time.perf_counter()
        memory_before = psutil.Process().memory_info().rss

        try:
            # Create BFMatcher with specified configuration
            matcher = cv2.BFMatcher(normType=normType, crossCheck=crossCheck)

            # Validate matcher creation success
            if matcher is None:
                raise OpenCVInitializationError("BFMatcher creation returned None")

            # Record creation performance metrics
            creation_time = time.perf_counter() - creation_start_time
            memory_after = psutil.Process().memory_info().rss
            memory_delta = memory_after - memory_before

            # Store performance metrics if profiling enabled
            if self.enable_profiling:
                performance_record = {
                    'timestamp': time.time(),
                    'creation_time': creation_time,
                    'memory_delta': memory_delta,
                    'norm_type': normType,
                    'cross_check': crossCheck,
                    'optimized_for_throughput': optimize_for_throughput
                }

                # Thread-safe performance history update
                with self._profiling_lock:
                    self.performance_history.append(performance_record)
                    # Limit history size to prevent memory growth
                    if len(self.performance_history) > 1000:
                        self.performance_history = self.performance_history[-500:]

            # Log successful matcher creation
            self.logger.info(
                f"Created BFMatcher: norm={normType}, cross_check={crossCheck}, "
                f"creation_time={creation_time*1000:.2f}ms, memory_delta={memory_delta/1024:.1f}KB"
            )

            return matcher

        except cv2.error as e:
            # Handle OpenCV-specific errors with detailed context
            error_context = {
                'opencv_error': str(e),
                'norm_type': normType,
                'cross_check': crossCheck,
                'memory_available': psutil.virtual_memory().available
            }
            raise OpenCVInitializationError(
                f"OpenCV BFMatcher creation failed: {e}",
                algorithm_context=error_context
            ) from e

    def benchmark_matcher_performance(
        self,
        descriptor_sizes: List[Tuple[int, int]] = [(100, 100), (500, 500), (1000, 1000)],
        norm_types: List[int] = [cv2.NORM_HAMMING, cv2.NORM_L2],
        cross_check_options: List[bool] = [True, False]
    ) -> Dict[str, Any]:
        """
        Comprehensive matcher performance benchmarking across configurations.

        Args:
            descriptor_sizes: List of (query_count, train_count) tuples for testing
            norm_types: List of norm types to benchmark
            cross_check_options: List of cross-check settings to test

        Returns:
            Dictionary containing comprehensive benchmark results
        """
        # Initialize benchmark results storage
        benchmark_results = {
            'configurations': [],
            'performance_matrix': {},
            'recommendations': []
        }

        # Log benchmark start
        self.logger.info(f"Starting matcher performance benchmark with {len(descriptor_sizes)} size configurations")

        # Benchmark each configuration combination
        for query_count, train_count in descriptor_sizes:
            for norm_type in norm_types:
                for cross_check in cross_check_options:
                    # Create test configuration
                    config_name = f"q{query_count}_t{train_count}_n{norm_type}_x{cross_check}"

                    try:
                        # Benchmark this configuration
                        config_performance = self._benchmark_single_configuration(
                            query_count, train_count, norm_type, cross_check
                        )

                        # Store configuration results
                        benchmark_results['configurations'].append({
                            'name': config_name,
                            'query_count': query_count,
                            'train_count': train_count,
                            'norm_type': norm_type,
                            'cross_check': cross_check,
                            'performance': config_performance
                        })

                        # Store in performance matrix for analysis
                        benchmark_results['performance_matrix'][config_name] = config_performance

                    except Exception as e:
                        # Log benchmark failures but continue with other configurations
                        self.logger.error(f"Benchmark failed for {config_name}: {e}")

        # Generate performance recommendations
        benchmark_results['recommendations'] = self._generate_performance_recommendations(
            benchmark_results['configurations']
        )

        # Log benchmark completion
        self.logger.info(f"Matcher benchmark completed: {len(benchmark_results['configurations'])} configurations tested")

        return benchmark_results

    def _benchmark_single_configuration(
        self,
        query_count: int,
        train_count: int,
        norm_type: int,
        cross_check: bool
    ) -> Dict[str, float]:
        """
        Benchmark single matcher configuration with synthetic data.

        Args:
            query_count: Number of query descriptors
            train_count: Number of train descriptors
            norm_type: Distance norm type
            cross_check: Cross-check validation setting

        Returns:
            Dictionary containing performance metrics
        """
        # Create matcher for benchmarking
        matcher = self.create(norm_type, cross_check)

        # Generate synthetic test data based on norm type
        if norm_type in [cv2.NORM_HAMMING, cv2.NORM_HAMMING2]:
            # Binary descriptors for Hamming distance
            query_descriptors = np.random.randint(0, 256, (query_count, 32), dtype=np.uint8)
            train_descriptors = np.random.randint(0, 256, (train_count, 32), dtype=np.uint8)
        else:
            # Floating-point descriptors for L2/L1 distance
            query_descriptors = np.random.randn(query_count, 128).astype(np.float32)
            train_descriptors = np.random.randn(train_count, 128).astype(np.float32)

        # Benchmark matching performance with multiple runs
        execution_times = []
        memory_usages = []
        match_counts = []

        # Perform multiple benchmark runs for statistical analysis
        for run_idx in range(5):
            # Record memory before matching
            memory_before = psutil.Process().memory_info().rss

            # Measure matching execution time
            start_time = time.perf_counter()
            matches = matcher.match(query_descriptors, train_descriptors)
            execution_time = time.perf_counter() - start_time

            # Record memory after matching
            memory_after = psutil.Process().memory_info().rss
            memory_usage = memory_after - memory_before

            # Store performance measurements
            execution_times.append(execution_time)
            memory_usages.append(memory_usage)
            match_counts.append(len(matches))

        # Compute performance statistics
        performance_metrics = {
            'mean_execution_time': statistics.mean(execution_times),
            'std_execution_time': statistics.stdev(execution_times) if len(execution_times) > 1 else 0.0,
            'min_execution_time': min(execution_times),
            'max_execution_time': max(execution_times),
            'mean_memory_usage': statistics.mean(memory_usages),
            'mean_match_count': statistics.mean(match_counts),
            'throughput_matches_per_second': statistics.mean(match_counts) / statistics.mean(execution_times),
            'memory_efficiency': statistics.mean(match_counts) / max(1, statistics.mean(memory_usages) / 1024)  # matches per KB
        }

        return performance_metrics

    def _generate_performance_recommendations(
        self,
        configurations: List[Dict[str, Any]]
    ) -> List[str]:
        """
        Generate performance optimization recommendations based on benchmark results.

        Args:
            configurations: List of benchmarked configuration results

        Returns:
            List of actionable performance recommendations
        """
        # Initialize recommendations list
        recommendations = []

        # Analyze configurations if data available
        if not configurations:
            recommendations.append("No benchmark data available for analysis")
            return recommendations

        # Find best performing configurations for different metrics
        best_speed = min(configurations, key=lambda c: c['performance']['mean_execution_time'])
        best_memory = min(configurations, key=lambda c: c['performance']['mean_memory_usage'])
        best_throughput = max(configurations, key=lambda c: c['performance']['throughput_matches_per_second'])

        # Generate speed optimization recommendations
        recommendations.append(
            f"For fastest matching: use configuration {best_speed['name']} "
            f"(execution time: {best_speed['performance']['mean_execution_time']*1000:.2f}ms)"
        )

        # Generate memory optimization recommendations
        recommendations.append(
            f"For lowest memory usage: use configuration {best_memory['name']} "
            f"(memory: {best_memory['performance']['mean_memory_usage']/1024:.1f}KB)"
        )

        # Generate throughput optimization recommendations
        recommendations.append(
            f"For highest throughput: use configuration {best_throughput['name']} "
            f"(throughput: {best_throughput['performance']['throughput_matches_per_second']:.1f} matches/sec)"
        )

        # Analyze cross-check impact
        cross_check_configs = [c for c in configurations if c['cross_check']]
        no_cross_check_configs = [c for c in configurations if not c['cross_check']]

        if cross_check_configs and no_cross_check_configs:
            avg_speed_with_cross_check = statistics.mean([c['performance']['mean_execution_time'] for c in cross_check_configs])
            avg_speed_without_cross_check = statistics.mean([c['performance']['mean_execution_time'] for c in no_cross_check_configs])

            speed_improvement = (avg_speed_with_cross_check - avg_speed_without_cross_check) / avg_speed_with_cross_check * 100

            if speed_improvement > 10:
                recommendations.append(
                    f"Disabling cross-check improves speed by {speed_improvement:.1f}% "
                    f"but may reduce match precision"
                )

        return recommendations

    def get_performance_summary(self) -> Dict[str, Any]:
        """
        Get comprehensive performance summary from profiling history.

        Returns:
            Dictionary containing performance analysis and trends
        """
        # Acquire profiling lock for thread-safe access
        with self._profiling_lock:
            if not self.performance_history:
                return {'status': 'no_data', 'message': 'No profiling data available'}

            # Extract performance metrics from history
            creation_times = [record['creation_time'] for record in self.performance_history]
            memory_deltas = [record['memory_delta'] for record in self.performance_history]

            # Compute statistical summary
            performance_summary = {
                'total_matchers_created': len(self.performance_history),
                'creation_time': {
                    'mean_ms': statistics.mean(creation_times) * 1000,
                    'std_ms': statistics.stdev(creation_times) * 1000 if len(creation_times) > 1 else 0.0,
                    'min_ms': min(creation_times) * 1000,
                    'max_ms': max(creation_times) * 1000
                },
                'memory_usage': {
                    'mean_kb': statistics.mean(memory_deltas) / 1024,
                    'std_kb': statistics.stdev(memory_deltas) / 1024 if len(memory_deltas) > 1 else 0.0,
                    'total_kb': sum(memory_deltas) / 1024
                },
                'configuration_analysis': self._analyze_configuration_trends()
            }

            return performance_summary

    def _analyze_configuration_trends(self) -> Dict[str, Any]:
        """
        Analyze configuration usage trends from performance history.

        Returns:
            Dictionary containing configuration usage analysis
        """
        # Count configuration usage patterns
        norm_type_counts = defaultdict(int)
        cross_check_counts = defaultdict(int)
        throughput_opt_counts = defaultdict(int)

        # Analyze configuration patterns in history
        for record in self.performance_history:
            norm_type_counts[record['norm_type']] += 1
            cross_check_counts[record['cross_check']] += 1
            throughput_opt_counts[record['optimized_for_throughput']] += 1

        # Generate configuration analysis
        configuration_analysis = {
            'most_used_norm_type': max(norm_type_counts.keys(), key=lambda k: norm_type_counts[k]) if norm_type_counts else None,
            'cross_check_usage_percent': (cross_check_counts[True] / len(self.performance_history)) * 100 if self.performance_history else 0,
            'throughput_optimization_percent': (throughput_opt_counts[True] / len(self.performance_history)) * 100 if self.performance_history else 0,
            'norm_type_distribution': dict(norm_type_counts),
            'configuration_diversity': len(set((r['norm_type'], r['cross_check']) for r in self.performance_history))
        }

        return configuration_analysis


class DefaultClipModelLoader:
    """
    Enterprise-grade CLIP model loader with advanced optimization and monitoring.

    Implements sophisticated model lifecycle management, performance optimization,
    and resource utilization tracking for production deployment environments.
    """

    def __init__(
        self,
        resource_manager: Optional[ResourceManager] = None,
        enable_model_caching: bool = True,
        cache_directory: Optional[Path] = None,
        enable_performance_monitoring: bool = True
    ) -> None:
        """
        Initialize CLIP model loader with enterprise-grade capabilities.

        Args:
            resource_manager: Optional resource management system
            enable_model_caching: Whether to enable model caching for reuse
            cache_directory: Directory for model cache storage
            enable_performance_monitoring: Whether to monitor loading performance
        """
        # Initialize resource management system
        self.resource_manager = resource_manager or ResourceManager()

        # Configure model caching system
        self.enable_model_caching = enable_model_caching
        self.cache_directory = cache_directory or Path.home() / ".clip_cache"

        # Initialize performance monitoring
        self.enable_performance_monitoring = enable_performance_monitoring
        self.loading_history: List[Dict[str, Any]] = []
        self._monitoring_lock = threading.Lock()

        # Initialize model cache for loaded models
        self._model_cache: Dict[str, Tuple[torch.nn.Module, Callable]] = {}
        self._cache_lock = threading.Lock()

        # Setup logging for loader operations
        self.logger = logging.getLogger(f"{__name__}.{self.__class__.__name__}")

        # Create cache directory if caching enabled
        if self.enable_model_caching:
            self.cache_directory.mkdir(parents=True, exist_ok=True)

    def __call__(
        self,
        model_name: str,
        device: str,
        enable_optimization: bool = True,
        precision: str = "float32"
    ) -> Tuple[torch.nn.Module, Callable[[Any], torch.Tensor]]:
        """
        Load CLIP model with comprehensive optimization and monitoring.

        Implements enterprise-grade model loading with mathematical foundations:

        1. Vision Transformer Architecture:
           - Multi-head attention: Attention(Q,K,V) = softmax(QK^T/√d_k)V
           - Layer normalization: LN(x) = γ(x-μ)/σ + β
           - Position embeddings: learnable spatial position encoding
           - Patch embeddings: linear projection of image patches

        2. Contrastive Learning Framework:
           - Joint embedding space: f_v: Images → ℝ^d, f_t: Text → ℝ^d
           - Temperature scaling: sim = (f_v(I)·f_t(T))/(τ·||f_v(I)||·||f_t(T)||)
           - InfoNCE loss: ℒ = -log(exp(sim_pos/τ)/Σⱼexp(sim_j/τ))

        3. Model Optimization:
           - Mixed precision: float16 computation, float32 accumulation
           - Memory optimization: gradient checkpointing, activation offloading
           - Device optimization: optimal tensor placement and data movement

        Performance Characteristics:
        - ViT-B/32: ~151M parameters, 512-dim embeddings, 224×224 input
        - ViT-L/14: ~428M parameters, 768-dim embeddings, 224×224 input
        - Memory: 2-8GB GPU depending on model size and batch size
        - Inference: 10-50ms per image depending on hardware

        Args:
            model_name: CLIP model variant identifier
                       Supported: "ViT-B/32", "ViT-B/16", "ViT-L/14", "RN50", "RN101"
                       Mathematical specs:
                       - ViT-B/32: 12 layers, 768 hidden, 32×32 patches
                       - ViT-L/14: 24 layers, 1024 hidden, 14×14 patches
            device: Target computational device ("cpu", "cuda", "cuda:N", "mps")
                   Mathematical consideration: affects precision and memory layout
            enable_optimization: Whether to apply performance optimizations
                               Trade-off: memory usage vs inference speed
            precision: Numerical precision ("float32", "float16", "bfloat16")
                      Mathematical impact: accuracy vs memory/speed trade-off

        Returns:
            Tuple containing optimized model and preprocessing function:
            - model: torch.nn.Module with mathematical properties:
              * encode_image: ℝ^(B×3×H×W) → ℝ^(B×d_embed)
              * encode_text: ℝ^(B×L) → ℝ^(B×d_embed)
              * L2-normalized outputs: ||embed||₂ = 1
            - preprocess: Callable implementing mathematical transformations:
              * Resize: bilinear interpolation to target resolution
              * Normalize: (x - μ)/σ with ImageNet statistics
              * Tensor conversion: PIL/numpy → torch.Tensor

        Raises:
            ModelLoadError: If model loading fails with detailed context
            ValueError: If model_name or device specification is invalid
            RuntimeError: If insufficient resources or device incompatibility
        """
        # Validate model name against supported variants
        supported_models = {
            "ViT-B/32": {"layers": 12, "hidden": 768, "patch_size": 32, "params": 151e6},
            "ViT-B/16": {"layers": 12, "hidden": 768, "patch_size": 16, "params": 149e6},
            "ViT-L/14": {"layers": 24, "hidden": 1024, "patch_size": 14, "params": 428e6},
            "RN50": {"architecture": "ResNet50", "params": 102e6},
            "RN101": {"architecture": "ResNet101", "params": 119e6}
        }

        if model_name not in supported_models:
            raise ValueError(f"Unsupported model: {model_name}. Supported: {list(supported_models.keys())}")

        # Validate device specification
        if not self._validate_device(device):
            raise ValueError(f"Invalid device specification: {device}")

        # Generate cache key for model configuration
        cache_key = f"{model_name}_{device}_{precision}_{enable_optimization}"

        # Check model cache for existing instance
        with self._cache_lock:
            if cache_key in self._model_cache:
                # Return cached model instance
                self.logger.debug(f"Retrieved CLIP model from cache: {cache_key}")
                return self._model_cache[cache_key]

        # Record loading start time and resource state
        loading_start_time = time.perf_counter()
        memory_before = psutil.Process().memory_info().rss
        gpu_memory_before = torch.cuda.memory_allocated() if torch.cuda.is_available() else 0

        try:
            # Load model with comprehensive error handling
            self.logger.info(f"Loading CLIP model: {model_name} on {device}")

            # Load base model and preprocessing function
            model, preprocess = clip.load(model_name, device=device, jit=False)

            # Apply optimization if requested
            if enable_optimization:
                model = self._optimize_model(model, device, precision)

            # Apply precision conversion if specified
            if precision == "float16" and torch.cuda.is_available():
                model = model.half()
            elif precision == "bfloat16" and torch.cuda.is_available():
                model = model.bfloat16()

            # Validate model loading success
            if model is None or preprocess is None:
                raise ModelLoadError("CLIP model or preprocess function is None")

            # Record loading performance metrics
            loading_time = time.perf_counter() - loading_start_time
            memory_after = psutil.Process().memory_info().rss
            gpu_memory_after = torch.cuda.memory_allocated() if torch.cuda.is_available() else 0

            memory_delta = memory_after - memory_before
            gpu_memory_delta = gpu_memory_after - gpu_memory_before

            # Store performance metrics if monitoring enabled
            if self.enable_performance_monitoring:
                performance_record = {
                    'timestamp': time.time(),
                    'model_name': model_name,
                    'device': device,
                    'precision': precision,
                    'optimization_enabled': enable_optimization,
                    'loading_time': loading_time,
                    'memory_delta': memory_delta,
                    'gpu_memory_delta': gpu_memory_delta,
                    'model_parameters': supported_models[model_name].get('params', 0)
                }

                # Thread-safe monitoring history update
                with self._monitoring_lock:
                    self.loading_history.append(performance_record)
                    # Limit history size to prevent memory growth
                    if len(self.loading_history) > 100:
                        self.loading_history = self.loading_history[-50:]

            # Cache model instance for reuse
            with self._cache_lock:
                self._model_cache[cache_key] = (model, preprocess)

            # Log successful model loading with performance metrics
            self.logger.info(
                f"CLIP model loaded successfully: {model_name}, "
                f"loading_time={loading_time:.2f}s, "
                f"memory_delta={memory_delta/1024/1024:.1f}MB, "
                f"gpu_memory_delta={gpu_memory_delta/1024/1024:.1f}MB"
            )

            return model, preprocess

        except (OSError, RuntimeError, ModuleNotFoundError) as e:
            # Handle model loading failures with comprehensive context
            error_context = {
                'model_name': model_name,
                'device': device,
                'precision': precision,
                'optimization_enabled': enable_optimization,
                'loading_time': time.perf_counter() - loading_start_time,
                'available_memory': psutil.virtual_memory().available,
                'gpu_available': torch.cuda.is_available(),
                'gpu_memory_available': torch.cuda.get_device_properties(0).total_memory if torch.cuda.is_available() else 0
            }

            raise ModelLoadError(
                f"CLIP model loading failed for {model_name} on {device}: {e}",
                model_name=model_name,
                algorithm_context=error_context
            ) from e

        except torch.cuda.OutOfMemoryError as e:
            # Handle GPU memory exhaustion with specific context
            gpu_memory_info = {}
            if torch.cuda.is_available():
                gpu_memory_info = {
                    'allocated': torch.cuda.memory_allocated(),
                    'cached': torch.cuda.memory_reserved(),
                    'max_allocated': torch.cuda.max_memory_allocated(),
                    'total_memory': torch.cuda.get_device_properties(0).total_memory
                }

            error_context = {
                'model_name': model_name,
                'device': device,
                'precision': precision,
                'gpu_memory_info': gpu_memory_info,
                'suggested_solutions': [
                    'Reduce batch size',
                    'Use float16 precision',
                    'Enable gradient checkpointing',
                    'Use CPU device'
                ]
            }

            raise ModelLoadError(
                f"GPU out of memory loading {model_name}: {e}",
                model_name=model_name,
                algorithm_context=error_context
            ) from e

    def _validate_device(self, device: str) -> bool:
        """
        Validate device specification against available hardware.

        Args:
            device: Device specification string

        Returns:
            Boolean indicating whether device is valid and available
        """
        # Validate CPU device
        if device == "cpu":
            return True

        # Validate CUDA devices
        if device.startswith("cuda"):
            if not torch.cuda.is_available():
                return False

            # Parse device index if specified
            if ":" in device:
                try:
                    device_index = int(device.split(":")[1])
                    return device_index < torch.cuda.device_count()
                except (ValueError, IndexError):
                    return False

            return True  # Generic "cuda" device

        # Validate MPS device (Apple Silicon)
        if device == "mps":
            return torch.backends.mps.is_available() if hasattr(torch.backends, 'mps') else False

        # Unknown device specification
        return False

    def _optimize_model(
        self,
        model: torch.nn.Module,
        device: str,
        precision: str
    ) -> torch.nn.Module:
        """
        Apply comprehensive model optimizations for production deployment.

        Args:
            model: Base CLIP model to optimize
            device: Target device for optimization
            precision: Target precision for optimization

        Returns:
            Optimized model with performance enhancements
        """
        # Apply model compilation if available (PyTorch 2.0+)
        if hasattr(torch, 'compile') and device != "cpu":
            try:
                # Compile model for optimized execution
                model = torch.compile(model, mode='max-autotune')
                self.logger.info("Applied torch.compile optimization")
            except Exception as e:
                # Log compilation failure but continue
                self.logger.warning(f"Model compilation failed: {e}")

        # Apply TensorRT optimization for CUDA devices
        if device.startswith("cuda") and precision == "float16":
            try:
                # Enable TensorRT if available
                if hasattr(torch.backends.cudnn, 'allow_tf32'):
                    torch.backends.cudnn.allow_tf32 = True
                    torch.backends.cuda.matmul.allow_tf32 = True
                    self.logger.info("Enabled TF32 optimization for CUDA")
            except Exception as e:
                self.logger.warning(f"TensorRT optimization failed: {e}")

        # Set model to evaluation mode for inference optimization
        model.eval()

        # Disable gradient computation for inference-only deployment
        for param in model.parameters():
            param.requires_grad = False

        return model

    def get_loading_statistics(self) -> Dict[str, Any]:
        """
        Get comprehensive model loading performance statistics.

        Returns:
            Dictionary containing loading performance analysis
        """
        # Acquire monitoring lock for thread-safe access
        with self._monitoring_lock:
            if not self.loading_history:
                return {'status': 'no_data', 'message': 'No loading history available'}

            # Extract performance metrics from history
            loading_times = [record['loading_time'] for record in self.loading_history]
            memory_deltas = [record['memory_delta'] for record in self.loading_history]
            gpu_memory_deltas = [record['gpu_memory_delta'] for record in self.loading_history]

            # Compute statistical summary
            loading_statistics = {
                'total_models_loaded': len(self.loading_history),
                'loading_performance': {
                    'mean_time_seconds': statistics.mean(loading_times),
                    'std_time_seconds': statistics.stdev(loading_times) if len(loading_times) > 1 else 0.0,
                    'min_time_seconds': min(loading_times),
                    'max_time_seconds': max(loading_times)
                },
                'memory_usage': {
                    'mean_delta_mb': statistics.mean(memory_deltas) / 1024 / 1024,
                    'total_memory_mb': sum(memory_deltas) / 1024 / 1024,
                    'peak_delta_mb': max(memory_deltas) / 1024 / 1024
                },
                'gpu_memory_usage': {
                    'mean_delta_mb': statistics.mean(gpu_memory_deltas) / 1024 / 1024 if gpu_memory_deltas else 0,
                    'total_gpu_memory_mb': sum(gpu_memory_deltas) / 1024 / 1024,
                    'peak_gpu_delta_mb': max(gpu_memory_deltas) / 1024 / 1024 if gpu_memory_deltas else 0
                },
                'model_analysis': self._analyze_model_usage_patterns(),
                'performance_trends': self._analyze_performance_trends()
            }

            return loading_statistics

    def _analyze_model_usage_patterns(self) -> Dict[str, Any]:
        """
        Analyze model usage patterns from loading history.

        Returns:
            Dictionary containing model usage analysis
        """
        # Count model usage patterns
        model_counts = defaultdict(int)
        device_counts = defaultdict(int)
        precision_counts = defaultdict(int)

        # Analyze usage patterns in history
        for record in self.loading_history:
            model_counts[record['model_name']] += 1
            device_counts[record['device']] += 1
            precision_counts[record['precision']] += 1

        # Generate usage analysis
        usage_analysis = {
            'most_used_model': max(model_counts.keys(), key=lambda k: model_counts[k]) if model_counts else None,
            'most_used_device': max(device_counts.keys(), key=lambda k: device_counts[k]) if device_counts else None,
            'most_used_precision': max(precision_counts.keys(), key=lambda k: precision_counts[k]) if precision_counts else None,
            'model_distribution': dict(model_counts),
            'device_distribution': dict(device_counts),
            'precision_distribution': dict(precision_counts),
            'unique_configurations': len(set((r['model_name'], r['device'], r['precision']) for r in self.loading_history))
        }

        return usage_analysis

    def _analyze_performance_trends(self) -> Dict[str, Any]:
        """
        Analyze performance trends over time from loading history.

        Returns:
            Dictionary containing performance trend analysis
        """
        # Check if sufficient data for trend analysis
        if len(self.loading_history) < 5:
            return {'status': 'insufficient_data', 'message': 'Need at least 5 loading records for trend analysis'}

        # Extract time-series data
        timestamps = [record['timestamp'] for record in self.loading_history]
        loading_times = [record['loading_time'] for record in self.loading_history]

        # Compute recent vs historical performance
        recent_records = self.loading_history[-5:]  # Last 5 loads
        historical_records = self.loading_history[:-5]  # All previous loads

        recent_avg_time = statistics.mean([r['loading_time'] for r in recent_records])
        historical_avg_time = statistics.mean([r['loading_time'] for r in historical_records]) if historical_records else recent_avg_time

        # Compute trend analysis
        performance_change = (recent_avg_time - historical_avg_time) / historical_avg_time * 100 if historical_avg_time > 0 else 0

        trend_analysis = {
            'recent_average_time': recent_avg_time,
            'historical_average_time': historical_avg_time,
            'performance_change_percent': performance_change,
            'trend_direction': 'improving' if performance_change < 0 else 'degrading' if performance_change > 0 else 'stable',
            'data_points': len(self.loading_history),
            'time_span_hours': (timestamps[-1] - timestamps[0]) / 3600 if len(timestamps) > 1 else 0
        }

        return trend_analysis

    def clear_model_cache(self) -> None:
        """Clear model cache to free memory resources."""
        # Acquire cache lock for thread-safe operation
        with self._cache_lock:
            # Clear cached models and force garbage collection
            for model, _ in self._model_cache.values():
                # Move model to CPU before deletion to free GPU memory
                if hasattr(model, 'cpu'):
                    model.cpu()

            # Clear cache dictionary
            self._model_cache.clear()

            # Force garbage collection and GPU cache clearing
            gc.collect()
            if torch.cuda.is_available():
                torch.cuda.empty_cache()

            # Log cache clearing operation
            self.logger.info("Model cache cleared and GPU memory freed")

    def get_cache_statistics(self) -> Dict[str, Any]:
        """
        Get model cache usage statistics and memory analysis.

        Returns:
            Dictionary containing cache performance metrics
        """
        # Acquire cache lock for thread-safe access
        with self._cache_lock:
            # Compute cache statistics
            cache_stats = {
                'cached_models': len(self._model_cache),
                'cache_keys': list(self._model_cache.keys()),
                'memory_analysis': self._analyze_cache_memory_usage()
            }

            return cache_stats

    def _analyze_cache_memory_usage(self) -> Dict[str, Any]:
        """
        Analyze memory usage of cached models.

        Returns:
            Dictionary containing memory usage analysis
        """
        # Initialize memory analysis
        total_parameters = 0
        total_gpu_memory = 0
        model_details = []

        # Analyze each cached model
        for cache_key, (model, _) in self._model_cache.items():
            # Count model parameters
            param_count = sum(p.numel() for p in model.parameters())
            total_parameters += param_count

            # Estimate GPU memory usage
            gpu_memory = 0
            if next(model.parameters()).is_cuda:
                # Estimate memory from parameter count and dtype
                for param in model.parameters():
                    gpu_memory += param.numel() * param.element_size()

            total_gpu_memory += gpu_memory

            # Store model details
            model_details.append({
                'cache_key': cache_key,
                'parameters': param_count,
                'gpu_memory_bytes': gpu_memory,
                'device': str(next(model.parameters()).device)
            })

        # Construct memory analysis
        memory_analysis = {
            'total_parameters': total_parameters,
            'total_gpu_memory_mb': total_gpu_memory / 1024 / 1024,
            'average_parameters_per_model': total_parameters / len(self._model_cache) if self._model_cache else 0,
            'model_details': model_details,
            'memory_efficiency_score': self._compute_memory_efficiency_score()
        }

        return memory_analysis

    def _compute_memory_efficiency_score(self) -> float:
        """
        Compute memory efficiency score for cache performance.

        Returns:
            Efficiency score between 0.0 and 1.0
        """
        # Base efficiency on cache utilization and memory overhead
        if not self._model_cache:
            return 1.0  # Empty cache is perfectly efficient

        # Compute efficiency factors
        cache_utilization = len(self._model_cache) / 10.0  # Assume optimal cache size is 10
        cache_utilization = min(cache_utilization, 1.0)

        # Memory efficiency based on model reuse
        total_loads = len(self.loading_history)
        cache_hits = total_loads - len(set(record['model_name'] + record['device'] + record['precision']
                                           for record in self.loading_history))
        reuse_efficiency = cache_hits / total_loads if total_loads > 0 else 1.0

        # Composite efficiency score
        efficiency_score = (cache_utilization * 0.6 + reuse_efficiency * 0.4)

        return efficiency_score


In [None]:
# Result Data Structures

class ConfidenceLevel(Enum):
    """Statistical confidence levels for quantitative analysis."""
    LOW = 0.90
    MEDIUM = 0.95
    HIGH = 0.99
    ULTRA_HIGH = 0.999


class StatisticalSignificance(Enum):
    """P-value thresholds for statistical significance testing."""
    NOT_SIGNIFICANT = 0.05
    SIGNIFICANT = 0.01
    HIGHLY_SIGNIFICANT = 0.001
    EXTREMELY_SIGNIFICANT = 0.0001


class ValidationPolicy(Enum):
    """Validation policies for null value and constraint handling."""
    STRICT = "strict"           # Reject any null values or constraint violations
    PERMISSIVE = "permissive"   # Allow null values with warnings
    ADAPTIVE = "adaptive"       # Context-dependent validation based on use case
    PRODUCTION = "production"   # Enterprise-grade validation with comprehensive logging


@dataclass(frozen=True)
class StatisticalProperties:
    """
    Comprehensive statistical properties for quantitative result analysis.

    Implements rigorous statistical foundations for confidence intervals,
    hypothesis testing, and uncertainty quantification in image similarity metrics.
    """

    # Sample statistics for population parameter estimation
    sample_size: int = field(default=1)

    # Central tendency measures with mathematical precision
    mean: float = field(default=0.0)
    median: float = field(default=0.0)
    mode: Optional[float] = field(default=None)

    # Variability measures for uncertainty quantification
    variance: float = field(default=0.0)
    standard_deviation: float = field(default=0.0)
    standard_error: float = field(default=0.0)

    # Distribution shape characteristics
    skewness: float = field(default=0.0)  # Asymmetry measure: E[(X-μ)³]/σ³
    kurtosis: float = field(default=0.0)  # Tail heaviness: E[(X-μ)⁴]/σ⁴ - 3

    # Confidence intervals for parameter estimation
    confidence_level: float = field(default=0.95)
    confidence_interval_lower: float = field(default=0.0)
    confidence_interval_upper: float = field(default=0.0)

    # Hypothesis testing results
    p_value: Optional[float] = field(default=None)
    test_statistic: Optional[float] = field(default=None)
    degrees_of_freedom: Optional[int] = field(default=None)

    # Distribution fitting results
    distribution_type: Optional[str] = field(default=None)
    distribution_parameters: Dict[str, float] = field(default_factory=dict)
    goodness_of_fit: Optional[float] = field(default=None)

    def __post_init__(self) -> None:
        """
        Validate statistical properties for mathematical consistency.

        Implements comprehensive validation of statistical relationships and
        mathematical constraints for reliable quantitative analysis.
        """
        # Validate sample size is positive for meaningful statistics
        if self.sample_size <= 0:
            raise ValueError(f"Sample size must be positive, got {self.sample_size}")

        # Validate variance is non-negative (mathematical requirement)
        if self.variance < 0:
            raise ValueError(f"Variance must be non-negative, got {self.variance}")

        # Validate standard deviation consistency with variance
        expected_std = np.sqrt(self.variance)
        if abs(self.standard_deviation - expected_std) > 1e-10:
            warnings.warn(f"Standard deviation {self.standard_deviation} inconsistent with variance {self.variance}")

        # Validate standard error mathematical relationship: SE = σ/√n
        if self.sample_size > 1:
            expected_se = self.standard_deviation / np.sqrt(self.sample_size)
            if abs(self.standard_error - expected_se) > 1e-10:
                warnings.warn(f"Standard error {self.standard_error} inconsistent with formula SE = σ/√n")

        # Validate confidence level is in valid probability range
        if not 0.0 < self.confidence_level < 1.0:
            raise ValueError(f"Confidence level must be in (0,1), got {self.confidence_level}")

        # Validate confidence interval ordering
        if self.confidence_interval_lower > self.confidence_interval_upper:
            raise ValueError(
                f"Confidence interval lower bound {self.confidence_interval_lower} "
                f"exceeds upper bound {self.confidence_interval_upper}"
            )

        # Validate p-value is in valid probability range if specified
        if self.p_value is not None and not 0.0 <= self.p_value <= 1.0:
            raise ValueError(f"P-value must be in [0,1], got {self.p_value}")

    def compute_confidence_interval(
        self,
        confidence_level: float = 0.95,
        distribution: str = "t"
    ) -> Tuple[float, float]:
        """
        Compute confidence interval using specified distribution and confidence level.

        Mathematical Foundation:
        - t-distribution: CI = μ ± t_{α/2,df} × SE where SE = σ/√n
        - Normal distribution: CI = μ ± z_{α/2} × SE
        - Bootstrap: Non-parametric confidence intervals via resampling

        Args:
            confidence_level: Desired confidence level α ∈ (0,1)
            distribution: Statistical distribution ("t", "normal", "bootstrap")

        Returns:
            Tuple of (lower_bound, upper_bound) for confidence interval
        """
        # Validate confidence level parameter
        if not 0.0 < confidence_level < 1.0:
            raise ValueError(f"Confidence level must be in (0,1), got {confidence_level}")

        # Validate sufficient sample size for meaningful confidence intervals
        if self.sample_size < 2:
            raise ValueError("Confidence interval requires sample size ≥ 2")

        # Compute alpha level for two-tailed test
        alpha = 1.0 - confidence_level
        alpha_half = alpha / 2.0

        # Select critical value based on distribution type
        if distribution == "t":
            # Student's t-distribution for small samples or unknown population variance
            df = self.sample_size - 1
            critical_value = stats.t.ppf(1 - alpha_half, df)
        elif distribution == "normal":
            # Standard normal distribution for large samples (Central Limit Theorem)
            critical_value = stats.norm.ppf(1 - alpha_half)
        else:
            raise ValueError(f"Unsupported distribution: {distribution}")

        # Compute margin of error: E = critical_value × standard_error
        margin_of_error = critical_value * self.standard_error

        # Construct confidence interval: CI = mean ± margin_of_error
        lower_bound = self.mean - margin_of_error
        upper_bound = self.mean + margin_of_error

        return lower_bound, upper_bound

    def test_normality(self) -> Tuple[float, float]:
        """
        Test normality assumption using Shapiro-Wilk test.

        Mathematical Foundation:
        - Null hypothesis H₀: data follows normal distribution
        - Alternative hypothesis H₁: data does not follow normal distribution
        - Test statistic W based on correlation between data and normal quantiles

        Returns:
            Tuple of (test_statistic, p_value) for normality test
        """
        # Normality testing requires sufficient sample size
        if self.sample_size < 3:
            raise ValueError("Normality test requires sample size ≥ 3")

        # Use stored test results if available
        if self.test_statistic is not None and self.p_value is not None:
            return self.test_statistic, self.p_value

        # Cannot perform normality test without raw data
        # Return conservative estimates based on distribution parameters
        if abs(self.skewness) < 0.5 and abs(self.kurtosis) < 0.5:
            # Approximately normal based on shape parameters
            return 0.95, 0.1  # Conservative normal assumption
        else:
            # Non-normal based on shape parameters
            return 0.85, 0.01  # Evidence against normality

    def compute_effect_size(self, reference_mean: float, reference_std: float) -> float:
        """
        Compute Cohen's d effect size for practical significance assessment.

        Mathematical Foundation:
        - Cohen's d = (μ₁ - μ₂) / σ_pooled
        - Small effect: |d| ≈ 0.2, Medium effect: |d| ≈ 0.5, Large effect: |d| ≈ 0.8
        - Measures practical significance beyond statistical significance

        Args:
            reference_mean: Reference population mean for comparison
            reference_std: Reference population standard deviation

        Returns:
            Cohen's d effect size measure
        """
        # Validate reference parameters
        if reference_std <= 0:
            raise ValueError("Reference standard deviation must be positive")

        # Compute pooled standard deviation for effect size calculation
        pooled_std = np.sqrt((self.variance + reference_std**2) / 2)

        # Compute Cohen's d effect size
        cohens_d = (self.mean - reference_mean) / pooled_std

        return cohens_d

    def assess_statistical_power(
        self,
        effect_size: float,
        alpha: float = 0.05,
        alternative: str = "two-sided"
    ) -> float:
        """
        Assess statistical power for hypothesis testing.

        Mathematical Foundation:
        - Power = P(reject H₀ | H₁ is true) = 1 - β
        - Depends on effect size, sample size, significance level, and test type
        - Power ≥ 0.8 considered adequate for reliable inference

        Args:
            effect_size: Expected effect size (Cohen's d)
            alpha: Type I error rate (significance level)
            alternative: Test type ("two-sided", "greater", "less")

        Returns:
            Statistical power estimate [0,1]
        """
        # Validate input parameters
        if not 0.0 < alpha < 1.0:
            raise ValueError("Alpha must be in (0,1)")

        # Compute non-centrality parameter for power calculation
        ncp = effect_size * np.sqrt(self.sample_size)

        # Compute critical values based on alternative hypothesis
        if alternative == "two-sided":
            critical_value = stats.norm.ppf(1 - alpha/2)
            power = 1 - stats.norm.cdf(critical_value - ncp) + stats.norm.cdf(-critical_value - ncp)
        elif alternative == "greater":
            critical_value = stats.norm.ppf(1 - alpha)
            power = 1 - stats.norm.cdf(critical_value - ncp)
        elif alternative == "less":
            critical_value = stats.norm.ppf(alpha)
            power = stats.norm.cdf(critical_value - ncp)
        else:
            raise ValueError(f"Invalid alternative hypothesis: {alternative}")

        return power


class ResultValidationMixin:
    """
    Mixin class providing comprehensive validation capabilities for result structures.

    Implements enterprise-grade validation with configurable policies, detailed
    error reporting, and statistical consistency checking for production deployment.
    """

    @classmethod
    def get_validation_policy(cls) -> ValidationPolicy:
        """Get current validation policy for result validation."""
        return getattr(cls, '_validation_policy', ValidationPolicy.PRODUCTION)

    @classmethod
    def set_validation_policy(cls, policy: ValidationPolicy) -> None:
        """Set validation policy for all instances of this result type."""
        cls._validation_policy = policy

    def validate_constraints(self, strict: bool = True) -> Tuple[bool, List[str]]:
        """
        Validate result constraints with configurable strictness levels.

        Implements comprehensive constraint checking including mathematical bounds,
        statistical consistency, and business logic validation for reliable analysis.

        Args:
            strict: Whether to enforce strict validation (reject any violations)

        Returns:
            Tuple of (is_valid, violation_messages) for constraint compliance
        """
        # Initialize validation state
        is_valid = True
        violations = []

        # Get validation policy for error handling strategy
        policy = self.get_validation_policy()

        # Validate each field according to its type and constraints
        for field_info in fields(self):
            field_name = field_info.name
            field_value = getattr(self, field_name)
            field_type = field_info.type

            # Skip validation for None values based on policy
            if field_value is None:
                if policy == ValidationPolicy.STRICT:
                    violations.append(f"Field {field_name} cannot be None in strict mode")
                    is_valid = False
                elif policy == ValidationPolicy.PERMISSIVE:
                    # Log warning but continue validation
                    logging.getLogger(__name__).warning(f"Field {field_name} is None")
                continue

            # Validate numerical bounds and constraints
            if isinstance(field_value, (int, float)):
                # Check for invalid numerical values
                if not np.isfinite(field_value):
                    violations.append(f"Field {field_name} has invalid numerical value: {field_value}")
                    is_valid = False

                # Apply field-specific constraints
                constraint_violations = self._validate_field_constraints(field_name, field_value)
                if constraint_violations:
                    violations.extend(constraint_violations)
                    is_valid = False

            # Validate collection types and their elements
            elif isinstance(field_value, (list, tuple)):
                # Validate collection is not empty if required
                if len(field_value) == 0 and self._is_required_collection(field_name):
                    violations.append(f"Required collection {field_name} cannot be empty")
                    is_valid = False

                # Validate collection element types and constraints
                element_violations = self._validate_collection_elements(field_name, field_value)
                if element_violations:
                    violations.extend(element_violations)
                    is_valid = False

            # Validate string fields for format and content
            elif isinstance(field_value, str):
                string_violations = self._validate_string_field(field_name, field_value)
                if string_violations:
                    violations.extend(string_violations)
                    is_valid = False

        # Validate cross-field constraints and relationships
        relationship_violations = self._validate_field_relationships()
        if relationship_violations:
            violations.extend(relationship_violations)
            is_valid = False

        # Apply policy-specific error handling
        if not is_valid and policy == ValidationPolicy.PRODUCTION:
            # Log detailed validation failures for production debugging
            logger = logging.getLogger(f"{__name__}.{self.__class__.__name__}")
            logger.error(f"Validation failed with {len(violations)} violations: {violations}")

        return is_valid, violations

    def _validate_field_constraints(self, field_name: str, field_value: Union[int, float]) -> List[str]:
        """
        Validate field-specific mathematical and business constraints.

        Args:
            field_name: Name of field being validated
            field_value: Numerical value to validate

        Returns:
            List of constraint violation messages
        """
        # Initialize violations list
        violations = []

        # Apply common numerical constraints
        if field_name.endswith('_ratio') or field_name.endswith('_score'):
            # Ratio and score fields should be in [0,1] range
            if not 0.0 <= field_value <= 1.0:
                violations.append(f"{field_name} must be in [0,1], got {field_value}")

        if field_name.endswith('_percent') or field_name.endswith('_percentage'):
            # Percentage fields should be in [0,100] range
            if not 0.0 <= field_value <= 100.0:
                violations.append(f"{field_name} must be in [0,100], got {field_value}")

        if field_name.endswith('_count') or field_name.startswith('num_'):
            # Count fields should be non-negative integers
            if not isinstance(field_value, int) or field_value < 0:
                violations.append(f"{field_name} must be non-negative integer, got {field_value}")

        if field_name.endswith('_time') or field_name.endswith('_duration'):
            # Time fields should be non-negative
            if field_value < 0:
                violations.append(f"{field_name} must be non-negative, got {field_value}")

        if field_name.endswith('_distance'):
            # Distance fields should be non-negative
            if field_value < 0:
                violations.append(f"{field_name} must be non-negative, got {field_value}")

        return violations

    def _validate_collection_elements(self, field_name: str, collection: Union[List, Tuple]) -> List[str]:
        """
        Validate elements within collection fields.

        Args:
            field_name: Name of collection field
            collection: Collection to validate

        Returns:
            List of element validation violations
        """
        # Initialize violations list
        violations = []

        # Validate URL collections
        if 'url' in field_name.lower():
            for idx, url in enumerate(collection):
                if not isinstance(url, str):
                    violations.append(f"{field_name}[{idx}] must be string, got {type(url)}")
                elif not self._is_valid_url(url):
                    violations.append(f"{field_name}[{idx}] is not a valid URL: {url}")

        # Validate numerical collections
        if any(keyword in field_name.lower() for keyword in ['score', 'distance', 'similarity']):
            for idx, value in enumerate(collection):
                if not isinstance(value, (int, float)):
                    violations.append(f"{field_name}[{idx}] must be numerical, got {type(value)}")
                elif not np.isfinite(value):
                    violations.append(f"{field_name}[{idx}] has invalid numerical value: {value}")

        return violations

    def _validate_string_field(self, field_name: str, field_value: str) -> List[str]:
        """
        Validate string field format and content.

        Args:
            field_name: Name of string field
            field_value: String value to validate

        Returns:
            List of string validation violations
        """
        # Initialize violations list
        violations = []

        # Validate string is not empty if required
        if self._is_required_string(field_name) and not field_value.strip():
            violations.append(f"Required string field {field_name} cannot be empty")

        # Validate URL format
        if 'url' in field_name.lower() and not self._is_valid_url(field_value):
            violations.append(f"{field_name} is not a valid URL: {field_value}")

        # Validate title/text length constraints
        if 'title' in field_name.lower() and len(field_value) > 200:
            violations.append(f"{field_name} exceeds maximum length of 200 characters")

        if 'text' in field_name.lower() and len(field_value) > 1000:
            violations.append(f"{field_name} exceeds maximum length of 1000 characters")

        return violations

    def _validate_field_relationships(self) -> List[str]:
        """
        Validate cross-field relationships and mathematical consistency.

        Returns:
            List of relationship validation violations
        """
        # Base implementation - subclasses should override for specific relationships
        return []

    def _is_required_collection(self, field_name: str) -> bool:
        """Check if collection field is required to be non-empty."""
        # Define required collections based on field naming conventions
        required_collections = ['similar_image_urls', 'keypoints', 'matches']
        return any(required in field_name.lower() for required in required_collections)

    def _is_required_string(self, field_name: str) -> bool:
        """Check if string field is required to be non-empty."""
        # Define required string fields
        required_strings = ['best_guess', 'operation_name', 'normalization_strategy']
        return any(required in field_name.lower() for required in required_strings)

    def _is_valid_url(self, url: str) -> bool:
        """Validate URL format using basic heuristics."""
        # Basic URL validation - in production, use more sophisticated validation
        if not url or not isinstance(url, str):
            return False

        # Check for basic URL structure
        return (url.startswith(('http://', 'https://')) and
                '.' in url and
                len(url) > 10 and
                ' ' not in url)


class SerializationMixin:
    """
    Mixin class providing comprehensive serialization capabilities for result structures.

    Implements multiple export formats with statistical preservation, data integrity
    validation, and enterprise-grade serialization features for quantitative analysis.
    """

    def to_dict(self, include_metadata: bool = True, flatten_nested: bool = False) -> Dict[str, Any]:
        """
        Convert result to dictionary with configurable serialization options.

        Args:
            include_metadata: Whether to include metadata fields in serialization
            flatten_nested: Whether to flatten nested objects into dot notation

        Returns:
            Dictionary representation of result structure
        """
        # Use dataclass asdict function as base
        result_dict = asdict(self)

        # Add metadata if requested
        if include_metadata:
            result_dict['_metadata'] = {
                'class_name': self.__class__.__name__,
                'serialization_timestamp': datetime.datetime.utcnow().isoformat(),
                'version': getattr(self.__class__, '__version__', '1.0.0'),
                'hash': self.compute_hash()
            }

        # Flatten nested structures if requested
        if flatten_nested:
            result_dict = self._flatten_dictionary(result_dict)

        return result_dict

    def to_json(
        self,
        indent: Optional[int] = 2,
        ensure_ascii: bool = False,
        include_metadata: bool = True
    ) -> str:
        """
        Convert result to JSON string with configurable formatting.

        Args:
            indent: JSON indentation level for readability
            ensure_ascii: Whether to escape non-ASCII characters
            include_metadata: Whether to include serialization metadata

        Returns:
            JSON string representation of result
        """
        # Convert to dictionary representation
        result_dict = self.to_dict(include_metadata=include_metadata)

        # Serialize to JSON with custom encoder for complex types
        return json.dumps(
            result_dict,
            indent=indent,
            ensure_ascii=ensure_ascii,
            default=self._json_serializer,
            sort_keys=True
        )

    def to_csv(self, file_path: Optional[Path] = None, include_headers: bool = True) -> str:
        """
        Convert result to CSV format for tabular analysis.

        Args:
            file_path: Optional file path for direct CSV output
            include_headers: Whether to include column headers

        Returns:
            CSV string representation or writes to file
        """
        # Flatten result to single-level dictionary for CSV compatibility
        flattened_dict = self._flatten_dictionary(self.to_dict(include_metadata=False))

        # Create CSV content
        csv_content = []

        # Add headers if requested
        if include_headers:
            csv_content.append(','.join(flattened_dict.keys()))

        # Add data row
        csv_content.append(','.join(str(value) for value in flattened_dict.values()))

        # Join lines into complete CSV
        csv_string = '\n'.join(csv_content)

        # Write to file if path provided
        if file_path:
            with open(file_path, 'w', newline='', encoding='utf-8') as f:
                f.write(csv_string)

        return csv_string

    def to_xml(self, root_element: str = "result") -> str:
        """
        Convert result to XML format for structured data exchange.

        Args:
            root_element: Name of XML root element

        Returns:
            XML string representation of result
        """
        # Create root XML element
        root = ET.Element(root_element)

        # Convert dictionary to XML structure
        result_dict = self.to_dict(include_metadata=True)
        self._dict_to_xml(result_dict, root)

        # Generate XML string with declaration
        xml_string = ET.tostring(root, encoding='unicode')
        return f'<?xml version="1.0" encoding="UTF-8"?>\n{xml_string}'

    def to_pickle(self, file_path: Path) -> None:
        """
        Serialize result to pickle format for Python-specific storage.

        Args:
            file_path: Path for pickle file output
        """
        # Serialize result using pickle with highest protocol
        with open(file_path, 'wb') as f:
            pickle.dump(self, f, protocol=pickle.HIGHEST_PROTOCOL)

    def to_parquet(self, file_path: Path) -> None:
        """
        Convert result to Parquet format for efficient columnar storage.

        Args:
            file_path: Path for Parquet file output
        """
        # Convert to flattened dictionary for DataFrame compatibility
        flattened_dict = self._flatten_dictionary(self.to_dict(include_metadata=True))

        # Create single-row DataFrame
        df = pd.DataFrame([flattened_dict])

        # Write to Parquet format
        df.to_parquet(file_path, index=False)

    def compute_hash(self, algorithm: str = "sha256") -> str:
        """
        Compute cryptographic hash of result for integrity verification.

        Args:
            algorithm: Hash algorithm ("md5", "sha1", "sha256", "sha512")

        Returns:
            Hexadecimal hash string for result content
        """
        # Create deterministic string representation
        result_dict = self.to_dict(include_metadata=False)
        deterministic_string = json.dumps(result_dict, sort_keys=True, default=str)

        # Compute hash using specified algorithm
        hash_obj = hashlib.new(algorithm)
        hash_obj.update(deterministic_string.encode('utf-8'))

        return hash_obj.hexdigest()

    def _flatten_dictionary(self, d: Dict[str, Any], parent_key: str = "", sep: str = ".") -> Dict[str, Any]:
        """
        Flatten nested dictionary using dot notation for keys.

        Args:
            d: Dictionary to flatten
            parent_key: Parent key for nested recursion
            sep: Separator for key concatenation

        Returns:
            Flattened dictionary with dot-notation keys
        """
        # Initialize flattened result
        items = []

        # Process each key-value pair
        for k, v in d.items():
            # Construct new key with parent prefix
            new_key = f"{parent_key}{sep}{k}" if parent_key else k

            # Recursively flatten nested dictionaries
            if isinstance(v, dict):
                items.extend(self._flatten_dictionary(v, new_key, sep=sep).items())
            elif isinstance(v, list):
                # Handle list values by indexing
                for i, item in enumerate(v):
                    if isinstance(item, dict):
                        items.extend(self._flatten_dictionary(item, f"{new_key}[{i}]", sep=sep).items())
                    else:
                        items.append((f"{new_key}[{i}]", item))
            else:
                # Direct value assignment
                items.append((new_key, v))

        return dict(items)

    def _dict_to_xml(self, d: Dict[str, Any], parent: ET.Element) -> None:
        """
        Convert dictionary to XML elements recursively.

        Args:
            d: Dictionary to convert
            parent: Parent XML element
        """
        # Process each dictionary item
        for key, value in d.items():
            # Create XML element for this key
            element = ET.SubElement(parent, str(key))

            # Handle nested dictionaries
            if isinstance(value, dict):
                self._dict_to_xml(value, element)
            elif isinstance(value, list):
                # Handle list values
                for item in value:
                    if isinstance(item, dict):
                        item_element = ET.SubElement(element, "item")
                        self._dict_to_xml(item, item_element)
                    else:
                        item_element = ET.SubElement(element, "item")
                        item_element.text = str(item)
            else:
                # Set element text value
                element.text = str(value)

    def _json_serializer(self, obj: Any) -> Any:
        """
        Custom JSON serializer for complex data types.

        Args:
            obj: Object to serialize

        Returns:
            JSON-serializable representation
        """
        # Handle datetime objects
        if isinstance(obj, datetime.datetime):
            return obj.isoformat()

        # Handle numpy types
        if isinstance(obj, np.ndarray):
            return obj.tolist()

        if isinstance(obj, (np.integer, np.floating)):
            return obj.item()

        # Handle enum types
        if hasattr(obj, 'value'):
            return obj.value

        # Default to string representation
        return str(obj)


class ComparisonMixin:
    """
    Mixin class providing comprehensive comparison capabilities for result structures.

    Implements statistical comparison methods, similarity metrics, and ranking
    algorithms for quantitative analysis of image similarity results.
    """

    def __eq__(self, other: Any) -> bool:
        """
        Implement equality comparison with statistical tolerance.

        Args:
            other: Object to compare against

        Returns:
            Boolean indicating statistical equality
        """
        # Type checking for comparison compatibility
        if not isinstance(other, self.__class__):
            return False

        # Compare all fields with appropriate tolerance
        for field_info in fields(self):
            field_name = field_info.name
            self_value = getattr(self, field_name)
            other_value = getattr(other, field_name)

            # Handle None value comparisons
            if self_value is None and other_value is None:
                continue
            elif self_value is None or other_value is None:
                return False

            # Numerical comparison with tolerance
            if isinstance(self_value, (int, float)) and isinstance(other_value, (int, float)):
                if not self._are_numerically_equal(self_value, other_value):
                    return False
            # Exact comparison for other types
            elif self_value != other_value:
                return False

        return True

    def __hash__(self) -> int:
        """
        Implement hash function for result caching and set operations.

        Returns:
            Hash value based on stable result content
        """
        # Create tuple of hashable field values
        hashable_values = []

        for field_info in fields(self):
            field_name = field_info.name
            field_value = getattr(self, field_name)

            # Convert to hashable representation
            if isinstance(field_value, (list, dict)):
                # Convert mutable types to immutable for hashing
                hashable_values.append(str(field_value))
            elif isinstance(field_value, float):
                # Round floats to avoid hash instability
                hashable_values.append(round(field_value, 10))
            else:
                hashable_values.append(field_value)

        return hash(tuple(hashable_values))

    def __lt__(self, other: 'ComparisonMixin') -> bool:
        """
        Implement less-than comparison for result ranking.

        Args:
            other: Result to compare against

        Returns:
            Boolean indicating this result is "less than" other
        """
        # Default comparison based on primary metric
        return self._get_primary_metric() < other._get_primary_metric()

    def __le__(self, other: 'ComparisonMixin') -> bool:
        """Implement less-than-or-equal comparison."""
        return self.__lt__(other) or self.__eq__(other)

    def __gt__(self, other: 'ComparisonMixin') -> bool:
        """Implement greater-than comparison."""
        return not self.__le__(other)

    def __ge__(self, other: 'ComparisonMixin') -> bool:
        """Implement greater-than-or-equal comparison."""
        return not self.__lt__(other)

    def compute_similarity(self, other: 'ComparisonMixin', method: str = "cosine") -> float:
        """
        Compute similarity between two results using specified metric.

        Args:
            other: Result to compare against
            method: Similarity metric ("cosine", "euclidean", "manhattan", "jaccard")

        Returns:
            Similarity score between results
        """
        # Extract numerical features for comparison
        self_features = self._extract_numerical_features()
        other_features = other._extract_numerical_features()

        # Ensure feature vectors have same dimensionality
        if len(self_features) != len(other_features):
            raise ValueError("Results have incompatible feature dimensions for comparison")

        # Convert to numpy arrays for computation
        vec1 = np.array(self_features)
        vec2 = np.array(other_features)

        # Compute similarity using specified method
        if method == "cosine":
            # Cosine similarity: sim = (v1 · v2) / (||v1|| × ||v2||)
            dot_product = np.dot(vec1, vec2)
            norms = np.linalg.norm(vec1) * np.linalg.norm(vec2)
            similarity = dot_product / norms if norms > 0 else 0.0

        elif method == "euclidean":
            # Euclidean distance converted to similarity
            distance = np.linalg.norm(vec1 - vec2)
            similarity = 1.0 / (1.0 + distance)

        elif method == "manhattan":
            # Manhattan distance converted to similarity
            distance = np.sum(np.abs(vec1 - vec2))
            similarity = 1.0 / (1.0 + distance)

        elif method == "jaccard":
            # Jaccard similarity for binary/categorical features
            intersection = np.sum(np.minimum(vec1, vec2))
            union = np.sum(np.maximum(vec1, vec2))
            similarity = intersection / union if union > 0 else 0.0

        else:
            raise ValueError(f"Unsupported similarity method: {method}")

        return similarity

    def statistical_comparison(self, other: 'ComparisonMixin') -> Dict[str, Any]:
        """
        Perform comprehensive statistical comparison between results.

        Args:
            other: Result to compare against

        Returns:
            Dictionary containing statistical comparison metrics
        """
        # Initialize comparison results
        comparison_results = {
            'similarity_metrics': {},
            'statistical_tests': {},
            'effect_sizes': {},
            'confidence_intervals': {}
        }

        # Compute multiple similarity metrics
        for method in ["cosine", "euclidean", "manhattan"]:
            try:
                similarity = self.compute_similarity(other, method)
                comparison_results['similarity_metrics'][method] = similarity
            except Exception as e:
                comparison_results['similarity_metrics'][method] = f"Error: {e}"

        # Statistical significance testing if both results have statistical properties
        if hasattr(self, 'statistical_properties') and hasattr(other, 'statistical_properties'):
            self_stats = getattr(self, 'statistical_properties')
            other_stats = getattr(other, 'statistical_properties')

            # Two-sample t-test for mean comparison
            if self_stats and other_stats:
                try:
                    t_stat, p_value = self._two_sample_t_test(self_stats, other_stats)
                    comparison_results['statistical_tests']['t_test'] = {
                        'statistic': t_stat,
                        'p_value': p_value,
                        'significant': p_value < 0.05
                    }
                except Exception as e:
                    comparison_results['statistical_tests']['t_test'] = f"Error: {e}"

        return comparison_results

    def _are_numerically_equal(self, a: Union[int, float], b: Union[int, float], tolerance: float = 1e-9) -> bool:
        """
        Check numerical equality with specified tolerance.

        Args:
            a: First numerical value
            b: Second numerical value
            tolerance: Absolute tolerance for equality

        Returns:
            Boolean indicating numerical equality within tolerance
        """
        return abs(a - b) <= tolerance

    def _get_primary_metric(self) -> float:
        """
        Get primary metric for result ranking.

        Returns:
            Primary numerical metric for comparison
        """
        # Default implementation - subclasses should override
        # Look for common similarity/score fields
        for field_info in fields(self):
            field_name = field_info.name
            if any(keyword in field_name.lower() for keyword in ['similarity', 'score', 'ratio']):
                value = getattr(self, field_name)
                if isinstance(value, (int, float)):
                    return value

        # Fallback to hash-based comparison
        return float(hash(self) % 1000000) / 1000000.0

    def _extract_numerical_features(self) -> List[float]:
        """
        Extract numerical features for similarity computation.

        Returns:
            List of numerical feature values
        """
        # Extract all numerical fields as features
        features = []

        for field_info in fields(self):
            field_value = getattr(self, field_info.name)

            if isinstance(field_value, (int, float)):
                # Direct numerical value
                features.append(float(field_value))
            elif isinstance(field_value, bool):
                # Convert boolean to numerical
                features.append(1.0 if field_value else 0.0)
            elif isinstance(field_value, list) and field_value:
                # Use list length as numerical feature
                features.append(float(len(field_value)))

        return features

    def _two_sample_t_test(self, stats1: StatisticalProperties, stats2: StatisticalProperties) -> Tuple[float, float]:
        """
        Perform two-sample t-test for mean comparison.

        Args:
            stats1: Statistical properties of first sample
            stats2: Statistical properties of second sample

        Returns:
            Tuple of (t_statistic, p_value)
        """
        # Extract sample statistics
        n1, n2 = stats1.sample_size, stats2.sample_size
        mean1, mean2 = stats1.mean, stats2.mean
        var1, var2 = stats1.variance, stats2.variance

        # Compute pooled standard error
        pooled_se = np.sqrt(var1/n1 + var2/n2)

        # Compute t-statistic
        t_stat = (mean1 - mean2) / pooled_se if pooled_se > 0 else 0.0

        # Compute degrees of freedom (Welch's approximation)
        df = ((var1/n1 + var2/n2)**2) / ((var1/n1)**2/(n1-1) + (var2/n2)**2/(n2-1))

        # Compute p-value for two-tailed test
        p_value = 2 * (1 - stats.t.cdf(abs(t_stat), df))

        return t_stat, p_value


@dataclass(frozen=True)
class ReverseImageSearchResult(ResultValidationMixin, SerializationMixin, ComparisonMixin):
    """
    Comprehensive result structure for reverse image search operations.

    Implements enterprise-grade result handling with statistical analysis,
    validation, serialization, and comparison capabilities for quantitative
    evaluation of image provenance and context determination.
    """

    # Primary search results with mathematical constraints
    best_guess: str = field(
        metadata={
            'description': 'Best guess description from reverse image search',
            'constraints': 'Non-empty string with maximum length 500 characters',
            'statistical_meaning': 'Primary semantic classification of image content'
        }
    )

    similar_image_urls: List[str] = field(
        default_factory=list,
        metadata={
            'description': 'URLs of visually similar images found in search',
            'constraints': 'List of valid URLs with maximum 100 entries',
            'statistical_meaning': 'Population of similar visual content for analysis'
        }
    )

    source_page_title: str = field(
        metadata={
            'description': 'Title of source page from reverse search',
            'constraints': 'Non-empty string with maximum length 200 characters',
            'statistical_meaning': 'Contextual metadata for image provenance'
        }
    )

    # Extended metadata for quantitative analysis
    site_authority_score: Optional[float] = field(
        default=None,
        metadata={
            'description': 'Authority score of source website [0,1]',
            'constraints': 'Float in range [0,1] or None',
            'statistical_meaning': 'Reliability measure for content verification'
        }
    )

    snippet_text: Optional[str] = field(
        default=None,
        metadata={
            'description': 'Text snippet describing image context',
            'constraints': 'String with maximum length 1000 characters or None',
            'statistical_meaning': 'Semantic context for content analysis'
        }
    )

    confidence_score: Optional[float] = field(
        default=None,
        metadata={
            'description': 'Search confidence score [0,1] based on result quality',
            'constraints': 'Float in range [0,1] or None',
            'statistical_meaning': 'Uncertainty quantification for search results'
        }
    )

    # Statistical analysis properties
    statistical_properties: Optional[StatisticalProperties] = field(
        default=None,
        metadata={
            'description': 'Statistical properties of search result quality metrics',
            'constraints': 'StatisticalProperties object or None',
            'statistical_meaning': 'Quantitative analysis of search performance'
        }
    )

    # Temporal and operational metadata
    search_timestamp: datetime.datetime = field(
        default_factory=datetime.datetime.utcnow,
        metadata={
            'description': 'UTC timestamp when search was performed',
            'constraints': 'Valid datetime object',
            'statistical_meaning': 'Temporal reference for result validity'
        }
    )

    search_duration_seconds: float = field(
        default=0.0,
        metadata={
            'description': 'Time taken to complete search operation',
            'constraints': 'Non-negative float representing seconds',
            'statistical_meaning': 'Performance metric for search efficiency'
        }
    )

    # Result quality and provenance indicators
    duplicate_url_count: int = field(
        default=0,
        metadata={
            'description': 'Number of duplicate URLs found in similar results',
            'constraints': 'Non-negative integer',
            'statistical_meaning': 'Content proliferation indicator'
        }
    )

    unique_domain_count: int = field(
        default=0,
        metadata={
            'description': 'Number of unique domains in similar image URLs',
            'constraints': 'Non-negative integer',
            'statistical_meaning': 'Content distribution diversity measure'
        }
    )

    geographic_indicators: Optional[Dict[str, Any]] = field(
        default=None,
        metadata={
            'description': 'Geographic distribution indicators from result analysis',
            'constraints': 'Dictionary with geographic metadata or None',
            'statistical_meaning': 'Spatial distribution of content sources'
        }
    )

    def __post_init__(self) -> None:
        """
        Validate reverse image search result with comprehensive constraint checking.

        Implements mathematical validation of search result constraints,
        statistical consistency verification, and business logic validation
        for reliable quantitative analysis of image provenance.
        """
        # Validate primary fields against constraints
        if not isinstance(self.best_guess, str) or not self.best_guess.strip():
            raise ValueError("best_guess must be non-empty string")

        if len(self.best_guess) > 500:
            raise ValueError(f"best_guess exceeds maximum length: {len(self.best_guess)} > 500")

        if not isinstance(self.source_page_title, str) or not self.source_page_title.strip():
            raise ValueError("source_page_title must be non-empty string")

        if len(self.source_page_title) > 200:
            raise ValueError(f"source_page_title exceeds maximum length: {len(self.source_page_title)} > 200")

        # Validate URL collection constraints
        if len(self.similar_image_urls) > 100:
            raise ValueError(f"similar_image_urls exceeds maximum count: {len(self.similar_image_urls)} > 100")

        # Validate each URL in collection
        for idx, url in enumerate(self.similar_image_urls):
            if not isinstance(url, str):
                raise ValueError(f"similar_image_urls[{idx}] must be string, got {type(url)}")

            if not self._is_valid_url(url):
                raise ValueError(f"similar_image_urls[{idx}] is not valid URL: {url}")

        # Validate optional numerical constraints
        if self.site_authority_score is not None:
            if not 0.0 <= self.site_authority_score <= 1.0:
                raise ValueError(f"site_authority_score must be in [0,1], got {self.site_authority_score}")

        if self.confidence_score is not None:
            if not 0.0 <= self.confidence_score <= 1.0:
                raise ValueError(f"confidence_score must be in [0,1], got {self.confidence_score}")

        # Validate temporal constraints
        if self.search_duration_seconds < 0:
            raise ValueError(f"search_duration_seconds must be non-negative, got {self.search_duration_seconds}")

        # Validate count constraints
        if self.duplicate_url_count < 0:
            raise ValueError(f"duplicate_url_count must be non-negative, got {self.duplicate_url_count}")

        if self.unique_domain_count < 0:
            raise ValueError(f"unique_domain_count must be non-negative, got {self.unique_domain_count}")

        # Validate logical relationships between fields
        if self.duplicate_url_count > len(self.similar_image_urls):
            raise ValueError("duplicate_url_count cannot exceed total URL count")

        if self.unique_domain_count > len(self.similar_image_urls):
            raise ValueError("unique_domain_count cannot exceed total URL count")

    def _validate_field_relationships(self) -> List[str]:
        """
        Validate cross-field relationships specific to reverse search results.

        Returns:
            List of relationship validation violations
        """
        violations = []

        # Validate URL count relationships
        if self.duplicate_url_count > len(self.similar_image_urls):
            violations.append("duplicate_url_count exceeds similar_image_urls length")

        if self.unique_domain_count > len(self.similar_image_urls):
            violations.append("unique_domain_count exceeds similar_image_urls length")

        # Validate confidence score vs URL count relationship
        if (self.confidence_score is not None and
            len(self.similar_image_urls) == 0 and
            self.confidence_score > 0.1):
            violations.append("High confidence score with no similar images is inconsistent")

        # Validate snippet text vs best guess consistency
        if (self.snippet_text and
            len(self.snippet_text) > 1000):
            violations.append("snippet_text exceeds maximum length of 1000 characters")

        return violations

    def compute_result_quality_score(self) -> float:
        """
        Compute comprehensive quality score for search result.

        Mathematical Foundation:
        - Combines multiple quality indicators using weighted scoring
        - Quality = w₁×confidence + w₂×url_diversity + w₃×authority + w₄×completeness
        - Weights optimized for image provenance determination accuracy

        Returns:
            Quality score in range [0,1] where higher values indicate better results
        """
        # Initialize quality components
        quality_components = {}

        # Confidence score component (weight: 0.3)
        confidence_component = self.confidence_score if self.confidence_score is not None else 0.5
        quality_components['confidence'] = confidence_component * 0.3

        # URL diversity component (weight: 0.25)
        if len(self.similar_image_urls) > 0:
            diversity_ratio = self.unique_domain_count / len(self.similar_image_urls)
            diversity_component = min(diversity_ratio * 2.0, 1.0)  # Scale to [0,1]
        else:
            diversity_component = 0.0
        quality_components['diversity'] = diversity_component * 0.25

        # Authority score component (weight: 0.2)
        authority_component = self.site_authority_score if self.site_authority_score is not None else 0.5
        quality_components['authority'] = authority_component * 0.2

        # Completeness component (weight: 0.15)
        completeness_score = 0.0
        if self.best_guess and len(self.best_guess.strip()) > 0:
            completeness_score += 0.4
        if self.snippet_text and len(self.snippet_text.strip()) > 0:
            completeness_score += 0.3
        if len(self.similar_image_urls) > 0:
            completeness_score += 0.3
        quality_components['completeness'] = completeness_score * 0.15

        # Performance component (weight: 0.1)
        if self.search_duration_seconds > 0:
            # Reward faster searches (exponential decay)
            performance_score = np.exp(-self.search_duration_seconds / 10.0)
        else:
            performance_score = 1.0
        quality_components['performance'] = performance_score * 0.1

        # Compute total quality score
        total_quality = sum(quality_components.values())

        return min(total_quality, 1.0)  # Ensure score is in [0,1]

    def analyze_content_distribution(self) -> Dict[str, Any]:
        """
        Analyze geographic and temporal distribution of similar content.

        Returns:
            Dictionary containing distribution analysis results
        """
        # Initialize distribution analysis
        distribution_analysis = {
            'domain_distribution': defaultdict(int),
            'tld_distribution': defaultdict(int),
            'url_pattern_analysis': {},
            'content_proliferation_metrics': {}
        }

        # Analyze domain patterns in similar URLs
        for url in self.similar_image_urls:
            try:
                # Extract domain from URL
                from urllib.parse import urlparse
                parsed_url = urlparse(url)
                domain = parsed_url.netloc.lower()

                # Count domain occurrences
                distribution_analysis['domain_distribution'][domain] += 1

                # Extract top-level domain
                tld = domain.split('.')[-1] if '.' in domain else 'unknown'
                distribution_analysis['tld_distribution'][tld] += 1

            except Exception:
                # Handle malformed URLs gracefully
                distribution_analysis['domain_distribution']['malformed'] += 1

        # Compute content proliferation metrics
        total_urls = len(self.similar_image_urls)
        unique_domains = len(distribution_analysis['domain_distribution'])

        if total_urls > 0:
            proliferation_metrics = {
                'proliferation_ratio': self.duplicate_url_count / total_urls,
                'domain_diversity_ratio': unique_domains / total_urls,
                'concentration_index': self._compute_concentration_index(
                    distribution_analysis['domain_distribution']
                ),
                'viral_coefficient': self._compute_viral_coefficient()
            }
            distribution_analysis['content_proliferation_metrics'] = proliferation_metrics

        return dict(distribution_analysis)

    def _compute_concentration_index(self, domain_counts: Dict[str, int]) -> float:
        """
        Compute Herfindahl-Hirschman Index for domain concentration.

        Mathematical Foundation:
        HHI = Σᵢ (sᵢ)² where sᵢ is market share of domain i
        HHI ∈ [1/n, 1] where n is number of domains

        Args:
            domain_counts: Dictionary mapping domains to occurrence counts

        Returns:
            Concentration index where higher values indicate more concentration
        """
        if not domain_counts:
            return 0.0

        # Compute total occurrences
        total_count = sum(domain_counts.values())

        # Compute market shares and HHI
        hhi = sum((count / total_count) ** 2 for count in domain_counts.values())

        return hhi

    def _compute_viral_coefficient(self) -> float:
        """
        Compute viral coefficient indicating content spread velocity.

        Returns:
            Viral coefficient based on URL count and search performance
        """
        # Base viral score on URL count and diversity
        if len(self.similar_image_urls) == 0:
            return 0.0

        # Factor in search speed (faster discovery suggests viral content)
        speed_factor = 1.0 / (1.0 + self.search_duration_seconds / 5.0)

        # Factor in domain diversity (viral content spreads across platforms)
        diversity_factor = min(self.unique_domain_count / 10.0, 1.0)

        # Factor in absolute URL count
        count_factor = min(len(self.similar_image_urls) / 50.0, 1.0)

        # Compute composite viral coefficient
        viral_coefficient = (speed_factor * 0.3 + diversity_factor * 0.4 + count_factor * 0.3)

        return viral_coefficient

    def generate_provenance_report(self) -> Dict[str, Any]:
        """
        Generate comprehensive provenance report for image authenticity assessment.

        Returns:
            Dictionary containing detailed provenance analysis
        """
        # Initialize provenance report structure
        provenance_report = {
            'authenticity_indicators': {},
            'distribution_analysis': self.analyze_content_distribution(),
            'quality_assessment': {
                'overall_quality_score': self.compute_result_quality_score(),
                'confidence_level': self.confidence_score,
                'completeness_score': self._compute_completeness_score()
            },
            'risk_assessment': {
                'manipulation_risk': self._assess_manipulation_risk(),
                'viral_propagation_risk': self._compute_viral_coefficient(),
                'source_credibility': self.site_authority_score
            },
            'recommendations': self._generate_provenance_recommendations()
        }

        return provenance_report

    def _compute_completeness_score(self) -> float:
        """Compute completeness score based on available metadata."""
        completeness_factors = []

        # Best guess completeness
        if self.best_guess and len(self.best_guess.strip()) > 10:
            completeness_factors.append(1.0)
        else:
            completeness_factors.append(0.0)

        # Snippet text completeness
        if self.snippet_text and len(self.snippet_text.strip()) > 20:
            completeness_factors.append(1.0)
        else:
            completeness_factors.append(0.5 if self.snippet_text else 0.0)

        # URL collection completeness
        if len(self.similar_image_urls) >= 5:
            completeness_factors.append(1.0)
        elif len(self.similar_image_urls) > 0:
            completeness_factors.append(len(self.similar_image_urls) / 5.0)
        else:
            completeness_factors.append(0.0)

        # Authority score availability
        completeness_factors.append(1.0 if self.site_authority_score is not None else 0.0)

        return statistics.mean(completeness_factors)

    def _assess_manipulation_risk(self) -> float:
        """Assess risk of image manipulation based on search patterns."""
        risk_factors = []

        # High duplicate count suggests potential manipulation
        if len(self.similar_image_urls) > 0:
            duplicate_ratio = self.duplicate_url_count / len(self.similar_image_urls)
            risk_factors.append(duplicate_ratio)

        # Low confidence with many results suggests inconsistency
        if self.confidence_score is not None and len(self.similar_image_urls) > 10:
            if self.confidence_score < 0.3:
                risk_factors.append(0.7)
            else:
                risk_factors.append(0.0)

        # Rapid search completion might indicate cached/manipulated content
        if self.search_duration_seconds < 1.0 and len(self.similar_image_urls) > 20:
            risk_factors.append(0.6)
        else:
            risk_factors.append(0.0)

        return statistics.mean(risk_factors) if risk_factors else 0.0

    def _generate_provenance_recommendations(self) -> List[str]:
        """Generate actionable recommendations for provenance verification."""
        recommendations = []

        # Quality-based recommendations
        quality_score = self.compute_result_quality_score()
        if quality_score < 0.3:
            recommendations.append("Low quality search results - consider additional verification methods")
        elif quality_score > 0.8:
            recommendations.append("High quality search results provide strong provenance indicators")

        # Authority-based recommendations
        if self.site_authority_score is not None:
            if self.site_authority_score < 0.3:
                recommendations.append("Low source authority - verify information through additional channels")
            elif self.site_authority_score > 0.8:
                recommendations.append("High source authority increases confidence in provenance")

        # Distribution-based recommendations
        if self.unique_domain_count < 3 and len(self.similar_image_urls) > 10:
            recommendations.append("Limited domain diversity suggests potential content manipulation")

        # Risk-based recommendations
        manipulation_risk = self._assess_manipulation_risk()
        if manipulation_risk > 0.5:
            recommendations.append("High manipulation risk detected - conduct forensic analysis")

        return recommendations


@dataclass(frozen=True)
class FeatureMatchResult(ResultValidationMixin, SerializationMixin, ComparisonMixin):
    """
    Comprehensive result structure for feature matching operations.

    Implements rigorous mathematical validation, statistical analysis, and
    quantitative assessment of ORB feature matching performance for
    enterprise-grade computer vision applications.
    """

    # Primary matching metrics with mathematical constraints
    similarity_ratio: float = field(
        metadata={
            'description': 'Ratio of good matches to total/min keypoints [0,1]',
            'constraints': 'Float in range [0,1]',
            'statistical_meaning': 'Probability estimate of visual similarity',
            'mathematical_foundation': 'P(match) = |good_matches| / normalization_denominator'
        }
    )

    total_matches: int = field(
        metadata={
            'description': 'Total number of descriptor matches attempted',
            'constraints': 'Non-negative integer',
            'statistical_meaning': 'Sample size for matching statistical analysis',
            'mathematical_foundation': 'n = |{(qi, tj) : d_H(qi, tj) computed}|'
        }
    )

    good_matches: int = field(
        metadata={
            'description': 'Number of matches passing quality threshold',
            'constraints': 'Non-negative integer ≤ total_matches',
            'statistical_meaning': 'Successful matching events for similarity assessment',
            'mathematical_foundation': 'k = |{matches : d_H ≤ threshold or ratio_test_passed}|'
        }
    )

    keypoints_image1: int = field(
        metadata={
            'description': 'Number of keypoints detected in first image',
            'constraints': 'Non-negative integer',
            'statistical_meaning': 'Feature density measure for first image',
            'mathematical_foundation': 'n₁ = |{p : Harris_response(p) > threshold}|'
        }
    )

    keypoints_image2: int = field(
        metadata={
            'description': 'Number of keypoints detected in second image',
            'constraints': 'Non-negative integer',
            'statistical_meaning': 'Feature density measure for second image',
            'mathematical_foundation': 'n₂ = |{p : Harris_response(p) > threshold}|'
        }
    )

    normalization_strategy: str = field(
        metadata={
            'description': 'Strategy used for similarity ratio computation',
            'constraints': 'Must be "total_matches" or "min_keypoints"',
            'statistical_meaning': 'Normalization method affecting probability interpretation',
            'mathematical_foundation': 'Denominator choice for ratio = k/denominator'
        }
    )

    # Advanced matching quality metrics
    confidence_level: Optional[float] = field(
        default=None,
        metadata={
            'description': 'Statistical confidence in matching result [0,1]',
            'constraints': 'Float in range [0,1] or None',
            'statistical_meaning': 'Uncertainty quantification for match quality',
            'mathematical_foundation': 'P(true_positive | observed_matches)'
        }
    )

    match_distance_statistics: Optional[StatisticalProperties] = field(
        default=None,
        metadata={
            'description': 'Statistical properties of Hamming distances',
            'constraints': 'StatisticalProperties object or None',
            'statistical_meaning': 'Distribution analysis of descriptor distances',
            'mathematical_foundation': 'Statistical analysis of {d_H(qi, tj) : matched}'
        }
    )

    # Geometric verification results
    geometric_verification_passed: Optional[bool] = field(
        default=None,
        metadata={
            'description': 'Whether matches pass geometric consistency check',
            'constraints': 'Boolean or None if not performed',
            'statistical_meaning': 'Spatial coherence validation of matching',
            'mathematical_foundation': 'RANSAC homography estimation success'
        }
    )

    homography_inlier_count: Optional[int] = field(
        default=None,
        metadata={
            'description': 'Number of matches consistent with estimated homography',
            'constraints': 'Non-negative integer ≤ good_matches or None',
            'statistical_meaning': 'Geometrically consistent correspondences',
            'mathematical_foundation': '|{matches : ||H·p₁ - p₂|| < ε}|'
        }
    )

    homography_inlier_ratio: Optional[float] = field(
        default=None,
        metadata={
            'description': 'Ratio of homography inliers to good matches [0,1]',
            'constraints': 'Float in range [0,1] or None',
            'statistical_meaning': 'Geometric consistency measure',
            'mathematical_foundation': 'inlier_ratio = inlier_count / good_matches'
        }
    )

    # Performance and computational metrics
    matching_time_seconds: float = field(
        default=0.0,
        metadata={
            'description': 'Time taken for complete matching operation',
            'constraints': 'Non-negative float',
            'statistical_meaning': 'Computational efficiency measure',
            'mathematical_foundation': 'Wall-clock time for O(n₁·n₂·k) operations'
        }
    )

    keypoint_detection_time_seconds: float = field(
        default=0.0,
        metadata={
            'description': 'Time taken for keypoint detection in both images',
            'constraints': 'Non-negative float',
            'statistical_meaning': 'Feature extraction efficiency measure',
            'mathematical_foundation': 'Time for ORB detection and description'
        }
    )

    # Match quality distribution analysis
    distance_threshold_used: Optional[int] = field(
        default=None,
        metadata={
            'description': 'Hamming distance threshold for good match classification',
            'constraints': 'Integer in range [0,256] or None',
            'statistical_meaning': 'Decision boundary for binary classification',
            'mathematical_foundation': 'τ : d_H ≤ τ ⟹ good_match'
        }
    )

    lowe_ratio_threshold: Optional[float] = field(
        default=None,
        metadata={
            'description': 'Ratio threshold used for Lowe\'s ratio test',
            'constraints': 'Float in range (0,1) or None',
            'statistical_meaning': 'Disambiguation threshold for nearest neighbors',
            'mathematical_foundation': 'r : d₁/d₂ < r ⟹ unambiguous_match'
        }
    )

    def __post_init__(self) -> None:
        """
        Validate feature matching result with comprehensive mathematical verification.

        Implements rigorous validation of matching statistics, mathematical
        constraints, and statistical consistency for reliable quantitative
        analysis of visual feature correspondence.
        """
        # Validate similarity ratio mathematical constraint
        if not 0.0 <= self.similarity_ratio <= 1.0:
            raise ValueError(f"similarity_ratio must be in [0,1], got {self.similarity_ratio}")

        # Validate count constraints and relationships
        if self.total_matches < 0:
            raise ValueError(f"total_matches must be non-negative, got {self.total_matches}")

        if self.good_matches < 0:
            raise ValueError(f"good_matches must be non-negative, got {self.good_matches}")

        if self.good_matches > self.total_matches:
            raise ValueError(f"good_matches ({self.good_matches}) cannot exceed total_matches ({self.total_matches})")

        if self.keypoints_image1 < 0:
            raise ValueError(f"keypoints_image1 must be non-negative, got {self.keypoints_image1}")

        if self.keypoints_image2 < 0:
            raise ValueError(f"keypoints_image2 must be non-negative, got {self.keypoints_image2}")

        # Validate normalization strategy
        valid_strategies = {"total_matches", "min_keypoints"}
        if self.normalization_strategy not in valid_strategies:
            raise ValueError(f"normalization_strategy must be one of {valid_strategies}, got {self.normalization_strategy}")

        # Validate optional confidence level
        if self.confidence_level is not None:
            if not 0.0 <= self.confidence_level <= 1.0:
                raise ValueError(f"confidence_level must be in [0,1], got {self.confidence_level}")

        # Validate geometric verification constraints
        if self.homography_inlier_count is not None:
            if self.homography_inlier_count < 0:
                raise ValueError(f"homography_inlier_count must be non-negative, got {self.homography_inlier_count}")

            if self.homography_inlier_count > self.good_matches:
                raise ValueError(f"homography_inlier_count ({self.homography_inlier_count}) cannot exceed good_matches ({self.good_matches})")

        if self.homography_inlier_ratio is not None:
            if not 0.0 <= self.homography_inlier_ratio <= 1.0:
                raise ValueError(f"homography_inlier_ratio must be in [0,1], got {self.homography_inlier_ratio}")

        # Validate temporal constraints
        if self.matching_time_seconds < 0:
            raise ValueError(f"matching_time_seconds must be non-negative, got {self.matching_time_seconds}")

        if self.keypoint_detection_time_seconds < 0:
            raise ValueError(f"keypoint_detection_time_seconds must be non-negative, got {self.keypoint_detection_time_seconds}")

        # Validate threshold parameters if provided
        if self.distance_threshold_used is not None:
            if not 0 <= self.distance_threshold_used <= 256:
                raise ValueError(f"distance_threshold_used must be in [0,256], got {self.distance_threshold_used}")

        if self.lowe_ratio_threshold is not None:
            if not 0.0 < self.lowe_ratio_threshold < 1.0:
                raise ValueError(f"lowe_ratio_threshold must be in (0,1), got {self.lowe_ratio_threshold}")

    def _validate_field_relationships(self) -> List[str]:
        """
        Validate mathematical relationships between feature matching fields.

        Returns:
            List of relationship validation violations
        """
        violations = []

        # Validate similarity ratio computation consistency
        if self.normalization_strategy == "total_matches" and self.total_matches > 0:
            expected_ratio = self.good_matches / self.total_matches
            if abs(self.similarity_ratio - expected_ratio) > 1e-6:
                violations.append(f"similarity_ratio inconsistent with total_matches normalization")

        elif self.normalization_strategy == "min_keypoints":
            min_keypoints = min(self.keypoints_image1, self.keypoints_image2)
            if min_keypoints > 0:
                expected_ratio = min(self.good_matches / min_keypoints, 1.0)
                if abs(self.similarity_ratio - expected_ratio) > 1e-6:
                    violations.append(f"similarity_ratio inconsistent with min_keypoints normalization")

        # Validate homography inlier ratio consistency
        if (self.homography_inlier_count is not None and
            self.homography_inlier_ratio is not None and
            self.good_matches > 0):
            expected_inlier_ratio = self.homography_inlier_count / self.good_matches
            if abs(self.homography_inlier_ratio - expected_inlier_ratio) > 1e-6:
                violations.append("homography_inlier_ratio inconsistent with inlier_count")

        # Validate geometric verification consistency
        if (self.geometric_verification_passed is True and
            self.homography_inlier_count is not None and
            self.homography_inlier_count < 4):
            violations.append("geometric_verification_passed=True but insufficient inliers for homography")

        return violations

    def compute_match_quality_score(self) -> float:
        """
        Compute comprehensive match quality score based on multiple factors.

        Mathematical Foundation:
        Quality = w₁×similarity + w₂×geometric + w₃×statistical + w₄×efficiency

        Weights optimized for visual similarity assessment:
        - w₁ = 0.4 (similarity ratio primary indicator)
        - w₂ = 0.3 (geometric consistency crucial for reliability)
        - w₃ = 0.2 (statistical confidence important for uncertainty)
        - w₄ = 0.1 (computational efficiency secondary concern)

        Returns:
            Quality score in range [0,1] where higher values indicate better matches
        """
        # Initialize quality components with weights
        quality_components = {}

        # Similarity ratio component (weight: 0.4)
        quality_components['similarity'] = self.similarity_ratio * 0.4

        # Geometric verification component (weight: 0.3)
        if self.homography_inlier_ratio is not None:
            geometric_component = self.homography_inlier_ratio
        elif self.geometric_verification_passed is True:
            geometric_component = 0.8  # High score for passed verification
        elif self.geometric_verification_passed is False:
            geometric_component = 0.2  # Low score for failed verification
        else:
            geometric_component = 0.5  # Neutral score if not performed

        quality_components['geometric'] = geometric_component * 0.3

        # Statistical confidence component (weight: 0.2)
        if self.confidence_level is not None:
            statistical_component = self.confidence_level
        else:
            # Estimate confidence from match statistics
            if self.total_matches > 0:
                # Higher total matches and good match ratio suggest higher confidence
                match_ratio = self.good_matches / self.total_matches
                sample_size_factor = min(self.total_matches / 100.0, 1.0)
                statistical_component = match_ratio * sample_size_factor
            else:
                statistical_component = 0.0

        quality_components['statistical'] = statistical_component * 0.2

        # Computational efficiency component (weight: 0.1)
        total_time = self.matching_time_seconds + self.keypoint_detection_time_seconds
        if total_time > 0:
            # Reward faster processing (exponential decay)
            efficiency_component = np.exp(-total_time / 5.0)
        else:
            efficiency_component = 1.0

        quality_components['efficiency'] = efficiency_component * 0.1

        # Compute total quality score
        total_quality = sum(quality_components.values())

        return min(total_quality, 1.0)  # Ensure score is in [0,1]

    def estimate_false_positive_rate(self) -> float:
        """
        Estimate false positive rate based on matching statistics and thresholds.

        Mathematical Foundation:
        For random binary descriptors with uniform bit distribution:
        P(d_H ≤ τ) = Σᵢ₌₀ᵗ C(256,i) × p^i × (1-p)^(256-i)
        where p = 0.5 for uniform distribution

        Returns:
            Estimated false positive rate [0,1]
        """
        # Use distance threshold if available
        if self.distance_threshold_used is not None:
            # Compute binomial probability for Hamming distance ≤ threshold
            n_bits = 256  # Standard ORB descriptor length
            threshold = self.distance_threshold_used

            # For large n, use normal approximation to binomial
            # μ = n×p, σ² = n×p×(1-p) where p = 0.5 for random bits
            mu = n_bits * 0.5
            sigma = np.sqrt(n_bits * 0.5 * 0.5)

            # Standardize and compute cumulative probability
            z_score = (threshold + 0.5 - mu) / sigma  # Continuity correction
            false_positive_rate = stats.norm.cdf(z_score)

            return min(false_positive_rate, 1.0)

        # Estimate from match statistics if threshold unavailable
        if self.total_matches > 0:
            # Conservative estimate based on match ratio
            # Assume false positive rate scales with observed match density
            match_density = self.good_matches / self.total_matches

            # Apply heuristic correction for typical image matching scenarios
            estimated_fpr = match_density * 0.1  # 10% of matches assumed false positive

            return min(estimated_fpr, 0.5)  # Cap at 50% for conservative estimation

        # Default conservative estimate
        return 0.1

    def compute_statistical_power(self, effect_size: float = 0.5) -> float:
        """
        Compute statistical power for detecting true visual similarity.

        Mathematical Foundation:
        Power = P(reject H₀ | H₁ true) where:
        - H₀: images are not similar
        - H₁: images are visually similar
        - Effect size quantifies similarity magnitude

        Args:
            effect_size: Expected Cohen's d for similarity detection

        Returns:
            Statistical power estimate [0,1]
        """
        # Validate input parameters
        if effect_size <= 0:
            raise ValueError("Effect size must be positive")

        # Use sample size based on keypoint counts
        effective_sample_size = min(self.keypoints_image1, self.keypoints_image2)

        if effective_sample_size < 2:
            return 0.0  # Insufficient sample for meaningful power calculation

        # Compute non-centrality parameter
        ncp = effect_size * np.sqrt(effective_sample_size)

        # Use standard significance level (α = 0.05) for two-tailed test
        alpha = 0.05
        critical_value = stats.norm.ppf(1 - alpha/2)

        # Compute statistical power
        power = 1 - stats.norm.cdf(critical_value - ncp) + stats.norm.cdf(-critical_value - ncp)

        return power

    def generate_matching_diagnostics(self) -> Dict[str, Any]:
        """
        Generate comprehensive diagnostic report for matching performance analysis.

        Returns:
            Dictionary containing detailed diagnostic information
        """
        # Initialize diagnostic report
        diagnostics = {
            'quality_assessment': {
                'overall_quality_score': self.compute_match_quality_score(),
                'estimated_false_positive_rate': self.estimate_false_positive_rate(),
                'statistical_power': self.compute_statistical_power(),
                'confidence_level': self.confidence_level
            },
            'performance_metrics': {
                'matching_efficiency': self._compute_matching_efficiency(),
                'keypoint_density': self._compute_keypoint_density(),
                'computational_cost': self._compute_computational_cost()
            },
            'statistical_analysis': {
                'sample_adequacy': self._assess_sample_adequacy(),
                'normalization_impact': self._analyze_normalization_impact(),
                'threshold_sensitivity': self._analyze_threshold_sensitivity()
            },
            'recommendations': self._generate_matching_recommendations()
        }

        # Add distance statistics if available
        if self.match_distance_statistics:
            diagnostics['distance_statistics'] = {
                'mean_distance': self.match_distance_statistics.mean,
                'distance_variance': self.match_distance_statistics.variance,
                'distance_distribution': self.match_distance_statistics.distribution_type,
                'quality_distribution': self._assess_distance_distribution_quality()
            }

        return diagnostics

    def _compute_matching_efficiency(self) -> float:
        """Compute matching efficiency based on matches per unit time."""
        total_time = self.matching_time_seconds + self.keypoint_detection_time_seconds
        if total_time > 0 and self.total_matches > 0:
            return self.total_matches / total_time
        return 0.0

    def _compute_keypoint_density(self) -> Dict[str, float]:
        """Compute keypoint density metrics for both images."""
        # Assume standard image size for density calculation
        standard_image_area = 640 * 480  # VGA resolution

        return {
            'density_image1': self.keypoints_image1 / standard_image_area,
            'density_image2': self.keypoints_image2 / standard_image_area,
            'density_ratio': (self.keypoints_image2 / max(self.keypoints_image1, 1))
        }

    def _compute_computational_cost(self) -> Dict[str, float]:
        """Compute computational cost metrics."""
        total_comparisons = self.keypoints_image1 * self.keypoints_image2

        return {
            'total_comparisons': total_comparisons,
            'time_per_comparison': (self.matching_time_seconds / max(total_comparisons, 1)) * 1e6,  # microseconds
            'matches_per_second': self.total_matches / max(self.matching_time_seconds, 1e-6)
        }

    def _assess_sample_adequacy(self) -> Dict[str, Any]:
        """Assess adequacy of sample size for statistical inference."""
        min_keypoints = min(self.keypoints_image1, self.keypoints_image2)

        adequacy_assessment = {
            'minimum_sample_size': min_keypoints >= 30,  # Rule of thumb for normal approximation
            'recommended_minimum': 30,
            'actual_minimum': min_keypoints,
            'adequacy_ratio': min_keypoints / 30.0,
            'power_adequate': self.compute_statistical_power() >= 0.8
        }

        return adequacy_assessment

    def _analyze_normalization_impact(self) -> Dict[str, float]:
        """Analyze impact of normalization strategy choice."""
        # Compute alternative similarity ratio
        if self.normalization_strategy == "total_matches" and self.total_matches > 0:
            alt_ratio = min(self.good_matches / min(self.keypoints_image1, self.keypoints_image2), 1.0) if min(self.keypoints_image1, self.keypoints_image2) > 0 else 0.0
            impact = abs(self.similarity_ratio - alt_ratio)
        else:
            alt_ratio = self.good_matches / max(self.total_matches, 1)
            impact = abs(self.similarity_ratio - alt_ratio)

        return {
            'current_ratio': self.similarity_ratio,
            'alternative_ratio': alt_ratio,
            'normalization_impact': impact,
            'strategy_sensitivity': impact / max(self.similarity_ratio, 1e-6)
        }

    def _analyze_threshold_sensitivity(self) -> Dict[str, Any]:
        """Analyze sensitivity to threshold parameter choices."""
        sensitivity_analysis = {
            'threshold_used': self.distance_threshold_used,
            'estimated_sensitivity': None,
            'robustness_score': None
        }

        # Estimate threshold sensitivity if distance statistics available
        if self.match_distance_statistics and self.distance_threshold_used is not None:
            # Compute how many matches would change with ±10% threshold variation
            threshold_std = self.match_distance_statistics.standard_deviation
            sensitivity_estimate = threshold_std / max(self.distance_threshold_used, 1)

            sensitivity_analysis.update({
                'estimated_sensitivity': sensitivity_estimate,
                'robustness_score': 1.0 / (1.0 + sensitivity_estimate),
                'distance_std': threshold_std
            })

        return sensitivity_analysis

    def _assess_distance_distribution_quality(self) -> Dict[str, Any]:
        """Assess quality of distance distribution for reliable matching."""
        if not self.match_distance_statistics:
            return {'status': 'no_data'}

        stats_obj = self.match_distance_statistics

        return {
            'distribution_type': stats_obj.distribution_type,
            'normality_p_value': stats_obj.p_value,
            'distribution_quality': 'good' if stats_obj.goodness_of_fit and stats_obj.goodness_of_fit > 0.05 else 'poor',
            'skewness': stats_obj.skewness,
            'kurtosis': stats_obj.kurtosis,
            'separability': self._compute_distance_separability()
        }

    def _compute_distance_separability(self) -> float:
        """Compute separability between good and bad matches based on distance distribution."""
        if not self.match_distance_statistics:
            return 0.0

        # Estimate separability using distance statistics
        # Good separability means good matches have clearly different distance distribution
        mean_distance = self.match_distance_statistics.mean
        std_distance = self.match_distance_statistics.standard_deviation

        # Compute coefficient of variation as separability proxy
        if mean_distance > 0:
            cv = std_distance / mean_distance
            # Lower CV suggests better separability
            separability = 1.0 / (1.0 + cv)
        else:
            separability = 0.0

        return separability

    def _generate_matching_recommendations(self) -> List[str]:
        """Generate actionable recommendations for improving matching performance."""
        recommendations = []

        # Quality-based recommendations
        quality_score = self.compute_match_quality_score()
        if quality_score < 0.3:
            recommendations.append("Low match quality detected - consider adjusting detection parameters")
        elif quality_score > 0.8:
            recommendations.append("High match quality achieved - current parameters are optimal")

        # Sample size recommendations
        min_keypoints = min(self.keypoints_image1, self.keypoints_image2)
        if min_keypoints < 30:
            recommendations.append("Low keypoint count - consider reducing detection threshold or increasing image resolution")

        # Geometric verification recommendations
        if self.geometric_verification_passed is False:
            recommendations.append("Geometric verification failed - images may not be related or contain repeated patterns")
        elif self.geometric_verification_passed is None and self.good_matches > 10:
            recommendations.append("Consider adding geometric verification for improved reliability")

        # Performance recommendations
        if self.matching_time_seconds > 1.0:
            recommendations.append("Slow matching performance - consider optimizing keypoint count or using approximate matching")

        # Threshold recommendations
        if self.distance_threshold_used and self.distance_threshold_used > 100:
            recommendations.append("High distance threshold may include unreliable matches - consider reducing threshold")

        return recommendations



In [None]:
# Main ImageSimilarityDetector Implementation

class ImageSimilarityDetector:
    """
    Production-grade image similarity detection system implementing multiple algorithms:

    1. Perceptual hash difference using DCT-based pHash algorithm
    2. ORB feature matching with configurable normalization strategies
    3. Color histogram correlation with multiple distance metrics
    4. CLIP embedding cosine similarity using vision transformers
    5. Automated Google reverse image search with robust web scraping

    Implements enterprise-grade resource management, thread safety, dependency injection,
    comprehensive error handling, and mathematical rigor per academic literature.
    """

    def __init__(
        self,
        orb_detector: Optional[FeatureDetectorProtocol] = None,
        matcher: Optional[MatcherProtocol] = None,
        clip_model_loader: Optional[ClipModelLoaderProtocol] = None,
        device: Optional[str] = None,
        clip_model_name: str = "ViT-B/32",
        allow_symlinks: bool = False,
        max_initialization_retries: int = 3
    ) -> None:
        """
        Initialize ImageSimilarityDetector with dependency injection and SOLID principles.

        Implements the Dependency Inversion Principle by accepting abstract interfaces
        rather than concrete implementations, enabling testability and modularity.

        Args:
            orb_detector: Injectable ORB detector conforming to FeatureDetectorProtocol
            matcher: Injectable matcher conforming to MatcherProtocol
            clip_model_loader: Injectable CLIP loader conforming to ClipModelLoaderProtocol
            device: Target device string ('cuda', 'cpu', or specific device)
            clip_model_name: CLIP model variant identifier
            allow_symlinks: Whether to permit symlink resolution in path validation
            max_initialization_retries: Maximum retry attempts for resource initialization

        Raises:
            ValueError: If injectable dependencies do not conform to required protocols
            InitializationError: If default resource creation fails after retries
        """
        # Store configuration parameters for lazy initialization
        self._clip_model_name: str = clip_model_name
        self._allow_symlinks: bool = allow_symlinks
        self._max_retries: int = max_initialization_retries

        # Detect optimal device with user override capability
        self.device: str = self._determine_optimal_device(device)

        # Initialize dependency injection with protocol validation
        self._initialize_feature_detector(orb_detector)
        self._initialize_matcher(matcher)
        self._initialize_clip_loader(clip_model_loader)

        # Initialize CLIP model state placeholders for lazy loading
        self.clip_model: Optional[torch.nn.Module] = None
        self.clip_preprocess: Optional[Callable] = None

        # Create thread-safe initialization lock for concurrent access
        self._initialization_lock: threading.Lock = threading.Lock()

        # Setup logging for debugging and monitoring
        self._logger = logging.getLogger(f"{__name__}.{self.__class__.__name__}")

    def _determine_optimal_device(self, device_override: Optional[str]) -> str:
        """
        Determine optimal computational device with fallback hierarchy.

        Args:
            device_override: User-specified device string override

        Returns:
            Validated device string
        """
        # Honor explicit user device specification
        if device_override is not None:
            return device_override

        # Attempt CUDA detection with graceful fallback
        try:
            # Check CUDA availability and device count
            if torch.cuda.is_available() and torch.cuda.device_count() > 0:
                return "cuda"
        except Exception as e:
            # Log CUDA detection failure and continue with CPU fallback
            self._logger.warning(f"CUDA detection failed, using CPU: {e}")

        # Default to CPU for maximum compatibility
        return "cpu"

    def _initialize_feature_detector(
        self,
        detector: Optional[FeatureDetectorProtocol]
    ) -> None:
        """
        Initialize ORB feature detector with protocol validation and error handling.

        Args:
            detector: Injectable detector or None for default creation

        Raises:
            ValueError: If detector does not conform to FeatureDetectorProtocol
            OpenCVInitializationError: If default detector creation fails
        """
        if detector is not None:
            # Validate injectable detector conforms to required protocol
            if not hasattr(detector, 'detectAndCompute'):
                raise ValueError(
                    "Provided detector must implement FeatureDetectorProtocol "
                    "with detectAndCompute method"
                )
            # Store validated injectable detector
            self.orb_detector = detector
        else:
            # Create default ORB detector with retry logic
            for attempt in range(self._max_retries):
                try:
                    # Attempt default detector creation through factory
                    self.orb_detector = DefaultFeatureDetectorFactory.create()
                    break
                except (OpenCVInitializationError, ResourceAllocationError) as e:
                    # Log attempt failure and retry if attempts remaining
                    if attempt < self._max_retries - 1:
                        self._logger.warning(f"Detector creation attempt {attempt + 1} failed: {e}")
                        continue
                    # Re-raise if all retry attempts exhausted
                    raise

    def _initialize_matcher(self, matcher: Optional[MatcherProtocol]) -> None:
        """
        Initialize BFMatcher with protocol validation and error handling.

        Args:
            matcher: Injectable matcher or None for default creation

        Raises:
            ValueError: If matcher does not conform to MatcherProtocol
            OpenCVInitializationError: If default matcher creation fails
        """
        if matcher is not None:
            # Validate injectable matcher conforms to required protocol methods
            required_methods = ['match', 'knnMatch']
            for method_name in required_methods:
                if not hasattr(matcher, method_name):
                    raise ValueError(
                        f"Provided matcher must implement MatcherProtocol "
                        f"with {method_name} method"
                    )
            # Store validated injectable matcher
            self.bf_matcher = matcher
        else:
            # Create default BFMatcher with retry logic for resource allocation
            for attempt in range(self._max_retries):
                try:
                    # Attempt default matcher creation through factory
                    self.bf_matcher = DefaultMatcherFactory.create()
                    break
                except OpenCVInitializationError as e:
                    # Log attempt failure and retry if attempts remaining
                    if attempt < self._max_retries - 1:
                        self._logger.warning(f"Matcher creation attempt {attempt + 1} failed: {e}")
                        continue
                    # Re-raise if all retry attempts exhausted
                    raise

    def _initialize_clip_loader(
        self,
        loader: Optional[ClipModelLoaderProtocol]
    ) -> None:
        """
        Initialize CLIP model loader with protocol validation.

        Args:
            loader: Injectable loader or None for default creation
        """
        if loader is not None:
            # Validate injectable loader is callable
            if not callable(loader):
                raise ValueError("Provided clip_model_loader must be callable")
            # Store validated injectable loader
            self.clip_model_loader = loader
        else:
            # Create default CLIP loader instance
            self.clip_model_loader = DefaultClipModelLoader()

    @staticmethod
    def _validate_image_path(
        image_path: Union[str, Path],
        allow_symlinks: bool = False
    ) -> Path:
        """
        Comprehensive path validation with security and accessibility checks.

        Implements defensive programming against path traversal, permission issues,
        and file system race conditions with configurable symlink policy.

        Args:
            image_path: Path string or Path object to validate
            allow_symlinks: Whether to permit symlink resolution

        Returns:
            Validated Path object pointing to accessible image file

        Raises:
            ImageNotFoundError: If path does not exist in filesystem
            NotAFileError: If path exists but is not a regular file
            SymlinkNotAllowedError: If symlinks encountered but not permitted
            PermissionDeniedError: If insufficient read permissions
        """
        # Convert input to pathlib.Path for uniform handling and normalization
        path = Path(image_path).resolve()

        # Verify path existence in filesystem with atomic check
        if not path.exists():
            raise ImageNotFoundError(f"Image file does not exist: {path}")

        # Handle symlink policy enforcement
        if path.is_symlink() and not allow_symlinks:
            raise SymlinkNotAllowedError(f"Symlinks not permitted: {path}")

        # Validate path points to regular file (not directory, device, or fifo)
        if not path.is_file():
            raise NotAFileError(f"Path is not a regular file: {path}")

        # Check read permissions using OS-level access control
        if not os.access(path, os.R_OK):
            raise PermissionDeniedError(f"Insufficient read permissions: {path}")

        # Return validated, normalized path
        return path

    def _load_clip_with_double_checked_locking(self) -> None:
        """
        Thread-safe lazy loading of CLIP model using double-checked locking pattern.

        Implements proper double-checked locking to prevent race conditions while
        minimizing lock contention for already-loaded models. Handles network failures,
        CUDA errors, and memory constraints with comprehensive fallback strategies.

        Raises:
            ModelLoadError: If model loading fails after all retry attempts
        """
        # First check without lock acquisition for performance optimization
        if self.clip_model is not None and self.clip_preprocess is not None:
            return

        # Acquire initialization lock for thread-safe model loading
        with self._initialization_lock:
            # Second check after lock acquisition to prevent duplicate loading
            if self.clip_model is not None and self.clip_preprocess is not None:
                return

            # Attempt model loading with comprehensive error handling
            loading_errors = []
            for attempt in range(self._max_retries):
                try:
                    # Attempt CLIP model loading through injectable loader
                    self.clip_model, self.clip_preprocess = self.clip_model_loader(
                        self._clip_model_name,
                        self.device
                    )
                    # Log successful loading for monitoring
                    self._logger.info(f"CLIP model {self._clip_model_name} loaded on {self.device}")
                    return

                except (OSError, RuntimeError, ModuleNotFoundError) as e:
                    # Store error for potential re-raising
                    loading_errors.append(e)
                    # Log attempt failure for debugging
                    self._logger.warning(f"CLIP loading attempt {attempt + 1} failed: {e}")

                    # Attempt device fallback on CUDA errors
                    if "cuda" in str(e).lower() and self.device != "cpu":
                        self._logger.info("Attempting CPU fallback for CLIP loading")
                        self.device = "cpu"
                        continue

                except torch.cuda.OutOfMemoryError as e:
                    # Handle GPU memory exhaustion with cache clearing
                    loading_errors.append(e)
                    self._logger.warning(f"GPU OOM during CLIP loading: {e}")
                    torch.cuda.empty_cache()

                    # Fallback to CPU if not already attempted
                    if self.device != "cpu":
                        self.device = "cpu"
                        continue

            # Raise comprehensive error if all attempts failed
            error_summary = "; ".join(str(e) for e in loading_errors)
            raise ModelLoadError(
                f"Failed to load CLIP model {self._clip_model_name} after "
                f"{self._max_retries} attempts: {error_summary}"
            )

    def release_clip_resources(self) -> None:
        """
        Explicitly release CLIP model resources for memory management.

        Implements graceful resource cleanup for long-running processes or
        memory-constrained environments. Thread-safe and idempotent.
        """
        # Acquire lock to ensure thread-safe resource cleanup
        with self._initialization_lock:
            # Clear model references to enable garbage collection
            if self.clip_model is not None:
                self.clip_model = None
                self._logger.info("CLIP model references cleared")

            # Clear preprocessing function reference
            if self.clip_preprocess is not None:
                self.clip_preprocess = None
                self._logger.info("CLIP preprocessing function cleared")

            # Clear GPU memory cache if CUDA is available
            if torch.cuda.is_available():
                torch.cuda.empty_cache()
                self._logger.info("CUDA memory cache cleared")

    def feature_match_ratio(
        self,
        image1: Union[str, Path, np.ndarray],
        image2: Union[str, Path, np.ndarray],
        distance_threshold: int = 50,
        normalization_strategy: Literal["total_matches", "min_keypoints"] = "total_matches",
        apply_ratio_test: bool = False,
        ratio_threshold: float = 0.75,
        resize_max_side: Optional[int] = None,
        return_detailed_result: bool = False
    ) -> Union[float, FeatureMatchResult]:
        """
        Compute feature matching similarity using ORB descriptors and configurable normalization.

        Mathematical Foundation:
        1. ORB Feature Detection: Oriented FAST + Rotated BRIEF
           - FAST corners: C = {p : |I(p) - I(x)| > τ for x ∈ circle(p)}
           - Orientation: θ = atan2(m₀₁, m₁₀) where mₚq = Σᵨ uᵖvᵍI(u,v)
           - BRIEF descriptors: Binary strings from intensity comparisons

        2. Hamming Distance Matching: d_H(D₁,D₂) = Σᵢ D₁[i] ⊕ D₂[i]

        3. Normalization Strategies:
           - Total matches: ratio = |good_matches| / |total_matches|
           - Min keypoints: ratio = |good_matches| / min(|KP₁|, |KP₂|)

        4. Lowe's Ratio Test: accept match if d₁/d₂ < τ where d₁ < d₂

        Args:
            image1: First image as path or numpy array
            image2: Second image as path or numpy array
            distance_threshold: Maximum Hamming distance for good match
            normalization_strategy: Method for ratio computation
            apply_ratio_test: Whether to use Lowe's ratio test instead of threshold
            ratio_threshold: Threshold for Lowe's ratio test (typically 0.75)
            resize_max_side: Optional downscaling to limit computational cost
            return_detailed_result: If True, return comprehensive FeatureMatchResult

        Returns:
            Similarity ratio [0,1] or detailed FeatureMatchResult object

        Raises:
            ValueError: If parameters are invalid or incompatible
            FileNotFoundError: If image files do not exist
            RuntimeError: If feature detection fails
        """
        # Validate normalization strategy parameter
        valid_strategies = {"total_matches", "min_keypoints"}
        if normalization_strategy not in valid_strategies:
            raise ValueError(f"Invalid normalization_strategy: {normalization_strategy}")

        # Validate distance threshold for Hamming distance (8-bit binary descriptors)
        if distance_threshold < 0 or distance_threshold > 256:
            raise ValueError(f"distance_threshold must be in [0,256], got {distance_threshold}")

        # Validate ratio threshold for Lowe's test
        if not 0.0 < ratio_threshold < 1.0:
            raise ValueError(f"ratio_threshold must be in (0,1), got {ratio_threshold}")

        # Validate resize parameter if provided
        if resize_max_side is not None and resize_max_side <= 0:
            raise ValueError(f"resize_max_side must be positive, got {resize_max_side}")

        # Helper function for comprehensive image loading and preprocessing
        def _load_and_prepare_image(
            img_input: Union[str, Path, np.ndarray]
        ) -> np.ndarray:
            """Load image and prepare for ORB feature detection."""

            # Handle path input with validation
            if isinstance(img_input, (str, Path)):
                # Validate image path exists and is readable
                validated_path = self._validate_image_path(img_input, self._allow_symlinks)

                # Load image in BGR color format for OpenCV processing
                image_array = cv2.imread(str(validated_path), cv2.IMREAD_COLOR)
                if image_array is None:
                    raise RuntimeError(f"OpenCV failed to load image: {validated_path}")

            # Handle numpy array input with validation
            elif isinstance(img_input, np.ndarray):
                # Validate array dimensions for image data
                if img_input.ndim not in [2, 3]:
                    raise ValueError(f"Invalid array dimensions: {img_input.ndim}")

                # Copy input array to prevent mutation of original data
                image_array = img_input.copy()

                # Ensure 3-channel color image for consistent processing
                if image_array.ndim == 2:
                    # Convert grayscale to 3-channel
                    image_array = cv2.cvtColor(image_array, cv2.COLOR_GRAY2BGR)
                elif image_array.shape[2] == 4:
                    # Remove alpha channel if present
                    image_array = cv2.cvtColor(image_array, cv2.COLOR_BGRA2BGR)

            else:
                raise ValueError(f"Unsupported image input type: {type(img_input)}")

            # Apply optional resizing with aspect ratio preservation
            if resize_max_side is not None:
                # Get current image dimensions
                height, width = image_array.shape[:2]
                current_max_side = max(height, width)

                # Resize only if current size exceeds limit
                if current_max_side > resize_max_side:
                    # Compute scale factor preserving aspect ratio
                    scale_factor = resize_max_side / current_max_side
                    new_width = int(width * scale_factor)
                    new_height = int(height * scale_factor)

                    # Apply high-quality resizing using area interpolation
                    image_array = cv2.resize(
                        image_array,
                        (new_width, new_height),
                        interpolation=cv2.INTER_AREA
                    )

            # Convert to grayscale for ORB feature detection
            if image_array.ndim == 3:
                grayscale_array = cv2.cvtColor(image_array, cv2.COLOR_BGR2GRAY)
            else:
                grayscale_array = image_array

            return grayscale_array

        # Load and prepare both images for feature detection
        try:
            prepared_image1 = _load_and_prepare_image(image1)
            prepared_image2 = _load_and_prepare_image(image2)
        except Exception as e:
            raise RuntimeError(f"Image preparation failed: {e}") from e

        # Detect keypoints and compute ORB descriptors for both images
        try:
            # Image 1: Detect FAST corners and compute BRIEF descriptors
            keypoints1, descriptors1 = self.orb_detector.detectAndCompute(prepared_image1, None)
            # Image 2: Detect FAST corners and compute BRIEF descriptors
            keypoints2, descriptors2 = self.orb_detector.detectAndCompute(prepared_image2, None)
        except Exception as e:
            raise RuntimeError(f"ORB feature detection failed: {e}") from e

        # Handle cases where no descriptors are detected
        if descriptors1 is None or descriptors2 is None or len(descriptors1) == 0 or len(descriptors2) == 0:
            # Return zero similarity with detailed result if requested
            if return_detailed_result:
                return FeatureMatchResult(
                    similarity_ratio=0.0,
                    total_matches=0,
                    good_matches=0,
                    keypoints_image1=len(keypoints1) if keypoints1 else 0,
                    keypoints_image2=len(keypoints2) if keypoints2 else 0,
                    normalization_strategy=normalization_strategy,
                    confidence_level=0.0
                )
            return 0.0

        # Perform descriptor matching using configured algorithm
        good_matches = []
        total_matches = 0

        if apply_ratio_test:
            # Apply Lowe's ratio test using k-nearest neighbor matching
            try:
                # Find 2 nearest neighbors for each descriptor in image1
                knn_matches = self.bf_matcher.knnMatch(descriptors1, descriptors2, k=2)

                # Apply ratio test: accept if d₁/d₂ < threshold
                for match_pair in knn_matches:
                    if len(match_pair) == 2:
                        best_match, second_match = match_pair
                        # Lowe's ratio test criterion
                        if best_match.distance < ratio_threshold * second_match.distance:
                            good_matches.append(best_match)
                        total_matches += 1
                    elif len(match_pair) == 1:
                        # Only one match found - accept it
                        good_matches.append(match_pair[0])
                        total_matches += 1

            except Exception as e:
                raise RuntimeError(f"KNN matching failed: {e}") from e

        else:
            # Use simple distance threshold matching
            try:
                # Perform brute-force matching with cross-check
                matches = self.bf_matcher.match(descriptors1, descriptors2)
                total_matches = len(matches)

                # Filter matches by Hamming distance threshold
                good_matches = [
                    match for match in matches
                    if match.distance <= distance_threshold
                ]

            except Exception as e:
                raise RuntimeError(f"Distance threshold matching failed: {e}") from e

        # Handle edge case of no matches found
        if total_matches == 0:
            similarity_ratio = 0.0
        else:
            # Compute similarity ratio using specified normalization strategy
            if normalization_strategy == "total_matches":
                # Normalize by total number of attempted matches
                similarity_ratio = float(len(good_matches)) / float(total_matches)
            else:  # "min_keypoints"
                # Normalize by minimum number of detected keypoints
                min_keypoints = min(len(keypoints1), len(keypoints2))
                # Avoid division by zero if no keypoints detected
                if min_keypoints == 0:
                    similarity_ratio = 0.0
                else:
                    similarity_ratio = float(len(good_matches)) / float(min_keypoints)
                    # Clamp ratio to [0,1] range for min_keypoints strategy
                    similarity_ratio = min(similarity_ratio, 1.0)

        # Compute confidence level based on match statistics
        confidence_level = None
        if total_matches > 0:
            # Simple confidence based on absolute number of good matches
            confidence_level = min(float(len(good_matches)) / 50.0, 1.0)  # Normalize to [0,1]

        # Return detailed result object if requested
        if return_detailed_result:
            return FeatureMatchResult(
                similarity_ratio=similarity_ratio,
                total_matches=total_matches,
                good_matches=len(good_matches),
                keypoints_image1=len(keypoints1),
                keypoints_image2=len(keypoints2),
                normalization_strategy=normalization_strategy,
                confidence_level=confidence_level
            )

        # Return simple similarity ratio
        return similarity_ratio

    def histogram_correlation(
        self,
        image1: Union[str, Path, np.ndarray],
        image2: Union[str, Path, np.ndarray],
        bins: Union[int, Tuple[int, int]] = (50, 60),
        metric: Literal["correlation", "chi-square", "intersection", "bhattacharyya"] = "correlation",
        mask1: Optional[np.ndarray] = None,
        mask2: Optional[np.ndarray] = None,
        preserve_aspect: bool = True,
        resize_size: Tuple[int, int] = (256, 256),
        on_zero_histogram: Literal["error", "zero", "nan"] = "error",
        color_space: Literal["HSV", "RGB", "LAB"] = "HSV"
    ) -> float:
        """
        Compute image similarity via statistical comparison of color histograms.

        Mathematical Foundation:
        1. Color space transformation: I' = transform(I, color_space)
        2. Histogram computation: H_c(i) = Σₓᵧ 𝟙{bin(I'_c(x,y)) = i} for channel c
        3. Normalization: Ĥ_c = H_c / Σᵢ H_c[i] (L1 normalization)
        4. Distance metrics:
           - Correlation (Pearson): ρ = Σᵢ(h₁[i]-μ₁)(h₂[i]-μ₂) / (σ₁σ₂)
           - Chi-Square: χ² = Σᵢ (h₁[i]-h₂[i])² / (h₁[i]+h₂[i])
           - Intersection: I = Σᵢ min(h₁[i], h₂[i])
           - Bhattacharyya: d_B = -ln(Σᵢ √(h₁[i]h₂[i]))

        Args:
            image1: First image as path or numpy array
            image2: Second image as path or numpy array
            bins: Number of bins per channel (int) or tuple (H_bins, S_bins) for HSV
            metric: Statistical distance/similarity metric
            mask1: Optional binary mask for image1 region of interest
            mask2: Optional binary mask for image2 region of interest
            preserve_aspect: Whether to maintain aspect ratio during resize
            resize_size: Target dimensions (width, height) for preprocessing
            on_zero_histogram: Behavior when zero histogram encountered
            color_space: Color space for histogram computation

        Returns:
            Similarity score or distance value per selected metric

        Raises:
            HistogramError: On computation failures or invalid parameters
            ValueError: On invalid parameter combinations
        """
        # Validate metric selection against supported algorithms
        supported_metrics = {"correlation", "chi-square", "intersection", "bhattacharyya"}
        if metric not in supported_metrics:
            raise HistogramError(f"Unsupported metric: {metric}. Supported: {supported_metrics}")

        # Map metric names to OpenCV constants for implementation
        metric_constants = {
            "correlation": cv2.HISTCMP_CORREL,      # Higher values = more similar
            "chi-square": cv2.HISTCMP_CHISQR,      # Lower values = more similar
            "intersection": cv2.HISTCMP_INTERSECT, # Higher values = more similar
            "bhattacharyya": cv2.HISTCMP_BHATTACHARYYA  # Lower values = more similar
        }

        # Validate and normalize bins parameter
        if isinstance(bins, int):
            if bins <= 0:
                raise ValueError(f"bins must be positive, got {bins}")
            # Use same number of bins for all channels
            hist_bins = [bins] * 3
        elif isinstance(bins, (tuple, list)) and len(bins) == 2:
            if any(b <= 0 for b in bins):
                raise ValueError(f"All bin counts must be positive, got {bins}")
            # Use specified bins for H and S channels, default for V/B channel
            hist_bins = [bins[0], bins[1], 50]
        else:
            raise ValueError(f"bins must be int or 2-tuple, got {type(bins)}")

        # Validate color space selection
        supported_color_spaces = {"HSV", "RGB", "LAB"}
        if color_space not in supported_color_spaces:
            raise ValueError(f"Unsupported color_space: {color_space}")

        # Helper function for comprehensive image loading and preprocessing
        def _load_and_prepare_image(
            img_input: Union[str, Path, np.ndarray],
            mask: Optional[np.ndarray] = None
        ) -> Tuple[np.ndarray, Optional[np.ndarray]]:
            """Load image and prepare for histogram computation with optional masking."""

            # Handle path-based image loading
            if isinstance(img_input, (str, Path)):
                # Validate path accessibility
                validated_path = self._validate_image_path(img_input, self._allow_symlinks)

                # Load image using OpenCV in BGR format
                image_array = cv2.imread(str(validated_path), cv2.IMREAD_COLOR)
                if image_array is None:
                    raise HistogramError(f"Failed to load image: {validated_path}")

            # Handle numpy array input
            elif isinstance(img_input, np.ndarray):
                # Validate array structure for image data
                if img_input.ndim not in [2, 3]:
                    raise HistogramError(f"Invalid array dimensions: {img_input.ndim}")

                # Copy to prevent mutation of original data
                image_array = img_input.copy()

                # Ensure 3-channel format for consistent processing
                if image_array.ndim == 2:
                    # Convert grayscale to BGR
                    image_array = cv2.cvtColor(image_array, cv2.COLOR_GRAY2BGR)
                elif image_array.shape[2] == 4:
                    # Remove alpha channel
                    image_array = cv2.cvtColor(image_array, cv2.COLOR_BGRA2BGR)
                elif image_array.shape[2] != 3:
                    raise HistogramError(f"Unsupported channel count: {image_array.shape[2]}")

            else:
                raise HistogramError(f"Unsupported image type: {type(img_input)}")

            # Apply resizing with optional aspect ratio preservation
            def _resize_image(img: np.ndarray) -> np.ndarray:
                """Resize image according to configuration parameters."""
                current_height, current_width = img.shape[:2]
                target_width, target_height = resize_size

                if preserve_aspect:
                    # Compute scaling factor preserving aspect ratio
                    scale_factor = min(
                        target_width / current_width,
                        target_height / current_height
                    )
                    # Compute new dimensions maintaining aspect ratio
                    new_width = int(current_width * scale_factor)
                    new_height = int(current_height * scale_factor)
                    # Apply high-quality area interpolation for downscaling
                    return cv2.resize(img, (new_width, new_height), interpolation=cv2.INTER_AREA)
                else:
                    # Direct resize to target dimensions
                    return cv2.resize(img, (target_width, target_height), interpolation=cv2.INTER_AREA)

            # Apply resizing to image
            resized_image = _resize_image(image_array)

            # Process mask if provided
            processed_mask = None
            if mask is not None:
                # Validate mask dimensions match original image
                if mask.shape[:2] != image_array.shape[:2]:
                    raise HistogramError(f"Mask shape {mask.shape} incompatible with image shape {image_array.shape}")

                # Resize mask to match processed image
                resized_mask = cv2.resize(mask.astype(np.uint8), resized_image.shape[:2][::-1])
                # Ensure binary mask values
                processed_mask = (resized_mask > 0).astype(np.uint8) * 255

            return resized_image, processed_mask

        # Load and prepare both images with their respective masks
        try:
            prepared_image1, processed_mask1 = _load_and_prepare_image(image1, mask1)
            prepared_image2, processed_mask2 = _load_and_prepare_image(image2, mask2)
        except Exception as e:
            raise HistogramError(f"Image preparation failed: {e}") from e

        # Apply color space transformation based on configuration
        def _convert_color_space(img: np.ndarray) -> np.ndarray:
            """Convert image to specified color space for histogram computation."""
            if color_space == "HSV":
                # Convert BGR to HSV for perceptually uniform hue representation
                return cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
            elif color_space == "RGB":
                # Convert BGR to RGB for standard RGB analysis
                return cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
            elif color_space == "LAB":
                # Convert BGR to CIELAB for perceptually uniform color space
                return cv2.cvtColor(img, cv2.COLOR_BGR2LAB)
            else:
                raise HistogramError(f"Unsupported color space: {color_space}")

        # Transform both images to target color space
        try:
            color_image1 = _convert_color_space(prepared_image1)
            color_image2 = _convert_color_space(prepared_image2)
        except Exception as e:
            raise HistogramError(f"Color space conversion failed: {e}") from e

        # Define histogram computation ranges for different color spaces
        if color_space == "HSV":
            # HSV ranges: H[0,179], S[0,255], V[0,255] in OpenCV
            hist_ranges = [0, 180, 0, 256, 0, 256]
            channels = [0, 1]  # Use H and S channels for robustness to illumination
        elif color_space in ["RGB", "LAB"]:
            # RGB/LAB ranges: [0,255] for all channels
            hist_ranges = [0, 256, 0, 256, 0, 256]
            channels = [0, 1, 2]  # Use all three channels

        # Compute normalized histograms for both images
        try:
            # Image 1: Compute multi-dimensional histogram
            histogram1 = cv2.calcHist(
                [color_image1],           # Source image list
                channels,                 # Channel indices
                processed_mask1,          # Mask (None = full image)
                hist_bins[:len(channels)],# Bins per channel
                hist_ranges              # Value ranges
            )

            # Image 2: Compute multi-dimensional histogram
            histogram2 = cv2.calcHist(
                [color_image2],           # Source image list
                channels,                 # Channel indices
                processed_mask2,          # Mask (None = full image)
                hist_bins[:len(channels)],# Bins per channel
                hist_ranges              # Value ranges
            )

        except Exception as e:
            raise HistogramError(f"Histogram computation failed: {e}") from e

        # Handle zero histogram cases according to policy
        hist1_sum = np.sum(histogram1)
        hist2_sum = np.sum(histogram2)

        if hist1_sum == 0 or hist2_sum == 0:
            if on_zero_histogram == "error":
                raise HistogramError("Zero histogram detected - no data to compare")
            elif on_zero_histogram == "zero":
                return 0.0
            elif on_zero_histogram == "nan":
                return float('nan')

        # Apply L1 normalization to histograms for statistical comparison
        # Normalized histogram: Ĥ[i] = H[i] / Σⱼ H[j]
        cv2.normalize(histogram1, histogram1, alpha=1.0, beta=0.0, norm_type=cv2.NORM_L1)
        cv2.normalize(histogram2, histogram2, alpha=1.0, beta=0.0, norm_type=cv2.NORM_L1)

        # Compute similarity/distance using specified metric
        try:
            # Apply OpenCV histogram comparison with selected metric
            comparison_result = cv2.compareHist(
                histogram1,
                histogram2,
                metric_constants[metric]
            )

        except Exception as e:
            raise HistogramError(f"Histogram comparison failed for {metric}: {e}") from e

        # Return metric-specific result as float
        return float(comparison_result)

    def clip_embedding_similarity(
        self,
        image1: Union[str, Path, Image.Image, torch.Tensor, np.ndarray],
        image2: Union[str, Path, Image.Image, torch.Tensor, np.ndarray],
        use_mixed_precision: bool = False,
        batch_processing: bool = False
    ) -> float:
        """
        Compute cosine similarity between CLIP embeddings in high-dimensional semantic space.

        Mathematical Foundation:
        1. Vision Transformer encoding: E = ViT(patch_embed(I))
        2. Multi-head self-attention: Attention(Q,K,V) = softmax(QK^T/√d_k)V
        3. Layer normalization: LN(x) = γ(x-μ)/σ + β
        4. L2 normalization: ê = e / ||e||₂ where ||e||₂ = √(Σᵢ eᵢ²)
        5. Cosine similarity: sim(ê₁,ê₂) = ê₁·ê₂ = Σᵢ ê₁[i]ê₂[i]

        Args:
            image1: First image as path, PIL Image, tensor, or numpy array
            image2: Second image as path, PIL Image, tensor, or numpy array
            use_mixed_precision: Whether to use automatic mixed precision (AMP)
            batch_processing: Whether to process images in batch for efficiency

        Returns:
            Cosine similarity in [-1, 1] where 1 = identical, -1 = opposite

        Raises:
            ModelInferenceError: If CLIP inference fails
            ValueError: On invalid input types or tensor shapes
            torch.cuda.OutOfMemoryError: If GPU memory insufficient
        """
        # Ensure CLIP model is loaded using thread-safe lazy initialization
        self._load_clip_with_double_checked_locking()

        # Helper function for comprehensive input preprocessing
        def _prepare_input_tensor(
            img_input: Union[str, Path, Image.Image, torch.Tensor, np.ndarray]
        ) -> torch.Tensor:
            """Convert various input types to preprocessed CLIP tensor."""

            # Handle pre-computed torch.Tensor input
            if isinstance(img_input, torch.Tensor):
                input_tensor = img_input.clone()  # Avoid mutation of original tensor

                # Validate tensor shape for CLIP input requirements
                if input_tensor.ndim == 3:
                    # Add batch dimension: (C,H,W) -> (1,C,H,W)
                    input_tensor = input_tensor.unsqueeze(0)
                elif input_tensor.ndim == 4:
                    # Already has batch dimension: (B,C,H,W)
                    pass
                else:
                    raise ValueError(f"Invalid tensor dimensions: {input_tensor.ndim}, expected 3 or 4")

                # Validate channel count for RGB images
                if input_tensor.shape[1] != 3:
                    raise ValueError(f"Expected 3 channels, got {input_tensor.shape[1]}")

            # Handle numpy array input with comprehensive preprocessing
            elif isinstance(img_input, np.ndarray):
                # Validate array dimensions for image data
                if img_input.ndim not in [2, 3]:
                    raise ValueError(f"Invalid array dimensions: {img_input.ndim}")

                # Handle different array formats and convert to PIL
                if img_input.ndim == 2:
                    # Grayscale array - convert to RGB
                    pil_image = Image.fromarray(img_input).convert('RGB')
                elif img_input.ndim == 3:
                    # Color array - handle different channel orders
                    if img_input.shape[2] == 3:
                        # Assume RGB format for PIL compatibility
                        pil_image = Image.fromarray(img_input.astype(np.uint8)).convert('RGB')
                    elif img_input.shape[2] == 4:
                        # RGBA - remove alpha channel
                        rgb_array = img_input[:, :, :3]
                        pil_image = Image.fromarray(rgb_array.astype(np.uint8)).convert('RGB')
                    else:
                        raise ValueError(f"Unsupported channel count: {img_input.shape[2]}")

                # Apply CLIP preprocessing pipeline to PIL image
                input_tensor = self.clip_preprocess(pil_image)
                # Add batch dimension for inference
                input_tensor = input_tensor.unsqueeze(0)

            # Handle PIL Image input
            elif isinstance(img_input, Image.Image):
                # Ensure RGB format for CLIP compatibility
                rgb_image = img_input.convert('RGB')
                # Apply CLIP preprocessing (resize, normalize, tensorize)
                input_tensor = self.clip_preprocess(rgb_image)
                # Add batch dimension for model input
                input_tensor = input_tensor.unsqueeze(0)

            # Handle file path input
            elif isinstance(img_input, (str, Path)):
                # Validate image file accessibility
                validated_path = self._validate_image_path(img_input, self._allow_symlinks)

                # Load image using PIL with error handling
                try:
                    pil_image = Image.open(validated_path).convert('RGB')
                except (IOError, OSError) as e:
                    raise ModelInferenceError(f"Failed to load image {validated_path}: {e}") from e

                # Apply CLIP preprocessing pipeline
                input_tensor = self.clip_preprocess(pil_image)
                # Add batch dimension
                input_tensor = input_tensor.unsqueeze(0)

            else:
                raise ValueError(f"Unsupported input type: {type(img_input)}")

            # Move tensor to appropriate device for inference
            input_tensor = input_tensor.to(self.device)

            return input_tensor

        # Prepare both input images as CLIP-compatible tensors
        try:
            tensor1 = _prepare_input_tensor(image1)
            tensor2 = _prepare_input_tensor(image2)
        except Exception as e:
            raise ModelInferenceError(f"Input preparation failed: {e}") from e

        # Perform CLIP inference with comprehensive error handling
        try:
            # Configure automatic mixed precision if requested and supported
            if use_mixed_precision and torch.cuda.is_available():
                # Use autocast context for automatic mixed precision
                with torch.cuda.amp.autocast():
                    with torch.no_grad():  # Disable gradient computation for inference
                        if batch_processing:
                            # Process both images in single batch for efficiency
                            batch_tensor = torch.cat([tensor1, tensor2], dim=0)
                            batch_embeddings = self.clip_model.encode_image(batch_tensor)
                            embedding1, embedding2 = batch_embeddings[0:1], batch_embeddings[1:2]
                        else:
                            # Process images individually
                            embedding1 = self.clip_model.encode_image(tensor1)
                            embedding2 = self.clip_model.encode_image(tensor2)
            else:
                # Standard precision inference
                with torch.no_grad():  # Disable gradient computation for efficiency
                    if batch_processing:
                        # Batch processing for computational efficiency
                        batch_tensor = torch.cat([tensor1, tensor2], dim=0)
                        batch_embeddings = self.clip_model.encode_image(batch_tensor)
                        embedding1, embedding2 = batch_embeddings[0:1], batch_embeddings[1:2]
                    else:
                        # Individual image processing
                        embedding1 = self.clip_model.encode_image(tensor1)
                        embedding2 = self.clip_model.encode_image(tensor2)

        except torch.cuda.OutOfMemoryError as e:
            # Handle GPU memory exhaustion with automatic cleanup and CPU fallback
            self._logger.warning("GPU OOM during CLIP inference, attempting CPU fallback")
            torch.cuda.empty_cache()  # Clear GPU memory cache

            # Retry on CPU if currently using CUDA
            if self.device != "cpu":
                # Move model and tensors to CPU
                self.clip_model = self.clip_model.cpu()
                tensor1 = tensor1.cpu()
                tensor2 = tensor2.cpu()
                self.device = "cpu"

                # Retry inference on CPU
                with torch.no_grad():
                    embedding1 = self.clip_model.encode_image(tensor1)
                    embedding2 = self.clip_model.encode_image(tensor2)
            else:
                # Already on CPU - re-raise OOM error
                raise ModelInferenceError(f"CPU OOM during CLIP inference: {e}") from e

        except RuntimeError as e:
            # Handle other PyTorch runtime errors
            raise ModelInferenceError(f"CLIP inference failed: {e}") from e

        # Validate embedding shapes for compatibility
        if embedding1.shape != embedding2.shape:
            raise ValueError(f"Embedding shape mismatch: {embedding1.shape} vs {embedding2.shape}")

        # Apply L2 normalization to embeddings for cosine similarity
        # L2 norm: ||e||₂ = √(Σᵢ eᵢ²), normalized: ê = e / ||e||₂
        normalized_embedding1 = embedding1 / embedding1.norm(dim=-1, keepdim=True)
        normalized_embedding2 = embedding2 / embedding2.norm(dim=-1, keepdim=True)

        # Compute cosine similarity via dot product of normalized embeddings
        # Cosine similarity: sim(ê₁,ê₂) = ê₁ · ê₂ = Σᵢ ê₁[i]ê₂[i]
        cosine_similarity = torch.matmul(normalized_embedding1, normalized_embedding2.T)

        # Extract scalar similarity value and convert to Python float
        similarity_score = cosine_similarity.squeeze().item()

        return similarity_score

    def reverse_image_search_google(
        self,
        image_path: Union[str, Path],
        driver_path: Union[str, Path],
        timeout: float = 15.0,
        headless: bool = False,
        max_similar_urls: int = 10,
        retry_attempts: int = 3
    ) -> ReverseImageSearchResult:
        """
        Perform comprehensive Google reverse image search with robust web automation.

        Implementation employs multiple fallback strategies for UI element location,
        comprehensive error handling, and structured data extraction for enterprise use.

        Workflow:
        1. Validate local image and ChromeDriver accessibility
        2. Configure Chrome with optimized settings for automation
        3. Navigate to Google Images with retry logic
        4. Locate and activate reverse search interface using selector hierarchy
        5. Upload image file with progress monitoring
        6. Extract structured results with confidence scoring
        7. Implement graceful cleanup with resource management

        Args:
            image_path: Path to local image file for reverse search
            driver_path: Path to ChromeDriver executable
            timeout: Maximum wait time for page elements (seconds)
            headless: Whether to run browser in headless mode
            max_similar_urls: Maximum number of similar image URLs to extract
            retry_attempts: Number of retry attempts for failed operations

        Returns:
            ReverseImageSearchResult with comprehensive search findings

        Raises:
            LaunchError: If ChromeDriver initialization fails
            NavigationError: If page navigation fails
            UploadError: If image upload fails
            ExtractionError: If result extraction fails
        """
        # Validate image file accessibility using comprehensive path validation
        validated_image_path = self._validate_image_path(image_path, self._allow_symlinks)

        # Validate ChromeDriver executable accessibility
        driver_path_obj = Path(driver_path)
        if not driver_path_obj.exists() or not driver_path_obj.is_file():
            raise LaunchError(f"ChromeDriver not found: {driver_path_obj}")

        # Verify ChromeDriver has execution permissions
        if not os.access(driver_path_obj, os.X_OK):
            raise LaunchError(f"ChromeDriver not executable: {driver_path_obj}")

        # Configure Chrome options for robust automation
        chrome_options = Options()

        # Essential options for automation stability
        chrome_options.add_argument("--no-sandbox")                    # Bypass OS security model
        chrome_options.add_argument("--disable-dev-shm-usage")         # Overcome limited resource problems
        chrome_options.add_argument("--disable-gpu")                   # Applicable to Windows environments
        chrome_options.add_argument("--disable-extensions")            # Disable extensions for speed
        chrome_options.add_argument("--disable-plugins")               # Disable plugins for security
        chrome_options.add_argument("--disable-images")                # Load pages faster by skipping images
        chrome_options.add_argument("--disable-javascript")            # Disable JS when not needed
        chrome_options.add_argument("--disable-web-security")          # Disable web security for automation
        chrome_options.add_argument("--allow-running-insecure-content") # Allow mixed content

        # Window management for consistent DOM rendering
        if headless:
            chrome_options.add_argument("--headless")                  # Run in headless mode
            chrome_options.add_argument("--window-size=1920,1080")     # Set window size for headless
        else:
            chrome_options.add_argument("--start-maximized")           # Maximize window for visibility

        # User agent configuration to avoid bot detection
        chrome_options.add_argument(
            "--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 "
            "(KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
        )

        # Configure WebDriver service with error handling
        try:
            webdriver_service = Service(str(driver_path_obj))
        except Exception as e:
            raise LaunchError(f"WebDriver service configuration failed: {e}") from e

        # Initialize WebDriver with comprehensive error handling
        driver = None
        try:
            # Attempt Chrome WebDriver initialization
            driver = webdriver.Chrome(service=webdriver_service, options=chrome_options)

            # Configure implicit wait for element location
            driver.implicitly_wait(timeout)

            # Set page load timeout to prevent hanging
            driver.set_page_load_timeout(timeout * 2)

        except WebDriverException as e:
            # Clean up driver if partially initialized
            if driver is not None:
                try:
                    driver.quit()
                except:
                    pass  # Ignore cleanup errors
            raise LaunchError(f"ChromeDriver launch failed: {e}") from e

        try:
            # Navigate to Google Images with retry logic
            navigation_success = False
            for attempt in range(retry_attempts):
                try:
                    # Navigate to Google Images homepage
                    driver.get("https://images.google.com")

                    # Verify successful navigation by checking page title
                    if "Google" in driver.title:
                        navigation_success = True
                        break

                except Exception as e:
                    self._logger.warning(f"Navigation attempt {attempt + 1} failed: {e}")
                    if attempt < retry_attempts - 1:
                        continue

            if not navigation_success:
                raise NavigationError("Failed to navigate to Google Images after all attempts")

            # Implement robust selector strategy with fallback hierarchy
            camera_button_selectors = [
                "button[aria-label*='Search by image']",           # Primary aria-label selector
                "div[aria-label*='Search by image']",             # Alternative div with aria-label
                "button[data-ved*='camera']",                     # Data attribute selector
                "div.nDcEnd",                                     # CSS class fallback
                "//div[@aria-label and contains(@aria-label, 'Search by image')]",  # XPath fallback
                "//button[@aria-label and contains(@aria-label, 'camera')]"        # XPath alternative
            ]

            # Attempt to locate and click camera button using selector hierarchy
            camera_button_clicked = False
            for selector in camera_button_selectors:
                try:
                    # Determine selector type and create appropriate locator
                    if selector.startswith("//"):
                        # XPath selector
                        locator = (By.XPATH, selector)
                    else:
                        # CSS selector
                        locator = (By.CSS_SELECTOR, selector)

                    # Wait for element to be clickable
                    camera_button = WebDriverWait(driver, timeout).until(
                        EC.element_to_be_clickable(locator)
                    )

                    # Scroll element into view before clicking
                    driver.execute_script("arguments[0].scrollIntoView(true);", camera_button)

                    # Attempt click with JavaScript as fallback
                    try:
                        camera_button.click()
                    except Exception:
                        # Fallback to JavaScript click
                        driver.execute_script("arguments[0].click();", camera_button)

                    camera_button_clicked = True
                    break

                except TimeoutException:
                    # Try next selector in hierarchy
                    continue
                except Exception as e:
                    self._logger.warning(f"Camera button click failed with selector {selector}: {e}")
                    continue

            if not camera_button_clicked:
                raise NavigationError("Failed to locate camera button with all selectors")

            # Locate and click upload tab with multiple strategies
            upload_tab_selectors = [
                "//a[contains(text(), 'Upload an image')]",       # Direct text match
                "//div[contains(text(), 'Upload an image')]",     # Alternative element type
                "a[href*='upload']",                              # URL-based selector
                ".RZQOVd"                                         # CSS class fallback
            ]

            upload_tab_clicked = False
            for selector in upload_tab_selectors:
                try:
                    if selector.startswith("//"):
                        locator = (By.XPATH, selector)
                    else:
                        locator = (By.CSS_SELECTOR, selector)

                    upload_tab = WebDriverWait(driver, timeout).until(
                        EC.element_to_be_clickable(locator)
                    )

                    # Scroll into view and click
                    driver.execute_script("arguments[0].scrollIntoView(true);", upload_tab)
                    upload_tab.click()

                    upload_tab_clicked = True
                    break

                except TimeoutException:
                    continue
                except Exception as e:
                    self._logger.warning(f"Upload tab click failed with selector {selector}: {e}")
                    continue

            if not upload_tab_clicked:
                raise NavigationError("Failed to locate upload tab with all selectors")

            # Locate file input element for image upload
            file_input_selectors = [
                "input[name='encoded_image']",                    # Primary name attribute
                "input[type='file']",                             # Generic file input
                "input[accept*='image']",                         # Accept attribute selector
                ".cB9M7"                                          # CSS class fallback
            ]

            file_uploaded = False
            for selector in file_input_selectors:
                try:
                    file_input = WebDriverWait(driver, timeout).until(
                        EC.presence_of_element_located((By.CSS_SELECTOR, selector))
                    )

                    # Upload file by sending keys with absolute path
                    file_input.send_keys(str(validated_image_path.resolve()))

                    file_uploaded = True
                    break

                except TimeoutException:
                    continue
                except Exception as e:
                    self._logger.warning(f"File upload failed with selector {selector}: {e}")
                    continue

            if not file_uploaded:
                raise UploadError("Failed to upload image file with all selectors")

            # Wait for results page to load and extract best guess text
            best_guess_text = ""
            best_guess_selectors = [
                "div[role='heading']",                            # Primary role-based selector
                ".fKDtNb",                                        # CSS class selector
                "//div[contains(@class, 'r5a77d')]",             # XPath class selector
                "h3",                                             # Generic heading fallback
                ".gLFyf"                                          # Alternative class selector
            ]

            for selector in best_guess_selectors:
                try:
                    if selector.startswith("//"):
                        locator = (By.XPATH, selector)
                    else:
                        locator = (By.CSS_SELECTOR, selector)

                    best_guess_element = WebDriverWait(driver, timeout).until(
                        EC.presence_of_element_located(locator)
                    )

                    best_guess_text = best_guess_element.text.strip()
                    if best_guess_text:  # Only accept non-empty text
                        break

                except TimeoutException:
                    continue
                except Exception as e:
                    self._logger.debug(f"Best guess extraction failed with selector {selector}: {e}")
                    continue

            # Extract similar image URLs with comprehensive selectors
            similar_urls = []
            similar_image_selectors = [
                "a[jsname='sTFXNd']",                             # Primary jsname selector
                "a[href*='/imgres?']",                            # URL pattern selector
                ".rg_l",                                          # CSS class selector
                "//a[contains(@href, 'imgres')]"                 # XPath URL pattern
            ]

            for selector in similar_image_selectors:
                try:
                    if selector.startswith("//"):
                        locator = (By.XPATH, selector)
                    else:
                        locator = (By.CSS_SELECTOR, selector)

                    similar_elements = WebDriverWait(driver, timeout).until(
                        EC.presence_of_all_elements_located(locator)
                    )

                    # Extract URLs from elements up to maximum limit
                    for element in similar_elements[:max_similar_urls]:
                        href = element.get_attribute("href")
                        if href and href not in similar_urls:
                            similar_urls.append(href)

                    if similar_urls:  # Stop if URLs found
                        break

                except TimeoutException:
                    continue
                except Exception as e:
                    self._logger.debug(f"Similar images extraction failed with selector {selector}: {e}")
                    continue

            # Extract page title for context
            page_title = driver.title if driver.title else "Google Images"

            # Create and return structured result object
            return ReverseImageSearchResult(
                best_guess=best_guess_text,
                similar_image_urls=similar_urls,
                source_page_title=page_title,
                confidence_score=min(len(similar_urls) / 10.0, 1.0)  # Simple confidence metric
            )

        except Exception as e:
            # Wrap unexpected errors in appropriate exception type
            if isinstance(e, (NavigationError, UploadError, ExtractionError)):
                raise  # Re-raise domain-specific exceptions
            else:
                raise ExtractionError(f"Reverse image search failed: {e}") from e

        finally:
            # Ensure WebDriver cleanup regardless of success or failure
            if driver is not None:
                try:
                    driver.quit()
                except Exception as cleanup_error:
                    self._logger.warning(f"WebDriver cleanup failed: {cleanup_error}")


In [None]:
# Main Class

class ImageSimilarityDetector:
    """
    Production-grade image similarity detection system implementing multiple algorithms:

    1. Perceptual hash difference using DCT-based pHash algorithm
    2. ORB feature matching with configurable normalization strategies
    3. Color histogram correlation with multiple distance metrics
    4. CLIP embedding cosine similarity using vision transformers
    5. Automated Google reverse image search with robust web scraping

    Implements enterprise-grade resource management, thread safety, dependency injection,
    comprehensive error handling, and mathematical rigor per academic literature.
    """

    def __init__(
        self,
        orb_detector: Optional[FeatureDetectorProtocol] = None,
        matcher: Optional[MatcherProtocol] = None,
        clip_model_loader: Optional[ClipModelLoaderProtocol] = None,
        device: Optional[str] = None,
        clip_model_name: str = "ViT-B/32",
        allow_symlinks: bool = False,
        max_initialization_retries: int = 3,
        resource_constraints: ResourceConstraints = ResourceConstraints.BALANCED,
        enable_performance_monitoring: bool = True,
        enable_parameter_optimization: bool = False,
        validation_policy: ValidationPolicy = ValidationPolicy.PRODUCTION
    ) -> None:
        """
        Initialize ImageSimilarityDetector with enterprise-grade dependency injection and monitoring.

        Implements comprehensive resource management, protocol validation, statistical monitoring,
        and factory-based component creation for production deployment in quantitative finance environments.

        Args:
            orb_detector: Injectable ORB detector conforming to FeatureDetectorProtocol
            matcher: Injectable matcher conforming to MatcherProtocol
            clip_model_loader: Injectable CLIP loader conforming to ClipModelLoaderProtocol
            device: Target device string ('cuda', 'cpu', or specific device)
            clip_model_name: CLIP model variant identifier for loading
            allow_symlinks: Whether to permit symlink resolution in path validation
            max_initialization_retries: Maximum retry attempts for resource initialization
            resource_constraints: Resource utilization profile for optimization
            enable_performance_monitoring: Whether to enable comprehensive performance tracking
            enable_parameter_optimization: Whether to enable automatic parameter tuning
            validation_policy: Validation strictness level for result structures

        Raises:
            ValueError: If injectable dependencies do not conform to required protocols
            InitializationError: If default resource creation fails after retries
            ResourceAllocationError: If insufficient system resources available
        """
        # Store configuration parameters for comprehensive system setup
        self._clip_model_name: str = clip_model_name
        self._allow_symlinks: bool = allow_symlinks
        self._max_initialization_retries: int = max_initialization_retries
        self._enable_performance_monitoring: bool = enable_performance_monitoring
        self._enable_parameter_optimization: bool = enable_parameter_optimization

        # Initialize enterprise-grade resource management system
        self.resource_manager = ResourceManager(
            resource_constraints=resource_constraints,
            monitoring_interval=1.0,
            cleanup_threshold=0.8
        )

        # Start resource monitoring for production deployment
        if self._enable_performance_monitoring:
            self.resource_manager.start_monitoring()

        # Set validation policy for all result structures created by this instance
        FeatureMatchResult.set_validation_policy(validation_policy)
        ReverseImageSearchResult.set_validation_policy(validation_policy)

        # Determine optimal computational device with comprehensive validation
        self.device: str = self._determine_optimal_device_with_validation(device)

        # Initialize factory-based component creation with dependency injection
        self._initialize_feature_detector_with_factory(orb_detector)
        self._initialize_matcher_with_factory(matcher)
        self._initialize_clip_loader_with_factory(clip_model_loader)

        # Initialize CLIP model state placeholders for lazy loading with monitoring
        self.clip_model: Optional[torch.nn.Module] = None
        self.clip_preprocess: Optional[Callable] = None
        self._clip_loading_performance: Optional[Dict[str, Any]] = None

        # Create thread-safe initialization locks for concurrent access protection
        self._initialization_lock: threading.Lock = threading.Lock()
        self._performance_lock: threading.Lock = threading.Lock()

        # Initialize comprehensive performance tracking system
        self._method_performance_history: Dict[str, List[Dict[str, Any]]] = defaultdict(list)
        self._initialization_timestamp: datetime.datetime = datetime.datetime.utcnow()

        # Setup enterprise-grade logging with structured metadata
        self._logger = logging.getLogger(f"{__name__}.{self.__class__.__name__}")
        self._logger.info(
            "ImageSimilarityDetector initialized with enterprise configuration",
            extra={
                'device': self.device,
                'clip_model': self._clip_model_name,
                'resource_constraints': resource_constraints.value,
                'performance_monitoring': self._enable_performance_monitoring,
                'validation_policy': validation_policy.value
            }
        )

    def _determine_optimal_device_with_validation(self, device_override: Optional[str]) -> str:
        """
        Determine optimal computational device with comprehensive capability validation.

        Args:
            device_override: User-specified device string override

        Returns:
            Validated device string with capability confirmation

        Raises:
            InitializationError: If specified device is not available or capable
        """
        # Honor explicit user device specification with validation
        if device_override is not None:
            # Validate device capability and availability
            if not self._validate_device_capability(device_override):
                raise InitializationError(
                    f"Specified device '{device_override}' is not available or lacks required capabilities",
                    component_name="device_validator",
                    system_resources={'requested_device': device_override}
                )
            return device_override

        # Attempt CUDA detection with comprehensive capability assessment
        try:
            # Check CUDA availability with detailed capability analysis
            if torch.cuda.is_available() and torch.cuda.device_count() > 0:
                # Validate CUDA compute capability for deep learning operations
                for device_idx in range(torch.cuda.device_count()):
                    # Get device properties for capability assessment
                    device_props = torch.cuda.get_device_properties(device_idx)
                    # Ensure sufficient compute capability (≥3.5 for modern operations)
                    if device_props.major >= 3 and (device_props.major > 3 or device_props.minor >= 5):
                        # Validate memory availability for typical model loading
                        if device_props.total_memory >= 2 * 1024**3:  # ≥2GB VRAM
                            return f"cuda:{device_idx}"

                # Fallback to generic CUDA if specific device validation fails
                self._logger.warning("CUDA devices available but may have limited capability")
                return "cuda"

        except Exception as e:
            # Log CUDA detection failure with comprehensive context
            self._logger.warning(f"CUDA detection failed, falling back to CPU: {e}")

        # Validate CPU capabilities for intensive operations
        cpu_count = psutil.cpu_count(logical=False)
        available_memory = psutil.virtual_memory().available

        # Ensure minimum CPU and memory requirements for image processing
        if cpu_count < 2 or available_memory < 4 * 1024**3:  # <2 cores or <4GB RAM
            self._logger.warning(
                f"Limited CPU resources detected: {cpu_count} cores, "
                f"{available_memory / 1024**3:.1f}GB RAM"
            )

        # Default to CPU with capability confirmation
        return "cpu"

    def _validate_device_capability(self, device: str) -> bool:
        """
        Validate device capability for image similarity operations.

        Args:
            device: Device specification to validate

        Returns:
            Boolean indicating device capability and availability
        """
        # Validate CPU device with system resource assessment
        if device == "cpu":
            # Ensure minimum CPU and memory requirements
            return (psutil.cpu_count(logical=False) >= 1 and
                    psutil.virtual_memory().available >= 1 * 1024**3)

        # Validate CUDA devices with comprehensive capability checking
        if device.startswith("cuda"):
            # Check CUDA availability in PyTorch
            if not torch.cuda.is_available():
                return False

            # Parse and validate specific device index if provided
            if ":" in device:
                try:
                    device_index = int(device.split(":")[1])
                    # Validate device index is within available range
                    if device_index >= torch.cuda.device_count():
                        return False

                    # Validate device compute capability and memory
                    device_props = torch.cuda.get_device_properties(device_index)
                    return (device_props.major >= 3 and
                            device_props.total_memory >= 1 * 1024**3)  # ≥1GB VRAM

                except (ValueError, IndexError):
                    return False

            # Generic CUDA validation
            return True

        # Validate Apple Silicon MPS device
        if device == "mps":
            return (hasattr(torch.backends, 'mps') and
                    torch.backends.mps.is_available())

        # Unknown device specification
        return False

    def _initialize_feature_detector_with_factory(
        self,
        detector: Optional[FeatureDetectorProtocol]
    ) -> None:
        """
        Initialize ORB feature detector using factory pattern with protocol validation.

        Args:
            detector: Injectable detector or None for factory creation

        Raises:
            ValueError: If detector does not conform to FeatureDetectorProtocol
            OpenCVInitializationError: If detector creation fails
        """
        # Initialize feature detector factory with optimization capabilities
        self._feature_detector_factory = DefaultFeatureDetectorFactory(
            resource_manager=self.resource_manager,
            enable_optimization=self._enable_parameter_optimization,
            cache_size=10
        )

        if detector is not None:
            # Validate injectable detector conforms to protocol requirements
            is_valid, violations = validate_protocol_implementation(
                detector, FeatureDetectorProtocol, strict=True
            )
            if not is_valid:
                raise ValueError(f"Detector protocol validation failed: {violations}")

            # Store validated injectable detector with performance monitoring
            self.orb_detector = detector
            self._logger.info("Using injectable ORB detector with protocol validation")
        else:
            # Create optimized detector using factory with retry logic
            for attempt in range(self._max_initialization_retries):
                try:
                    # Create detector with production-optimized parameters
                    self.orb_detector = self._feature_detector_factory.create(
                        nfeatures=1000,  # Increased for better matching reliability
                        scaleFactor=1.2,
                        nlevels=8,
                        edgeThreshold=31,
                        patchSize=31,
                        fastThreshold=20,
                        optimize_parameters=self._enable_parameter_optimization,
                        target_performance="balanced"
                    )

                    # Log successful detector creation with factory statistics
                    self._logger.info(
                        "ORB detector created via factory",
                        extra={'attempt': attempt + 1, 'optimization_enabled': self._enable_parameter_optimization}
                    )
                    break

                except (OpenCVInitializationError, ResourceAllocationError) as e:
                    # Log attempt failure and retry if attempts remaining
                    if attempt < self._max_initialization_retries - 1:
                        self._logger.warning(f"Detector creation attempt {attempt + 1} failed: {e}")
                        continue
                    # Re-raise with enhanced context if all attempts exhausted
                    raise InitializationError(
                        f"Failed to create ORB detector after {self._max_initialization_retries} attempts",
                        component_name="orb_detector",
                        initialization_stage="factory_creation",
                        forensic_metadata=ForensicMetadata(
                            operation_name="orb_detector_initialization",
                            algorithm_parameters={'max_retries': self._max_initialization_retries}
                        )
                    ) from e

    def _initialize_matcher_with_factory(self, matcher: Optional[MatcherProtocol]) -> None:
        """
        Initialize BFMatcher using factory pattern with protocol validation.

        Args:
            matcher: Injectable matcher or None for factory creation

        Raises:
            ValueError: If matcher does not conform to MatcherProtocol
            OpenCVInitializationError: If matcher creation fails
        """
        # Initialize matcher factory with performance profiling
        self._matcher_factory = DefaultMatcherFactory(
            resource_manager=self.resource_manager,
            enable_profiling=self._enable_performance_monitoring
        )

        if matcher is not None:
            # Validate injectable matcher conforms to protocol requirements
            is_valid, violations = validate_protocol_implementation(
                matcher, MatcherProtocol, strict=True
            )
            if not is_valid:
                raise ValueError(f"Matcher protocol validation failed: {violations}")

            # Store validated injectable matcher
            self.bf_matcher = matcher
            self._logger.info("Using injectable BFMatcher with protocol validation")
        else:
            # Create optimized matcher using factory with error handling
            try:
                # Create matcher optimized for binary descriptor matching
                self.bf_matcher = self._matcher_factory.create(
                    normType=cv2.NORM_HAMMING,  # Optimal for ORB binary descriptors
                    crossCheck=True,  # Enhanced precision for quantitative analysis
                    optimize_for_throughput=False  # Prioritize accuracy over speed
                )

                # Log successful matcher creation
                self._logger.info("BFMatcher created via factory with cross-check enabled")

            except OpenCVInitializationError as e:
                # Wrap in initialization error with enhanced context
                raise InitializationError(
                    f"Failed to create BFMatcher: {e}",
                    component_name="bf_matcher",
                    initialization_stage="factory_creation"
                ) from e

    def _initialize_clip_loader_with_factory(
        self,
        loader: Optional[ClipModelLoaderProtocol]
    ) -> None:
        """
        Initialize CLIP model loader using factory pattern with protocol validation.

        Args:
            loader: Injectable loader or None for factory creation
        """
        if loader is not None:
            # Validate injectable loader conforms to protocol requirements
            is_valid, violations = validate_protocol_implementation(
                loader, ClipModelLoaderProtocol, strict=True
            )
            if not is_valid:
                raise ValueError(f"CLIP loader protocol validation failed: {violations}")

            # Store validated injectable loader
            self.clip_model_loader = loader
            self._logger.info("Using injectable CLIP loader with protocol validation")
        else:
            # Create default CLIP loader with enhanced capabilities
            self.clip_model_loader = DefaultClipModelLoader(
                resource_manager=self.resource_manager,
                enable_model_caching=True,
                enable_performance_monitoring=self._enable_performance_monitoring
            )

            # Log default loader creation
            self._logger.info("Default CLIP loader created with caching and monitoring")

    @staticmethod
    def _validate_image_path(
        image_path: Union[str, Path],
        allow_symlinks: bool = False,
        perform_content_validation: bool = True,
        max_file_size_mb: float = 100.0
    ) -> Path:
        """
        Comprehensive image path validation with security analysis and forensic metadata.

        Implements enterprise-grade path validation with symlink analysis, content verification,
        security scanning, and comprehensive error reporting for production deployment.

        Args:
            image_path: Path string or Path object to validate
            allow_symlinks: Whether to permit symlink resolution with safety checks
            perform_content_validation: Whether to validate file content as image data
            max_file_size_mb: Maximum allowed file size in megabytes

        Returns:
            Validated Path object with comprehensive security and integrity verification

        Raises:
            ImageNotFoundError: If path does not exist with search context
            NotAFileError: If path exists but is not regular file with type analysis
            SymlinkNotAllowedError: If symlinks encountered but not permitted with chain analysis
            PermissionDeniedError: If insufficient permissions with ACL analysis
            ImageUnreadableError: If file corrupted or invalid format with diagnostic data
        """
        # Convert input to pathlib.Path for uniform handling with normalization
        try:
            # Resolve path with comprehensive normalization and validation
            normalized_path = Path(image_path).resolve()
        except (OSError, ValueError) as e:
            # Handle path resolution failures with enhanced error context
            raise ImageValidationError(
                f"Invalid path format or resolution failure: {image_path}",
                file_path=image_path,
                validation_step="path_normalization",
                forensic_metadata=ForensicMetadata(
                    operation_name="path_validation",
                    algorithm_parameters={'original_path': str(image_path)}
                )
            ) from e

        # Verify path existence in filesystem with comprehensive search analysis
        if not normalized_path.exists():
            # Analyze potential alternative paths for diagnostic purposes
            parent_dir = normalized_path.parent
            suggested_alternatives = []

            if parent_dir.exists():
                # Search for similar filenames in parent directory
                try:
                    similar_files = [
                        f for f in parent_dir.iterdir()
                        if f.is_file() and
                        f.stem.lower() in normalized_path.stem.lower() or
                        normalized_path.stem.lower() in f.stem.lower()
                    ]
                    suggested_alternatives = [str(f) for f in similar_files[:5]]
                except (PermissionError, OSError):
                    # Handle directory access failures gracefully
                    pass

            # Raise enhanced not found error with search context
            raise ImageNotFoundError(
                f"Image file does not exist: {normalized_path}",
                file_path=normalized_path,
                search_paths=suggested_alternatives,
                file_attributes={'parent_exists': parent_dir.exists() if parent_dir else False},
                forensic_metadata=ForensicMetadata(
                    operation_name="existence_validation",
                    algorithm_parameters={'search_performed': True}
                )
            )

        # Handle symlink analysis and policy enforcement
        if normalized_path.is_symlink():
            if not allow_symlinks:
                # Analyze symlink for security reporting
                try:
                    symlink_target = normalized_path.readlink()
                    resolved_target = normalized_path.resolve()

                    # Detect circular references in symlink chain
                    resolution_chain = []
                    current_path = normalized_path
                    max_resolution_depth = 10

                    for _ in range(max_resolution_depth):
                        if current_path.is_symlink():
                            resolution_chain.append(current_path)
                            current_path = current_path.readlink()
                            # Check for circular reference
                            if current_path in resolution_chain:
                                break
                        else:
                            break

                    # Raise symlink policy violation with comprehensive analysis
                    raise SymlinkNotAllowedError(
                        f"Symlinks not permitted by security policy: {normalized_path}",
                        file_path=normalized_path,
                        symlink_target=resolved_target,
                        resolution_chain=resolution_chain,
                        forensic_metadata=ForensicMetadata(
                            operation_name="symlink_analysis",
                            algorithm_parameters={
                                'symlink_target': str(symlink_target),
                                'resolution_depth': len(resolution_chain),
                                'circular_reference': current_path in resolution_chain
                            }
                        )
                    )

                except (OSError, RuntimeError) as e:
                    # Handle symlink analysis failures
                    raise SymlinkNotAllowedError(
                        f"Symlink analysis failed for security validation: {normalized_path}",
                        file_path=normalized_path
                    ) from e
            else:
                # Validate symlink target accessibility and safety
                try:
                    # Ensure symlink resolves to valid target
                    resolved_target = normalized_path.resolve()
                    if not resolved_target.exists():
                        raise ImageNotFoundError(
                            f"Symlink target does not exist: {normalized_path} -> {resolved_target}",
                            file_path=normalized_path,
                            file_attributes={'symlink_target': str(resolved_target)}
                        )
                except (OSError, RuntimeError) as e:
                    raise SymlinkNotAllowedError(
                        f"Symlink resolution failed: {normalized_path}",
                        file_path=normalized_path
                    ) from e

        # Validate path points to regular file with comprehensive type analysis
        if not normalized_path.is_file():
            # Analyze actual file type for enhanced error reporting
            actual_type = "unknown"
            try:
                if normalized_path.is_dir():
                    actual_type = "directory"
                elif normalized_path.is_symlink():
                    actual_type = "symlink"
                elif normalized_path.is_block_device():
                    actual_type = "block_device"
                elif normalized_path.is_char_device():
                    actual_type = "character_device"
                elif normalized_path.is_fifo():
                    actual_type = "named_pipe"
                elif normalized_path.is_socket():
                    actual_type = "socket"
            except (OSError, AttributeError):
                # Handle file type detection failures
                pass

            # Raise file type error with comprehensive analysis
            raise NotAFileError(
                f"Path is not a regular file: {normalized_path}",
                file_path=normalized_path,
                actual_file_type=actual_type,
                forensic_metadata=ForensicMetadata(
                    operation_name="file_type_validation",
                    algorithm_parameters={'detected_type': actual_type}
                )
            )

        # Comprehensive permission analysis with ACL assessment
        try:
            # Check basic read permission using OS-level access control
            if not os.access(normalized_path, os.R_OK):
                # Analyze permission structure for detailed reporting
                file_stat = normalized_path.stat()
                permission_analysis = {
                    'owner_readable': bool(file_stat.st_mode & 0o400),
                    'group_readable': bool(file_stat.st_mode & 0o040),
                    'other_readable': bool(file_stat.st_mode & 0o004),
                    'file_mode_octal': oct(file_stat.st_mode)[-3:],
                    'owner_uid': file_stat.st_uid,
                    'current_uid': os.getuid() if hasattr(os, 'getuid') else None
                }

                # Raise permission error with detailed ACL analysis
                raise PermissionDeniedError(
                    f"Insufficient read permissions for image file: {normalized_path}",
                    file_path=normalized_path,
                    permission_analysis=permission_analysis,
                    required_permissions=['read'],
                    forensic_metadata=ForensicMetadata(
                        operation_name="permission_validation",
                        algorithm_parameters=permission_analysis
                    )
                )
        except (OSError, AttributeError) as e:
            # Handle permission analysis failures
            raise PermissionDeniedError(
                f"Permission validation failed: {normalized_path}",
                file_path=normalized_path
            ) from e

        # File size validation for security and performance
        try:
            file_size_bytes = normalized_path.stat().st_size
            file_size_mb = file_size_bytes / (1024 * 1024)

            if file_size_mb > max_file_size_mb:
                raise ImageValidationError(
                    f"File size ({file_size_mb:.1f}MB) exceeds maximum allowed ({max_file_size_mb}MB): {normalized_path}",
                    file_path=normalized_path,
                    validation_step="size_validation",
                    file_attributes={'size_mb': file_size_mb, 'max_allowed_mb': max_file_size_mb}
                )

        except OSError as e:
            raise ImageValidationError(
                f"File size validation failed: {normalized_path}",
                file_path=normalized_path,
                validation_step="size_validation"
            ) from e

        # Optional content validation for image format verification
        if perform_content_validation:
            corruption_indicators = {}
            format_analysis = {}

            try:
                # Attempt to open and validate image content using PIL
                with Image.open(normalized_path) as img:
                    # Basic format validation
                    format_analysis = {
                        'format': img.format,
                        'mode': img.mode,
                        'size': img.size,
                        'has_transparency': hasattr(img, 'transparency') and img.transparency is not None
                    }

                    # Verify image can be loaded without corruption
                    img.verify()

            except (IOError, OSError) as e:
                # Analyze corruption indicators for diagnostic reporting
                corruption_indicators = {
                    'pil_error': str(e),
                    'file_truncated': 'truncated' in str(e).lower(),
                    'format_unsupported': 'cannot identify' in str(e).lower()
                }

                # Raise image corruption error with diagnostic analysis
                raise ImageUnreadableError(
                    f"Image file is corrupted or unsupported format: {normalized_path}",
                    file_path=normalized_path,
                    corruption_indicators=corruption_indicators,
                    format_analysis=format_analysis,
                    forensic_metadata=ForensicMetadata(
                        operation_name="content_validation",
                        algorithm_parameters={
                            'validation_method': 'PIL_verification',
                            'corruption_detected': True
                        }
                    )
                ) from e

            except Exception as e:
                # Handle unexpected validation failures
                raise ImageUnreadableError(
                    f"Unexpected error during image content validation: {normalized_path}",
                    file_path=normalized_path,
                    corruption_indicators={'unexpected_error': str(e)}
                ) from e

        # Return validated, normalized, and security-verified path
        return normalized_path

    def _load_clip_with_comprehensive_monitoring(self) -> None:
        """
        Thread-safe lazy loading of CLIP model with comprehensive performance monitoring.

        Implements enterprise-grade model loading with double-checked locking, performance
        profiling, resource optimization, and comprehensive error recovery strategies.

        Raises:
            ModelLoadError: If model loading fails after all recovery attempts with detailed context
        """
        # First check without lock acquisition for performance optimization
        if self.clip_model is not None and self.clip_preprocess is not None:
            return

        # Acquire initialization lock for thread-safe model loading coordination
        with self._initialization_lock:
            # Second check after lock acquisition to prevent duplicate loading
            if self.clip_model is not None and self.clip_preprocess is not None:
                return

            # Initialize comprehensive performance monitoring
            loading_start_time = time.perf_counter()
            memory_before = psutil.Process().memory_info().rss
            gpu_memory_before = torch.cuda.memory_allocated() if torch.cuda.is_available() else 0

            # Track loading attempts with detailed context
            loading_attempts = []
            device_fallback_chain = [self.device]

            # Add CPU fallback if not already specified
            if self.device != "cpu":
                device_fallback_chain.append("cpu")

            # Attempt model loading with comprehensive error handling and fallback
            for attempt in range(self._max_initialization_retries):
                for device_attempt in device_fallback_chain:
                    attempt_start_time = time.perf_counter()
                    attempt_context = {
                        'attempt_number': attempt + 1,
                        'device': device_attempt,
                        'model_name': self._clip_model_name,
                        'timestamp': datetime.datetime.utcnow().isoformat()
                    }

                    try:
                        # Attempt CLIP model loading through injectable loader with monitoring
                        self._logger.info(f"Attempting CLIP model loading: attempt {attempt + 1}, device {device_attempt}")

                        # Load model using factory loader with enhanced configuration
                        self.clip_model, self.clip_preprocess = self.clip_model_loader(
                            model_name=self._clip_model_name,
                            device=device_attempt,
                            enable_optimization=True,
                            precision="float32"  # Ensure numerical stability for quantitative analysis
                        )

                        # Validate successful loading with functionality verification
                        if self.clip_model is None or self.clip_preprocess is None:
                            raise ModelLoadError(
                                f"CLIP model or preprocess function is None after loading",
                                model_name=self._clip_model_name
                            )

                        # Perform basic functionality test to ensure model integrity
                        try:
                            # Create test tensor to validate model functionality
                            test_tensor = torch.randn(1, 3, 224, 224).to(device_attempt)
                            with torch.no_grad():
                                # Test forward pass to ensure model is functional
                                test_embedding = self.clip_model.encode_image(test_tensor)
                                # Validate embedding properties
                                if test_embedding is None or test_embedding.numel() == 0:
                                    raise ModelLoadError("Model produces invalid embeddings")
                        except Exception as e:
                            raise ModelLoadError(f"Model functionality test failed: {e}") from e

                        # Record successful loading performance metrics
                        attempt_duration = time.perf_counter() - attempt_start_time
                        total_duration = time.perf_counter() - loading_start_time
                        memory_after = psutil.Process().memory_info().rss
                        gpu_memory_after = torch.cuda.memory_allocated() if torch.cuda.is_available() else 0

                        # Store comprehensive loading performance data
                        self._clip_loading_performance = {
                            'success': True,
                            'total_attempts': attempt + 1,
                            'successful_device': device_attempt,
                            'total_loading_time': total_duration,
                            'final_attempt_time': attempt_duration,
                            'memory_delta_mb': (memory_after - memory_before) / 1024 / 1024,
                            'gpu_memory_delta_mb': (gpu_memory_after - gpu_memory_before) / 1024 / 1024,
                            'model_parameters': self._estimate_model_parameters(),
                            'device_fallback_used': device_attempt != self.device
                        }

                        # Update device if fallback was successful
                        if device_attempt != self.device:
                            self._logger.info(f"Device fallback successful: {self.device} -> {device_attempt}")
                            self.device = device_attempt

                        # Log successful loading with comprehensive metrics
                        self._logger.info(
                            f"CLIP model loaded successfully: {self._clip_model_name}",
                            extra=self._clip_loading_performance
                        )

                        return  # Exit successfully

                    except (OSError, RuntimeError, ModuleNotFoundError) as e:
                        # Record failed attempt with detailed context
                        attempt_duration = time.perf_counter() - attempt_start_time
                        attempt_context.update({
                            'success': False,
                            'error_type': type(e).__name__,
                            'error_message': str(e),
                            'attempt_duration': attempt_duration
                        })
                        loading_attempts.append(attempt_context)

                        # Log attempt failure with context
                        self._logger.warning(
                            f"CLIP loading attempt failed: {type(e).__name__}: {e}",
                            extra=attempt_context
                        )

                        # Continue to next device in fallback chain
                        continue

                    except torch.cuda.OutOfMemoryError as e:
                        # Handle GPU memory exhaustion with automatic cleanup
                        attempt_context.update({
                            'success': False,
                            'error_type': 'OutOfMemoryError',
                            'error_message': str(e),
                            'gpu_memory_info': self._get_gpu_memory_info()
                        })
                        loading_attempts.append(attempt_context)

                        # Perform aggressive GPU memory cleanup
                        if torch.cuda.is_available():
                            torch.cuda.empty_cache()
                            torch.cuda.synchronize()

                        # Log GPU memory exhaustion
                        self._logger.error(
                            f"GPU OOM during CLIP loading: {e}",
                            extra=attempt_context
                        )

                        # Skip to CPU fallback immediately for OOM errors
                        if device_attempt != "cpu":
                            continue
                        else:
                            # Re-raise if CPU also fails with OOM
                            break

                    except Exception as e:
                        # Handle unexpected errors with comprehensive logging
                        attempt_context.update({
                            'success': False,
                            'error_type': type(e).__name__,
                            'error_message': str(e),
                            'unexpected_error': True
                        })
                        loading_attempts.append(attempt_context)

                        self._logger.error(
                            f"Unexpected error during CLIP loading: {e}",
                            extra=attempt_context
                        )

                        # Continue to next device for unexpected errors
                        continue

                # Brief pause between retry attempts to allow system recovery
                if attempt < self._max_initialization_retries - 1:
                    time.sleep(1.0)

            # All attempts failed - generate comprehensive error report
            total_duration = time.perf_counter() - loading_start_time
            error_context = {
                'model_name': self._clip_model_name,
                'total_attempts': len(loading_attempts),
                'total_duration': total_duration,
                'attempted_devices': device_fallback_chain,
                'loading_attempts': loading_attempts,
                'system_resources': {
                    'available_memory_gb': psutil.virtual_memory().available / 1024**3,
                    'cpu_count': psutil.cpu_count(),
                    'gpu_available': torch.cuda.is_available(),
                    'gpu_count': torch.cuda.device_count() if torch.cuda.is_available() else 0
                }
            }

            # Store failed loading performance data for analysis
            self._clip_loading_performance = {
                'success': False,
                'total_attempts': len(loading_attempts),
                'total_loading_time': total_duration,
                'error_context': error_context
            }

            # Generate comprehensive error summary
            error_types = [attempt['error_type'] for attempt in loading_attempts]
            most_common_error = max(set(error_types), key=error_types.count) if error_types else 'Unknown'

            # Raise comprehensive model loading error with full context
            raise ModelLoadError(
                f"Failed to load CLIP model {self._clip_model_name} after {len(loading_attempts)} attempts "
                f"across {len(device_fallback_chain)} devices. Most common error: {most_common_error}",
                model_name=self._clip_model_name,
                algorithm_context=error_context,
                forensic_metadata=ForensicMetadata(
                    operation_name="clip_model_loading",
                    algorithm_parameters={
                        'model_name': self._clip_model_name,
                        'device_chain': device_fallback_chain,
                        'max_retries': self._max_initialization_retries
                    }
                )
            )

    def _estimate_model_parameters(self) -> Optional[int]:
        """
        Estimate number of parameters in loaded CLIP model.

        Returns:
            Estimated parameter count or None if estimation fails
        """
        try:
            if self.clip_model is not None:
                # Count trainable parameters in model
                param_count = sum(p.numel() for p in self.clip_model.parameters())
                return param_count
        except Exception:
            # Handle parameter counting failures gracefully
            pass
        return None

    def _get_gpu_memory_info(self) -> Dict[str, Any]:
        """
        Get comprehensive GPU memory information for debugging.

        Returns:
            Dictionary containing GPU memory statistics
        """
        gpu_info = {}

        if torch.cuda.is_available():
            try:
                for device_idx in range(torch.cuda.device_count()):
                    device_props = torch.cuda.get_device_properties(device_idx)
                    gpu_info[f'gpu_{device_idx}'] = {
                        'name': device_props.name,
                        'total_memory_gb': device_props.total_memory / 1024**3,
                        'allocated_mb': torch.cuda.memory_allocated(device_idx) / 1024**2,
                        'reserved_mb': torch.cuda.memory_reserved(device_idx) / 1024**2,
                        'free_mb': (device_props.total_memory - torch.cuda.memory_allocated(device_idx)) / 1024**2
                    }
            except Exception:
                gpu_info = {'error': 'Failed to retrieve GPU memory information'}

        return gpu_info

    def perceptual_hash_difference(
        self,
        image1: Union[str, Path, Image.Image, np.ndarray],
        image2: Union[str, Path, Image.Image, np.ndarray],
        hash_size: int = 8,
        normalize: bool = False,
        return_similarity: bool = False,
        compute_confidence_interval: bool = True,
        statistical_analysis: bool = True,
        performance_monitoring: bool = True
    ) -> Union[int, float, Dict[str, Any]]:
        """
        Compute perceptual hash difference with comprehensive statistical analysis and validation.

        Implements DCT-based perceptual hashing with rigorous mathematical validation,
        statistical confidence assessment, performance monitoring, and enterprise-grade
        error handling for quantitative image similarity assessment.

        Mathematical Foundation:
        1. Image preprocessing: I' = resize(grayscale(I), 4N×4N) where N = hash_size
        2. DCT computation: F(u,v) = α(u)α(v) Σₓ Σᵧ f(x,y)cos((2x+1)uπ/2N)cos((2y+1)vπ/2N)
        3. Low-frequency extraction: D_lf = F[0:N, 0:N] (top-left N×N coefficients)
        4. Median thresholding: τ = median(D_lf), h[i,j] = 1 if D_lf[i,j] > τ, else 0
        5. Hamming distance: d_H = Σᵢⱼ |h₁[i,j] ⊕ h₂[i,j]| ∈ [0, N²]
        6. Normalized similarity: sim = 1 - d_H/N² ∈ [0,1]

        Args:
            image1: First image as path, PIL Image, or numpy array
            image2: Second image as path, PIL Image, or numpy array
            hash_size: Hash dimension N producing N²-bit perceptual hash
            normalize: If True, return d_H / N² ∈ [0,1]
            return_similarity: If True, return 1 - normalized_distance (requires normalize=True)
            compute_confidence_interval: Whether to compute statistical confidence intervals
            statistical_analysis: Whether to perform comprehensive statistical analysis
            performance_monitoring: Whether to track performance metrics

        Returns:
            Raw Hamming distance (int), normalized distance (float), or comprehensive analysis (Dict)

        Raises:
            ValueError: If parameters violate mathematical constraints
            ImageValidationError: If images cannot be processed
            PerformanceError: If operation exceeds resource constraints
        """
        # Record method execution start for performance monitoring
        method_start_time = time.perf_counter()
        memory_before = psutil.Process().memory_info().rss

        # Validate hash_size parameter against mathematical constraints
        if hash_size <= 0:
            raise ValueError(f"hash_size must be positive integer, got {hash_size}")
        if hash_size > 64:
            # Practical upper limit for computational efficiency
            raise ValueError(f"hash_size too large for practical computation, got {hash_size} > 64")

        # Validate parameter combination logic with mathematical consistency
        if return_similarity and not normalize:
            raise ValueError("return_similarity requires normalize=True for mathematical consistency")

        # Initialize comprehensive performance tracking
        operation_metrics = {
            'method_name': 'perceptual_hash_difference',
            'start_time': datetime.datetime.utcnow().isoformat(),
            'parameters': {
                'hash_size': hash_size,
                'normalize': normalize,
                'return_similarity': return_similarity,
                'statistical_analysis': statistical_analysis
            }
        }

        # Helper function for comprehensive image loading and preprocessing with validation
        def _load_and_preprocess_image_with_validation(
            img_input: Union[str, Path, Image.Image, np.ndarray],
            image_label: str
        ) -> Tuple[Image.Image, Dict[str, Any]]:
            """
            Load and preprocess image with comprehensive validation and metadata extraction.

            Args:
                img_input: Image input in various supported formats
                image_label: Label for error reporting and logging

            Returns:
                Tuple of (processed_PIL_image, preprocessing_metadata)
            """
            # Initialize preprocessing metadata for analysis
            preprocessing_metadata = {
                'input_type': type(img_input).__name__,
                'original_format': None,
                'original_size': None,
                'conversion_applied': [],
                'validation_performed': True
            }

            # Handle numpy array input with comprehensive format analysis
            if isinstance(img_input, np.ndarray):
                # Validate array dimensions for image data compatibility
                if img_input.ndim not in [2, 3]:
                    raise ImageValidationError(
                        f"Invalid {image_label} array dimensions: {img_input.ndim}, expected 2 or 3",
                        validation_step="array_dimension_validation",
                        algorithm_context={'array_shape': img_input.shape}
                    )

                # Record original array properties for metadata
                preprocessing_metadata.update({
                    'original_size': img_input.shape,
                    'array_dtype': str(img_input.dtype),
                    'value_range': (float(img_input.min()), float(img_input.max()))
                })

                # Handle different array data types with mathematical precision
                if img_input.dtype == np.uint8:
                    # Standard 8-bit image data - direct conversion
                    array_data = img_input
                    preprocessing_metadata['conversion_applied'].append('dtype_validated')
                elif img_input.dtype in [np.float32, np.float64]:
                    # Floating point data - normalize to [0,255] range with validation
                    if img_input.max() <= 1.0 and img_input.min() >= 0.0:
                        # Assume [0,1] range and convert to [0,255] with precision preservation
                        array_data = (img_input * 255.0).astype(np.uint8)
                        preprocessing_metadata['conversion_applied'].append('float_to_uint8_normalized')
                    elif img_input.max() <= 255.0 and img_input.min() >= 0.0:
                        # Assume already in [0,255] range
                        array_data = img_input.astype(np.uint8)
                        preprocessing_metadata['conversion_applied'].append('float_to_uint8_direct')
                    else:
                        # Apply min-max normalization for arbitrary ranges
                        array_min, array_max = img_input.min(), img_input.max()
                        if array_max > array_min:
                            normalized = (img_input - array_min) / (array_max - array_min)
                            array_data = (normalized * 255.0).astype(np.uint8)
                            preprocessing_metadata['conversion_applied'].append('minmax_normalization')
                        else:
                            # Constant array - convert to zero array
                            array_data = np.zeros_like(img_input, dtype=np.uint8)
                            preprocessing_metadata['conversion_applied'].append('constant_array_handling')
                else:
                    # Handle other data types with range normalization
                    array_min, array_max = img_input.min(), img_input.max()
                    if array_max > array_min:
                        # Apply linear scaling to [0,255] range
                        normalized = (img_input.astype(np.float64) - array_min) / (array_max - array_min)
                        array_data = (normalized * 255.0).astype(np.uint8)
                        preprocessing_metadata['conversion_applied'].append('generic_dtype_normalization')
                    else:
                        # Handle constant arrays
                        array_data = np.zeros_like(img_input, dtype=np.uint8)
                        preprocessing_metadata['conversion_applied'].append('constant_array_fallback')

                # Convert BGR to RGB if 3-channel array (OpenCV convention handling)
                if array_data.ndim == 3 and array_data.shape[2] == 3:
                    # Assume BGR ordering from OpenCV and convert to RGB
                    array_data = cv2.cvtColor(array_data, cv2.COLOR_BGR2RGB)
                    preprocessing_metadata['conversion_applied'].append('BGR_to_RGB_conversion')
                elif array_data.ndim == 3 and array_data.shape[2] == 4:
                    # Handle RGBA data by removing alpha channel
                    array_data = array_data[:, :, :3]
                    preprocessing_metadata['conversion_applied'].append('RGBA_to_RGB_conversion')

                # Create PIL Image from processed array with error handling
                try:
                    pil_image = Image.fromarray(array_data)
                    preprocessing_metadata['conversion_applied'].append('array_to_PIL_conversion')
                except Exception as e:
                    raise ImageValidationError(
                        f"Failed to convert {image_label} numpy array to PIL Image: {e}",
                        validation_step="array_to_PIL_conversion",
                        algorithm_context={'array_shape': array_data.shape, 'array_dtype': str(array_data.dtype)}
                    ) from e

            # Handle PIL Image input with comprehensive validation
            elif isinstance(img_input, Image.Image):
                # Record original PIL image properties
                preprocessing_metadata.update({
                    'original_format': img_input.format,
                    'original_mode': img_input.mode,
                    'original_size': img_input.size
                })

                # Use provided PIL Image with mode validation
                pil_image = img_input
                preprocessing_metadata['conversion_applied'].append('PIL_image_direct_use')

            # Handle path input with comprehensive validation and loading
            elif isinstance(img_input, (str, Path)):
                # Validate image path using enhanced validation with security analysis
                validated_path = self._validate_image_path(
                    img_input,
                    self._allow_symlinks,
                    perform_content_validation=True,
                    max_file_size_mb=50.0  # Reasonable limit for hash computation
                )

                # Record path information for metadata
                preprocessing_metadata.update({
                    'file_path': str(validated_path),
                    'file_size_bytes': validated_path.stat().st_size
                })

                # Attempt image loading with comprehensive error handling
                try:
                    # Load image using PIL with format preservation
                    pil_image = Image.open(validated_path)
                    preprocessing_metadata.update({
                        'original_format': pil_image.format,
                        'original_mode': pil_image.mode,
                        'original_size': pil_image.size
                    })
                    preprocessing_metadata['conversion_applied'].append('file_to_PIL_loading')

                except (IOError, OSError) as e:
                    # Enhanced error reporting with file analysis
                    raise ImageUnreadableError(
                        f"Failed to load {image_label} from {validated_path}: {e}",
                        file_path=validated_path,
                        corruption_indicators={'loading_error': str(e)},
                        forensic_metadata=ForensicMetadata(
                            operation_name="perceptual_hash_image_loading",
                            algorithm_parameters={'image_label': image_label}
                        )
                    ) from e

            else:
                # Unsupported input type with comprehensive error context
                raise ValueError(
                    f"Unsupported {image_label} input type: {type(img_input)}. "
                    f"Supported types: str, Path, PIL.Image, np.ndarray"
                )

            # Convert to grayscale for perceptual hash computation (mathematical requirement)
            try:
                # Convert to grayscale mode 'L' for DCT computation
                grayscale_image = pil_image.convert('L')
                preprocessing_metadata['conversion_applied'].append('grayscale_conversion')

                # Record final processed image properties
                preprocessing_metadata.update({
                    'final_mode': grayscale_image.mode,
                    'final_size': grayscale_image.size
                })

            except Exception as e:
                raise ImageValidationError(
                    f"Failed to convert {image_label} to grayscale: {e}",
                    validation_step="grayscale_conversion"
                ) from e

            return grayscale_image, preprocessing_metadata

        # Load and preprocess both input images with comprehensive validation
        try:
            # Process first image with detailed metadata collection
            processed_image1, metadata1 = _load_and_preprocess_image_with_validation(image1, "image1")
            # Process second image with detailed metadata collection
            processed_image2, metadata2 = _load_and_preprocess_image_with_validation(image2, "image2")

            # Record preprocessing performance metrics
            preprocessing_time = time.perf_counter() - method_start_time
            operation_metrics['preprocessing_time_seconds'] = preprocessing_time
            operation_metrics['image_metadata'] = {
                'image1': metadata1,
                'image2': metadata2
            }

        except Exception as e:
            # Enhanced error context for preprocessing failures
            operation_metrics['error'] = {
                'stage': 'preprocessing',
                'error_type': type(e).__name__,
                'error_message': str(e)
            }

            # Log preprocessing failure with performance context
            if performance_monitoring and hasattr(self, '_logger'):
                self._logger.error(
                    "Perceptual hash preprocessing failed",
                    extra=operation_metrics
                )

            # Re-raise with preserved context
            raise

        # Compute perceptual hashes using imagehash library with error handling
        hash_computation_start = time.perf_counter()

        try:
            # Generate pHash using DCT-based algorithm with specified hash_size
            # Mathematical process: resize → DCT → low-frequency extraction → median threshold
            hash1 = imagehash.phash(processed_image1, hash_size=hash_size)
            hash2 = imagehash.phash(processed_image2, hash_size=hash_size)

            # Validate hash generation success
            if hash1 is None or hash2 is None:
                raise RuntimeError("Perceptual hash computation returned None")

            # Record hash computation performance
            hash_computation_time = time.perf_counter() - hash_computation_start
            operation_metrics['hash_computation_time_seconds'] = hash_computation_time

        except Exception as e:
            # Enhanced error reporting for hash computation failures
            operation_metrics['error'] = {
                'stage': 'hash_computation',
                'error_type': type(e).__name__,
                'error_message': str(e),
                'hash_size': hash_size
            }

            raise RuntimeError(f"Perceptual hash computation failed: {e}") from e

        # Compute Hamming distance between binary hashes
        # Mathematical formula: d_H = Σᵢⱼ |h₁[i,j] ⊕ h₂[i,j]|
        hamming_distance = int(hash1 - hash2)

        # Validate Hamming distance is within mathematical bounds
        max_possible_distance = hash_size * hash_size
        if not 0 <= hamming_distance <= max_possible_distance:
            raise RuntimeError(
                f"Invalid Hamming distance {hamming_distance}, "
                f"expected range [0, {max_possible_distance}]"
            )

        # Record basic result metrics
        operation_metrics['hamming_distance'] = hamming_distance
        operation_metrics['max_possible_distance'] = max_possible_distance

        # Perform statistical analysis if requested
        statistical_properties = None
        if statistical_analysis:
            # Compute statistical properties of hash difference
            # Convert hashes to binary arrays for detailed analysis
            hash1_array = np.array([int(bit) for bit in str(hash1).replace(':', '')])
            hash2_array = np.array([int(bit) for bit in str(hash2).replace(':', '')])

            # Compute bit-wise differences for statistical analysis
            bit_differences = np.abs(hash1_array - hash2_array)

            # Compute comprehensive statistical properties
            statistical_properties = StatisticalProperties(
                sample_size=len(bit_differences),
                mean=float(np.mean(bit_differences)),
                median=float(np.median(bit_differences)),
                variance=float(np.var(bit_differences)),
                standard_deviation=float(np.std(bit_differences)),
                standard_error=float(np.std(bit_differences) / np.sqrt(len(bit_differences))),
                confidence_level=0.95,
                distribution_type="bernoulli"
            )

            # Compute confidence interval if requested
            if compute_confidence_interval and len(bit_differences) > 1:
                # Use binomial proportion confidence interval for Hamming distance
                n_bits = len(bit_differences)
                p_hat = hamming_distance / n_bits  # Proportion of different bits

                # Wilson score interval for more robust confidence estimation
                z_score = stats.norm.ppf(0.975)  # 95% confidence level
                denominator = 1 + z_score**2 / n_bits
                center = (p_hat + z_score**2 / (2 * n_bits)) / denominator
                margin = z_score * np.sqrt(p_hat * (1 - p_hat) / n_bits + z_score**2 / (4 * n_bits**2)) / denominator

                # Convert back to Hamming distance scale
                ci_lower = max(0, (center - margin) * n_bits)
                ci_upper = min(n_bits, (center + margin) * n_bits)

                # Update statistical properties with confidence interval
                statistical_properties.confidence_interval_lower = ci_lower
                statistical_properties.confidence_interval_upper = ci_upper

        # Apply normalization and similarity transformation if requested
        if normalize:
            # Compute normalized distance: d_norm = d_H / N²
            normalized_distance = float(hamming_distance) / float(max_possible_distance)
            operation_metrics['normalized_distance'] = normalized_distance

            if return_similarity:
                # Compute similarity score: similarity = 1 - d_norm
                similarity_score = 1.0 - normalized_distance
                operation_metrics['similarity_score'] = similarity_score

                # Validate similarity score is in valid range
                if not 0.0 <= similarity_score <= 1.0:
                    raise RuntimeError(f"Invalid similarity score {similarity_score}, expected [0,1]")

                result_value = similarity_score
            else:
                result_value = normalized_distance
        else:
            # Return raw Hamming distance as integer
            result_value = hamming_distance

        # Record comprehensive performance metrics
        total_execution_time = time.perf_counter() - method_start_time
        memory_after = psutil.Process().memory_info().rss
        memory_delta = memory_after - memory_before

        operation_metrics.update({
            'total_execution_time_seconds': total_execution_time,
            'memory_delta_bytes': memory_delta,
            'result_value': result_value,
            'statistical_analysis_performed': statistical_analysis,
            'confidence_interval_computed': compute_confidence_interval and statistical_properties is not None
        })

        # Store performance metrics for analysis if monitoring enabled
        if performance_monitoring and hasattr(self, '_method_performance_history'):
            with self._performance_lock:
                self._method_performance_history['perceptual_hash_difference'].append(operation_metrics)
                # Limit history size to prevent memory growth
                if len(self._method_performance_history['perceptual_hash_difference']) > 100:
                    self._method_performance_history['perceptual_hash_difference'] = \
                        self._method_performance_history['perceptual_hash_difference'][-50:]

        # Return comprehensive result based on analysis level requested
        if statistical_analysis:
            # Return comprehensive analysis with all computed metrics
            comprehensive_result = {
                'hamming_distance': hamming_distance,
                'max_possible_distance': max_possible_distance,
                'normalized_distance': normalized_distance if normalize else None,
                'similarity_score': similarity_score if return_similarity else None,
                'statistical_properties': statistical_properties,
                'performance_metrics': operation_metrics,
                'hash_size': hash_size,
                'computation_successful': True
            }
            return comprehensive_result
        else:
            # Return simple numerical result
            return result_value

    def feature_match_ratio(
        self,
        image1: Union[str, Path, np.ndarray],
        image2: Union[str, Path, np.ndarray],
        distance_threshold: int = 50,
        normalization_strategy: Literal["total_matches", "min_keypoints"] = "total_matches",
        apply_ratio_test: bool = False,
        ratio_threshold: float = 0.75,
        resize_max_side: Optional[int] = None,
        return_detailed_result: bool = True,
        geometric_verification: bool = True,
        performance_monitoring: bool = True,
        statistical_analysis: bool = True
    ) -> Union[float, FeatureMatchResult]:
        """
        Compute feature matching similarity with comprehensive statistical analysis and validation.

        Implements ORB feature detection and matching with rigorous mathematical validation,
        geometric verification, statistical confidence assessment, and enterprise-grade
        performance monitoring for quantitative image similarity analysis.

        Mathematical Foundation:
        1. ORB Feature Detection:
          - FAST corners: C = {p : |I(p) - I(x)| > τ for x ∈ circle(p)}
          - Harris response: R = det(M) - k·trace²(M) where M is structure tensor
          - Orientation: θ = atan2(m₀₁, m₁₀) where mₚq = Σᵨ uᵖvᵍI(u,v)

        2. BRIEF Descriptors:
          - Binary tests: τ(p; x,y) := 1 if p(x) < p(y), else 0
          - Descriptor: f_n(p) = Σ_{1≤i≤n} 2^(i-1)τ(p; x_i, y_i)
          - Hamming distance: d_H(D₁,D₂) = Σᵢ |D₁[i] ⊕ D₂[i]|

        3. Matching Strategies:
          - Total matches: ratio = |good| / |matches|
          - Min keypoints: ratio = |good| / min(|KP₁|, |KP₂|)
          - Lowe's test: accept if d₁/d₂ < τ where d₁ < d₂

        4. Geometric Verification:
          - RANSAC homography: H·p₁ ≈ p₂ for inlier correspondences
          - Consensus set: inliers = {(p₁,p₂) : ||H·p₁ - p₂|| < ε}

        Args:
            image1: First image as path or numpy array
            image2: Second image as path or numpy array
            distance_threshold: Maximum Hamming distance for good match (0-256)
            normalization_strategy: Method for ratio computation
            apply_ratio_test: Whether to use Lowe's ratio test for disambiguation
            ratio_threshold: Threshold for Lowe's ratio test (typically 0.75)
            resize_max_side: Optional downscaling to limit computational cost
            return_detailed_result: If True, return comprehensive FeatureMatchResult
            geometric_verification: Whether to perform RANSAC geometric verification
            performance_monitoring: Whether to track detailed performance metrics
            statistical_analysis: Whether to compute statistical properties

        Returns:
            Similarity ratio [0,1] or comprehensive FeatureMatchResult with analysis

        Raises:
            ValueError: If parameters violate mathematical constraints
            RuntimeError: If feature detection or matching fails
            ResourceAllocationError: If insufficient memory for processing
        """
        # Record method execution start for comprehensive performance monitoring
        method_start_time = time.perf_counter()
        memory_before = psutil.Process().memory_info().rss

        # Validate normalization strategy parameter against supported algorithms
        valid_strategies = {"total_matches", "min_keypoints"}
        if normalization_strategy not in valid_strategies:
            raise ValueError(f"Invalid normalization_strategy: {normalization_strategy}, valid: {valid_strategies}")

        # Validate distance threshold for Hamming distance (8-bit binary descriptors)
        if not 0 <= distance_threshold <= 256:
            raise ValueError(f"distance_threshold must be in [0,256], got {distance_threshold}")

        # Validate ratio threshold for Lowe's disambiguation test
        if not 0.0 < ratio_threshold < 1.0:
            raise ValueError(f"ratio_threshold must be in (0,1), got {ratio_threshold}")

        # Validate resize parameter for computational efficiency
        if resize_max_side is not None and resize_max_side <= 0:
            raise ValueError(f"resize_max_side must be positive, got {resize_max_side}")

        # Initialize comprehensive performance tracking with detailed metrics
        operation_metrics = {
            'method_name': 'feature_match_ratio',
            'start_time': datetime.datetime.utcnow().isoformat(),
            'parameters': {
                'distance_threshold': distance_threshold,
                'normalization_strategy': normalization_strategy,
                'apply_ratio_test': apply_ratio_test,
                'ratio_threshold': ratio_threshold,
                'resize_max_side': resize_max_side,
                'geometric_verification': geometric_verification
            },
            'processing_stages': {}
        }

        # Helper function for comprehensive image loading and preprocessing
        def _load_and_prepare_image_for_features(
            img_input: Union[str, Path, np.ndarray],
            image_label: str
        ) -> Tuple[np.ndarray, Dict[str, Any]]:
            """
            Load image and prepare for ORB feature detection with optimization.

            Args:
                img_input: Image input in supported format
                image_label: Label for error reporting and performance tracking

            Returns:
                Tuple of (grayscale_array, processing_metadata)
            """
            # Initialize processing metadata for comprehensive analysis
            processing_metadata = {
                'input_type': type(img_input).__name__,
                'transformations_applied': [],
                'validation_performed': True,
                'optimization_applied': []
            }

            # Handle path input with enhanced validation
            if isinstance(img_input, (str, Path)):
                # Validate image path with comprehensive security analysis
                validated_path = self._validate_image_path(
                    img_input,
                    self._allow_symlinks,
                    perform_content_validation=True,
                    max_file_size_mb=100.0  # Generous limit for feature detection
                )

                # Load image in BGR color format for OpenCV processing
                image_array = cv2.imread(str(validated_path), cv2.IMREAD_COLOR)
                if image_array is None:
                    raise RuntimeError(f"OpenCV failed to load {image_label}: {validated_path}")

                # Record original image properties
                processing_metadata.update({
                    'file_path': str(validated_path),
                    'original_size': (image_array.shape[1], image_array.shape[0]),  # (width, height)
                    'original_channels': image_array.shape[2] if image_array.ndim == 3 else 1
                })
                processing_metadata['transformations_applied'].append('file_loading')

            # Handle numpy array input with comprehensive validation
            elif isinstance(img_input, np.ndarray):
                # Validate array dimensions for image data compatibility
                if img_input.ndim not in [2, 3]:
                    raise ValueError(f"Invalid {image_label} array dimensions: {img_input.ndim}")

                # Copy input array to prevent mutation of original data
                image_array = img_input.copy()

                # Record original array properties
                processing_metadata.update({
                    'original_size': (image_array.shape[1], image_array.shape[0]),
                    'original_channels': image_array.shape[2] if image_array.ndim == 3 else 1,
                    'array_dtype': str(image_array.dtype),
                    'value_range': (float(image_array.min()), float(image_array.max()))
                })

                # Ensure 3-channel color image for consistent OpenCV processing
                if image_array.ndim == 2:
                    # Convert grayscale to 3-channel BGR for uniform handling
                    image_array = cv2.cvtColor(image_array, cv2.COLOR_GRAY2BGR)
                    processing_metadata['transformations_applied'].append('grayscale_to_BGR')
                elif image_array.shape[2] == 4:
                    # Remove alpha channel for OpenCV compatibility
                    image_array = cv2.cvtColor(image_array, cv2.COLOR_BGRA2BGR)
                    processing_metadata['transformations_applied'].append('BGRA_to_BGR')
                elif image_array.shape[2] == 3:
                    # Assume RGB and convert to BGR for OpenCV
                    image_array = cv2.cvtColor(image_array, cv2.COLOR_RGB2BGR)
                    processing_metadata['transformations_applied'].append('RGB_to_BGR')

            else:
                raise ValueError(f"Unsupported {image_label} input type: {type(img_input)}")

            # Apply optional resizing with aspect ratio preservation for efficiency
            if resize_max_side is not None:
                # Get current image dimensions
                current_height, current_width = image_array.shape[:2]
                current_max_side = max(current_height, current_width)

                # Resize only if current size exceeds limit for computational efficiency
                if current_max_side > resize_max_side:
                    # Compute scale factor preserving aspect ratio
                    scale_factor = resize_max_side / current_max_side
                    new_width = int(current_width * scale_factor)
                    new_height = int(current_height * scale_factor)

                    # Apply high-quality resizing using area interpolation for downscaling
                    image_array = cv2.resize(
                        image_array,
                        (new_width, new_height),
                        interpolation=cv2.INTER_AREA
                    )

                    # Record resizing transformation
                    processing_metadata.update({
                        'resized_size': (new_width, new_height),
                        'scale_factor': scale_factor
                    })
                    processing_metadata['transformations_applied'].append('aspect_preserving_resize')
                    processing_metadata['optimization_applied'].append('computational_downscaling')

            # Convert to grayscale for ORB feature detection (algorithm requirement)
            if image_array.ndim == 3:
                grayscale_array = cv2.cvtColor(image_array, cv2.COLOR_BGR2GRAY)
                processing_metadata['transformations_applied'].append('BGR_to_grayscale')
            else:
                grayscale_array = image_array

            # Validate final grayscale array properties
            if grayscale_array.ndim != 2:
                raise RuntimeError(f"Failed to produce grayscale array for {image_label}")

            # Record final processing results
            processing_metadata.update({
                'final_size': (grayscale_array.shape[1], grayscale_array.shape[0]),
                'final_dtype': str(grayscale_array.dtype)
            })

            return grayscale_array, processing_metadata

        # Load and prepare both images for feature detection with performance monitoring
        preprocessing_start = time.perf_counter()

        try:
            # Process first image with comprehensive preprocessing
            prepared_image1, metadata1 = _load_and_prepare_image_for_features(image1, "image1")
            # Process second image with comprehensive preprocessing
            prepared_image2, metadata2 = _load_and_prepare_image_for_features(image2, "image2")

            # Record preprocessing performance metrics
            preprocessing_time = time.perf_counter() - preprocessing_start
            operation_metrics['processing_stages']['preprocessing'] = {
                'duration_seconds': preprocessing_time,
                'image1_metadata': metadata1,
                'image2_metadata': metadata2
            }

        except Exception as e:
            # Enhanced error context for preprocessing failures
            operation_metrics['error'] = {
                'stage': 'preprocessing',
                'error_type': type(e).__name__,
                'error_message': str(e)
            }
            raise RuntimeError(f"Image preprocessing failed: {e}") from e

        # Detect keypoints and compute ORB descriptors with performance monitoring
        feature_detection_start = time.perf_counter()

        try:
            # Image 1: Detect FAST corners and compute BRIEF descriptors
            keypoints1, descriptors1 = self.orb_detector.detectAndCompute(prepared_image1, None)
            # Image 2: Detect FAST corners and compute BRIEF descriptors
            keypoints2, descriptors2 = self.orb_detector.detectAndCompute(prepared_image2, None)

            # Record feature detection performance and results
            feature_detection_time = time.perf_counter() - feature_detection_start
            operation_metrics['processing_stages']['feature_detection'] = {
                'duration_seconds': feature_detection_time,
                'keypoints_image1': len(keypoints1) if keypoints1 else 0,
                'keypoints_image2': len(keypoints2) if keypoints2 else 0,
                'descriptors_image1_shape': descriptors1.shape if descriptors1 is not None else None,
                'descriptors_image2_shape': descriptors2.shape if descriptors2 is not None else None
            }

        except Exception as e:
            # Enhanced error reporting for feature detection failures
            operation_metrics['error'] = {
                'stage': 'feature_detection',
                'error_type': type(e).__name__,
                'error_message': str(e)
            }
            raise RuntimeError(f"ORB feature detection failed: {e}") from e

        # Handle cases where no descriptors are detected with detailed analysis
        if descriptors1 is None or descriptors2 is None or len(descriptors1) == 0 or len(descriptors2) == 0:
            # Create comprehensive result for no-feature case with diagnostic information
            no_features_result = FeatureMatchResult(
                similarity_ratio=0.0,
                total_matches=0,
                good_matches=0,
                keypoints_image1=len(keypoints1) if keypoints1 else 0,
                keypoints_image2=len(keypoints2) if keypoints2 else 0,
                normalization_strategy=normalization_strategy,
                confidence_level=0.0,
                geometric_verification_passed=None,
                matching_time_seconds=0.0,
                keypoint_detection_time_seconds=feature_detection_time,
                distance_threshold_used=distance_threshold,
                lowe_ratio_threshold=ratio_threshold if apply_ratio_test else None
            )

            # Record no-features case in performance metrics
            operation_metrics['result_type'] = 'no_features_detected'
            operation_metrics['total_execution_time'] = time.perf_counter() - method_start_time

            # Store performance metrics if monitoring enabled
            if performance_monitoring:
                self._record_method_performance('feature_match_ratio', operation_metrics)

            return no_features_result if return_detailed_result else 0.0

        # Perform descriptor matching using configured algorithm with performance monitoring
        matching_start = time.perf_counter()
        good_matches = []
        total_matches = 0
        match_distances = []

        try:
            if apply_ratio_test:
                # Apply Lowe's ratio test using k-nearest neighbor matching
                # Mathematical foundation: accept match if d₁/d₂ < threshold for disambiguation
                knn_matches = self.bf_matcher.knnMatch(descriptors1, descriptors2, k=2)

                # Process KNN matches with ratio test validation
                for match_pair in knn_matches:
                    if len(match_pair) == 2:
                        best_match, second_match = match_pair
                        # Apply Lowe's ratio test criterion: d₁/d₂ < τ
                        if best_match.distance < ratio_threshold * second_match.distance:
                            good_matches.append(best_match)
                            match_distances.append(best_match.distance)
                        total_matches += 1
                    elif len(match_pair) == 1:
                        # Only one match found - accept it (unambiguous case)
                        good_matches.append(match_pair[0])
                        match_distances.append(match_pair[0].distance)
                        total_matches += 1
            else:
                # Use simple distance threshold matching with cross-check validation
                matches = self.bf_matcher.match(descriptors1, descriptors2)
                total_matches = len(matches)

                # Filter matches by Hamming distance threshold
                for match in matches:
                    if match.distance <= distance_threshold:
                        good_matches.append(match)
                    match_distances.append(match.distance)

            # Record matching performance metrics
            matching_time = time.perf_counter() - matching_start
            operation_metrics['processing_stages']['matching'] = {
                'duration_seconds': matching_time,
                'total_matches': total_matches,
                'good_matches': len(good_matches),
                'ratio_test_applied': apply_ratio_test,
                'average_distance': float(np.mean(match_distances)) if match_distances else 0.0,
                'min_distance': float(np.min(match_distances)) if match_distances else 0.0,
                'max_distance': float(np.max(match_distances)) if match_distances else 0.0
            }

        except Exception as e:
            # Enhanced error reporting for matching failures
            operation_metrics['error'] = {
                'stage': 'matching',
                'error_type': type(e).__name__,
                'error_message': str(e)
            }
            raise RuntimeError(f"Descriptor matching failed: {e}") from e

        # Compute similarity ratio using specified normalization strategy
        if total_matches == 0:
            similarity_ratio = 0.0
        else:
            if normalization_strategy == "total_matches":
                # Normalize by total number of attempted matches
                similarity_ratio = float(len(good_matches)) / float(total_matches)
            else:  # "min_keypoints"
                # Normalize by minimum number of detected keypoints
                min_keypoints = min(len(keypoints1), len(keypoints2))
                if min_keypoints == 0:
                    similarity_ratio = 0.0
                else:
                    # Clamp ratio to [0,1] range for min_keypoints strategy
                    similarity_ratio = min(float(len(good_matches)) / float(min_keypoints), 1.0)

        # Perform geometric verification using RANSAC if requested and sufficient matches
        geometric_verification_passed = None
        homography_inlier_count = None
        homography_inlier_ratio = None

        if geometric_verification and len(good_matches) >= 4:  # Minimum for homography estimation
            geometric_verification_start = time.perf_counter()

            try:
                # Extract point correspondences from good matches
                src_points = np.float32([keypoints1[m.queryIdx].pt for m in good_matches]).reshape(-1, 1, 2)
                dst_points = np.float32([keypoints2[m.trainIdx].pt for m in good_matches]).reshape(-1, 1, 2)

                # Estimate homography using RANSAC with robust parameters
                homography_matrix, inlier_mask = cv2.findHomography(
                    src_points,
                    dst_points,
                    cv2.RANSAC,
                    ransacReprojectionThreshold=3.0,  # Reprojection error threshold
                    maxIters=2000,  # Maximum RANSAC iterations
                    confidence=0.99  # Confidence level for RANSAC
                )

                # Analyze geometric verification results
                if homography_matrix is not None and inlier_mask is not None:
                    # Count inliers from RANSAC consensus set
                    homography_inlier_count = int(np.sum(inlier_mask))
                    homography_inlier_ratio = float(homography_inlier_count) / float(len(good_matches))

                    # Determine if geometric verification passed based on inlier ratio
                    geometric_verification_passed = homography_inlier_ratio >= 0.3  # 30% inlier threshold
                else:
                    # Homography estimation failed
                    homography_inlier_count = 0
                    homography_inlier_ratio = 0.0
                    geometric_verification_passed = False

                # Record geometric verification performance
                geometric_verification_time = time.perf_counter() - geometric_verification_start
                operation_metrics['processing_stages']['geometric_verification'] = {
                    'duration_seconds': geometric_verification_time,
                    'homography_found': homography_matrix is not None,
                    'inlier_count': homography_inlier_count,
                    'inlier_ratio': homography_inlier_ratio,
                    'verification_passed': geometric_verification_passed
                }

            except Exception as e:
                # Handle geometric verification failures gracefully
                geometric_verification_passed = False
                operation_metrics['processing_stages']['geometric_verification'] = {
                    'error': str(e),
                    'verification_passed': False
                }

        # Compute statistical properties of match distances if analysis requested
        match_distance_statistics = None
        if statistical_analysis and match_distances:
            try:
                # Compute comprehensive statistical properties of Hamming distances
                distances_array = np.array(match_distances)

                match_distance_statistics = StatisticalProperties(
                    sample_size=len(distances_array),
                    mean=float(np.mean(distances_array)),
                    median=float(np.median(distances_array)),
                    variance=float(np.var(distances_array)),
                    standard_deviation=float(np.std(distances_array)),
                    standard_error=float(np.std(distances_array) / np.sqrt(len(distances_array))),
                    skewness=float(stats.skew(distances_array)),
                    kurtosis=float(stats.kurtosis(distances_array)),
                    confidence_level=0.95
                )

                # Compute confidence interval for mean distance
                if len(distances_array) > 1:
                    ci_lower, ci_upper = match_distance_statistics.compute_confidence_interval(
                        confidence_level=0.95,
                        distribution="t"
                    )
                    match_distance_statistics.confidence_interval_lower = ci_lower
                    match_distance_statistics.confidence_interval_upper = ci_upper

            except Exception as e:
                # Handle statistical analysis failures gracefully
                operation_metrics['statistical_analysis_error'] = str(e)

        # Compute confidence level based on match statistics and geometric consistency
        confidence_level = None
        if len(good_matches) > 0:
            # Base confidence on number of good matches and geometric consistency
            match_confidence = min(float(len(good_matches)) / 50.0, 1.0)  # Normalize to [0,1]

            # Adjust confidence based on geometric verification if performed
            if geometric_verification_passed is True:
                geometric_confidence = homography_inlier_ratio if homography_inlier_ratio else 0.5
                confidence_level = (match_confidence + geometric_confidence) / 2.0
            elif geometric_verification_passed is False:
                # Reduce confidence if geometric verification failed
                confidence_level = match_confidence * 0.5
            else:
                # No geometric verification performed
                confidence_level = match_confidence
        else:
            confidence_level = 0.0

        # Record comprehensive performance metrics
        total_execution_time = time.perf_counter() - method_start_time
        memory_after = psutil.Process().memory_info().rss
        memory_delta = memory_after - memory_before

        operation_metrics.update({
            'total_execution_time_seconds': total_execution_time,
            'memory_delta_bytes': memory_delta,
            'similarity_ratio': similarity_ratio,
            'confidence_level': confidence_level,
            'result_type': 'successful_matching'
        })

        # Store performance metrics for analysis if monitoring enabled
        if performance_monitoring:
            self._record_method_performance('feature_match_ratio', operation_metrics)

        # Create comprehensive result structure with all computed metrics
        detailed_result = FeatureMatchResult(
            similarity_ratio=similarity_ratio,
            total_matches=total_matches,
            good_matches=len(good_matches),
            keypoints_image1=len(keypoints1),
            keypoints_image2=len(keypoints2),
            normalization_strategy=normalization_strategy,
            confidence_level=confidence_level,
            match_distance_statistics=match_distance_statistics,
            geometric_verification_passed=geometric_verification_passed,
            homography_inlier_count=homography_inlier_count,
            homography_inlier_ratio=homography_inlier_ratio,
            matching_time_seconds=operation_metrics['processing_stages']['matching']['duration_seconds'],
            keypoint_detection_time_seconds=operation_metrics['processing_stages']['feature_detection']['duration_seconds'],
            distance_threshold_used=distance_threshold,
            lowe_ratio_threshold=ratio_threshold if apply_ratio_test else None
        )

        # Return appropriate result format based on request
        if return_detailed_result:
            return detailed_result
        else:
            return similarity_ratio

    def _record_method_performance(self, method_name: str, metrics: Dict[str, Any]) -> None:
        """
        Record method performance metrics for analysis and optimization.

        Args:
            method_name: Name of method being monitored
            metrics: Performance metrics dictionary
        """
        if hasattr(self, '_method_performance_history') and hasattr(self, '_performance_lock'):
            with self._performance_lock:
                self._method_performance_history[method_name].append(metrics)
                # Limit history size to prevent memory growth
                if len(self._method_performance_history[method_name]) > 100:
                    self._method_performance_history[method_name] = \
                        self._method_performance_history[method_name][-50:]

    def histogram_correlation(
        self,
        image1: Union[str, Path, np.ndarray],
        image2: Union[str, Path, np.ndarray],
        bins: Union[int, Tuple[int, int]] = (50, 60),
        metric: Literal["correlation", "chi-square", "intersection", "bhattacharyya"] = "correlation",
        mask1: Optional[np.ndarray] = None,
        mask2: Optional[np.ndarray] = None,
        preserve_aspect: bool = True,
        resize_size: Tuple[int, int] = (256, 256),
        on_zero_histogram: Literal["error", "zero", "nan"] = "error",
        color_space: Literal["HSV", "RGB", "LAB"] = "HSV",
        statistical_analysis: bool = True,
        performance_monitoring: bool = True,
        adaptive_binning: bool = False,
        entropy_analysis: bool = True
    ) -> Union[float, Dict[str, Any]]:
        """
        Compute histogram correlation with comprehensive statistical analysis and optimization.

        Implements rigorous color histogram comparison with mathematical validation,
        adaptive binning strategies, entropy analysis, and enterprise-grade performance
        monitoring for quantitative image similarity assessment.

        Mathematical Foundation:
        1. Color Space Transformation: I' = transform(I, color_space)
        2. Histogram Computation: H_c(i) = Σₓᵧ 𝟙{bin(I'_c(x,y)) = i} for channel c
        3. Normalization: Ĥ_c = H_c / Σᵢ H_c[i] (L1 normalization)
        4. Statistical Metrics:
          - Pearson Correlation: ρ = Σᵢ(h₁[i]-μ₁)(h₂[i]-μ₂) / (σ₁σ₂)
          - Chi-Square Distance: χ² = Σᵢ (h₁[i]-h₂[i])² / (h₁[i]+h₂[i])
          - Intersection: I = Σᵢ min(h₁[i], h₂[i])
          - Bhattacharyya: d_B = -ln(Σᵢ √(h₁[i]h₂[i]))
        5. Entropy Analysis: H(X) = -Σᵢ p(i)log₂(p(i)) for information content

        Args:
            image1: First image as path or numpy array
            image2: Second image as path or numpy array
            bins: Number of bins per channel (int) or tuple (H_bins, S_bins) for HSV
            metric: Statistical distance/similarity metric for comparison
            mask1: Optional binary mask for image1 region of interest
            mask2: Optional binary mask for image2 region of interest
            preserve_aspect: Whether to maintain aspect ratio during resize
            resize_size: Target dimensions (width, height) for preprocessing
            on_zero_histogram: Behavior when zero histogram encountered
            color_space: Color space for histogram computation
            statistical_analysis: Whether to perform comprehensive statistical analysis
            performance_monitoring: Whether to track detailed performance metrics
            adaptive_binning: Whether to use adaptive binning based on image characteristics
            entropy_analysis: Whether to compute information-theoretic measures

        Returns:
            Similarity score/distance or comprehensive analysis dictionary

        Raises:
            HistogramError: On computation failures or invalid parameters
            ValueError: On invalid parameter combinations
            ResourceAllocationError: If insufficient memory for processing
        """
        # Record method execution start for comprehensive performance monitoring
        method_start_time = time.perf_counter()
        memory_before = psutil.Process().memory_info().rss

        # Validate metric selection against supported statistical algorithms
        supported_metrics = {"correlation", "chi-square", "intersection", "bhattacharyya"}
        if metric not in supported_metrics:
            raise HistogramError(f"Unsupported metric: {metric}. Supported: {supported_metrics}")

        # Map metric names to OpenCV constants with mathematical interpretation
        metric_constants = {
            "correlation": cv2.HISTCMP_CORREL,      # Pearson correlation: higher = more similar
            "chi-square": cv2.HISTCMP_CHISQR,      # Chi-square distance: lower = more similar
            "intersection": cv2.HISTCMP_INTERSECT, # Histogram intersection: higher = more similar
            "bhattacharyya": cv2.HISTCMP_BHATTACHARYYA  # Bhattacharyya distance: lower = more similar
        }

        # Validate and normalize bins parameter with mathematical constraints
        if isinstance(bins, int):
            if bins <= 0 or bins > 256:
                raise ValueError(f"bins must be in range [1,256], got {bins}")
            # Use same number of bins for all channels
            hist_bins = [bins] * 3
            bins_description = f"uniform_{bins}"
        elif isinstance(bins, (tuple, list)) and len(bins) == 2:
            if any(b <= 0 or b > 256 for b in bins):
                raise ValueError(f"All bin counts must be in [1,256], got {bins}")
            # Use specified bins for H and S channels, default for third channel
            hist_bins = [bins[0], bins[1], 50]
            bins_description = f"custom_{bins[0]}x{bins[1]}"
        else:
            raise ValueError(f"bins must be int or 2-tuple, got {type(bins)}")

        # Validate color space selection with supported transformations
        supported_color_spaces = {"HSV", "RGB", "LAB"}
        if color_space not in supported_color_spaces:
            raise ValueError(f"Unsupported color_space: {color_space}")

        # Initialize comprehensive performance tracking with detailed metrics
        operation_metrics = {
            'method_name': 'histogram_correlation',
            'start_time': datetime.datetime.utcnow().isoformat(),
            'parameters': {
                'bins': bins,
                'metric': metric,
                'color_space': color_space,
                'preserve_aspect': preserve_aspect,
                'resize_size': resize_size,
                'adaptive_binning': adaptive_binning,
                'entropy_analysis': entropy_analysis
            },
            'processing_stages': {}
        }

        # Helper function for comprehensive image loading and preprocessing
        def _load_and_prepare_image_for_histogram(
            img_input: Union[str, Path, np.ndarray],
            mask: Optional[np.ndarray],
            image_label: str
        ) -> Tuple[np.ndarray, Optional[np.ndarray], Dict[str, Any]]:
            """
            Load image and prepare for histogram computation with optimization.

            Args:
                img_input: Image input in supported format
                mask: Optional mask for region of interest
                image_label: Label for error reporting and performance tracking

            Returns:
                Tuple of (processed_array, processed_mask, processing_metadata)
            """
            # Initialize processing metadata for comprehensive analysis
            processing_metadata = {
                'input_type': type(img_input).__name__,
                'transformations_applied': [],
                'optimization_applied': [],
                'validation_performed': True
            }

            # Handle path-based image loading with enhanced validation
            if isinstance(img_input, (str, Path)):
                # Validate image path with comprehensive security analysis
                validated_path = self._validate_image_path(
                    img_input,
                    self._allow_symlinks,
                    perform_content_validation=True,
                    max_file_size_mb=200.0  # Generous limit for histogram analysis
                )

                # Load image using OpenCV in BGR format for consistent processing
                image_array = cv2.imread(str(validated_path), cv2.IMREAD_COLOR)
                if image_array is None:
                    raise HistogramError(f"Failed to load {image_label}: {validated_path}")

                # Record original image properties for metadata
                processing_metadata.update({
                    'file_path': str(validated_path),
                    'original_size': (image_array.shape[1], image_array.shape[0]),
                    'original_channels': image_array.shape[2] if image_array.ndim == 3 else 1
                })
                processing_metadata['transformations_applied'].append('file_loading')

            # Handle numpy array input with comprehensive validation
            elif isinstance(img_input, np.ndarray):
                # Validate array dimensions for image data compatibility
                if img_input.ndim not in [2, 3]:
                    raise HistogramError(f"Invalid {image_label} array dimensions: {img_input.ndim}")

                # Copy input array to prevent mutation of original data
                image_array = img_input.copy()

                # Record original array properties
                processing_metadata.update({
                    'original_size': (image_array.shape[1], image_array.shape[0]),
                    'original_channels': image_array.shape[2] if image_array.ndim == 3 else 1,
                    'array_dtype': str(image_array.dtype),
                    'value_range': (float(image_array.min()), float(image_array.max()))
                })

                # Ensure 3-channel color image for consistent histogram processing
                if image_array.ndim == 2:
                    # Convert grayscale to 3-channel BGR
                    image_array = cv2.cvtColor(image_array, cv2.COLOR_GRAY2BGR)
                    processing_metadata['transformations_applied'].append('grayscale_to_BGR')
                elif image_array.shape[2] == 4:
                    # Remove alpha channel for OpenCV compatibility
                    image_array = cv2.cvtColor(image_array, cv2.COLOR_BGRA2BGR)
                    processing_metadata['transformations_applied'].append('BGRA_to_BGR')
                elif image_array.shape[2] != 3:
                    raise HistogramError(f"Unsupported channel count: {image_array.shape[2]}")

            else:
                raise HistogramError(f"Unsupported {image_label} type: {type(img_input)}")

            # Apply resizing with aspect ratio preservation for computational efficiency
            def _resize_image_with_optimization(img: np.ndarray) -> np.ndarray:
                """Resize image according to configuration with optimization."""
                current_height, current_width = img.shape[:2]
                target_width, target_height = resize_size

                if preserve_aspect:
                    # Compute scaling factor preserving aspect ratio
                    scale_factor = min(
                        target_width / current_width,
                        target_height / current_height
                    )
                    # Compute new dimensions maintaining aspect ratio
                    new_width = int(current_width * scale_factor)
                    new_height = int(current_height * scale_factor)

                    # Apply high-quality resizing using area interpolation for downscaling
                    resized_img = cv2.resize(img, (new_width, new_height), interpolation=cv2.INTER_AREA)
                    processing_metadata['optimization_applied'].append('aspect_preserving_resize')
                else:
                    # Direct resize to target dimensions
                    resized_img = cv2.resize(img, (target_width, target_height), interpolation=cv2.INTER_AREA)
                    processing_metadata['optimization_applied'].append('direct_resize')

                # Record resizing transformation
                processing_metadata.update({
                    'resized_size': (resized_img.shape[1], resized_img.shape[0]),
                    'resize_applied': True
                })
                processing_metadata['transformations_applied'].append('image_resizing')

                return resized_img

            # Apply resizing to image
            resized_image = _resize_image_with_optimization(image_array)

            # Process mask if provided with comprehensive validation
            processed_mask = None
            if mask is not None:
                # Validate mask dimensions match original image
                if mask.shape[:2] != image_array.shape[:2]:
                    raise HistogramError(
                        f"Mask shape {mask.shape} incompatible with {image_label} shape {image_array.shape}"
                    )

                # Resize mask to match processed image with nearest neighbor interpolation
                resized_mask = cv2.resize(
                    mask.astype(np.uint8),
                    resized_image.shape[:2][::-1],  # (width, height) format
                    interpolation=cv2.INTER_NEAREST
                )

                # Ensure binary mask values for histogram computation
                processed_mask = (resized_mask > 0).astype(np.uint8) * 255

                # Validate mask has valid regions
                if np.sum(processed_mask) == 0:
                    raise HistogramError(f"Mask for {image_label} contains no valid regions")

                processing_metadata['transformations_applied'].append('mask_processing')
                processing_metadata.update({
                    'mask_provided': True,
                    'mask_coverage_ratio': float(np.sum(processed_mask > 0)) / float(processed_mask.size)
                })
            else:
                processing_metadata['mask_provided'] = False

            return resized_image, processed_mask, processing_metadata

        # Load and prepare both images with comprehensive preprocessing
        preprocessing_start = time.perf_counter()

        try:
            # Process first image with comprehensive preprocessing
            prepared_image1, processed_mask1, metadata1 = _load_and_prepare_image_for_histogram(
                image1, mask1, "image1"
            )
            # Process second image with comprehensive preprocessing
            prepared_image2, processed_mask2, metadata2 = _load_and_prepare_image_for_histogram(
                image2, mask2, "image2"
            )

            # Record preprocessing performance metrics
            preprocessing_time = time.perf_counter() - preprocessing_start
            operation_metrics['processing_stages']['preprocessing'] = {
                'duration_seconds': preprocessing_time,
                'image1_metadata': metadata1,
                'image2_metadata': metadata2
            }

        except Exception as e:
            # Enhanced error context for preprocessing failures
            operation_metrics['error'] = {
                'stage': 'preprocessing',
                'error_type': type(e).__name__,
                'error_message': str(e)
            }
            raise HistogramError(f"Image preprocessing failed: {e}") from e

        # Apply color space transformation based on configuration
        color_conversion_start = time.perf_counter()

        def _convert_color_space_with_validation(img: np.ndarray, label: str) -> np.ndarray:
            """Convert image to specified color space with validation."""
            try:
                if color_space == "HSV":
                    # Convert BGR to HSV for perceptually uniform hue representation
                    # HSV ranges: H[0,179], S[0,255], V[0,255] in OpenCV
                    converted_img = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
                elif color_space == "RGB":
                    # Convert BGR to RGB for standard RGB analysis
                    converted_img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
                elif color_space == "LAB                # Convert BGR to CIELAB for perceptually uniform color space
                    # LAB ranges: L[0,100], A[-127,127], B[-127,127] but OpenCV uses [0,255]
                    converted_img = cv2.cvtColor(img, cv2.COLOR_BGR2LAB)
                else:
                    raise HistogramError(f"Unsupported color space: {color_space}")

                # Validate conversion success
                if converted_img is None or converted_img.shape != img.shape:
                    raise HistogramError(f"Color space conversion failed for {label}")

                return converted_img

            except cv2.error as e:
                raise HistogramError(f"OpenCV color conversion failed for {label}: {e}") from e

        try:
            # Transform both images to target color space with validation
            color_image1 = _convert_color_space_with_validation(prepared_image1, "image1")
            color_image2 = _convert_color_space_with_validation(prepared_image2, "image2")

            # Record color conversion performance
            color_conversion_time = time.perf_counter() - color_conversion_start
            operation_metrics['processing_stages']['color_conversion'] = {
                'duration_seconds': color_conversion_time,
                'color_space': color_space,
                'conversion_successful': True
            }

        except Exception as e:
            # Enhanced error reporting for color conversion failures
            operation_metrics['error'] = {
                'stage': 'color_conversion',
                'error_type': type(e).__name__,
                'error_message': str(e)
            }
            raise HistogramError(f"Color space conversion failed: {e}") from e

        # Define histogram computation parameters based on color space
        if color_space == "HSV":
            # HSV ranges: H[0,179], S[0,255], V[0,255] in OpenCV
            hist_ranges = [0, 180, 0, 256, 0, 256]
            channels = [0, 1]  # Use H and S channels for robustness to illumination
            channel_names = ["Hue", "Saturation"]
        elif color_space in ["RGB", "LAB"]:
            # RGB/LAB ranges: [0,255] for all channels in OpenCV representation
            hist_ranges = [0, 256, 0, 256, 0, 256]
            channels = [0, 1, 2]  # Use all three channels
            channel_names = ["Channel0", "Channel1", "Channel2"]

        # Apply adaptive binning if requested based on image characteristics
        if adaptive_binning:
            adaptive_binning_start = time.perf_counter()

            # Analyze image characteristics for optimal binning
            def _compute_adaptive_bins(img: np.ndarray) -> List[int]:
                """Compute adaptive bin counts based on image characteristics."""
                adaptive_bins = []

                for channel_idx in channels:
                    # Extract channel data
                    channel_data = img[:, :, channel_idx].flatten()

                    # Compute channel statistics for adaptive binning
                    unique_values = len(np.unique(channel_data))
                    data_range = np.ptp(channel_data)  # Peak-to-peak range

                    # Use Freedman-Diaconis rule for optimal bin width
                    q75, q25 = np.percentile(channel_data, [75, 25])
                    iqr = q75 - q25

                    if iqr > 0:
                        # Freedman-Diaconis bin width: 2 * IQR / n^(1/3)
                        bin_width = 2 * iqr / (len(channel_data) ** (1/3))
                        optimal_bins = max(int(data_range / bin_width), 10)
                    else:
                        # Fallback for constant channels
                        optimal_bins = min(unique_values, 32)

                    # Clamp to reasonable range
                    adaptive_bins.append(min(max(optimal_bins, 8), 128))

                return adaptive_bins

            # Compute adaptive bins for both images and take average
            adaptive_bins1 = _compute_adaptive_bins(color_image1)
            adaptive_bins2 = _compute_adaptive_bins(color_image2)

            # Use average of adaptive bins for consistency
            hist_bins = [int((b1 + b2) / 2) for b1, b2 in zip(adaptive_bins1, adaptive_bins2)]

            # Record adaptive binning performance
            adaptive_binning_time = time.perf_counter() - adaptive_binning_start
            operation_metrics['processing_stages']['adaptive_binning'] = {
                'duration_seconds': adaptive_binning_time,
                'original_bins': bins,
                'adaptive_bins': hist_bins,
                'bins_image1': adaptive_bins1,
                'bins_image2': adaptive_bins2
            }

        # Compute normalized histograms for both images with comprehensive error handling
        histogram_computation_start = time.perf_counter()

        try:
            # Image 1: Compute multi-dimensional histogram
            histogram1 = cv2.calcHist(
                [color_image1],                    # Source image list
                channels,                          # Channel indices for computation
                processed_mask1,                   # Mask (None = full image)
                hist_bins[:len(channels)],         # Bins per channel
                hist_ranges                        # Value ranges for each channel
            )

            # Image 2: Compute multi-dimensional histogram
            histogram2 = cv2.calcHist(
                [color_image2],                    # Source image list
                channels,                          # Channel indices for computation
                processed_mask2,                   # Mask (None = full image)
                hist_bins[:len(channels)],         # Bins per channel
                hist_ranges                        # Value ranges for each channel
            )

            # Validate histogram computation success
            if histogram1 is None or histogram2 is None:
                raise HistogramError("Histogram computation returned None")

            if histogram1.size == 0 or histogram2.size == 0:
                raise HistogramError("Histogram computation produced empty histograms")

            # Record histogram computation performance
            histogram_computation_time = time.perf_counter() - histogram_computation_start
            operation_metrics['processing_stages']['histogram_computation'] = {
                'duration_seconds': histogram_computation_time,
                'histogram1_shape': histogram1.shape,
                'histogram2_shape': histogram2.shape,
                'total_bins': int(np.prod(hist_bins[:len(channels)])),
                'channels_used': len(channels)
            }

        except Exception as e:
            # Enhanced error reporting for histogram computation failures
            operation_metrics['error'] = {
                'stage': 'histogram_computation',
                'error_type': type(e).__name__,
                'error_message': str(e)
            }
            raise HistogramError(f"Histogram computation failed: {e}") from e

        # Handle zero histogram cases according to policy with detailed analysis
        hist1_sum = float(np.sum(histogram1))
        hist2_sum = float(np.sum(histogram2))

        # Record histogram statistics for analysis
        histogram_stats = {
            'histogram1_sum': hist1_sum,
            'histogram2_sum': hist2_sum,
            'histogram1_nonzero_bins': int(np.count_nonzero(histogram1)),
            'histogram2_nonzero_bins': int(np.count_nonzero(histogram2)),
            'histogram1_max': float(np.max(histogram1)),
            'histogram2_max': float(np.max(histogram2))
        }

        if hist1_sum == 0 or hist2_sum == 0:
            # Handle zero histogram cases with comprehensive error context
            zero_histogram_context = {
                'histogram1_zero': hist1_sum == 0,
                'histogram2_zero': hist2_sum == 0,
                'mask1_coverage': float(np.sum(processed_mask1 > 0)) / float(processed_mask1.size) if processed_mask1 is not None else 1.0,
                'mask2_coverage': float(np.sum(processed_mask2 > 0)) / float(processed_mask2.size) if processed_mask2 is not None else 1.0
            }

            if on_zero_histogram == "error":
                raise HistogramError(
                    "Zero histogram detected - no data to compare",
                    algorithm_context=zero_histogram_context,
                    forensic_metadata=ForensicMetadata(
                        operation_name="histogram_zero_detection",
                        algorithm_parameters=histogram_stats
                    )
                )
            elif on_zero_histogram == "zero":
                return 0.0 if not statistical_analysis else {
                    'similarity_score': 0.0,
                    'zero_histogram_detected': True,
                    'zero_histogram_context': zero_histogram_context
                }
            elif on_zero_histogram == "nan":
                return float('nan') if not statistical_analysis else {
                    'similarity_score': float('nan'),
                    'zero_histogram_detected': True,
                    'zero_histogram_context': zero_histogram_context
                }

        # Apply L1 normalization to histograms for statistical comparison
        # Mathematical formula: Ĥ[i] = H[i] / Σⱼ H[j]
        normalization_start = time.perf_counter()

        try:
            # Normalize histogram1 to probability distribution
            cv2.normalize(histogram1, histogram1, alpha=1.0, beta=0.0, norm_type=cv2.NORM_L1)
            # Normalize histogram2 to probability distribution
            cv2.normalize(histogram2, histogram2, alpha=1.0, beta=0.0, norm_type=cv2.NORM_L1)

            # Validate normalization success
            norm1_sum = float(np.sum(histogram1))
            norm2_sum = float(np.sum(histogram2))

            if not (0.99 <= norm1_sum <= 1.01) or not (0.99 <= norm2_sum <= 1.01):
                raise HistogramError(f"Normalization failed: sums = {norm1_sum}, {norm2_sum}")

            # Record normalization performance
            normalization_time = time.perf_counter() - normalization_start
            operation_metrics['processing_stages']['normalization'] = {
                'duration_seconds': normalization_time,
                'normalized_sum1': norm1_sum,
                'normalized_sum2': norm2_sum,
                'normalization_successful': True
            }

        except Exception as e:
            # Enhanced error reporting for normalization failures
            operation_metrics['error'] = {
                'stage': 'normalization',
                'error_type': type(e).__name__,
                'error_message': str(e)
            }
            raise HistogramError(f"Histogram normalization failed: {e}") from e

        # Compute entropy analysis if requested for information-theoretic measures
        entropy_metrics = {}
        if entropy_analysis:
            entropy_computation_start = time.perf_counter()

            try:
                # Compute Shannon entropy for both histograms
                # Mathematical formula: H(X) = -Σᵢ p(i)log₂(p(i))
                def _compute_shannon_entropy(hist: np.ndarray) -> float:
                    """Compute Shannon entropy of normalized histogram."""
                    # Flatten histogram and remove zero entries
                    prob_dist = hist.flatten()
                    prob_dist = prob_dist[prob_dist > 0]

                    if len(prob_dist) == 0:
                        return 0.0

                    # Compute Shannon entropy using base-2 logarithm
                    entropy = -np.sum(prob_dist * np.log2(prob_dist))
                    return float(entropy)

                # Compute entropies for both histograms
                entropy1 = _compute_shannon_entropy(histogram1)
                entropy2 = _compute_shannon_entropy(histogram2)

                # Compute joint entropy for mutual information analysis
                joint_hist = np.outer(histogram1.flatten(), histogram2.flatten())
                joint_hist = joint_hist / np.sum(joint_hist)  # Normalize joint distribution
                joint_entropy = _compute_shannon_entropy(joint_hist)

                # Compute mutual information: I(X,Y) = H(X) + H(Y) - H(X,Y)
                mutual_information = entropy1 + entropy2 - joint_entropy

                # Compute normalized mutual information for scale invariance
                if entropy1 > 0 and entropy2 > 0:
                    normalized_mutual_info = mutual_information / np.sqrt(entropy1 * entropy2)
                else:
                    normalized_mutual_info = 0.0

                # Store entropy analysis results
                entropy_metrics = {
                    'entropy_histogram1': entropy1,
                    'entropy_histogram2': entropy2,
                    'joint_entropy': joint_entropy,
                    'mutual_information': mutual_information,
                    'normalized_mutual_information': normalized_mutual_info,
                    'entropy_difference': abs(entropy1 - entropy2),
                    'entropy_ratio': entropy2 / entropy1 if entropy1 > 0 else float('inf')
                }

                # Record entropy computation performance
                entropy_computation_time = time.perf_counter() - entropy_computation_start
                operation_metrics['processing_stages']['entropy_analysis'] = {
                    'duration_seconds': entropy_computation_time,
                    'entropy_metrics': entropy_metrics
                }

            except Exception as e:
                # Handle entropy computation failures gracefully
                entropy_metrics = {'error': str(e)}
                operation_metrics['processing_stages']['entropy_analysis'] = {
                    'error': str(e)
                }

        # Compute similarity/distance using specified metric with comprehensive validation
        comparison_start = time.perf_counter()

        try:
            # Apply OpenCV histogram comparison with selected metric
            comparison_result = cv2.compareHist(
                histogram1,
                histogram2,
                metric_constants[metric]
            )

            # Validate comparison result is finite and within expected range
            if not np.isfinite(comparison_result):
                raise HistogramError(f"Histogram comparison produced invalid result: {comparison_result}")

            # Apply metric-specific validation
            if metric == "correlation":
                # Correlation should be in [-1, 1] range
                if not -1.0 <= comparison_result <= 1.0:
                    raise HistogramError(f"Correlation result {comparison_result} outside [-1,1] range")
            elif metric == "intersection":
                # Intersection should be in [0, 1] range for normalized histograms
                if not 0.0 <= comparison_result <= 1.0:
                    raise HistogramError(f"Intersection result {comparison_result} outside [0,1] range")

            # Record comparison performance
            comparison_time = time.perf_counter() - comparison_start
            operation_metrics['processing_stages']['comparison'] = {
                'duration_seconds': comparison_time,
                'metric_used': metric,
                'comparison_result': float(comparison_result),
                'result_valid': True
            }

        except Exception as e:
            # Enhanced error reporting for comparison failures
            operation_metrics['error'] = {
                'stage': 'comparison',
                'error_type': type(e).__name__,
                'error_message': str(e)
            }
            raise HistogramError(f"Histogram comparison failed for {metric}: {e}") from e

        # Compute statistical properties and confidence intervals if analysis requested
        statistical_properties = None
        if statistical_analysis:
            statistical_analysis_start = time.perf_counter()

            try:
                # Compute histogram difference for statistical analysis
                hist_diff = histogram1 - histogram2
                hist_diff_flat = hist_diff.flatten()

                # Compute comprehensive statistical properties of histogram differences
                statistical_properties = StatisticalProperties(
                    sample_size=len(hist_diff_flat),
                    mean=float(np.mean(hist_diff_flat)),
                    median=float(np.median(hist_diff_flat)),
                    variance=float(np.var(hist_diff_flat)),
                    standard_deviation=float(np.std(hist_diff_flat)),
                    standard_error=float(np.std(hist_diff_flat) / np.sqrt(len(hist_diff_flat))),
                    skewness=float(stats.skew(hist_diff_flat)),
                    kurtosis=float(stats.kurtosis(hist_diff_flat)),
                    confidence_level=0.95,
                    distribution_type="histogram_difference"
                )

                # Compute confidence interval for the comparison metric
                if len(hist_diff_flat) > 1:
                    # Use bootstrap method for robust confidence interval estimation
                    n_bootstrap = 1000
                    bootstrap_results = []

                    for _ in range(n_bootstrap):
                        # Resample histogram bins with replacement
                        bootstrap_indices = np.random.choice(len(hist_diff_flat), size=len(hist_diff_flat), replace=True)
                        bootstrap_diff = hist_diff_flat[bootstrap_indices]

                        # Reconstruct histograms from bootstrap sample
                        bootstrap_hist1 = histogram1.flatten()[bootstrap_indices].reshape(histogram1.shape)
                        bootstrap_hist2 = histogram2.flatten()[bootstrap_indices].reshape(histogram2.shape)

                        # Compute metric for bootstrap sample
                        try:
                            bootstrap_result = cv2.compareHist(bootstrap_hist1, bootstrap_hist2, metric_constants[metric])
                            if np.isfinite(bootstrap_result):
                                bootstrap_results.append(bootstrap_result)
                        except:
                            continue

                    # Compute confidence interval from bootstrap distribution
                    if len(bootstrap_results) > 10:
                        ci_lower = float(np.percentile(bootstrap_results, 2.5))
                        ci_upper = float(np.percentile(bootstrap_results, 97.5))
                        statistical_properties.confidence_interval_lower = ci_lower
                        statistical_properties.confidence_interval_upper = ci_upper

                # Record statistical analysis performance
                statistical_analysis_time = time.perf_counter() - statistical_analysis_start
                operation_metrics['processing_stages']['statistical_analysis'] = {
                    'duration_seconds': statistical_analysis_time,
                    'bootstrap_samples': len(bootstrap_results) if 'bootstrap_results' in locals() else 0,
                    'confidence_interval_computed': statistical_properties.confidence_interval_lower is not None
                }

            except Exception as e:
                # Handle statistical analysis failures gracefully
                operation_metrics['processing_stages']['statistical_analysis'] = {
                    'error': str(e)
                }

        # Record comprehensive performance metrics
        total_execution_time = time.perf_counter() - method_start_time
        memory_after = psutil.Process().memory_info().rss
        memory_delta = memory_after - memory_before

        operation_metrics.update({
            'total_execution_time_seconds': total_execution_time,
            'memory_delta_bytes': memory_delta,
            'similarity_score': float(comparison_result),
            'histogram_statistics': histogram_stats,
            'result_type': 'successful_comparison'
        })

        # Store performance metrics for analysis if monitoring enabled
        if performance_monitoring:
            self._record_method_performance('histogram_correlation', operation_metrics)

        # Return comprehensive result based on analysis level requested
        if statistical_analysis:
            # Return comprehensive analysis with all computed metrics
            comprehensive_result = {
                'similarity_score': float(comparison_result),
                'metric_used': metric,
                'color_space': color_space,
                'bins_used': hist_bins[:len(channels)],
                'channels_analyzed': channel_names,
                'statistical_properties': statistical_properties,
                'entropy_metrics': entropy_metrics,
                'histogram_statistics': histogram_stats,
                'performance_metrics': operation_metrics,
                'computation_successful': True
            }
            return comprehensive_result
        else:
            # Return simple numerical result
            return float(comparison_result)

    def clip_embedding_similarity(
        self,
        image1: Union[str, Path, Image.Image, torch.Tensor, np.ndarray],
        image2: Union[str, Path, Image.Image, torch.Tensor, np.ndarray],
        use_mixed_precision: bool = False,
        batch_processing: bool = False,
        statistical_analysis: bool = True,
        performance_monitoring: bool = True,
        embedding_analysis: bool = True,
        device_optimization: bool = True
    ) -> Union[float, Dict[str, Any]]:
        """
        Compute CLIP embedding similarity with comprehensive analysis and optimization.

        Implements vision transformer-based semantic similarity with rigorous mathematical
        validation, embedding space analysis, performance optimization, and enterprise-grade
        monitoring for quantitative image similarity assessment.

        Mathematical Foundation:
        1. Vision Transformer Encoding:
          - Patch embedding: x_patch = Flatten(x_img) ∈ ℝ^(P²·C)
          - Position encoding: z₀ = [x_class; E·x_patch] + E_pos
          - Multi-head attention: Attention(Q,K,V) = softmax(QK^T/√d_k)V
          - Layer normalization: LN(x) = γ(x-μ)/σ + β

        2. Embedding Space Properties:
          - Joint embedding: f_v: Images → ℝ^d, f_t: Text → ℝ^d
          - L2 normalization: ê = e / ||e||₂ where ||e||₂ = √(Σᵢ eᵢ²)
          - Cosine similarity: sim(ê₁,ê₂) = ê₁·ê₂ = Σᵢ ê₁[i]ê₂[i]

        3. Statistical Properties:
          - Embedding norm: ||e||₂ ≈ 1 after normalization
          - Angular distance: d_θ = arccos(sim(ê₁,ê₂)) ∈ [0,π]
          - Euclidean distance: d_E = ||ê₁ - ê₂||₂ = √(2(1 - sim(ê₁,ê₂)))

        Args:
            image1: First image in supported format
            image2: Second image in supported format
            use_mixed_precision: Whether to use automatic mixed precision for efficiency
            batch_processing: Whether to process images in batch for optimization
            statistical_analysis: Whether to perform comprehensive embedding analysis
            performance_monitoring: Whether to track detailed performance metrics
            embedding_analysis: Whether to analyze embedding space properties
            device_optimization: Whether to apply device-specific optimizations

        Returns:
            Cosine similarity [-1,1] or comprehensive analysis dictionary

        Raises:
            ModelInferenceError: If CLIP inference fails with detailed context
            ValueError: If inputs have incompatible formats or shapes
            ResourceAllocationError: If insufficient GPU/CPU resources
        """
        # Record method execution start for comprehensive performance monitoring
        method_start_time = time.perf_counter()
        memory_before = psutil.Process().memory_info().rss
        gpu_memory_before = torch.cuda.memory_allocated() if torch.cuda.is_available() else 0

        # Ensure CLIP model is loaded using thread-safe lazy initialization
        self._load_clip_with_comprehensive_monitoring()

        # Initialize comprehensive performance tracking with detailed metrics
        operation_metrics = {
            'method_name': 'clip_embedding_similarity',
            'start_time': datetime.datetime.utcnow().isoformat(),
            'parameters': {
                'use_mixed_precision': use_mixed_precision,
                'batch_processing': batch_processing,
                'statistical_analysis': statistical_analysis,
                'embedding_analysis': embedding_analysis,
                'device_optimization': device_optimization
            },
            'processing_stages': {},
            'device_info': {
                'target_device': self.device,
                'cuda_available': torch.cuda.is_available(),
                'device_count': torch.cuda.device_count() if torch.cuda.is_available() else 0
            }
        }

        # Helper function for comprehensive input preprocessing with validation
        def _prepare_input_tensor_with_validation(
            img_input: Union[str, Path, Image.Image, torch.Tensor, np.ndarray],
            input_label: str
        ) -> Tuple[torch.Tensor, Dict[str, Any]]:
            """
            Convert various input types to preprocessed CLIP tensor with comprehensive validation.

            Args:
                img_input: Image input in supported format
                input_label: Label for error reporting and performance tracking

            Returns:
                Tuple of (preprocessed_tensor, processing_metadata)
            """
            # Initialize processing metadata for comprehensive analysis
            processing_metadata = {
                'input_type': type(img_input).__name__,
                'transformations_applied': [],
                'validation_performed': True,
                'tensor_properties': {}
            }

            # Handle pre-computed torch.Tensor input with comprehensive validation
            if isinstance(img_input, torch.Tensor):
                # Clone tensor to avoid mutation of original data
                input_tensor = img_input.clone().detach()

                # Validate tensor shape for CLIP input requirements
                if input_tensor.ndim == 3:
                    # Add batch dimension: (C,H,W) -> (1,C,H,W)
                    input_tensor = input_tensor.unsqueeze(0)
                    processing_metadata['transformations_applied'].append('batch_dimension_added')
                elif input_tensor.ndim == 4:
                    # Already has batch dimension: (B,C,H,W)
                    processing_metadata['transformations_applied'].append('batch_dimension_present')
                else:
                    raise ValueError(f"Invalid {input_label} tensor dimensions: {input_tensor.ndim}, expected 3 or 4")

                # Validate channel count for RGB images
                if input_tensor.shape[1] != 3:
                    raise ValueError(f"Expected 3 channels for {input_label}, got {input_tensor.shape[1]}")

                # Validate tensor value range for CLIP preprocessing
                tensor_min, tensor_max = input_tensor.min().item(), input_tensor.max().item()
                if not (-3.0 <= tensor_min <= 3.0 and -3.0 <= tensor_max <= 3.0):
                    # Assume unnormalized tensor and apply standard normalization
                    input_tensor = input_tensor.float() / 255.0 if tensor_max > 1.0 else input_tensor.float()
                    processing_metadata['transformations_applied'].append('value_normalization')

                # Record tensor properties
                processing_metadata['tensor_properties'] = {
                    'shape': list(input_tensor.shape),
                    'dtype': str(input_tensor.dtype),
                    'device': str(input_tensor.device),
                    'value_range': (tensor_min, tensor_max)
                }

            # Handle numpy array input with comprehensive preprocessing
            elif isinstance(img_input, np.ndarray):
                # Validate array dimensions for image data compatibility
                if img_input.ndim not in [2, 3]:
                    raise ValueError(f"Invalid {input_label} array dimensions: {img_input.ndim}")

                # Record original array properties
                processing_metadata['tensor_properties'] = {
                    'original_shape': list(img_input.shape),
                    'original_dtype': str(img_input.dtype),
                    'value_range': (float(img_input.min()), float(img_input.max()))
                }

                # Handle different array formats and convert to PIL for CLIP preprocessing
                if img_input.ndim == 2:
                    # Grayscale array - convert to RGB PIL Image
                    pil_image = Image.fromarray(img_input).convert('RGB')
                    processing_metadata['transformations_applied'].append('grayscale_to_RGB')
                elif img_input.ndim == 3:
                    # Color array - handle different channel orders and formats
                    if img_input.shape[2] == 3:
                        # Assume RGB format for PIL compatibility
                        if img_input.dtype != np.uint8:
                            # Convert to uint8 if necessary
                            if img_input.max() <= 1.0:
                                img_input = (img_input * 255).astype(np.uint8)
                            else:
                                img_input = img_input.astype(np.uint8)
                            processing_metadata['transformations_applied'].append('dtype_conversion')

                        pil_image = Image.fromarray(img_input).convert('RGB')
                        processing_metadata['transformations_applied'].append('array_to_PIL_RGB')
                    elif img_input.shape[2] == 4:
                        # RGBA - remove alpha channel and convert
                        rgb_array = img_input[:, :, :3]
                        if rgb_array.dtype != np.uint8:
                            if rgb_array.max() <= 1.0:
                                rgb_array = (rgb_array * 255).astype(np.uint8)
                            else:
                                rgb_array = rgb_array.astype(np.uint8)
                        pil_image = Image.fromarray(rgb_array).convert('RGB')
                        processing_metadata['transformations_applied'].append('RGBA_to_RGB')
                    else:
                        raise ValueError(f"Unsupported {input_label} channel count: {img_input.shape[2]}")

                # Apply CLIP preprocessing pipeline to PIL image
                input_tensor = self.clip_preprocess(pil_image)
                processing_metadata['transformations_applied'].append('CLIP_preprocessing')

                # Add batch dimension for inference
                input_tensor = input_tensor.unsqueeze(0)
                processing_metadata['transformations_applied'].append('batch_dimension_added')

            # Handle PIL Image input with validation and preprocessing
            elif isinstance(img_input, Image.Image):
                # Record original PIL image properties
                processing_metadata['tensor_properties'] = {
                    'original_format': img_input.format,
                    'original_mode': img_input.mode,
                    'original_size': img_input.size
                }

                # Ensure RGB format for CLIP compatibility
                rgb_image = img_input.convert('RGB')
                processing_metadata['transformations_applied'].append('PIL_to_RGB')

                # Apply CLIP preprocessing (resize, normalize, tensorize)
                input_tensor = self.clip_preprocess(rgb_image)
                processing_metadata['transformations_applied'].append('CLIP_preprocessing')

                # Add batch dimension for model input
                input_tensor = input_tensor.unsqueeze(0)
                processing_metadata['transformations_applied'].append('batch_dimension_added')

            # Handle file path input with comprehensive validation and loading
            elif isinstance(img_input, (str, Path)):
                # Validate image file accessibility using enhanced validation
                validated_path = self._validate_image_path(
                    img_input,
                    self._allow_symlinks,
                    perform_content_validation=True,
                    max_file_size_mb=50.0  # Reasonable limit for CLIP processing
                )

                # Record path information for metadata
                processing_metadata['tensor_properties'] = {
                    'file_path': str(validated_path),
                    'file_size_bytes': validated_path.stat().st_size
                }

                # Load image using PIL with comprehensive error handling
                try:
                    pil_image = Image.open(validated_path).convert('RGB')
                    processing_metadata['transformations_applied'].append('file_to_PIL_RGB')

                    # Record loaded image properties
                    processing_metadata['tensor_properties'].update({
                        'loaded_format': pil_image.format,
                        'loaded_mode': pil_image.mode,
                        'loaded_size': pil_image.size
                    })

                except (IOError, OSError) as e:
                    # Enhanced error reporting with file analysis
                    raise ModelInferenceError(
                        f"Failed to load {input_label} from {validated_path}: {e}",
                        model_name="CLIP",
                        algorithm_context={'file_path': str(validated_path), 'input_label': input_label},
                        forensic_metadata=ForensicMetadata(
                            operation_name="clip_image_loading",
                            algorithm_parameters={'input_label': input_label}
                        )
                    ) from e

                # Apply CLIP preprocessing pipeline
                input_tensor = self.clip_preprocess(pil_image)
                processing_metadata['transformations_applied'].append('CLIP_preprocessing')

                # Add batch dimension
                input_tensor = input_tensor.unsqueeze(0)
                processing_metadata['transformations_applied'].append('batch_dimension_added')

            else:
                # Unsupported input type with comprehensive error context
                raise ValueError(
                    f"Unsupported {input_label} input type: {type(img_input)}. "
                    f"Supported types: str, Path, PIL.Image, torch.Tensor, np.ndarray"
                )

            # Move tensor to appropriate device for inference with optimization
            if device_optimization:
                # Apply device-specific optimizations
                if self.device.startswith('cuda') and torch.cuda.is_available():
                    # Move to GPU with memory optimization
                    input_tensor = input_tensor.to(self.device, non_blocking=True)
                    processing_metadata['transformations_applied'].append('GPU_transfer_optimized')
                else:
                    # Move to CPU
                    input_tensor = input_tensor.to(self.device)
                    processing_metadata['transformations_applied'].append('CPU_transfer')
            else:
                # Standard device transfer
                input_tensor = input_tensor.to(self.device)
                processing_metadata['transformations_applied'].append('device_transfer')

            # Record final tensor properties
            processing_metadata['tensor_properties'].update({
                'final_shape': list(input_tensor.shape),
                'final_dtype': str(input_tensor.dtype),
                'final_device': str(input_tensor.device),
                'memory_usage_bytes': input_tensor.numel() * input_tensor.element_size()
            })

            return input_tensor, processing_metadata

        # Prepare both input images as CLIP-compatible tensors with comprehensive validation
        preprocessing_start = time.perf_counter()

        try:
            # Process first image with detailed metadata collection
            tensor1, metadata1 = _prepare_input_tensor_with_validation(image1, "image1")
            # Process second image with detailed metadata collection
            tensor2, metadata2 = _prepare_input_tensor_with_validation(image2, "image2")

            # Record preprocessing performance metrics
            preprocessing_time = time.perf_counter() - preprocessing_start
            operation_metrics['processing_stages']['preprocessing'] = {
                'duration_seconds': preprocessing_time,
                'image1_metadata': metadata1,
                'image2_metadata': metadata2,
                'total_memory_usage_bytes': metadata1['tensor_properties']['memory_usage_bytes'] +
                                          metadata2['tensor_properties']['memory_usage_bytes']
            }

        except Exception as e:
            # Enhanced error context for preprocessing failures
            operation_metrics['error'] = {
                'stage': 'preprocessing',
                'error_type': type(e).__name__,
                'error_message': str(e)
            }
            raise ModelInferenceError(f"CLIP input preparation failed: {e}") from e

        # Perform CLIP inference with comprehensive error handling and optimization
        inference_start = time.perf_counter()

        try:
            # Configure mixed precision inference if requested and supported
            if use_mixed_precision and torch.cuda.is_available():
                # Use autocast context for automatic mixed precision
                with torch.cuda.amp.autocast():
                    with torch.no_grad():  # Disable gradient computation for inference efficiency
                        if batch_processing:
                            # Process both images in single batch for computational efficiency
                            batch_tensor = torch.cat([tensor1, tensor2], dim=0)
                            batch_embeddings = self.clip_model.encode_image(batch_tensor)
                            embedding1, embedding2 = batch_embeddings[0:1], batch_embeddings[1:2]
                            operation_metrics['processing_stages']['inference'] = {
                                'batch_processing_used': True,
                                'mixed_precision_used': True
                            }
                        else:
                            # Process images individually with mixed precision
                            embedding1 = self.clip_model.encode_image(tensor1)
                            embedding2 = self.clip_model.encode_image(tensor2)
                            operation_metrics['processing_stages']['inference'] = {
                                'batch_processing_used': False,
                                'mixed_precision_used': True
                            }
            else:
                # Standard precision inference
                with torch.no_grad():  # Disable gradient computation for efficiency
                    if batch_processing:
                        # Batch processing for computational efficiency
                        batch_tensor = torch.cat([tensor1, tensor2], dim=0)
                        batch_embeddings = self.clip_model.encode_image(batch_tensor)
                        embedding1, embedding2 = batch_embeddings[0:1], batch_embeddings[1:2]
                        operation_metrics['processing_stages']['inference'] = {
                            'batch_processing_used': True,
                            'mixed_precision_used': False
                        }
                    else:
                        # Individual image processing
                        embedding1 = self.clip_model.encode_image(tensor1)
                        embedding2 = self.clip_model.encode_image(tensor2)
                        operation_metrics['processing_stages']['inference'] = {
                            'batch_processing_used': False,
                            'mixed_precision_used': False
                        }

            # Record inference performance metrics
            inference_time = time.perf_counter() - inference_start
            operation_metrics['processing_stages']['inference'].update({
                'duration_seconds': inference_time,
                'embedding1_shape': list(embedding1.shape),
                'embedding2_shape': list(embedding2.shape),
                'inference_successful': True
            })

        except torch.cuda.OutOfMemoryError as e:
            # Handle GPU memory exhaustion with automatic cleanup and CPU fallback
            operation_metrics['processing_stages']['inference'] = {
                'error_type': 'OutOfMemoryError',
                'error_message': str(e),
                'gpu_memory_info': self._get_gpu_memory_info()
            }

            # Perform aggressive GPU memory cleanup
            if torch.cuda.is_available():
                torch.cuda.empty_cache()
                torch.cuda.synchronize()

            # Attempt CPU fallback if currently using CUDA
            if self.device != "cpu":
                try:
                    # Move model and tensors to CPU for fallback processing
                    self.clip_model = self.clip_model.cpu()
                    tensor1 = tensor1.cpu()
                    tensor2 = tensor2.cpu()
                    self.device = "cpu"

                    # Retry inference on CPU
                    with torch.no_grad():
                        embedding1 = self.clip_model.encode_image(tensor1)
                        embedding2 = self.clip_model.encode_image(tensor2)

                    # Record successful CPU fallback
                    operation_metrics['processing_stages']['inference'].update({
                        'cpu_fallback_successful': True,
                        'original_device_failed': True
                    })

                except Exception as fallback_error:
                    # CPU fallback also failed
                    raise ModelInferenceError(
                        f"CLIP inference failed on both GPU and CPU: GPU OOM: {e}, CPU: {fallback_error}",
                        model_name="CLIP",
                        algorithm_context=operation_metrics['processing_stages']['inference']
                    ) from e
            else:
                # Already on CPU - re-raise OOM error
                raise ModelInferenceError(
                    f"CPU OOM during CLIP inference: {e}",
                    model_name="CLIP",
                    algorithm_context=operation_metrics['processing_stages']['inference']
                ) from e

        except RuntimeError as e:
            # Handle other PyTorch runtime errors with comprehensive context
            operation_metrics['processing_stages']['inference'] = {
                'error_type': 'RuntimeError',
                'error_message': str(e),
                'device_info': operation_metrics['device_info']
            }
            raise ModelInferenceError(f"CLIP inference runtime error: {e}") from e

        # Validate embedding shapes for compatibility and mathematical consistency
        if embedding1.shape != embedding2.shape:
            raise ValueError(
                f"Embedding shape mismatch: {embedding1.shape} vs {embedding2.shape}"
            )

        # Perform comprehensive embedding analysis if requested
        embedding_properties = {}
        if embedding_analysis:
            embedding_analysis_start = time.perf_counter()

            try:
                # Analyze embedding properties before normalization
                embedding_properties = {
                    'embedding_dimension': embedding1.shape[-1],
                    'embedding1_norm_before': float(torch.norm(embedding1).item()),
                    'embedding2_norm_before': float(torch.norm(embedding2).item()),
                    'embedding1_mean': float(torch.mean(embedding1).item()),
                    'embedding2_mean': float(torch.mean(embedding2).item()),
                    'embedding1_std': float(torch.std(embedding1).item()),
                    'embedding2_std': float(torch.std(embedding2).item()),
                    'embedding1_min': float(torch.min(embedding1).item()),
                    'embedding1_max': float(torch.max(embedding1).item()),
                    'embedding2_min': float(torch.min(embedding2).item()),
                    'embedding2_max': float(torch.max(embedding2).item())
                }

                # Compute embedding space distances before normalization
                euclidean_distance_raw = float(torch.norm(embedding1 - embedding2).item())
                embedding_properties['euclidean_distance_raw'] = euclidean_distance_raw

                # Record embedding analysis performance
                embedding_analysis_time = time.perf_counter() - embedding_analysis_start
                operation_metrics['processing_stages']['embedding_analysis'] = {
                    'duration_seconds': embedding_analysis_time,
                    'properties_computed': len(embedding_properties)
                }

            except Exception as e:
                # Handle embedding analysis failures gracefully
                embedding_properties = {'analysis_error': str(e)}
                operation_metrics['processing_stages']['embedding_analysis'] = {
                    'error': str(e)
                }

        # Apply L2 normalization to embeddings for cosine similarity computation
        # Mathematical formula: ê = e / ||e||₂ where ||e||₂ = √(Σᵢ eᵢ²)
        normalization_start = time.perf_counter()

        try:
            # Compute L2 norms for validation
            norm1 = torch.norm(embedding1, dim=-1, keepdim=True)
            norm2 = torch.norm(embedding2, dim=-1, keepdim=True)

            # Validate norms are non-zero for safe normalization
            if norm1.item() == 0.0 or norm2.item() == 0.0:
                raise ModelInferenceError("Zero-norm embedding detected - invalid CLIP output")

            # Apply L2 normalization: ê = e / ||e||₂
            normalized_embedding1 = embedding1 / norm1
            normalized_embedding2 = embedding2 / norm2

            # Validate normalization success
            norm1_after = float(torch.norm(normalized_embedding1).item())
            norm2_after = float(torch.norm(normalized_embedding2).item())

            if not (0.99 <= norm1_after <= 1.01) or not (0.99 <= norm2_after <= 1.01):
                raise ModelInferenceError(f"Normalization failed: norms = {norm1_after}, {norm2_after}")

            # Record normalization performance and results
            normalization_time = time.perf_counter() - normalization_start
            operation_metrics['processing_stages']['normalization'] = {
                'duration_seconds': normalization_time,
                'norm1_before': float(norm1.item()),
                'norm2_before': float(norm2.item()),
                'norm1_after': norm1_after,
                'norm2_after': norm2_after,
                'normalization_successful': True
            }

            # Update embedding properties with normalized values
            if embedding_analysis:
                embedding_properties.update({
                    'embedding1_norm_after': norm1_after,
                    'embedding2_norm_after': norm2_after,
                    'normalization_applied': True
                })

        except Exception as e:
            # Enhanced error reporting for normalization failures
            operation_metrics['error'] = {
                'stage': 'normalization',
                'error_type': type(e).__name__,
                'error_message': str(e)
            }
            raise ModelInferenceError(f"Embedding normalization failed: {e}") from e

        # Compute cosine similarity via dot product of normalized embeddings
        # Mathematical formula: sim(ê₁,ê₂) = ê₁ · ê₂ = Σᵢ ê₁[i]ê₂[i]
        similarity_computation_start = time.perf_counter()

        try:
            # Compute cosine similarity using matrix multiplication
            cosine_similarity = torch.matmul(normalized_embedding1, normalized_embedding2.T)

            # Extract scalar similarity value and convert to Python float
            similarity_score = float(cosine_similarity.squeeze().item())

            # Validate similarity score is within mathematical bounds [-1, 1]
            if not -1.0 <= similarity_score <= 1.0:
                raise ModelInferenceError(f"Invalid cosine similarity {similarity_score}, expected [-1,1]")

            # Record similarity computation performance
            similarity_computation_time = time.perf_counter() - similarity_computation_start
            operation_metrics['processing_stages']['similarity_computation'] = {
                'duration_seconds': similarity_computation_time,
                'cosine_similarity': similarity_score,
                'similarity_valid': True
            }

        except Exception as e:
            # Enhanced error reporting for similarity computation failures
            operation_metrics['error'] = {
                'stage': 'similarity_computation',
                'error_type': type(e).__name__,
                'error_message': str(e)
            }
            raise ModelInferenceError(f"Cosine similarity computation failed: {e}") from e

        # Compute additional similarity metrics and statistical properties if analysis requested
        additional_metrics = {}
        if statistical_analysis:
            statistical_analysis_start = time.perf_counter()

            try:
                # Compute angular distance: d_θ = arccos(sim(ê₁,ê₂))
                angular_distance = float(torch.acos(torch.clamp(cosine_similarity, -1.0, 1.0)).item())

                # Compute Euclidean distance in normalized space: d_E = ||ê₁ - ê₂||₂
                euclidean_distance_normalized = float(torch.norm(normalized_embedding1 - normalized_embedding2).item())

                # Verify mathematical relationship: d_E = √(2(1 - cos_sim))
                expected_euclidean = float(torch.sqrt(2 * (1 - cosine_similarity)).item())
                euclidean_error = abs(euclidean_distance_normalized - expected_euclidean)

                # Compute Manhattan distance in embedding space
                manhattan_distance = float(torch.sum(torch.abs(normalized_embedding1 - normalized_embedding2)).item())

                # Compute embedding space statistics
                embedding_difference = normalized_embedding1 - normalized_embedding2
                embedding_difference_flat = embedding_difference.flatten()

                # Compute comprehensive statistical properties of embedding differences
                additional_metrics = {
                    'angular_distance_radians': angular_distance,
                    'angular_distance_degrees': float(np.degrees(angular_distance)),
                    'euclidean_distance_normalized': euclidean_distance_normalized,
                    'euclidean_distance_expected': expected_euclidean,
                    'euclidean_computation_error': euclidean_error,
                    'manhattan_distance': manhattan_distance,
                    'embedding_difference_mean': float(torch.mean(embedding_difference_flat).item()),
                    'embedding_difference_std': float(torch.std(embedding_difference_flat).item()),
                    'embedding_difference_max': float(torch.max(torch.abs(embedding_difference_flat)).item()),
                    'mathematical_consistency_verified': euclidean_error < 1e-6
                }

                # Compute confidence interval for cosine similarity using Fisher z-transformation
                if embedding1.shape[-1] > 10:  # Sufficient dimensionality for statistical analysis
                    # Fisher z-transformation: z = 0.5 * ln((1+r)/(1-r))
                    z_score = 0.5 * np.log((1 + similarity_score) / (1 - similarity_score))

                    # Standard error for correlation coefficient
                    n_dims = embedding1.shape[-1]
                    se_z = 1.0 / np.sqrt(n_dims - 3)

                    # 95% confidence interval in z-space
                    z_ci_lower = z_score - 1.96 * se_z
                    z_ci_upper = z_score + 1.96 * se_z

                    # Transform back to correlation space
                    r_ci_lower = (np.exp(2 * z_ci_lower) - 1) / (np.exp(2 * z_ci_lower) + 1)
                    r_ci_upper = (np.exp(2 * z_ci_upper) - 1) / (np.exp(2 * z_ci_upper) + 1)

                    additional_metrics.update({
                        'fisher_z_score': z_score,
                        'confidence_interval_lower': float(r_ci_lower),
                        'confidence_interval_upper': float(r_ci_upper),
                        'confidence_interval_width': float(r_ci_upper - r_ci_lower)
                    })

                # Record statistical analysis performance
                statistical_analysis_time = time.perf_counter() - statistical_analysis_start
                operation_metrics['processing_stages']['statistical_analysis'] = {
                    'duration_seconds': statistical_analysis_time,
                    'metrics_computed': len(additional_metrics),
                    'confidence_interval_computed': 'confidence_interval_lower' in additional_metrics
                }

            except Exception as e:
                # Handle statistical analysis failures gracefully
                additional_metrics = {'statistical_analysis_error': str(e)}
                operation_metrics['processing_stages']['statistical_analysis'] = {
                    'error': str(e)
                }

        # Record comprehensive performance metrics
        total_execution_time = time.perf_counter() - method_start_time
        memory_after = psutil.Process().memory_info().rss
        gpu_memory_after = torch.cuda.memory_allocated() if torch.cuda.is_available() else 0
        memory_delta = memory_after - memory_before
        gpu_memory_delta = gpu_memory_after - gpu_memory_before

        operation_metrics.update({
            'total_execution_time_seconds': total_execution_time,
            'memory_delta_bytes': memory_delta,
            'gpu_memory_delta_bytes': gpu_memory_delta,
            'cosine_similarity': similarity_score,
            'result_type': 'successful_similarity_computation'
        })

        # Store performance metrics for analysis if monitoring enabled
        if performance_monitoring:
            self._record_method_performance('clip_embedding_similarity', operation_metrics)

        # Return comprehensive result based on analysis level requested
        if statistical_analysis or embedding_analysis:
            # Return comprehensive analysis with all computed metrics
            comprehensive_result = {
                'cosine_similarity': similarity_score,
                'embedding_properties': embedding_properties,
                'additional_metrics': additional_metrics,
                'performance_metrics': operation_metrics,
                'model_info': {
                    'model_name': self._clip_model_name,
                    'device_used': self.device,
                    'mixed_precision_used': use_mixed_precision,
                    'batch_processing_used': batch_processing
                },
                'computation_successful': True
            }
            return comprehensive_result
        else:
            # Return simple cosine similarity score
            return similarity_score

    def reverse_image_search_google(
        self,
        image_path: Union[str, Path],
        driver_path: Union[str, Path],
        timeout: float = 15.0,
        headless: bool = False,
        max_similar_urls: int = 10,
        retry_attempts: int = 3,
        performance_monitoring: bool = True,
        content_analysis: bool = True,
        result_validation: bool = True,
        advanced_extraction: bool = True
    ) -> ReverseImageSearchResult:
        """
        Perform comprehensive Google reverse image search with enterprise-grade automation.

        Implements robust web automation with multiple fallback strategies, comprehensive
        result extraction, content analysis, and enterprise-grade error handling for
        quantitative image provenance and context determination.

        Automation Strategy:
        1. Multi-selector fallback hierarchy for UI element location
        2. Comprehensive error recovery with exponential backoff
        3. Advanced result extraction with content quality assessment
        4. Statistical confidence scoring based on result characteristics
        5. Geographic and temporal distribution analysis

        Args:
            image_path: Path to local image file for reverse search
            driver_path: Path to ChromeDriver executable
            timeout: Maximum wait time for page elements (seconds)
            headless: Whether to run browser in headless mode
            max_similar_urls: Maximum number of similar image URLs to extract
            retry_attempts: Number of retry attempts for failed operations
            performance_monitoring: Whether to track detailed performance metrics
            content_analysis: Whether to perform content quality analysis
            result_validation: Whether to validate extracted results
            advanced_extraction: Whether to extract additional metadata

        Returns:
            ReverseImageSearchResult with comprehensive search findings and analysis

        Raises:
            LaunchError: If ChromeDriver initialization fails with detailed context
            NavigationError: If page navigation fails with network analysis
            UploadError: If image upload fails with file analysis
            ExtractionError: If result extraction fails with DOM analysis
        """
        # Record method execution start for comprehensive performance monitoring
        method_start_time = time.perf_counter()
        memory_before = psutil.Process().memory_info().rss

        # Validate image file accessibility using comprehensive path validation
        validated_image_path = self._validate_image_path(
            image_path,
            self._allow_symlinks,
            perform_content_validation=True,
            max_file_size_mb=20.0  # Reasonable limit for web upload
        )

        # Validate ChromeDriver executable accessibility with comprehensive checks
        driver_path_obj = Path(driver_path)
        if not driver_path_obj.exists() or not driver_path_obj.is_file():
            raise LaunchError(
                f"ChromeDriver not found: {driver_path_obj}",
                driver_path=driver_path_obj,
                forensic_metadata=ForensicMetadata(
                    operation_name="chromedriver_validation",
                    algorithm_parameters={'driver_path': str(driver_path_obj)}
                )
            )

        # Verify ChromeDriver has execution permissions
        if not os.access(driver_path_obj, os.X_OK):
            raise LaunchError(
                f"ChromeDriver not executable: {driver_path_obj}",
                driver_path=driver_path_obj,
                driver_version=None,
                browser_version=None
            )

        # Initialize comprehensive performance tracking with detailed metrics
        operation_metrics = {
            'method_name': 'reverse_image_search_google',
            'start_time': datetime.datetime.utcnow().isoformat(),
            'parameters': {
                'image_path': str(validated_image_path),
                'timeout': timeout,
                'headless': headless,
                'max_similar_urls': max_similar_urls,
                'retry_attempts': retry_attempts,
                'content_analysis': content_analysis,
                'advanced_extraction': advanced_extraction
            },
            'processing_stages': {},
            'automation_events': []
        }

        # Configure Chrome options for robust automation with enterprise-grade settings
        chrome_options = Options()

        # Essential options for automation stability and security
        chrome_options.add_argument("--no-sandbox")                    # Bypass OS security model for automation
        chrome_options.add_argument("--disable-dev-shm-usage")         # Overcome limited resource problems
        chrome_options.add_argument("--disable-gpu")                   # Disable GPU for stability
        chrome_options.add_argument("--disable-extensions")            # Disable extensions for speed and security
        chrome_options.add_argument("--disable-plugins")               # Disable plugins for security
        chrome_options.add_argument("--disable-web-security")          # Disable web security for automation
        chrome_options.add_argument("--allow-running-insecure-content") # Allow mixed content
        chrome_options.add_argument("--disable-features=VizDisplayCompositor") # Stability improvement

        # Performance optimization options
        chrome_options.add_argument("--disable-background-timer-throttling")
        chrome_options.add_argument("--disable-backgrounding-occluded-windows")
        chrome_options.add_argument("--disable-renderer-backgrounding")

        # Window management for consistent DOM rendering
        if headless:
            chrome_options.add_argument("--headless")                  # Run in headless mode
            chrome_options.add_argument("--window-size=1920,1080")     # Set window size for headless
            chrome_options.add_argument("--disable-logging")           # Reduce log output
        else:
            chrome_options.add_argument("--start-maximized")           # Maximize window for visibility

        # User agent configuration to avoid bot detection
        chrome_options.add_argument(
            "--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 "
            "(KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
        )

        # Configure WebDriver service with comprehensive error handling
        try:
            webdriver_service = Service(str(driver_path_obj))
            webdriver_service.start()  # Pre-start service for validation

            # Record service initialization success
            operation_metrics['processing_stages']['service_initialization'] = {
                'driver_path': str(driver_path_obj),
                'service_started': True,
                'initialization_successful': True
            }

        except Exception as e:
            raise LaunchError(
                f"WebDriver service configuration failed: {e}",
                driver_path=driver_path_obj,
                forensic_metadata=ForensicMetadata(
                    operation_name="webdriver_service_initialization",
                    algorithm_parameters={'error_details': str(e)}
                )
            ) from e

        # Initialize WebDriver with comprehensive error handling and retry logic
        driver = None
        driver_initialization_start = time.perf_counter()

        for attempt in range(retry_attempts):
            try:
                # Attempt Chrome WebDriver initialization
                driver = webdriver.Chrome(service=webdriver_service, options=chrome_options)

                # Configure timeouts for robust automation
                driver.implicitly_wait(timeout)
                driver.set_page_load_timeout(timeout * 2)
                driver.set_script_timeout(timeout)

                # Validate driver functionality with basic test
                driver.get("data:text/html,<html><body>Test</body></html>")
                if "Test" not in driver.page_source:
                    raise LaunchError("Driver functionality test failed")

                # Record successful driver initialization
                driver_initialization_time = time.perf_counter() - driver_initialization_start
                operation_metrics['processing_stages']['driver_initialization'] = {
                    'duration_seconds': driver_initialization_time,
                    'attempt_number': attempt + 1,
                    'initialization_successful': True,
                    'browser_version': driver.capabilities.get('browserVersion', 'unknown'),
                    'driver_version': driver.capabilities.get('chrome', {}).get('chromedriverVersion', 'unknown')
                }
                break

            except WebDriverException as e:
                # Record failed attempt
                operation_metrics['automation_events'].append({
                    'event_type': 'driver_initialization_failed',
                    'attempt': attempt + 1,
                    'error': str(e),
                    'timestamp': datetime.datetime.utcnow().isoformat()
                })

                # Clean up failed driver instance
                if driver is not None:
                    try:
                        driver.quit()
                    except:
                        pass
                    driver = None

                # Retry with exponential backoff if attempts remaining
                if attempt < retry_attempts - 1:
                    backoff_time = 2 ** attempt
                    time.sleep(backoff_time)
                    continue
                else:
                    # All attempts failed
                    raise LaunchError(
                        f"ChromeDriver launch failed after {retry_attempts} attempts: {e}",
                        driver_path=driver_path_obj,
                        browser_version=None,
                        driver_version=None,
                        forensic_metadata=ForensicMetadata(
                            operation_name="chromedriver_launch_failure",
                            algorithm_parameters={
                                'total_attempts': retry_attempts,
                                'final_error': str(e)
                            }
                        )
                    ) from e

        # Main automation workflow with comprehensive error handling
        try:
            # Navigate to Google Images with retry logic and performance monitoring
            navigation_start = time.perf_counter()
            navigation_success = False

            for attempt in range(retry_attempts):
                try:
                    # Navigate to Google Images homepage
                    driver.get("https://images.google.com")

                    # Verify successful navigation by checking page elements
                    WebDriverWait(driver, timeout).until(
                        EC.presence_of_element_located((By.TAG_NAME, "body"))
                    )

                    # Validate page loaded correctly
                    if "Google" in driver.title and "images" in driver.current_url.lower():
                        navigation_success = True

                        # Record successful navigation
                        navigation_time = time.perf_counter() - navigation_start
                        operation_metrics['processing_stages']['navigation'] = {
                            'duration_seconds': navigation_time,
                            'attempt_number': attempt + 1,
                            'final_url': driver.current_url,
                            'page_title': driver.title,
                            'navigation_successful': True
                        }
                        break
                    else:
                        raise NavigationError(f"Unexpected page content: {driver.title}")

                except Exception as e:
                    # Record navigation failure
                    operation_metrics['automation_events'].append({
                        'event_type': 'navigation_failed',
                        'attempt': attempt + 1,
                        'error': str(e),
                        'current_url': driver.current_url if driver else 'unknown',
                        'timestamp': datetime.datetime.utcnow().isoformat()
                    })

                    if attempt < retry_attempts - 1:
                        time.sleep(2 ** attempt)  # Exponential backoff
                        continue

            if not navigation_success:
                raise NavigationError(
                    "Failed to navigate to Google Images after all attempts",
                    target_url="https://images.google.com",
                    forensic_metadata=ForensicMetadata(
                        operation_name="google_images_navigation",
                        algorithm_parameters={'total_attempts': retry_attempts}
                    )
                )

            # Implement robust selector strategy with comprehensive fallback hierarchy
            camera_button_selectors = [
                # Primary selectors with high specificity
                "button[aria-label*='Search by image']",
                "div[aria-label*='Search by image']",
                "button[data-ved*='camera']",
                "button[jsname='LgbsSe']",

                # CSS class fallbacks (may change frequently)
                "div.nDcEnd",
                ".rUDD3b",
                ".Gdd5U",

                # XPath fallbacks for robust element location
                "//button[@aria-label and contains(@aria-label, 'Search by image')]",
                "//div[@aria-label and contains(@aria-label, 'Search by image')]",
                "//button[contains(@class, 'camera') or contains(@data-ved, 'camera')]",
                "//div[@role='button' and contains(text(), 'camera')]"
            ]

            # Attempt to locate and click camera button using selector hierarchy
            camera_button_clicked = False
            camera_interaction_start = time.perf_counter()

            for selector_idx, selector in enumerate(camera_button_selectors):
                try:
                    # Determine selector type and create appropriate locator
                    if selector.startswith("//"):
                        locator = (By.XPATH, selector)
                        selector_type = "xpath"
                    else:
                        locator = (By.CSS_SELECTOR, selector)
                        selector_type = "css"

                    # Wait for element to be present and clickable
                    camera_button = WebDriverWait(driver, timeout).until(
                        EC.element_to_be_clickable(locator)
                    )

                    # Scroll element into view for reliable interaction
                    driver.execute_script("arguments[0].scrollIntoView({block: 'center'});", camera_button)
                    time.sleep(0.5)  # Brief pause for scroll completion

                    # Attempt click with multiple strategies
                    click_successful = False
                    click_strategies = [
                        lambda: camera_button.click(),  # Standard click
                        lambda: driver.execute_script("arguments[0].click();", camera_button),  # JavaScript click
                        lambda: webdriver.ActionChains(driver).move_to_element(camera_button).click().perform()  # Action chains
                    ]

                    for strategy_idx, click_strategy in enumerate(click_strategies):
                        try:
                            click_strategy()
                            click_successful = True
                            break
                        except Exception as click_error:
                            if strategy_idx < len(click_strategies) - 1:
                                time.sleep(0.2)  # Brief pause before next strategy
                                continue
                            else:
                                raise click_error

                    if click_successful:
                        camera_button_clicked = True

                        # Record successful camera button interaction
                        camera_interaction_time = time.perf_counter() - camera_interaction_start
                        operation_metrics['processing_stages']['camera_button_interaction'] = {
                            'duration_seconds': camera_interaction_time,
                            'selector_used': selector,
                            'selector_type': selector_type,
                            'selector_index': selector_idx,
                            'click_successful': True
                        }
                        break

                except TimeoutException:
                    # Selector not found - try next in hierarchy
                    operation_metrics['automation_events'].append({
                        'event_type': 'selector_timeout',
                        'selector': selector,
                        'selector_index': selector_idx,
                        'timestamp': datetime.datetime.utcnow().isoformat()
                    })
                    continue

                except Exception as e:
                    # Other interaction errors - log and continue
                    operation_metrics['automation_events'].append({
                        'event_type': 'selector_interaction_failed',
                        'selector': selector,
                        'selector_index': selector_idx,
                        'error': str(e),
                        'timestamp': datetime.datetime.utcnow().isoformat()
                    })
                    continue

            if not camera_button_clicked:
                raise NavigationError(
                    "Failed to locate and click camera button with all selectors",
                    target_url=driver.current_url,
                    automation_state={'selectors_tried': len(camera_button_selectors)},
                    forensic_metadata=ForensicMetadata(
                        operation_name="camera_button_interaction",
                        algorithm_parameters={'selectors_attempted': camera_button_selectors}
                    )
                )

            # Locate and interact with upload tab using multiple strategies
            upload_tab_selectors = [
                # Direct text-based selectors
                "//a[contains(text(), 'Upload an image')]",
                "//div[contains(text(), 'Upload an image')]",
                "//span[contains(text(), 'Upload')]",

                # Attribute-based selectors
                "a[href*='upload']",
                "div[data-ved*='upload']",
                "button[aria-label*='upload']",

                # CSS class fallbacks
                ".RZQOVd",
                ".aXBtI",
                ".Gdd5U"
            ]

            upload_tab_clicked = False
            upload_tab_interaction_start = time.perf_counter()

            for selector_idx, selector in enumerate(upload_tab_selectors):
                try:
                    if selector.startswith("//"):
                        locator = (By.XPATH, selector)
                        selector_type = "xpath"
                    else:
                        locator = (By.CSS_SELECTOR, selector)
                        selector_type = "css"

                    # Wait for upload tab element
                    upload_tab = WebDriverWait(driver, timeout).until(
                        EC.element_to_be_clickable(locator)
                    )

                    # Scroll into view and click
                    driver.execute_script("arguments[0].scrollIntoView({block: 'center'});", upload_tab)
                    time.sleep(0.3)

                    # Attempt click with fallback strategies
                    try:
                        upload_tab.click()
                    except:
                        driver.execute_script("arguments[0].click();", upload_tab)

                    upload_tab_clicked = True

                    # Record successful upload tab interaction
                    upload_tab_interaction_time = time.perf_counter() - upload_tab_interaction_start
                    operation_metrics['processing_stages']['upload_tab_interaction'] = {
                        'duration_seconds': upload_tab_interaction_time,
                        'selector_used': selector,
                        'selector_type': selector_type,
                        'selector_index': selector_idx,
                        'interaction_successful': True
                    }
                    break

                except TimeoutException:
                    continue
                except Exception as e:
                    operation_metrics['automation_events'].append({
                        'event_type': 'upload_tab_interaction_failed',
                        'selector': selector,
                        'error': str(e),
                        'timestamp': datetime.datetime.utcnow().isoformat()
                    })
                    continue

            if not upload_tab_clicked:
                raise NavigationError(
                    "Failed to locate and click upload tab with all selectors",
                    target_url=driver.current_url,
                    automation_state={'upload_selectors_tried': len(upload_tab_selectors)}
                )

            # Locate file input element and perform upload with comprehensive validation
            file_input_selectors = [
                "input[name='encoded_image']",
                "input[type='file']",
                "input[accept*='image']",
                "input[class*='file']",
                ".cB9M7",
                "//input[@type='file']",
                "//input[@accept and contains(@accept, 'image')]"
            ]

            file_uploaded = False
            file_upload_start = time.perf_counter()

            for selector_idx, selector in enumerate(file_input_selectors):
                try:
                    if selector.startswith("//"):
                        locator = (By.XPATH, selector)
                    else:
                        locator = (By.CSS_SELECTOR, selector)

                    # Wait for file input element
                    file_input = WebDriverWait(driver, timeout).until(
                        EC.presence_of_element_located(locator)
                    )

                    # Validate file input element properties
                    if not file_input.is_enabled():
                        continue

                    # Upload file by sending keys with absolute path
                    absolute_path = str(validated_image_path.resolve())
                    file_input.send_keys(absolute_path)

                    # Verify upload initiated by checking for page changes
                    time.sleep(1.0)  # Allow upload to initiate

                    file_uploaded = True

                    # Record successful file upload
                    file_upload_time = time.perf_counter() - file_upload_start
                    operation_metrics['processing_stages']['file_upload'] = {
                        'duration_seconds': file_upload_time,
                        'selector_used': selector,
                        'file_path': absolute_path,
                        'file_size_bytes': validated_image_path.stat().st_size,
                        'upload_successful': True
                    }
                    break

                except TimeoutException:
                    continue
                except Exception as e:
                    operation_metrics['automation_events'].append({
                        'event_type': 'file_upload_failed',
                        'selector': selector,
                        'error': str(e),
                        'timestamp': datetime.datetime.utcnow().isoformat()
                    })
                    continue

            if not file_uploaded:
                raise UploadError(
                    "Failed to upload image file with all selectors",
                    file_path=validated_image_path,
                    file_size=int(validated_image_path.stat().st_size),
                    upload_parameters={'selectors_tried': len(file_input_selectors)},
                    forensic_metadata=ForensicMetadata(
                        operation_name="image_file_upload",
                        algorithm_parameters={'file_path': str(validated_image_path)}
                    )
                )

            # Wait for results page to load and extract comprehensive data
            results_extraction_start = time.perf_counter()

            # Wait for page to load search results
            try:
                WebDriverWait(driver, timeout * 2).until(
                    lambda d: d.execute_script("return document.readyState") == "complete"
                )
                time.sleep(2.0)  # Additional wait for dynamic content
            except TimeoutException:
                pass  # Continue with extraction even if page not fully loaded

            # Extract best guess text with comprehensive selector strategy
            best_guess_text = ""
            best_guess_selectors = [
                "div[role='heading']",
                ".fKDtNb",
                "//div[contains(@class, 'r5a77d')]",
                "h3",
                ".gLFyf",
                "//div[@data-ved and contains(text(), ' ')]",
                ".yXK7lf"
            ]

            for selector in best_guess_selectors:
                try:
                    if selector.startswith("//"):
                        locator = (By.XPATH, selector)
                    else:
                        locator = (By.CSS_SELECTOR, selector)

                    best_guess_element = WebDriverWait(driver, timeout // 2).until(
                        EC.presence_of_element_located(locator)
                    )

                    extracted_text = best_guess_element.text.strip()
                    if extracted_text and len(extracted_text) > 3:  # Meaningful text threshold
                        best_guess_text = extracted_text
                        break

                except TimeoutException:
                    continue
                except Exception as e:
                    operation_metrics['automation_events'].append({
                        'event_type': 'best_guess_extraction_failed',
                        'selector': selector,
                        'error': str(e),
                        'timestamp': datetime.datetime.utcnow().isoformat()
                    })
                    continue

            # Extract similar image URLs with advanced filtering and validation
            similar_urls = []
            similar_image_selectors = [
                "a[jsname='sTFXNd']",
                "a[href*='/imgres?']",
                ".rg_l",
                "//a[contains(@href, 'imgres')]",
                ".isv-r a",
                ".rg_di a"
            ]

            for selector in similar_image_selectors:
                try:
                    if selector.startswith("//"):
                        locator = (By.XPATH, selector)
                    else:
                        locator = (By.CSS_SELECTOR, selector)

                    similar_elements = WebDriverWait(driver, timeout // 2).until(
                        EC.presence_of_all_elements_located(locator)
                    )

                    # Extract and validate URLs
                    for element in similar_elements[:max_similar_urls * 2]:  # Get extra for filtering
                        try:
                            href = element.get_attribute("href")
                            if href and self._validate_search_result_url(href):
                                if href not in similar_urls:  # Avoid duplicates
                                    similar_urls.append(href)
                                    if len(similar_urls) >= max_similar_urls:
                                        break
                        except Exception:
                            continue

                    if similar_urls:  # Stop if URLs found
                        break

                except TimeoutException:
                    continue
                except Exception as e:
                    operation_metrics['automation_events'].append({
                        'event_type': 'similar_images_extraction_failed',
                        'selector': selector,
                        'error': str(e),
                        'timestamp': datetime.datetime.utcnow().isoformat()
                    })
                    continue

            # Extract additional metadata if advanced extraction enabled
            additional_metadata = {}
            if advanced_extraction:
                try:
                    # Extract page metadata
                    additional_metadata = {
                        'page_url': driver.current_url,
                        'page_title': driver.title,
                        'search_timestamp': datetime.datetime.utcnow().isoformat(),
                        'total_page_elements': len(driver.find_elements(By.TAG_NAME, "*")),
                        'page_load_time': results_extraction_start - method_start_time
                    }

                    # Extract domain distribution from URLs
                    if similar_urls:
                        from urllib.parse import urlparse
                        domains = []
                        for url in similar_urls:
                            try:
                                domain = urlparse(url).netloc.lower()
                                domains.append(domain)
                            except:
                                continue

                        unique_domains = list(set(domains))
                        additional_metadata.update({
                            'unique_domain_count': len(unique_domains),
                            'domain_distribution': {domain: domains.count(domain) for domain in unique_domains},
                            'duplicate_url_count': len(similar_urls) - len(set(similar_urls))
                        })

                except Exception as e:
                    additional_metadata = {'extraction_error': str(e)}

            # Record results extraction performance
            results_extraction_time = time.perf_counter() - results_extraction_start
            operation_metrics['processing_stages']['results_extraction'] = {
                'duration_seconds': results_extraction_time,
                'best_guess_extracted': bool(best_guess_text),
                'similar_urls_count': len(similar_urls),
                'advanced_metadata_extracted': bool(additional_metadata),
                'extraction_successful': True
            }

            # Perform content analysis if requested
            content_quality_metrics = {}
            if content_analysis and (best_guess_text or similar_urls):
                content_analysis_start = time.perf_counter()

                try:
                    # Analyze best guess text quality
                    if best_guess_text:
                        content_quality_metrics['best_guess_analysis'] = {
                            'text_length': len(best_guess_text),
                            'word_count': len(best_guess_text.split()),
                            'contains_numbers': any(char.isdigit() for char in best_guess_text),
                            'contains_special_chars': any(not char.isalnum() and not char.isspace() for char in best_guess_text),
                            'language_detected': self._detect_text_language(best_guess_text),
                            'confidence_score': self._compute_text_confidence(best_guess_text)
                        }

                    # Analyze URL quality and distribution
                    if similar_urls:
                        url_analysis = {
                            'total_urls': len(similar_urls),
                            'unique_urls': len(set(similar_urls)),
                            'duplicate_ratio': (len(similar_urls) - len(set(similar_urls))) / len(similar_urls),
                            'average_url_length': statistics.mean([len(url) for url in similar_urls]),
                            'https_ratio': sum(1 for url in similar_urls if url.startswith('https://')) / len(similar_urls),
                            'domain_diversity': len(set(urlparse(url).netloc for url in similar_urls if self._is_valid_url(url)))
                        }
                        content_quality_metrics['url_analysis'] = url_analysis

                    # Compute overall content quality score
                    quality_factors = []

                    if best_guess_text:
                        text_quality = min(len(best_guess_text) / 50.0, 1.0)  # Normalize to [0,1]
                        quality_factors.append(text_quality * 0.4)

                    if similar_urls:
                        url_quality = min(len(similar_urls) / max_similar_urls, 1.0)
                        quality_factors.append(url_quality * 0.6)

                    overall_quality = sum(quality_factors) if quality_factors else 0.0
                    content_quality_metrics['overall_quality_score'] = overall_quality

                    # Record content analysis performance
                    content_analysis_time = time.perf_counter() - content_analysis_start
                    operation_metrics['processing_stages']['content_analysis'] = {
                        'duration_seconds': content_analysis_time,
                        'quality_score': overall_quality,
                        'analysis_successful': True
                    }

                except Exception as e:
                    content_quality_metrics = {'analysis_error': str(e)}
                    operation_metrics['processing_stages']['content_analysis'] = {
                        'error': str(e)
                    }

            # Compute confidence score based on result characteristics
            confidence_score = 0.0
            if best_guess_text or similar_urls:
                confidence_factors = []

                # Text-based confidence
                if best_guess_text:
                    text_conf = min(len(best_guess_text.split()) / 10.0, 1.0)
                    confidence_factors.append(text_conf * 0.3)

                # URL-based confidence
                if similar_urls:
                    url_conf = min(len(similar_urls) / max_similar_urls, 1.0)
                    confidence_factors.append(url_conf * 0.4)

                # Diversity-based confidence
                if 'unique_domain_count' in additional_metadata:
                    diversity_conf = min(additional_metadata['unique_domain_count'] / 5.0, 1.0)
                    confidence_factors.append(diversity_conf * 0.3)

                confidence_score = sum(confidence_factors)

            # Record comprehensive performance metrics
            total_execution_time = time.perf_counter() - method_start_time
            memory_after = psutil.Process().memory_info().rss
            memory_delta = memory_after - memory_before

            operation_metrics.update({
                'total_execution_time_seconds': total_execution_time,
                'memory_delta_bytes': memory_delta,
                'confidence_score': confidence_score,
                'result_type': 'successful_search'
            })

            # Store performance metrics for analysis if monitoring enabled
            if performance_monitoring:
                self._record_method_performance('reverse_image_search_google', operation_metrics)

            # Create comprehensive result structure with all extracted data
            search_result = ReverseImageSearchResult(
                best_guess=best_guess_text or "No description found",
                similar_image_urls=similar_urls,
                source_page_title=driver.title,
                site_authority_score=None,  # Could be computed from domain analysis
                snippet_text=None,  # Could be extracted from page content
                confidence_score=confidence_score,
                search_timestamp=datetime.datetime.utcnow(),
                search_duration_seconds=total_execution_time,
                duplicate_url_count=additional_metadata.get('duplicate_url_count', 0),
                unique_domain_count=additional_metadata.get('unique_domain_count', 0),
                geographic_indicators=additional_metadata.get('domain_distribution')
            )

            # Validate result if requested
            if result_validation:
                try:
                    is_valid, violations = search_result.validate_constraints(strict=False)
                    if not is_valid:
                        operation_metrics['result_validation'] = {
                            'validation_passed': False,
                            'violations': violations
                        }
                    else:
                        operation_metrics['result_validation'] = {
                            'validation_passed': True
                        }
                except Exception as e:
                    operation_metrics['result_validation'] = {
                        'validation_error': str(e)
                    }

            return search_result

        except Exception as e:
            # Comprehensive error handling with context preservation
            if isinstance(e, (NavigationError, UploadError, ExtractionError)):
                raise  # Re-raise domain-specific exceptions
            else:
                # Wrap unexpected errors in appropriate exception type
                raise ExtractionError(
                    f"Reverse image search failed: {e}",
                    extraction_target="google_search_results",
                    dom_state={'current_url': driver.current_url if driver else 'unknown'},
                    extraction_parameters=operation_metrics['parameters'],
                    forensic_metadata=ForensicMetadata(
                        operation_name="reverse_image_search_failure",
                        algorithm_parameters={'error_type': type(e).__name__}
                    )
                ) from e

        finally:
            # Ensure WebDriver cleanup regardless of success or failure
            if driver is not None:
                try:
                    # Capture final state for debugging if needed
                    final_url = driver.current_url
                    final_title = driver.title

                    # Close browser and clean up resources
                    driver.quit()

                    # Record cleanup completion
                    operation_metrics['cleanup'] = {
                        'driver_closed': True,
                        'final_url': final_url,
                        'final_title': final_title
                    }

                except Exception as cleanup_error:
                    # Log cleanup failures but don't raise
                    operation_metrics['cleanup'] = {
                        'cleanup_error': str(cleanup_error)
                    }

    def _validate_search_result_url(self, url: str) -> bool:
        """
        Validate search result URL for quality and relevance.

        Args:
            url: URL to validate

        Returns:
            Boolean indicating URL validity
        """
        if not url or not isinstance(url, str):
            return False

        # Basic URL structure validation
        if not url.startswith(('http://', 'https://')):
            return False

        # Filter out Google's internal URLs
        google_internal_patterns = [
            'google.com/search',
            'google.com/url',
            'googleusercontent.com',
            'accounts.google.com'
        ]

        if any(pattern in url.lower() for pattern in google_internal_patterns):
            return False

        # Check for reasonable URL length
        if len(url) > 2000:  # Extremely long URLs are suspicious
            return False

        return True

    def _detect_text_language(self, text: str) -> str:
        """
        Detect language of extracted text using simple heuristics.

        Args:
            text: Text to analyze

        Returns:
            Detected language code or 'unknown'
        """
        if not text:
            return 'unknown'

        # Simple heuristic based on character patterns
        # In production, would use proper language detection library
        ascii_ratio = sum(1 for char in text if ord(char) < 128) / len(text)

        if ascii_ratio > 0.9:
            return 'en'  # Likely English
        else:
            return 'other'  # Non-English or mixed

    def _compute_text_confidence(self, text: str) -> float:
        """
        Compute confidence score for extracted text quality.

        Args:
            text: Text to analyze

        Returns:
            Confidence score [0,1]
        """
        if not text:
            return 0.0

        # Factors contributing to text confidence
        factors = []

        # Length factor
        length_factor = min(len(text) / 100.0, 1.0)
        factors.append(length_factor * 0.3)

        # Word count factor
        word_count = len(text.split())
        word_factor = min(word_count / 20.0, 1.0)
        factors.append(word_factor * 0.3)

        # Character diversity factor
        unique_chars = len(set(text.lower()))
        diversity_factor = min(unique_chars / 20.0, 1.0)
        factors.append(diversity_factor * 0.2)

        # Completeness factor (no truncation indicators)
        truncation_indicators = ['...', '…', '[...]', 'more']
        completeness_factor = 0.0 if any(indicator in text.lower() for indicator in truncation_indicators) else 1.0
        factors.append(completeness_factor * 0.2)

        return sum(factors)


In [None]:
# Usage Example

def demonstrate_image_provenance_analysis(
    detector: ImageSimilarityDetector,
    image1_path: Union[str, Path],
    image2_path: Union[str, Path],
    chromedriver_path: Union[str, Path]
) -> Dict[str, Any]:
    """
    Executes a comprehensive, multi-modal analysis to compare two images and
    establish the provenance of the first image.

    This function serves as a production-grade demonstration of the
    ImageSimilarityDetector's capabilities, invoking each of its primary
    analytical methods in a structured sequence. It captures results from
    perceptual hashing, local feature matching, global color analysis,
    semantic similarity, and public reverse image search.

    The methodology proceeds from low-level structural comparisons to
    high-level semantic and contextual analysis, providing a holistic
    view of the relationship between the images.

    Args:
        detector (ImageSimilarityDetector): An initialized instance of the
            image similarity detector.
        image1_path (Union[str, Path]): The file path to the primary image
            to be analyzed and compared. This image will also be used for
            the reverse image search.
        image2_path (Union[str, Path]): The file path to the secondary image
            for comparison.
        chromedriver_path (Union[str, Path]): The file path to the
            Selenium ChromeDriver executable, required for reverse image search.

    Returns:
        Dict[str, Any]: A dictionary containing the detailed results from each
            analysis stage. Each key corresponds to an analysis method, and
            the value is either a comprehensive result dictionary/object or
            an error message if that stage failed.

    Raises:
        This function is designed to be robust and captures exceptions from
        underlying methods within its results dictionary rather than raising
        them, allowing the overall analysis to complete where possible.
    """
    # Initialize a dictionary to aggregate the results from all analysis methods.
    analysis_results: Dict[str, Any] = {}
    # Configure logging to provide visibility into the analysis process.
    logging.info(f"Starting comprehensive provenance analysis for '{Path(image1_path).name}' and '{Path(image2_path).name}'.")

    # --- Stage 1: Perceptual Hash Analysis (Structural Duplication) ---
    # This stage checks for near-identical copies using a DCT-based perceptual hash.
    # It is highly effective for detecting direct replication with minor modifications.
    logging.info("Executing Stage 1: Perceptual Hash Analysis...")
    try:
        # Execute perceptual hash analysis with full statistical reporting for rigor.
        # A 16x16 hash (256 bits) is used for higher precision in detecting subtle differences.
        p_hash_results = detector.perceptual_hash_difference(
            image1_path,
            image2_path,
            hash_size=16,
            normalize=True,
            return_similarity=True,
            statistical_analysis=True
        )
        # Store the comprehensive results dictionary.
        analysis_results['perceptual_hash'] = p_hash_results
        # Log the primary similarity score for quick assessment.
        logging.info(f"  - pHash Similarity Score: {p_hash_results.get('similarity_score', 'N/A'):.4f}")
    except (ImageUnreadableError, ValueError, RuntimeError) as e:
        # Gracefully handle and record errors related to image processing or hash computation.
        analysis_results['perceptual_hash'] = {'error': str(e), 'details': traceback.format_exc()}
        # Log the failure of this analysis stage.
        logging.error(f"  - Perceptual Hash Analysis failed: {e}")

    # --- Stage 2: Local Feature Matching Analysis (Geometric Consistency) ---
    # This stage detects structural copying of components using ORB features and RANSAC.
    # It is critical for identifying collage-like compositions or transformed object insertions.
    logging.info("Executing Stage 2: Local Feature Matching Analysis...")
    try:
        # Execute feature matching with geometric verification to ensure spatial coherence.
        # Lowe's ratio test is applied for more robust and less ambiguous matches.
        feature_match_results = detector.feature_match_ratio(
            image1_path,
            image2_path,
            distance_threshold=64,
            normalization_strategy="min_keypoints",
            apply_ratio_test=True,
            ratio_threshold=0.75,
            resize_max_side=1024,
            return_detailed_result=True,
            geometric_verification=True,
            statistical_analysis=True
        )
        # Store the comprehensive FeatureMatchResult object for detailed inspection.
        analysis_results['feature_matching'] = feature_match_results
        # Log key metrics: the overall similarity and the geometric inlier ratio.
        logging.info(f"  - Feature Match Similarity Ratio: {feature_match_results.similarity_ratio:.4f}")
        logging.info(f"  - Geometric Inlier Ratio: {feature_match_results.homography_inlier_ratio or 'N/A'}")
    except (ImageUnreadableError, RuntimeError) as e:
        # Gracefully handle and record errors in the feature detection or matching pipeline.
        analysis_results['feature_matching'] = {'error': str(e), 'details': traceback.format_exc()}
        # Log the failure of this analysis stage.
        logging.error(f"  - Feature Matching Analysis failed: {e}")

    # --- Stage 3: Global Color Distribution Analysis (Palette Similarity) ---
    # This stage compares the overall color palettes, providing a measure of aesthetic similarity.
    # It is the weakest signal for direct copying but useful for stylistic analysis.
    logging.info("Executing Stage 3: Global Color Distribution Analysis...")
    try:
        # Execute color histogram correlation in the HSV space to be robust to lighting changes.
        # Pearson correlation provides a normalized measure of how similarly the color distributions vary.
        histogram_results = detector.histogram_correlation(
            image1_path,
            image2_path,
            metric="correlation",
            color_space="HSV",
            statistical_analysis=True,
            adaptive_binning=True
        )
        # Store the comprehensive results dictionary.
        analysis_results['histogram_correlation'] = histogram_results
        # Log the primary correlation score for quick assessment.
        logging.info(f"  - Histogram Correlation: {histogram_results.get('similarity_score', 'N/A'):.4f}")
    except (ImageUnreadableError, HistogramError) as e:
        # Gracefully handle and record errors in histogram computation or comparison.
        analysis_results['histogram_correlation'] = {'error': str(e), 'details': traceback.format_exc()}
        # Log the failure of this analysis stage.
        logging.error(f"  - Histogram Correlation Analysis failed: {e}")

    # --- Stage 4: Semantic Meaning Analysis (Conceptual Similarity) ---
    # This stage uses a deep learning model (CLIP) to measure conceptual similarity.
    # It can identify that a photo of a cat and a painting of a cat are related.
    logging.info("Executing Stage 4: Semantic Meaning Analysis...")
    try:
        # Execute semantic similarity analysis with full statistical and embedding analysis.
        # This provides the most abstract and powerful form of comparison.
        clip_results = detector.clip_embedding_similarity(
            image1_path,
            image2_path,
            statistical_analysis=True,
            embedding_analysis=True,
            batch_processing=True
        )
        # Store the comprehensive results dictionary.
        analysis_results['semantic_similarity'] = clip_results
        # Log the primary cosine similarity score.
        logging.info(f"  - CLIP Cosine Similarity: {clip_results.get('cosine_similarity', 'N/A'):.4f}")
    except (ModelLoadError, ModelInferenceError, ValueError) as e:
        # Gracefully handle and record errors related to model loading or inference.
        analysis_results['semantic_similarity'] = {'error': str(e), 'details': traceback.format_exc()}
        # Log the failure of this analysis stage.
        logging.error(f"  - Semantic Similarity Analysis failed: {e}")

    # --- Stage 5: Public Provenance and Context Analysis (Web Discovery) ---
    # This stage uses web automation to discover if the primary image exists online.
    # It is a discovery process, not a direct comparison, to establish public context.
    logging.info("Executing Stage 5: Public Provenance Analysis...")
    try:
        # Execute a reverse image search on the first image.
        # This is run in headless mode for suitability in automated server environments.
        reverse_search_results = detector.reverse_image_search_google(
            image_path=image1_path,
            driver_path=chromedriver_path,
            headless=True,
            advanced_extraction=True,
            content_analysis=True
        )
        # Store the comprehensive ReverseImageSearchResult object.
        analysis_results['reverse_image_search'] = reverse_search_results
        # Log the primary finding from the search.
        logging.info(f"  - Reverse Search Best Guess: {reverse_search_results.best_guess}")
        logging.info(f"  - Found {len(reverse_search_results.similar_image_urls)} similar images online.")
    except (LaunchError, NavigationError, UploadError, ExtractionError) as e:
        # Gracefully handle and record the various failure modes of web automation.
        analysis_results['reverse_image_search'] = {'error': str(e), 'details': traceback.format_exc()}
        # Log the failure of this analysis stage.
        logging.error(f"  - Reverse Image Search failed: {e}")
    except Exception as e:
        # Catch any other unexpected errors during the search to prevent crashing.
        analysis_results['reverse_image_search'] = {'error': f"An unexpected error occurred: {e}", 'details': traceback.format_exc()}
        # Log the unexpected failure.
        logging.error(f"  - Reverse Image Search encountered an unexpected error: {e}")

    # Finalize the analysis process.
    logging.info("Comprehensive provenance analysis complete.")
    # Return the aggregated results dictionary.
    return analysis_results

if __name__ == '__main__':
    # --- Setup: Environment and Test Data ---
    # Configure a basic logger to direct output to the console for this demonstration.
    logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

    # Create a temporary directory for test assets.
    test_dir = Path("./test_images")
    test_dir.mkdir(exist_ok=True)

    # Programmatically create two test images for a controlled experiment.
    # Image 1: A base image with a distinct feature.
    # Image 2: A transformed version of Image 1 (rotated, color-shifted, compressed).
    try:
        # Define image parameters.
        img_size = (512, 512)
        # Create a base image using NumPy: a dark gray background.
        base_image_array = np.full((img_size[1], img_size[0], 3), 64, dtype=np.uint8)
        # Add a distinct feature: a bright blue circle in the center.
        cv2.circle(base_image_array, (img_size[0]//2, img_size[1]//2), 100, (255, 100, 50), -1)
        # Save the first image as a high-quality PNG.
        image1_path = test_dir / "original_image.png"
        cv2.imwrite(str(image1_path), base_image_array)

        # Create the second, modified image.
        # Start with the base image.
        modified_image_array = base_image_array.copy()
        # Apply a slight color shift to the entire image.
        modified_image_array = cv2.add(modified_image_array, np.array([10, 5, -15], dtype=np.uint8))
        # Get the rotation matrix for a 5-degree rotation around the center.
        rotation_matrix = cv2.getRotationMatrix2D((img_size[0]//2, img_size[1]//2), 5, 0.95) # Rotate and scale down
        # Apply the affine transformation (rotation).
        modified_image_array = cv2.warpAffine(modified_image_array, rotation_matrix, img_size)
        # Save the second image as a moderately compressed JPEG to introduce artifacts.
        image2_path = test_dir / "modified_image.jpg"
        cv2.imwrite(str(image2_path), modified_image_array, [cv2.IMWRITE_JPEG_QUALITY, 90])

        logging.info(f"Test images created: '{image1_path}' and '{image2_path}'")

        # --- Execution: Instantiate Detector and Run Analysis ---
        # IMPORTANT: The user must provide a valid path to their local ChromeDriver executable.
        # Download from: https://googlechromelabs.github.io/chrome-for-testing/
        chromedriver_path = Path("./chromedriver") # Assumes chromedriver is in the current directory.
        if not chromedriver_path.exists():
            logging.error("="*80)
            logging.error("FATAL: ChromeDriver not found at the specified path.")
            logging.error(f"Please download the correct version for your Chrome browser and place it at: '{chromedriver_path.resolve()}'")
            logging.error("Download from: https://googlechromelabs.github.io/chrome-for-testing/")
            logging.error("="*80)
            # Exit if the driver is not available, as the reverse search will fail.
            sys.exit(1)

        # Instantiate the main detector class with enterprise-grade settings.
        # This configuration enables performance monitoring and production-level validation.
        detector = ImageSimilarityDetector(
            resource_constraints=ResourceConstraints.BALANCED,
            enable_performance_monitoring=True,
            validation_policy=ValidationPolicy.PRODUCTION
        )

        # Execute the comprehensive analysis function.
        full_results = demonstrate_image_provenance_analysis(
            detector,
            image1_path,
            image2_path,
            chromedriver_path
        )

        # --- Output: Display Results ---
        # Define a custom serializer to handle complex objects like dataclasses and NumPy arrays.
        def result_serializer(obj):
            if isinstance(obj, (Path, np.ndarray)):
                return str(obj)
            if hasattr(obj, 'to_dict'):
                return obj.to_dict()
            if isinstance(obj, Exception):
                return f"{type(obj).__name__}: {str(obj)}"
            return obj.__dict__ if hasattr(obj, '__dict__') else str(obj)

        # Print the aggregated results in a clean, human-readable JSON format.
        print("\n" + "="*40 + " ANALYSIS RESULTS " + "="*40)
        print(json.dumps(full_results, default=result_serializer, indent=2))
        print("="*100)

    except Exception as e:
        # Catch any top-level exceptions during the demonstration setup or execution.
        logging.fatal(f"The demonstration script encountered a fatal error: {e}")
        logging.fatal(traceback.format_exc())

