# Custom Comparator Demo: Creating and Using Your Own Comparators

This notebook demonstrates how to create custom comparators in the stickler library and shows how different comparators produce different evaluation results.

## Overview

Comparators are a core component of the stickler library. They determine how individual field values are compared and scored. The library comes with several built-in comparators (ExactComparator, LevenshteinComparator, etc.), but you can also create custom ones for specialized comparison logic.

**What we'll cover:**
1. Create a custom `NumericProximityComparator`
2. Test it against built-in comparators
3. Use it in real StructuredModel comparisons
4. See how different comparators affect evaluation results

## 1. Setup and Imports

In [1]:
import sys
import os

# Add the src directory to path so we can import stickler
sys.path.insert(0, os.path.join(os.path.dirname(os.getcwd()), "src"))

from typing import Any
import pandas as pd

# Import stickler library components
from stickler.comparators.base import BaseComparator
from stickler.comparators.exact import ExactComparator
from stickler.comparators.levenshtein import LevenshteinComparator
from stickler.structured_object_evaluator.models.structured_model import StructuredModel
from stickler.structured_object_evaluator.models.comparable_field import ComparableField

print("✓ Successfully imported stickler library components")

✓ Successfully imported stickler library components


## 2. Create Custom NumericProximityComparator

Let's create a custom comparator that scores numeric values based on how close they are to each other. This is useful when you want to consider numbers "similar" if they're within a certain range, rather than requiring exact matches.

In [2]:
class NumericProximityComparator(BaseComparator):
    """A custom comparator that scores numbers based on proximity.

    This comparator returns higher similarity scores for numbers that are
    closer in value. It's useful for comparing prices, measurements, or
    other numeric data where exact matches aren't required.
    """

    def __init__(self, tolerance_percent: float = 0.1, threshold: float = 0.7):
        """Initialize the comparator.

        Args:
            tolerance_percent: Percentage difference that yields 1.0 similarity
                              (e.g., 0.1 = 10% difference = perfect score)
            threshold: Similarity threshold for binary classification
        """
        super().__init__(threshold=threshold)
        self.tolerance_percent = tolerance_percent

    def compare(self, value1: Any, value2: Any) -> float:
        """Compare two values based on numeric proximity.

        Args:
            value1: First value (will be converted to float)
            value2: Second value (will be converted to float)

        Returns:
            Similarity score between 0.0 and 1.0
        """
        # Handle None values
        if value1 is None and value2 is None:
            return 1.0
        if value1 is None or value2 is None:
            return 0.0

        try:
            # Convert to float
            num1 = float(value1)
            num2 = float(value2)

            # Handle exact match
            if num1 == num2:
                return 1.0

            # Handle zero values (avoid division by zero)
            if num1 == 0.0 and num2 == 0.0:
                return 1.0
            if num1 == 0.0 or num2 == 0.0:
                # Use absolute difference for zero cases
                max_val = max(abs(num1), abs(num2))
                if (
                    max_val <= self.tolerance_percent * 100
                ):  # Scale tolerance for absolute difference
                    return 1.0 - (max_val / (self.tolerance_percent * 100))
                else:
                    return 0.0

            # Calculate percentage difference
            avg_value = (abs(num1) + abs(num2)) / 2
            percent_diff = abs(num1 - num2) / avg_value

            # Convert to similarity score
            if percent_diff <= self.tolerance_percent:
                # Linear scale from 1.0 to 0.8 within tolerance
                return 1.0 - (percent_diff / self.tolerance_percent) * 0.2
            else:
                # Exponential decay beyond tolerance
                excess = percent_diff - self.tolerance_percent
                return max(0.0, 0.8 * (0.5 ** (excess * 10)))

        except (ValueError, TypeError):
            # Fallback to string comparison if conversion fails
            str1 = str(value1)
            str2 = str(value2)
            return 1.0 if str1 == str2 else 0.0

    def __repr__(self) -> str:
        return f"NumericProximityComparator(tolerance_percent={self.tolerance_percent}, threshold={self.threshold})"


print("✓ Created NumericProximityComparator class")

✓ Created NumericProximityComparator class


## 3. Basic Comparator Testing

Let's test our custom comparator with some simple examples and compare it to built-in comparators.

In [3]:
# Create instances of different comparators
numeric_comp = NumericProximityComparator(
    tolerance_percent=0.1, threshold=0.7
)  # 10% tolerance
exact_comp = ExactComparator(threshold=1.0)
levenshtein_comp = LevenshteinComparator(threshold=0.7)

# Test data: numeric values with different levels of similarity
test_pairs = [
    (100.0, 100.0),  # Identical
    (100.0, 105.0),  # 5% difference (within tolerance)
    (100.0, 110.0),  # 10% difference (at tolerance boundary)
    (100.0, 120.0),  # 20% difference (beyond tolerance)
    (100.0, 150.0),  # 50% difference (large difference)
    ("100.0", "105.0"),  # String versions
    ("apple", "apple"),  # Non-numeric identical
    ("apple", "apples"),  # Non-numeric similar
    (None, None),  # Both None
    (100.0, None),  # One None
]

print("Comparator Testing Results:")
print("=" * 80)
print(
    f"{'Value 1':<12} {'Value 2':<12} {'Numeric':<12} {'Exact':<12} {'Levenshtein':<12}"
)
print("-" * 80)

for val1, val2 in test_pairs:
    numeric_score = numeric_comp.compare(val1, val2)
    exact_score = exact_comp.compare(val1, val2)
    levenshtein_score = levenshtein_comp.compare(val1, val2)

    print(
        f"{str(val1):<12} {str(val2):<12} {numeric_score:<12.3f} {exact_score:<12.3f} {levenshtein_score:<12.3f}"
    )

print("\n💡 Key Observations:")
print("• NumericProximityComparator gives partial credit for close numbers")
print("• ExactComparator only gives 1.0 for perfect matches")
print("• LevenshteinComparator treats numbers as strings (character-level similarity)")

Comparator Testing Results:
Value 1      Value 2      Numeric      Exact        Levenshtein 
--------------------------------------------------------------------------------
100.0        100.0        1.000        1.000        1.000       
100.0        105.0        0.902        0.000        0.800       
100.0        110.0        0.810        0.000        0.800       
100.0        120.0        0.454        0.000        0.800       
100.0        150.0        0.100        0.000        0.800       
100.0        105.0        0.902        0.000        0.800       
apple        apple        1.000        1.000        1.000       
apple        apples       0.000        0.000        0.833       
None         None         1.000        1.000        1.000       
100.0        None         0.000        0.000        0.000       

💡 Key Observations:
• NumericProximityComparator gives partial credit for close numbers
• ExactComparator only gives 1.0 for perfect matches
• LevenshteinComparator treats num

## 4. Real-World Example: Product Comparison

Now let's use our custom comparator in a practical scenario. We'll create a Product model and compare products with similar names but different prices.

In [4]:
# First, let's create three different Product models using different price comparators


class ProductWithExactPrice(StructuredModel):
    """Product model using ExactComparator for price matching."""

    name: str = ComparableField(
        comparator=LevenshteinComparator(threshold=0.8), weight=2.0
    )
    price: float = ComparableField(
        comparator=ExactComparator(threshold=1.0), weight=3.0
    )


class ProductWithFuzzyPrice(StructuredModel):
    """Product model using NumericProximityComparator for price matching."""

    name: str = ComparableField(
        comparator=LevenshteinComparator(threshold=0.8), weight=2.0
    )
    price: float = ComparableField(
        comparator=NumericProximityComparator(
            tolerance_percent=0.15, threshold=0.7
        ),  # 15% tolerance
        weight=3.0,
    )


class ProductWithStringPrice(StructuredModel):
    """Product model using LevenshteinComparator for price matching."""

    name: str = ComparableField(
        comparator=LevenshteinComparator(threshold=0.8), weight=2.0
    )
    price: float = ComparableField(
        comparator=LevenshteinComparator(threshold=0.7), weight=3.0
    )


print("✓ Created three Product model variants with different price comparators")

✓ Created three Product model variants with different price comparators


In [5]:
# Create test products
print("Test Products:")
print("=" * 50)

# Ground truth product
gt_name = "Apple MacBook Pro 16-inch"
gt_price = 2499.99
print(f"Ground Truth: {gt_name} - ${gt_price}")

# Various predictions with different price differences
predictions = [
    ("Apple MacBook Pro 16-inch", 2499.99),  # Perfect match
    ("Apple MacBook Pro 16-inch", 2399.99),  # $100 less (4% diff)
    ("Apple MacBook Pro 16-inch", 2599.99),  # $100 more (4% diff)
    ("Apple MacBook Pro 16-inch", 2699.99),  # $200 more (8% diff)
    ("Apple MacBook Pro 16inch", 2499.99),  # Slightly different name, same price
    ('MacBook Pro 16"', 2199.99),  # Different name, $300 less (12% diff)
]

for i, (name, price) in enumerate(predictions, 1):
    price_diff = abs(price - gt_price)
    price_diff_pct = (price_diff / gt_price) * 100
    print(
        f"Prediction {i}: {name} - ${price} (${price_diff:.0f} diff, {price_diff_pct:.1f}%)"
    )

Test Products:
Ground Truth: Apple MacBook Pro 16-inch - $2499.99
Prediction 1: Apple MacBook Pro 16-inch - $2499.99 ($0 diff, 0.0%)
Prediction 2: Apple MacBook Pro 16-inch - $2399.99 ($100 diff, 4.0%)
Prediction 3: Apple MacBook Pro 16-inch - $2599.99 ($100 diff, 4.0%)
Prediction 4: Apple MacBook Pro 16-inch - $2699.99 ($200 diff, 8.0%)
Prediction 5: Apple MacBook Pro 16inch - $2499.99 ($0 diff, 0.0%)
Prediction 6: MacBook Pro 16" - $2199.99 ($300 diff, 12.0%)


## 5. Side-by-Side Comparison Results

Now let's compare all predictions using each of our three Product models and see how the different price comparators affect the overall similarity scores.

In [6]:
# Create ground truth products for each model type
gt_exact = ProductWithExactPrice(name=gt_name, price=gt_price)
gt_fuzzy = ProductWithFuzzyPrice(name=gt_name, price=gt_price)
gt_string = ProductWithStringPrice(name=gt_name, price=gt_price)

# Store results for comparison
results = []

print("Detailed Comparison Results:")
print("=" * 100)

for i, (pred_name, pred_price) in enumerate(predictions, 1):
    # Create prediction products for each model type
    pred_exact = ProductWithExactPrice(name=pred_name, price=pred_price)
    pred_fuzzy = ProductWithFuzzyPrice(name=pred_name, price=pred_price)
    pred_string = ProductWithStringPrice(name=pred_name, price=pred_price)

    # Compare using each model
    result_exact = gt_exact.compare_with(pred_exact)
    result_fuzzy = gt_fuzzy.compare_with(pred_fuzzy)
    result_string = gt_string.compare_with(pred_string)

    # Store results
    results.append(
        {
            "Prediction": f"Pred {i}",
            "Name": pred_name[:25] + "..." if len(pred_name) > 25 else pred_name,
            "Price": f"${pred_price}",
            "Price_Diff_Pct": f"{abs(pred_price - gt_price) / gt_price * 100:.1f}%",
            "Exact_Price_Score": f"{result_exact['field_scores']['price']:.3f}",
            "Fuzzy_Price_Score": f"{result_fuzzy['field_scores']['price']:.3f}",
            "String_Price_Score": f"{result_string['field_scores']['price']:.3f}",
            "Exact_Overall": f"{result_exact['overall_score']:.3f}",
            "Fuzzy_Overall": f"{result_fuzzy['overall_score']:.3f}",
            "String_Overall": f"{result_string['overall_score']:.3f}",
        }
    )

    print(f"\nPrediction {i}: {pred_name} - ${pred_price}")
    print(
        f"  Price Scores:  Exact={result_exact['field_scores']['price']:.3f}  |  "
        f"Fuzzy={result_fuzzy['field_scores']['price']:.3f}  |  "
        f"String={result_string['field_scores']['price']:.3f}"
    )
    print(
        f"  Overall Scores: Exact={result_exact['overall_score']:.3f}  |  "
        f"Fuzzy={result_fuzzy['overall_score']:.3f}  |  "
        f"String={result_string['overall_score']:.3f}"
    )

    # Show threshold-based classification
    exact_match = "✓" if result_exact["overall_score"] >= 0.8 else "✗"
    fuzzy_match = "✓" if result_fuzzy["overall_score"] >= 0.8 else "✗"
    string_match = "✓" if result_string["overall_score"] >= 0.8 else "✗"
    print(
        f"  Match @0.8:     Exact={exact_match}      |  Fuzzy={fuzzy_match}      |  String={string_match}"
    )

Detailed Comparison Results:

Prediction 1: Apple MacBook Pro 16-inch - $2499.99
  Price Scores:  Exact=1.000  |  Fuzzy=1.000  |  String=1.000
  Overall Scores: Exact=1.000  |  Fuzzy=1.000  |  String=1.000
  Match @0.8:     Exact=✓      |  Fuzzy=✓      |  String=✓

Prediction 2: Apple MacBook Pro 16-inch - $2399.99
  Price Scores:  Exact=0.000  |  Fuzzy=0.857  |  String=0.857
  Overall Scores: Exact=0.400  |  Fuzzy=0.914  |  String=0.914
  Match @0.8:     Exact=✗      |  Fuzzy=✓      |  String=✓

Prediction 3: Apple MacBook Pro 16-inch - $2599.99
  Price Scores:  Exact=0.000  |  Fuzzy=0.857  |  String=0.857
  Overall Scores: Exact=0.400  |  Fuzzy=0.914  |  String=0.914
  Match @0.8:     Exact=✗      |  Fuzzy=✓      |  String=✓

Prediction 4: Apple MacBook Pro 16-inch - $2699.99
  Price Scores:  Exact=0.000  |  Fuzzy=0.857  |  String=0.857
  Overall Scores: Exact=0.400  |  Fuzzy=0.914  |  String=0.914
  Match @0.8:     Exact=✗      |  Fuzzy=✓      |  String=✓

Prediction 5: Apple MacBoo

## 6. Summary Table

Let's create a clean summary table to see the differences at a glance.

In [7]:
# Create a pandas DataFrame for better visualization
df = pd.DataFrame(results)

print("\n📊 SUMMARY: How Different Price Comparators Affect Product Similarity")
print("=" * 85)
print(
    df[
        [
            "Prediction",
            "Price",
            "Price_Diff_Pct",
            "Exact_Overall",
            "Fuzzy_Overall",
            "String_Overall",
        ]
    ].to_string(index=False)
)

print("\n📋 Price Comparator Details:")
print("• Exact:  ExactComparator - Only perfect matches get 1.0")
print("• Fuzzy:  NumericProximityComparator - Gradual scoring based on % difference")
print(
    "• String: LevenshteinComparator - Treats prices as strings (character similarity)"
)


📊 SUMMARY: How Different Price Comparators Affect Product Similarity
Prediction    Price Price_Diff_Pct Exact_Overall Fuzzy_Overall String_Overall
    Pred 1 $2499.99           0.0%         1.000         1.000          1.000
    Pred 2 $2399.99           4.0%         0.400         0.914          0.914
    Pred 3 $2599.99           4.0%         0.400         0.914          0.914
    Pred 4 $2699.99           8.0%         0.400         0.914          0.914
    Pred 5 $2499.99           0.0%         0.984         0.984          0.984
    Pred 6 $2199.99          12.0%         0.224         0.738          0.738

📋 Price Comparator Details:
• Exact:  ExactComparator - Only perfect matches get 1.0
• Fuzzy:  NumericProximityComparator - Gradual scoring based on % difference
• String: LevenshteinComparator - Treats prices as strings (character similarity)


## 7. Binary Classification Analysis

Let's see how the different comparators perform for binary classification (match vs. no-match) at different thresholds.

In [8]:
# Analyze binary classification at different thresholds
thresholds = [0.7, 0.8, 0.9]

print("🎯 Binary Classification Results (Match vs No-Match)")
print("=" * 70)

for threshold in thresholds:
    print(f"\nThreshold: {threshold}")
    print(f"{'Prediction':<12} {'Exact':<8} {'Fuzzy':<8} {'String':<8}")
    print("-" * 40)

    for i, result in enumerate(results, 1):
        exact_match = "Match" if float(result["Exact_Overall"]) >= threshold else "No"
        fuzzy_match = "Match" if float(result["Fuzzy_Overall"]) >= threshold else "No"
        string_match = "Match" if float(result["String_Overall"]) >= threshold else "No"

        print(f"Pred {i:<8} {exact_match:<8} {fuzzy_match:<8} {string_match:<8}")

print("\n💡 Key Insights:")
print("• Custom comparators can be more lenient or strict than built-in ones")
print("• NumericProximityComparator gives partial credit for close values")
print("• Choice of comparator significantly impacts match/no-match decisions")
print("• Consider your use case: strict matching vs. tolerant matching")

🎯 Binary Classification Results (Match vs No-Match)

Threshold: 0.7
Prediction   Exact    Fuzzy    String  
----------------------------------------
Pred 1        Match    Match    Match   
Pred 2        No       Match    Match   
Pred 3        No       Match    Match   
Pred 4        No       Match    Match   
Pred 5        Match    Match    Match   
Pred 6        No       Match    Match   

Threshold: 0.8
Prediction   Exact    Fuzzy    String  
----------------------------------------
Pred 1        Match    Match    Match   
Pred 2        No       Match    Match   
Pred 3        No       Match    Match   
Pred 4        No       Match    Match   
Pred 5        Match    Match    Match   
Pred 6        No       No       No      

Threshold: 0.9
Prediction   Exact    Fuzzy    String  
----------------------------------------
Pred 1        Match    Match    Match   
Pred 2        No       Match    Match   
Pred 3        No       Match    Match   
Pred 4        No       Match    Match   
P

## 8. Key Takeaways

### When to Create Custom Comparators

1. **Domain-Specific Logic**: When you need specialized comparison logic (like our numeric proximity example)
2. **Better Tolerance**: When built-in comparators are too strict or too lenient
3. **Performance**: When you need optimized comparison for specific data types
4. **Complex Rules**: When simple string/numeric comparison isn't enough

### Best Practices

1. **Inherit from BaseComparator**: Always extend the base class
2. **Handle Edge Cases**: None values, type conversion failures
3. **Return 0.0-1.0**: Ensure similarity scores are in the correct range
4. **Test Thoroughly**: Test with various inputs and edge cases
5. **Document Well**: Clear docstrings and parameter explanations

### Impact on Evaluation

As we saw in this demo:
- **ExactComparator**: Very strict, only perfect matches get high scores
- **NumericProximityComparator**: More nuanced, gives partial credit
- **LevenshteinComparator**: String-based, can give unexpected results for numbers

Choose your comparators wisely based on your specific evaluation needs!

## 9. Next Steps

Try experimenting with:

1. **Different Tolerance Values**: Modify the `tolerance_percent` in NumericProximityComparator
2. **Other Custom Comparators**: Create date comparators, percentage comparators, etc.
3. **Mixed Comparators**: Use different comparators for different fields in the same model
4. **Threshold Tuning**: Find optimal thresholds for your specific use case

The comparator system is highly flexible - you can implement any comparison logic you need!