# Threshold Tuning Guide

This notebook explains how to tune the verification threshold for your use case.

## Table of Contents
1. Understanding Thresholds
2. Impact of Threshold on Results
3. Finding Optimal Threshold
4. Use Case Recommendations
5. Custom Threshold Configuration

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from faceverify import FaceVerifier
from faceverify.config import VerifierConfig

## 1. Understanding Thresholds

The threshold determines when two faces are considered a match:

- Similarity >= Threshold --> VERIFIED (same person)
- Similarity < Threshold --> NOT VERIFIED (different people)

Trade-offs:
- **Higher threshold**: Fewer false positives, more false negatives (stricter)
- **Lower threshold**: Fewer false negatives, more false positives (looser)

In [None]:
# Default thresholds for different models
DEFAULT_THRESHOLDS = {
    "facenet": 0.40,
    "facenet512": 0.65,
    "arcface": 0.68,
    "vggface": 0.40
}

print("Default Thresholds by Model")
print("=" * 35)
for model, thresh in DEFAULT_THRESHOLDS.items():
    print(f"{model:<15} {thresh}")

## 2. Impact of Threshold on Results

Visualize how different thresholds affect verification:

In [None]:
# Test images
IMAGE_SAME_1 = "../test_images/person1_a.jpg"
IMAGE_SAME_2 = "../test_images/person1_b.jpg"
IMAGE_DIFF = "../test_images/person2.jpg"

# Initialize verifier
verifier = FaceVerifier()

In [None]:
# Get similarity scores
result_same = verifier.verify(IMAGE_SAME_1, IMAGE_SAME_2)
result_diff = verifier.verify(IMAGE_SAME_1, IMAGE_DIFF)

sim_same = result_same.similarity
sim_diff = result_diff.similarity

print(f"Same person similarity:      {sim_same:.4f}")
print(f"Different person similarity: {sim_diff:.4f}")

In [None]:
# Test different thresholds
thresholds = np.arange(0.3, 0.9, 0.05)

print("\nThreshold Analysis")
print("=" * 65)
print(f"{'Threshold':<12} {'Same Person':<15} {'Diff Person':<15} {'Status'}")
print("-" * 65)

for thresh in thresholds:
    same_result = "VERIFIED" if sim_same >= thresh else "REJECTED"
    diff_result = "VERIFIED" if sim_diff >= thresh else "REJECTED"
    
    # Determine if this threshold works correctly
    if same_result == "VERIFIED" and diff_result == "REJECTED":
        status = "CORRECT"
    elif same_result == "REJECTED":
        status = "False Negative"
    else:
        status = "False Positive"
    
    print(f"{thresh:<12.2f} {same_result:<15} {diff_result:<15} {status}")

In [None]:
# Visualize threshold impact
fig, ax = plt.subplots(figsize=(10, 6))

# Plot similarity scores as horizontal lines
ax.axhline(y=sim_same, color='green', linestyle='-', linewidth=2, label=f'Same person ({sim_same:.3f})')
ax.axhline(y=sim_diff, color='red', linestyle='-', linewidth=2, label=f'Different person ({sim_diff:.3f})')

# Shade regions
ax.fill_between([0, 1], sim_diff, sim_same, alpha=0.3, color='yellow', label='Optimal threshold range')

# Mark default threshold
default_thresh = 0.65
ax.axhline(y=default_thresh, color='blue', linestyle='--', linewidth=2, label=f'Default threshold ({default_thresh})')

ax.set_xlim(0, 1)
ax.set_ylim(0, 1)
ax.set_xlabel('(Visual representation)')
ax.set_ylabel('Similarity Score')
ax.set_title('Threshold Selection Visualization')
ax.legend(loc='lower right')
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig("threshold_visualization.png", dpi=150)
plt.show()

## 3. Finding Optimal Threshold

The optimal threshold depends on your specific requirements:

In [None]:
def find_optimal_threshold(same_scores, diff_scores):
    """
    Find optimal threshold that maximizes separation.
    
    Args:
        same_scores: List of similarity scores for same-person pairs
        diff_scores: List of similarity scores for different-person pairs
    
    Returns:
        Optimal threshold value
    """
    same_scores = np.array(same_scores)
    diff_scores = np.array(diff_scores)
    
    # Test thresholds
    thresholds = np.arange(0.3, 0.9, 0.01)
    best_threshold = 0.5
    best_accuracy = 0
    
    for thresh in thresholds:
        # True positives: same person correctly verified
        tp = np.sum(same_scores >= thresh)
        # True negatives: different person correctly rejected
        tn = np.sum(diff_scores < thresh)
        # Total accuracy
        accuracy = (tp + tn) / (len(same_scores) + len(diff_scores))
        
        if accuracy > best_accuracy:
            best_accuracy = accuracy
            best_threshold = thresh
    
    return best_threshold, best_accuracy

In [None]:
# Example with our test data
same_scores = [sim_same]  # In practice, you'd have many samples
diff_scores = [sim_diff]

optimal_thresh, accuracy = find_optimal_threshold(same_scores, diff_scores)

print(f"Optimal threshold: {optimal_thresh:.2f}")
print(f"Accuracy: {accuracy:.2%}")
print()
print("Note: For reliable threshold tuning, you need a larger dataset")
print("with many same-person and different-person pairs.")

## 4. Use Case Recommendations

Different applications require different threshold settings:

In [None]:
USE_CASE_THRESHOLDS = {
    "High Security (banking, access control)": {
        "threshold": 0.75,
        "priority": "Minimize false positives",
        "description": "Very strict - only high-confidence matches allowed"
    },
    "Standard Verification (ID check)": {
        "threshold": 0.65,
        "priority": "Balance security and usability",
        "description": "Default setting - good for most applications"
    },
    "Photo Organization (albums)": {
        "threshold": 0.55,
        "priority": "Minimize false negatives",
        "description": "Looser matching - catches more true positives"
    },
    "Surveillance (person re-identification)": {
        "threshold": 0.50,
        "priority": "High recall",
        "description": "Very loose - for initial candidate filtering"
    }
}

print("Threshold Recommendations by Use Case")
print("=" * 70)

for use_case, info in USE_CASE_THRESHOLDS.items():
    print(f"\n{use_case}")
    print(f"  Threshold:   {info['threshold']}")
    print(f"  Priority:    {info['priority']}")
    print(f"  Description: {info['description']}")

## 5. Custom Threshold Configuration

How to use custom thresholds in FaceVerify:

In [None]:
# Method 1: Set threshold in config
config = VerifierConfig(threshold=0.70)
strict_verifier = FaceVerifier(config=config)

result = strict_verifier.verify(IMAGE_SAME_1, IMAGE_SAME_2)
print(f"With threshold 0.70:")
print(f"  Verified: {result.verified}")
print(f"  Similarity: {result.similarity:.4f}")

In [None]:
# Method 2: Pass threshold to verify method
verifier = FaceVerifier()

# Test with different thresholds on the same comparison
for thresh in [0.50, 0.60, 0.70, 0.80]:
    result = verifier.verify(IMAGE_SAME_1, IMAGE_SAME_2, threshold=thresh)
    status = "VERIFIED" if result.verified else "REJECTED"
    print(f"Threshold {thresh}: {status} (similarity: {result.similarity:.4f})")

In [None]:
# Method 3: Post-process results with custom threshold
def verify_with_custom_threshold(verifier, img1, img2, custom_threshold):
    """
    Verify faces and apply custom threshold to the result.
    """
    result = verifier.verify(img1, img2)
    
    # Override the verified decision
    custom_verified = result.similarity >= custom_threshold
    
    return {
        "verified": custom_verified,
        "similarity": result.similarity,
        "threshold_used": custom_threshold,
        "original_threshold": result.threshold
    }

# Example usage
custom_result = verify_with_custom_threshold(verifier, IMAGE_SAME_1, IMAGE_SAME_2, 0.72)
print("Custom threshold result:")
for key, value in custom_result.items():
    print(f"  {key}: {value}")

## Summary

Key points about threshold tuning:

1. **Default threshold (0.65)** works well for most use cases
2. **Higher threshold** = stricter, fewer false positives, more false negatives
3. **Lower threshold** = looser, fewer false negatives, more false positives
4. **Optimal threshold** depends on your specific requirements
5. **Test thoroughly** with representative data before deploying

Quick Reference:
- High Security: 0.70-0.80
- Standard: 0.60-0.70
- Loose Matching: 0.50-0.60