# Sanctions Screening Evaluation

- **Purpose:** Evaluate sanctions screening accuracy and validate precision/recall targets
- **Author:** Devbrew LLC  
- **Last Updated:** November 17, 2025  
- **Status:** In progress  
- **License:** Apache 2.0

## Overview

This notebook implements the evaluation protocol for the sanctions screening module. The evaluation measures matching accuracy through a labeled test set and validates that the system meets production accuracy targets.

**Evaluation Metrics:**
- Precision@1: Percentage of queries where top candidate is the correct match (target: ≥95%)
- Recall@top3: Percentage of queries where ground truth match appears in top 3 (target: ≥98%)
- False Positive Rate: Percentage of non-matches incorrectly flagged as matches
- Decision Accuracy: Alignment between predicted and expected decision categories

The evaluation validates that the screening system correctly identifies sanctioned entities while minimizing false positives, meeting production readiness requirements.

## Setup: Artifacts and Functions

The evaluation loads artifacts generated by the implementation pipeline:

- **Sanctions Index**: Canonicalized names and metadata (`sanctions_index.parquet`)
- **Blocking Indices**: Inverted indices for candidate retrieval (`blocking_indices.json`)
- **Metadata**: Version tracking and dataset statistics

Helper functions for text normalization, tokenization, and screening are loaded to enable independent evaluation runs without re-executing the full implementation pipeline.

### Environment Configuration

We configure the Python environment with standardized settings, import required libraries, and set a fixed random seed for reproducibility. This ensures consistent evaluation results across runs.

In [3]:
import warnings
from pathlib import Path
import json
import unicodedata
import re
from typing import Dict, Any, Optional, List, Tuple
import time
import random
from functools import lru_cache
from collections import OrderedDict

import pandas as pd
import numpy as np

import rapidfuzz as rf
from rapidfuzz import fuzz, process

# Configuration
warnings.filterwarnings("ignore")
pd.set_option("display.max_columns", 100)
pd.set_option("display.max_rows", 100)
pd.set_option("display.float_format", '{:.2f}'.format)

# Reproducibility
RANDOM_STATE = 42
random.seed(RANDOM_STATE)
np.random.seed(RANDOM_STATE)

print("Environment configured successfully")
print(f" pandas: {pd.__version__}")
print(f" numpy: {np.__version__}")
print(f" rapidfuzz: {rf.__version__}")

Environment configured successfully
 pandas: 2.3.3
 numpy: 2.3.3
 rapidfuzz: 3.14.1
