# FairProp Inspector - Interactive Tutorial

Welcome to the FairProp Inspector tutorial! This notebook will guide you through:

1. **Installation & Setup**
2. **Basic Usage**
3. **Training a Custom Model**
4. **Batch Processing**
5. **Performance Analysis**

---

## 1. Installation & Setup

First, let's install FairProp Inspector:

In [None]:
# Install from source
!pip install -e ..

# Or install from GitHub
# !pip install git+https://github.com/ZheWang-stack/FairProp-Inspector.git

In [None]:
# Import required libraries
import sys
import os

# Add parent directory to path
sys.path.insert(0, os.path.abspath('..'))

from src.inference.predict import predict
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Set style
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)

print("‚úì Setup complete!")

## 2. Basic Usage

Let's start with a simple example:

In [None]:
# Example 1: Clear violation
text1 = "No kids under 12 allowed"
label1, confidence1 = predict(text1, "../artifacts/model")

print(f"Text: '{text1}'")
print(f"Label: {label1}")
print(f"Confidence: {confidence1:.1%}")
print()

# Example 2: Compliant text
text2 = "Great school district nearby"
label2, confidence2 = predict(text2, "../artifacts/model")

print(f"Text: '{text2}'")
print(f"Label: {label2}")
print(f"Confidence: {confidence2:.1%}")

### Interactive Testing

Try your own examples:

In [None]:
# Try your own text here!
your_text = "Perfect for young professionals"  # Change this

label, confidence = predict(your_text, "../artifacts/model")

print(f"Text: '{your_text}'")
print(f"Label: {label}")
print(f"Confidence: {confidence:.1%}")

if label == "NON_COMPLIANT":
    print("\n‚ö†Ô∏è WARNING: This text may violate Fair Housing Act guidelines!")
else:
    print("\n‚úì This text appears compliant.")

## 3. Batch Processing

Process multiple property descriptions at once:

In [None]:
# Sample property descriptions
properties = [
    "Beautiful 3BR home in quiet neighborhood",
    "No kids under 12 allowed",
    "Perfect for young professionals",
    "Great school district nearby",
    "Christian community preferred",
    "Wheelchair accessible entrance",
    "No section 8",
    "Walking distance to shops",
    "Ideal for active adults",
    "Family-friendly neighborhood"
]

# Process all
results = []
for text in properties:
    label, confidence = predict(text, "../artifacts/model")
    results.append({
        'text': text,
        'label': label,
        'confidence': confidence
    })

# Create DataFrame
df = pd.DataFrame(results)
df

### Analyze Results

In [None]:
# Summary statistics
print("Summary:")
print(f"Total: {len(df)}")
print(f"Violations: {(df['label'] == 'NON_COMPLIANT').sum()}")
print(f"Compliant: {(df['label'] == 'COMPLIANT').sum()}")
print(f"Violation Rate: {(df['label'] == 'NON_COMPLIANT').mean():.1%}")
print(f"Average Confidence: {df['confidence'].mean():.1%}")

In [None]:
# Visualize results
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Plot 1: Label distribution
df['label'].value_counts().plot(kind='bar', ax=axes[0], color=['#00ff41', '#ff4444'])
axes[0].set_title('Label Distribution', fontsize=14, fontweight='bold')
axes[0].set_xlabel('Label')
axes[0].set_ylabel('Count')
axes[0].set_xticklabels(axes[0].get_xticklabels(), rotation=0)

# Plot 2: Confidence distribution
df.boxplot(column='confidence', by='label', ax=axes[1])
axes[1].set_title('Confidence by Label', fontsize=14, fontweight='bold')
axes[1].set_xlabel('Label')
axes[1].set_ylabel('Confidence')
plt.suptitle('')  # Remove default title

plt.tight_layout()
plt.show()

## 4. Performance Analysis

Let's measure inference latency:

In [None]:
import time

# Measure latency
latencies = []
test_text = "No kids under 12 allowed"

# Warmup
for _ in range(5):
    predict(test_text, "../artifacts/model")

# Actual measurement
for _ in range(100):
    start = time.time()
    predict(test_text, "../artifacts/model")
    latency = (time.time() - start) * 1000  # ms
    latencies.append(latency)

# Statistics
import numpy as np
print(f"Mean latency: {np.mean(latencies):.2f}ms")
print(f"Median latency: {np.median(latencies):.2f}ms")
print(f"P95 latency: {np.percentile(latencies, 95):.2f}ms")
print(f"P99 latency: {np.percentile(latencies, 99):.2f}ms")

In [None]:
# Visualize latency distribution
plt.figure(figsize=(12, 5))

plt.subplot(1, 2, 1)
plt.hist(latencies, bins=30, color='#00ff41', alpha=0.7, edgecolor='black')
plt.axvline(np.mean(latencies), color='red', linestyle='--', label=f'Mean: {np.mean(latencies):.1f}ms')
plt.axvline(np.percentile(latencies, 95), color='orange', linestyle='--', label=f'P95: {np.percentile(latencies, 95):.1f}ms')
plt.xlabel('Latency (ms)')
plt.ylabel('Frequency')
plt.title('Latency Distribution', fontweight='bold')
plt.legend()

plt.subplot(1, 2, 2)
plt.boxplot(latencies, vert=True)
plt.ylabel('Latency (ms)')
plt.title('Latency Box Plot', fontweight='bold')

plt.tight_layout()
plt.show()

## 5. Advanced: Custom Training

For training a custom model, see the [Training Guide](../docs/training_guide.md).

Quick example:

In [None]:
# Example training data format
training_data = [
    {"text": "No kids under 12 allowed", "label": "NON_COMPLIANT"},
    {"text": "Great school district nearby", "label": "COMPLIANT"},
    # ... more examples
]

print("Training data format:")
print(training_data[0])
print("\nTo train:")
print("python src/trainer/train.py --data your_data.json --epochs 3")

## üéì Next Steps

1. **Explore Examples**: Check out `examples/` directory for more code samples
2. **Read Training Guide**: Learn how to train custom models in `docs/training_guide.md`
3. **Run Benchmarks**: Test performance with `benchmarks/accuracy_comparison.py`
4. **Contribute**: See `CONTRIBUTING.md` for contribution guidelines

---

## üìö Resources

- [GitHub Repository](https://github.com/ZheWang-stack/FairProp-Inspector)
- [Documentation](https://github.com/ZheWang-stack/FairProp-Inspector/blob/main/README.md)
- [Fair Housing Act Guidelines](https://www.hud.gov/program_offices/fair_housing_equal_opp)

---

**Questions?** Open an issue on GitHub or email: fairprop-inspector@proton.me