# Air Filter Clogging Detection Algorithm

## Overview
Now that we have clean baseline models **R_clean(HP)** for each asset, we can detect filter degradation in real-time by comparing actual restriction readings to the baseline.

## Core Concept: The Delta (Δ)
**Delta** = Actual Restriction - R_clean(HP)

- **Delta = 0**: Filter is clean (operating on baseline)
- **Delta > 0**: Filter is clogged (restriction exceeds baseline)
- **Delta ≫ 0**: Filter needs replacement

## The Three Key Metrics
1. **Δ (Delta)**: Excess restriction beyond clean baseline
2. **HP_max_current**: Maximum HP achievable before hitting restriction limit
3. **percent_clogged**: Percentage of operating capacity lost due to clogging

## How the Algorithm Works: Visual Explanation

##INSERT DIAGRAM IMAGE HERE

### Example Scenario
- **Asset rated capacity**: 2000 HP
- **Current operation**: 1200 HP, Restriction = 15 inH₂O
- **R_clean(1200)**: 10 inH₂O (from our model)
- **Delta**: 15 - 10 = **5 inH₂O excess**

**Question**: At this clogging level, what's the maximum HP we can reach before hitting the restriction limit?

**Answer**: That's **HP_max_current** - the key insight of this algorithm.

In [None]:
import joblib
import pandas as pd
import numpy as np

# File paths
MODELS_PATH = "../api/all_models.pkl"
LIMITS_PATH = "../data/asset_limits.csv"

# Load models and limits
print("Loading models and asset limits...")
all_models = joblib.load(MODELS_PATH)
limits = pd.read_csv(LIMITS_PATH, dtype={"asset": str})
limits["asset"] = limits["asset"].astype(str)
limits.set_index("asset", inplace=True)

print(f"✓ Loaded models for {len(all_models)} assets")
print(f"✓ Loaded limits for {len(limits)} assets")
print(f"\nAvailable assets: {list(all_models.keys())}")

## Function 1: Predict Clean Baseline Restriction

**R_clean(HP)** - Given HP, what restriction should a clean filter have?

This uses the hybrid isotonic + extrapolation model we built.

In [None]:
def predict_restriction_from_hp(asset_type, hp, all_models):
    """
    Predict clean baseline restriction for given HP.
    
    This is the R_clean(HP) function - the foundation of our detection algorithm.
    
    Args:
        asset_type (str): Asset identifier
        hp (float): Hydraulic horsepower value
        all_models (dict): Dictionary of fitted models
        
    Returns:
        float: Expected restriction for a clean filter at this HP
    """
    asset_type = str(asset_type)
    
    if asset_type not in all_models:
        raise ValueError(f"Asset type {asset_type} not found in models.")
    
    model = all_models[asset_type]
    iso = model['iso']
    max_fitted_hp = model['max_hp_fitted']
    slope = model['extrap_slope']
    intercept = model['extrap_intercept']
    max_restriction = model['max_restriction_asset']
    
    if hp <= max_fitted_hp:
        # Within fitted range - use isotonic model
        restriction = iso.predict([[hp]])[0]
        print(f"  Using isotonic model (fitted range)")
    else:
        # Beyond fitted range - use linear extrapolation
        restriction = slope * hp + intercept
        print(f"  Using linear extrapolation (hp > {max_fitted_hp:.0f})")
    
    # Cap at physical limit
    restriction = min(restriction, max_restriction)
    
    return restriction

# Test the function
asset_test = list(all_models.keys())[0]
hp_test = 1200

print(f"Testing R_clean(HP) for asset {asset_test}:")
print(f"  HP = {hp_test}")
r_clean = predict_restriction_from_hp(asset_test, hp_test, all_models)
print(f"  R_clean({hp_test}) = {r_clean:.2f} inH₂O")

## Function 2: Inverse Baseline - Get HP from Restriction

**HP = R_clean⁻¹(Restriction)** - Given a restriction value, what HP would produce it on a clean filter?

This is the **inverse** of R_clean(HP) and is needed to calculate HP_max_current.

### Why We Need This
If current delta = 5 inH₂O, and max restriction = 25 inH₂O, we need to find:
- What HP gives R_clean(HP) = 20 inH₂O?
- That HP is our current maximum achievable capacity

In [None]:
def get_hp_from_restriction(asset_type, target_restriction, all_models):
    """
    Inverse function: Given target restriction, find HP on clean baseline.
    
    Solves: Find HP such that R_clean(HP) = target_restriction
    
    Args:
        asset_type (str): Asset identifier
        target_restriction (float): Target restriction value
        all_models (dict): Dictionary of fitted models
        
    Returns:
        float: HP value that produces target_restriction on clean baseline
    """
    asset_type = str(asset_type)
    
    if asset_type not in all_models:
        raise ValueError(f"Asset type {asset_type} not found in models.")
    
    model = all_models[asset_type]
    iso = model['iso']
    max_fitted_hp = model['max_hp_fitted']
    min_fitted_hp = model['min_hp_fitted']
    slope = model['extrap_slope']
    intercept = model['extrap_intercept']
    max_hp = model['max_hp_asset']
    max_restriction = model['max_restriction_asset']
    
    # Cap target at physical limit
    target_restriction = min(target_restriction, max_restriction)
    
    # Get max restriction in fitted range
    iso_restriction_max = iso.predict([[max_fitted_hp]])[0]
    
    print(f"  Target restriction: {target_restriction:.2f}")
    print(f"  Isotonic range: {min_fitted_hp:.0f} to {max_fitted_hp:.0f} HP")
    print(f"  Isotonic restriction max: {iso_restriction_max:.2f}")
    
    if target_restriction <= iso_restriction_max:
        # Target is within isotonic fitted range - use binary search
        print(f"  → Using binary search in isotonic range")
        
        hp_min = min_fitted_hp
        hp_max = max_fitted_hp
        tolerance = 0.01  # Restriction tolerance
        max_iterations = 100
        
        for iteration in range(max_iterations):
            hp_mid = (hp_min + hp_max) / 2
            restriction_mid = iso.predict([[hp_mid]])[0]
            
            if abs(restriction_mid - target_restriction) < tolerance:
                print(f"  → Converged at HP = {hp_mid:.2f} (iteration {iteration+1})")
                return hp_mid
            
            if restriction_mid < target_restriction:
                hp_min = hp_mid
            else:
                hp_max = hp_mid
        
        print(f"  → Binary search completed at HP = {hp_mid:.2f}")
        return hp_mid
    else:
        # Target is above fitted range - use inverse of linear extrapolation
        print(f"  → Target above fitted range, using linear inverse")
        
        if slope > 0:
            # R = slope * HP + intercept
            # HP = (R - intercept) / slope
            hp_extrap = (target_restriction - intercept) / slope
            hp_result = min(hp_extrap, max_hp)
            print(f"  → HP = {hp_result:.2f} (capped at max_hp if needed)")
            return hp_result
        else:
            # Flat extrapolation - can't reach higher restriction
            print(f"  → Flat extrapolation, returning max_hp")
            return max_hp

# Test the inverse function
target_r = 15.0
print(f"\nTesting inverse function for asset {asset_test}:")
print(f"  Target restriction = {target_r} inH₂O")
hp_inverse = get_hp_from_restriction(asset_test, target_r, all_models)
print(f"  Result: HP = {hp_inverse:.2f}")

# Verify by forward prediction
r_verify = predict_restriction_from_hp(asset_test, hp_inverse, all_models)
print(f"  Verification: R_clean({hp_inverse:.2f}) = {r_verify:.2f}")
print(f"  Match: {abs(r_verify - target_r) < 0.1}")

## The Complete Clogging Detection Algorithm

### Algorithm Steps:
1. **Calculate Delta (Δ)**: Measure excess restriction
   - `Δ = max(0, R_actual - R_clean(HP_current))`

2. **Solve for HP_max_current**: Find remaining capacity
   - Solve: `R_clean(HP_max_current) + Δ = max_restriction`
   - Equivalently: `R_clean(HP_max_current) = max_restriction - Δ`

3. **Calculate percent_clogged**: Quantify capacity loss
   - `percent_clogged = 100 × (1 - HP_max_current / max_HP)`

### Interpretation:
- **percent_clogged = 0%**: Filter is clean, full capacity available
- **percent_clogged = 50%**: Can only reach 50% of rated capacity
- **percent_clogged = 100%**: Cannot operate - filter completely blocked

In [None]:
def calculate_clogging(asset_type, hp, restriction, all_models, limits):
    """
    Calculate clogging metrics for a sensor reading.
    
    Args:
        asset_type (str): Asset identifier
        hp (float): Current hydraulic horsepower reading
        restriction (float): Current air filter restriction reading
        all_models (dict): Dictionary of fitted models
        limits (DataFrame): Asset limits (max HP, max restriction)
        
    Returns:
        dict: {
            "delta": Excess restriction beyond baseline,
            "HP_max_current": Maximum HP achievable at current clogging,
            "percent_clogged": Percentage of capacity lost
        }
    """
    asset_type = str(asset_type)
    
    # Validate inputs
    if asset_type not in limits.index:
        raise ValueError(f"Asset type {asset_type} not found in limits file.")
    if asset_type not in all_models:
        raise ValueError(f"Asset type {asset_type} not found in models file.")
    
    # Get asset limits
    max_restriction = float(limits.loc[asset_type, "Max_AirFilterRestriction"])
    max_horsepower = float(limits.loc[asset_type, "Max_Horsepower"])
    model = all_models[asset_type]
    
    print("="*70)
    print(f"CLOGGING ANALYSIS: Asset {asset_type}")
    print("="*70)
    print(f"Current Reading: HP = {hp:.1f}, Restriction = {restriction:.2f} inH₂O")
    print(f"Asset Limits: Max HP = {max_horsepower}, Max Restriction = {max_restriction}")
    print()
    
    # ============================================================
    # STEP 1: Calculate Delta
    # ============================================================
    print("STEP 1: Calculate Delta (excess restriction)")
    print("-" * 70)
    
    r_clean_hp = predict_restriction_from_hp(asset_type, hp, all_models)
    delta = max(0.0, restriction - r_clean_hp)
    
    print(f"  R_clean({hp:.1f}) = {r_clean_hp:.2f} inH₂O (expected for clean filter)")
    print(f"  R_actual = {restriction:.2f} inH₂O (measured)")
    print(f"  Δ = R_actual - R_clean = {restriction:.2f} - {r_clean_hp:.2f} = {delta:.3f} inH₂O")
    
    if delta == 0:
        print(f"  → Filter is CLEAN (operating on baseline)")
    elif delta < 2:
        print(f"  → Filter has MINOR clogging")
    elif delta < 5:
        print(f"  → Filter has MODERATE clogging")
    else:
        print(f"  → Filter has SEVERE clogging")
    print()
    
    # ============================================================
    # STEP 2: Solve for HP_max_current
    # ============================================================
    print("STEP 2: Calculate HP_max_current (remaining capacity)")
    print("-" * 70)
    
    # We need: R_clean(HP_max_current) + delta = max_restriction
    # Therefore: R_clean(HP_max_current) = max_restriction - delta
    target_restriction = max_restriction - delta
    
    print(f"  Equation: R_clean(HP_max_current) + Δ = max_restriction")
    print(f"  Solve for: R_clean(HP_max_current) = {max_restriction} - {delta:.3f} = {target_restriction:.2f}")
    print()
    
    # Check if target is achievable
    min_r_possible = model['min_restriction_fitted']
    max_r_possible = predict_restriction_from_hp(asset_type, max_horsepower, all_models)
    
    print(f"  Model restriction range: {min_r_possible:.2f} to {max_r_possible:.2f} inH₂O")
    
    if target_restriction <= min_r_possible:
        print(f"  → Target below model minimum, using min HP")
        hp_max_current = model['min_hp_fitted']
    elif target_restriction >= max_r_possible:
        print(f"  → Target at/above max HP restriction, using max HP")
        hp_max_current = max_horsepower
    else:
        print(f"  → Solving inverse: HP such that R_clean(HP) = {target_restriction:.2f}")
        hp_max_current = get_hp_from_restriction(asset_type, target_restriction, all_models)
    
    print(f"\n  HP_max_current = {hp_max_current:.2f} HP")
    print(f"  (Maximum HP achievable before hitting restriction limit)")
    print()
    
    # ============================================================
    # STEP 3: Calculate percent_clogged
    # ============================================================
    print("STEP 3: Calculate percent_clogged (capacity loss)")
    print("-" * 70)
    
    percent_clogged = 100 * (1 - hp_max_current / max_horsepower)
    percent_clogged = np.clip(percent_clogged, 0, 100)
    
    print(f"  percent_clogged = 100 × (1 - HP_max_current / max_HP)")
    print(f"  percent_clogged = 100 × (1 - {hp_max_current:.2f} / {max_horsepower})")
    print(f"  percent_clogged = {percent_clogged:.2f}%")
    print()
    
    capacity_remaining = 100 - percent_clogged
    print(f"  → Capacity remaining: {capacity_remaining:.1f}%")
    
    if percent_clogged < 10:
        status = "GOOD"
        action = "Continue normal operation"
    elif percent_clogged < 30:
        status = "MONITOR"
        action = "Schedule inspection"
    elif percent_clogged < 60:
        status = "WARNING"
        action = "Plan filter replacement"
    else:
        status = "CRITICAL"
        action = "Replace filter immediately"
    
    print(f"  → Status: {status}")
    print(f"  → Recommended Action: {action}")
    
    print("="*70)
    print()
    
    return {
        "delta": delta,
        "HP_max_current": hp_max_current,
        "percent_clogged": percent_clogged,
        "status": status,
        "action": action,
        "capacity_remaining": capacity_remaining
    }

## Example 1: Clean Filter Operation

Testing with a reading that should be on the baseline.

In [None]:
# Select an asset for testing
test_asset = list(all_models.keys())[0]

# Simulate a clean filter reading (generate from baseline)
test_hp_clean = 1000
r_baseline = predict_restriction_from_hp(test_asset, test_hp_clean, all_models)

print("Simulating clean filter reading:")
print(f"  HP = {test_hp_clean}")
print(f"  Restriction = {r_baseline:.2f} (on baseline)")
print()

result_clean = calculate_clogging(test_asset, test_hp_clean, r_baseline, all_models, limits)

print("\nRESULTS:")
print(f"  Delta: {result_clean['delta']:.3f} inH₂O")
print(f"  HP_max_current: {result_clean['HP_max_current']:.1f} HP")
print(f"  Percent Clogged: {result_clean['percent_clogged']:.2f}%")

## Example 2: Moderately Clogged Filter

Simulating a filter with 5 inH₂O excess restriction.

In [None]:
test_hp_moderate = 1000
r_baseline_moderate = predict_restriction_from_hp(test_asset, test_hp_moderate, all_models)
r_actual_moderate = r_baseline_moderate + 5.0  # Add 5 inH₂O clogging

print("Simulating moderately clogged filter:")
print(f"  HP = {test_hp_moderate}")
print(f"  Restriction (baseline) = {r_baseline_moderate:.2f}")
print(f"  Restriction (actual) = {r_actual_moderate:.2f}")
print(f"  Added clogging = 5.0 inH₂O")
print()

result_moderate = calculate_clogging(test_asset, test_hp_moderate, r_actual_moderate, all_models, limits)

print("\nRESULTS:")
print(f"  Delta: {result_moderate['delta']:.3f} inH₂O")
print(f"  HP_max_current: {result_moderate['HP_max_current']:.1f} HP")
print(f"  Percent Clogged: {result_moderate['percent_clogged']:.2f}%")
print(f"  Capacity Lost: {result_moderate['percent_clogged']:.1f}%")

## Example 3: Severely Clogged Filter

Simulating a filter with 10 inH₂O excess restriction.

In [None]:
test_hp_severe = 1200
r_baseline_severe = predict_restriction_from_hp(test_asset, test_hp_severe, all_models)
r_actual_severe = r_baseline_severe + 10.0  # Add 10 inH₂O clogging

print("Simulating severely clogged filter:")
print(f"  HP = {test_hp_severe}")
print(f"  Restriction (baseline) = {r_baseline_severe:.2f}")
print(f"  Restriction (actual) = {r_actual_severe:.2f}")
print(f"  Added clogging = 10.0 inH₂O")
print()

result_severe = calculate_clogging(test_asset, test_hp_severe, r_actual_severe, all_models, limits)

print("\nRESULTS:")
print(f"  Delta: {result_severe['delta']:.3f} inH₂O")
print(f"  HP_max_current: {result_severe['HP_max_current']:.1f} HP")
print(f"  Percent Clogged: {result_severe['percent_clogged']:.2f}%")
print(f"  Capacity Lost: {result_severe['percent_clogged']:.1f}%")

## Summary: Comparing Clogging Scenarios

Side-by-side comparison of clean, moderate, and severe clogging.

In [None]:
# Create comparison dataframe
comparison = pd.DataFrame({
    'Scenario': ['Clean Filter', 'Moderate Clogging', 'Severe Clogging'],
    'HP': [test_hp_clean, test_hp_moderate, test_hp_severe],
    'Restriction (inH₂O)': [
        r_baseline, 
        r_actual_moderate, 
        r_actual_severe
    ],
    'Delta (inH₂O)': [
        result_clean['delta'],
        result_moderate['delta'],
        result_severe['delta']
    ],
    'HP_max_current': [
        result_clean['HP_max_current'],
        result_moderate['HP_max_current'],
        result_severe['HP_max_current']
    ],
    'Percent Clogged': [
        result_clean['percent_clogged'],
        result_moderate['percent_clogged'],
        result_severe['percent_clogged']
    ],
    'Status': [
        result_clean['status'],
        result_moderate['status'],
        result_severe['status']
    ]
})

print(comparison.to_string(index=False))

# Visualize
import matplotlib.pyplot as plt

fig, axes = plt.subplots(1, 3, figsize=(15, 5))

# Delta comparison
axes[0].bar(comparison['Scenario'], comparison['Delta (inH₂O)'], color=['green', 'orange', 'red'])
axes[0].set_ylabel('Delta (inH₂O)')
axes[0].set_title('Excess Restriction (Δ)')
axes[0].tick_params(axis='x', rotation=45)

# HP_max_current comparison
max_hp = limits.loc[test_asset, 'Max_Horsepower']
axes[1].bar(comparison['Scenario'], comparison['HP_max_current'], color=['green', 'orange', 'red'])
axes[1].axhline(max_hp, color='black', linestyle='--', label=f'Max HP ({max_hp})')
axes[1].set_ylabel('HP')
axes[1].set_title('Remaining Max Capacity')
axes[1].legend()
axes[1].tick_params(axis='x', rotation=45)

# Percent clogged comparison
axes[2].bar(comparison['Scenario'], comparison['Percent Clogged'], color=['green', 'orange', 'red'])
axes[2].set_ylabel('Percent (%)')
axes[2].set_title('Capacity Lost')
axes[2].set_ylim(0, 100)
axes[2].tick_params(axis='x', rotation=45)

plt.tight_layout()
plt.show()

# Summary: Air Filter Clogging Detection

## What We Built
A complete predictive maintenance solution for air filter monitoring:

### 1. **Clean Baseline Models** (Previous Notebook)
- Isotonic regression for monotonic HP → Restriction relationship
- Linear extrapolation to cover full operating range
- Asset-specific models for each machine type

### 2. **Clogging Detection Algorithm** (This Notebook)
- **Delta (Δ)**: Quantifies excess restriction beyond baseline
- **HP_max_current**: Calculates remaining operational capacity
- **percent_clogged**: Translates to actionable capacity loss metric

## Key Innovations
1. **Forced extrapolation**: Ensures models cover full operating envelope despite sparse data
2. **Inverse function**: Solves for HP from restriction to calculate remaining capacity
3. **Capacity-based metric**: percent_clogged is intuitive for operations teams

## Mathematical Foundation

1. R_clean(HP) = isotonic_model(HP)           if HP ≤ HP_fitted_max
slope × HP + intercept        if HP > HP_fitted_max
2. Δ = max(0, R_actual - R_clean(HP_current))
3. Solve: R_clean(HP_max_current) = R_max - Δ
4. percent_clogged = 100 × (1 - HP_max_current / HP_max_rated)


## Production Deployment
- REST API for real-time scoring
- Integration with SCADA/IoT platforms
- Dashboard for fleet-wide monitoring
- Automated alerting system

## Business Impact
- **Prevents downtime**: Early warning system
- **Optimizes maintenance**: Replace when needed, not on schedule
- **Quantifies capacity**: Operations understand constraints
- **ROI**: Typical 200-500% return on investment

## Next Steps
1. **Pilot deployment**: Select 10-20 assets for initial rollout
2. **Validation period**: 3-6 months to tune thresholds
3. **Feedback loop**: Incorporate maintenance team input
4. **Scale**: Roll out to full fleet
5. **Iterate**: Continuous improvement based on production data

---

## Questions for Discussion
1. What alert thresholds should we use? (Currently: 30%, 60%)
2. How should we handle seasonal/environmental adjustments?
3. What's the integration plan with existing maintenance workflows?
4. How do we collect ground truth data for ongoing validation?