# 124: Feature Store Implementation

## üéØ Learning Objectives

By the end of this notebook, you will:
- **Understand** feature stores and their role in ML production systems
- **Implement** offline and online feature serving patterns
- **Build** a feature store using Feast framework
- **Apply** feature stores to post-silicon validation workflows
- **Evaluate** feature consistency across training and serving
- **Design** feature engineering pipelines for production scale

## üìö What is a Feature Store?

A **feature store** is a centralized repository for storing, managing, and serving ML features. It solves the critical problem of **training-serving skew** by ensuring features computed during training are identical to features used during inference. Feature stores provide versioning, point-in-time correctness, and scalable serving for both batch and real-time predictions.

**Why Feature Stores?**
- ‚úÖ **Eliminates training-serving skew** (features computed identically)
- ‚úÖ **Reduces feature engineering duplication** (define once, use everywhere)
- ‚úÖ **Ensures point-in-time correctness** (no data leakage from future)
- ‚úÖ **Enables feature reuse** (share features across models and teams)
- ‚úÖ **Provides feature versioning** (reproducible experiments)
- ‚úÖ **Supports low-latency serving** (pre-computed features for real-time)

**Core Components:**
1. **Offline Store**: Historical features for model training (data warehouse, data lake)
2. **Online Store**: Low-latency features for real-time inference (Redis, DynamoDB)
3. **Feature Registry**: Metadata catalog (definitions, schemas, lineage)
4. **Feature Computation Engine**: Transforms raw data into features

## üè≠ Post-Silicon Validation Use Cases

**Centralized Device Feature Store**
- Input: Raw STDF test data (voltage, current, frequency, temperature)
- Features: Aggregated statistics (mean/std/percentile per device, wafer, lot)
- Output: Consistent features for yield prediction, binning, anomaly detection
- Value: Single source of truth for device characterization, eliminates duplicate feature engineering across 10+ models

**Real-Time Test Binning**
- Input: Device test parameters (Vdd=1.23V, Idd=245mA, Freq=2.4GHz)
- Features: Pre-computed percentile rankings, Z-scores vs lot distribution
- Output: Bin assignment in <50ms (PASS/FAIL/BIN1/BIN2)
- Value: Low-latency online serving enables inline binning during test, reduces retest cycles

**Wafer-Level Feature Engineering**
- Input: Die-level test results (die_x, die_y, parametric values)
- Features: Spatial aggregations (neighboring die statistics, radial patterns, edge effects)
- Output: Wafer-aware features for spatial correlation models
- Value: Point-in-time correctness ensures training features match production wafer maps

**Feature Versioning for Experiments**
- Input: Multiple feature engineering approaches (raw params, polynomial, interactions)
- Features: Version-controlled feature sets (v1: raw, v2: +polynomials, v3: +interactions)
- Output: Reproducible experiments with consistent feature definitions
- Value: Track which feature set version achieved best yield prediction accuracy

## üîÑ Feature Store Workflow

```mermaid
graph TB
    subgraph "Feature Definition"
        A[Raw Data Sources] --> B[Feature Engineering Logic]
        B --> C[Feature Registry]
    end
    
    subgraph "Offline Flow - Training"
        C --> D[Historical Feature Store]
        D --> E[Point-in-Time Join]
        E --> F[Training Dataset]
        F --> G[Model Training]
    end
    
    subgraph "Online Flow - Serving"
        C --> H[Online Feature Store]
        I[Real-Time Request] --> H
        H --> J[Feature Vector]
        J --> K[Model Inference]
        K --> L[Prediction]
    end
    
    subgraph "Feature Computation"
        M[Batch Pipeline] --> D
        N[Stream Pipeline] --> H
    end
    
    style A fill:#e1f5ff
    style D fill:#fff5e1
    style H fill:#ffe1e1
    style G fill:#e1ffe1
    style L fill:#e1ffe1
```

## üìä Learning Path Context

**Prerequisites:**
- **121_MLOps_Fundamentals.ipynb** - MLOps lifecycle, experiment tracking
- **122_MLflow_Complete_Guide.ipynb** - Model versioning and deployment
- **123_Model_Monitoring_Drift_Detection.ipynb** - Feature drift detection

**Next Steps:**
- **125_ML_Testing_Validation.ipynb** - Unit/integration testing for ML
- **126_CI_CD_for_ML.ipynb** - Automated ML pipelines
- **127_Model_Serving_Patterns.ipynb** - Deployment architectures

---

Let's build production feature stores! üöÄ

In [None]:
# Install feature store libraries
# !pip install feast pandas numpy scikit-learn pyyaml

import pandas as pd
import numpy as np
from datetime import datetime, timedelta
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, f1_score
import warnings
warnings.filterwarnings('ignore')

print("Feature store libraries loaded")
print("Focus: Centralized feature management, training/serving consistency")

In [None]:
# Simple feature store implementation (conceptual)
class SimpleFeatureStore:
    """
    Minimal feature store demonstrating core concepts.
    
    Components:
    - Feature registry (metadata)
    - Offline store (historical features)
    - Online store (real-time features)
    """
    
    def __init__(self):
        self.feature_registry = {}  # Metadata catalog
        self.offline_store = {}     # Historical features (pandas DataFrames)
        self.online_store = {}      # Latest features (key-value)
    
    def register_feature(self, name, description, feature_type, computation_fn):
        """Register feature definition in catalog."""
        self.feature_registry[name] = {
            'description': description,
            'type': feature_type,
            'computation': computation_fn,
            'created_at': datetime.now()
        }
        print(f"‚úÖ Registered feature: {name}")
    
    def materialize_offline(self, feature_name, data):
        """
        Compute and store historical features for training.
        
        Args:
            feature_name: Feature to materialize
            data: Raw data (DataFrame with 'timestamp' column)
        """
        if feature_name not in self.feature_registry:
            raise ValueError(f"Feature {feature_name} not registered")
        
        computation_fn = self.feature_registry[feature_name]['computation']
        features = computation_fn(data)
        
        # Store with timestamp for point-in-time correctness
        self.offline_store[feature_name] = features
        print(f"‚úÖ Materialized {len(features)} offline records for {feature_name}")
    
    def materialize_online(self, feature_name, entity_key, features):
        """
        Update online store for low-latency serving.
        
        Args:
            feature_name: Feature to update
            entity_key: Entity identifier (device_id, customer_id, etc.)
            features: Feature values
        """
        if feature_name not in self.online_store:
            self.online_store[feature_name] = {}
        
        self.online_store[feature_name][entity_key] = features
        print(f"‚úÖ Updated online features for {entity_key}")
    
    def get_historical_features(self, feature_names, entity_df, event_timestamp_col='timestamp'):
        """
        Get point-in-time correct features for training.
        
        Args:
            feature_names: List of features to retrieve
            entity_df: DataFrame with entity IDs and timestamps
            event_timestamp_col: Timestamp column name
        
        Returns:
            DataFrame with features joined to entities
        """
        result = entity_df.copy()
        
        for feature_name in feature_names:
            if feature_name not in self.offline_store:
                print(f"‚ö†Ô∏è Feature {feature_name} not found in offline store")
                continue
            
            # Point-in-time join: only use features <= event timestamp
            feature_data = self.offline_store[feature_name]
            
            # Simplified join (production would use more sophisticated logic)
            result = result.merge(
                feature_data, 
                on='device_id', 
                how='left', 
                suffixes=('', f'_{feature_name}')
            )
        
        return result
    
    def get_online_features(self, feature_names, entity_keys):
        """
        Get latest features for real-time inference.
        
        Args:
            feature_names: List of features to retrieve
            entity_keys: List of entity IDs
        
        Returns:
            Dictionary of features per entity
        """
        results = {}
        
        for entity_key in entity_keys:
            entity_features = {}
            
            for feature_name in feature_names:
                if feature_name in self.online_store and entity_key in self.online_store[feature_name]:
                    entity_features[feature_name] = self.online_store[feature_name][entity_key]
                else:
                    entity_features[feature_name] = None
            
            results[entity_key] = entity_features
        
        return results

# Initialize feature store
fs = SimpleFeatureStore()
print("Simple feature store initialized")
print(f"Components: Registry, Offline Store, Online Store")

In [None]:
# Generate synthetic STDF test data
np.random.seed(42)

def generate_stdf_data(n_devices=1000):
    """Generate synthetic device test data."""
    
    # Base timestamp
    base_time = datetime.now() - timedelta(days=30)
    
    # Generate device IDs and timestamps
    device_ids = [f"DEV{i:05d}" for i in range(n_devices)]
    timestamps = [base_time + timedelta(hours=i*0.5) for i in range(n_devices)]
    
    # Parametric test measurements
    vdd = np.random.normal(1.2, 0.02, n_devices)      # Voltage (V)
    idd = np.random.normal(250, 15, n_devices)         # Current (mA)
    freq = np.random.normal(2.5, 0.1, n_devices)       # Frequency (GHz)
    temp = np.random.normal(85, 5, n_devices)          # Temperature (¬∞C)
    
    # Yield labels (based on parametric limits)
    yield_pass = (
        (vdd >= 1.15) & (vdd <= 1.25) &
        (idd <= 280) &
        (freq >= 2.3) &
        (temp <= 100)
    ).astype(int)
    
    data = pd.DataFrame({
        'device_id': device_ids,
        'timestamp': timestamps,
        'vdd': vdd,
        'idd': idd,
        'freq': freq,
        'temp': temp,
        'yield': yield_pass
    })
    
    return data

# Generate data
raw_data = generate_stdf_data(n_devices=1000)

print("‚úÖ Generated synthetic STDF data")
print(f"\nDataset shape: {raw_data.shape}")
print(f"Date range: {raw_data['timestamp'].min()} to {raw_data['timestamp'].max()}")
print(f"Yield rate: {raw_data['yield'].mean():.1%}")
print("\nFirst 5 records:")
print(raw_data.head())

In [None]:
# Define feature transformations

def compute_basic_features(data):
    """Basic parametric features (raw values)."""
    return data[['device_id', 'timestamp', 'vdd', 'idd', 'freq', 'temp']].copy()

def compute_power_features(data):
    """Derived power features."""
    result = data[['device_id', 'timestamp']].copy()
    result['power'] = data['vdd'] * data['idd']  # Power (mW)
    result['power_efficiency'] = data['freq'] / result['power']  # GHz/mW
    return result

def compute_zscore_features(data):
    """Normalized Z-scores vs population."""
    result = data[['device_id', 'timestamp']].copy()
    result['vdd_zscore'] = (data['vdd'] - data['vdd'].mean()) / data['vdd'].std()
    result['idd_zscore'] = (data['idd'] - data['idd'].mean()) / data['idd'].std()
    result['freq_zscore'] = (data['freq'] - data['freq'].mean()) / data['freq'].std()
    return result

def compute_aggregate_features(data):
    """Rolling aggregate features (last 100 devices)."""
    result = data[['device_id', 'timestamp']].copy()
    result['vdd_rolling_mean'] = data['vdd'].rolling(window=100, min_periods=1).mean()
    result['idd_rolling_std'] = data['idd'].rolling(window=100, min_periods=1).std()
    result['freq_rolling_median'] = data['freq'].rolling(window=100, min_periods=1).median()
    return result

# Register features in store
fs.register_feature(
    name='basic_params',
    description='Raw parametric measurements from STDF',
    feature_type='batch',
    computation_fn=compute_basic_features
)

fs.register_feature(
    name='power_metrics',
    description='Derived power consumption features',
    feature_type='batch',
    computation_fn=compute_power_features
)

fs.register_feature(
    name='zscore_normalized',
    description='Population-normalized Z-scores',
    feature_type='batch',
    computation_fn=compute_zscore_features
)

fs.register_feature(
    name='rolling_aggregates',
    description='Rolling window statistics',
    feature_type='streaming',
    computation_fn=compute_aggregate_features
)

print("\n‚úÖ Registered 4 feature definitions in feature store")
print(f"Feature registry size: {len(fs.feature_registry)}")
print("\nRegistered features:")
for name, meta in fs.feature_registry.items():
    print(f"  - {name}: {meta['description']}")

In [None]:
# Materialize features for offline training

# Compute all features
fs.materialize_offline('basic_params', raw_data)
fs.materialize_offline('power_metrics', raw_data)
fs.materialize_offline('zscore_normalized', raw_data)
fs.materialize_offline('rolling_aggregates', raw_data)

print("\n" + "="*60)
print("OFFLINE FEATURE STORE STATUS")
print("="*60)

for feature_name, feature_data in fs.offline_store.items():
    print(f"\n{feature_name}:")
    print(f"  Records: {len(feature_data)}")
    print(f"  Columns: {list(feature_data.columns)}")
    print(f"  Memory: {feature_data.memory_usage(deep=True).sum() / 1024:.1f} KB")

# Create training dataset by joining features
print("\n" + "="*60)
print("TRAINING DATASET CREATION")
print("="*60)

# Entity DataFrame (devices we want features for)
entity_df = raw_data[['device_id', 'timestamp', 'yield']].copy()

# Get historical features (point-in-time join)
training_features = fs.get_historical_features(
    feature_names=['basic_params', 'power_metrics', 'zscore_normalized'],
    entity_df=entity_df,
    event_timestamp_col='timestamp'
)

print(f"\n‚úÖ Training dataset created")
print(f"Shape: {training_features.shape}")
print(f"Columns: {list(training_features.columns)}")
print(f"\nSample records:")
print(training_features.head(3))

In [None]:
# Train model using feature store features

# Select feature columns (exclude metadata)
feature_cols = ['vdd', 'idd', 'freq', 'temp', 'power', 'power_efficiency', 
                'vdd_zscore', 'idd_zscore', 'freq_zscore']

# Prepare training data
X = training_features[feature_cols].fillna(0)  # Handle any missing values
y = training_features['yield']

# Split data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

# Train model
model = RandomForestClassifier(n_estimators=100, random_state=42, max_depth=10)
model.fit(X_train, y_train)

# Evaluate
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)

print("="*60)
print("MODEL TRAINING RESULTS")
print("="*60)
print(f"\nDataset:")
print(f"  Training samples: {len(X_train)}")
print(f"  Test samples: {len(X_test)}")
print(f"  Features used: {len(feature_cols)}")

print(f"\nPerformance:")
print(f"  Accuracy: {accuracy:.3f}")
print(f"  F1 Score: {f1:.3f}")

# Feature importance
feature_importance = pd.DataFrame({
    'feature': feature_cols,
    'importance': model.feature_importances_
}).sort_values('importance', ascending=False)

print(f"\nTop 5 Most Important Features:")
for idx, row in feature_importance.head(5).iterrows():
    print(f"  {row['feature']:20s}: {row['importance']:.4f}")

print(f"\n‚úÖ Model trained successfully using feature store features")

In [None]:
# Simulate online feature store population

# Select latest features for a subset of devices (production devices)
production_devices = raw_data.tail(20)  # Last 20 devices represent production

print("="*60)
print("ONLINE FEATURE STORE POPULATION")
print("="*60)

for idx, device in production_devices.iterrows():
    device_id = device['device_id']
    
    # Compute features for this device
    basic = compute_basic_features(pd.DataFrame([device]))
    power = compute_power_features(pd.DataFrame([device]))
    zscore = compute_zscore_features(raw_data)  # Use full population for Z-scores
    
    # Extract single device features
    device_features = {
        'vdd': device['vdd'],
        'idd': device['idd'],
        'freq': device['freq'],
        'temp': device['temp'],
        'power': device['vdd'] * device['idd'],
        'power_efficiency': device['freq'] / (device['vdd'] * device['idd']),
        'vdd_zscore': zscore.loc[idx, 'vdd_zscore'],
        'idd_zscore': zscore.loc[idx, 'idd_zscore'],
        'freq_zscore': zscore.loc[idx, 'freq_zscore']
    }
    
    # Update online store
    fs.materialize_online('device_features', device_id, device_features)

print(f"\n‚úÖ Populated online store for {len(production_devices)} production devices")
print(f"Online store size: {len(fs.online_store['device_features'])} entities")

# Test online feature retrieval (low-latency lookup)
print("\n" + "="*60)
print("ONLINE FEATURE RETRIEVAL TEST")
print("="*60)

# Simulate production inference request
test_device_ids = ['DEV00980', 'DEV00985', 'DEV00990']

# Retrieve features (this would be <10ms in production with Redis)
online_features = fs.get_online_features(
    feature_names=['device_features'],
    entity_keys=test_device_ids
)

print(f"\n‚úÖ Retrieved online features for {len(test_device_ids)} devices")

for device_id, features in online_features.items():
    print(f"\n{device_id}:")
    if features['device_features']:
        for key, value in features['device_features'].items():
            print(f"  {key:20s}: {value:.4f}" if isinstance(value, float) else f"  {key:20s}: {value}")
    else:
        print("  ‚ö†Ô∏è Features not found in online store")

In [None]:
# Production inference simulation

def predict_with_online_features(device_ids, feature_store, model):
    """
    Simulate production inference using online feature store.
    
    Args:
        device_ids: List of device IDs to predict
        feature_store: Initialized feature store
        model: Trained ML model
    
    Returns:
        Predictions and latency statistics
    """
    import time
    
    predictions = []
    latencies = []
    
    for device_id in device_ids:
        start_time = time.time()
        
        # 1. Retrieve features from online store (Redis in production)
        features = feature_store.get_online_features(
            feature_names=['device_features'],
            entity_keys=[device_id]
        )
        
        # 2. Prepare feature vector
        device_features = features[device_id]['device_features']
        
        if device_features is None:
            predictions.append({'device_id': device_id, 'prediction': None, 'error': 'Features not found'})
            continue
        
        feature_vector = np.array([
            device_features['vdd'],
            device_features['idd'],
            device_features['freq'],
            device_features['temp'],
            device_features['power'],
            device_features['power_efficiency'],
            device_features['vdd_zscore'],
            device_features['idd_zscore'],
            device_features['freq_zscore']
        ]).reshape(1, -1)
        
        # 3. Model inference
        prediction = model.predict(feature_vector)[0]
        probability = model.predict_proba(feature_vector)[0]
        
        # 4. Record latency
        latency_ms = (time.time() - start_time) * 1000
        latencies.append(latency_ms)
        
        predictions.append({
            'device_id': device_id,
            'prediction': 'PASS' if prediction == 1 else 'FAIL',
            'confidence': probability[prediction],
            'latency_ms': latency_ms
        })
    
    return predictions, latencies

# Test production inference
test_devices = ['DEV00980', 'DEV00985', 'DEV00990', 'DEV00995']

print("="*60)
print("PRODUCTION INFERENCE SIMULATION")
print("="*60)

predictions, latencies = predict_with_online_features(test_devices, fs, model)

print(f"\n‚úÖ Completed {len(predictions)} predictions")
print(f"\nLatency Statistics:")
print(f"  Mean: {np.mean(latencies):.2f} ms")
print(f"  P50:  {np.percentile(latencies, 50):.2f} ms")
print(f"  P95:  {np.percentile(latencies, 95):.2f} ms")
print(f"  P99:  {np.percentile(latencies, 99):.2f} ms")

print(f"\nPrediction Results:")
print("-" * 60)
for pred in predictions:
    if 'error' in pred:
        print(f"{pred['device_id']}: ERROR - {pred['error']}")
    else:
        print(f"{pred['device_id']}: {pred['prediction']:5s} (confidence: {pred['confidence']:.3f}, latency: {pred['latency_ms']:.2f}ms)")

print("\n‚úÖ Real-time inference using online feature store successful")
print("Note: Production with Redis would achieve <10ms feature retrieval")

In [None]:
# Feast feature store conceptual structure
# (Would require feast installation and configuration)

feast_structure = """
FEAST FEATURE STORE STRUCTURE
==============================

1. Project Structure:
   feature_repo/
   ‚îú‚îÄ‚îÄ feature_store.yaml        # Configuration (offline/online stores)
   ‚îú‚îÄ‚îÄ entities.py                # Entity definitions (device, customer, etc.)
   ‚îú‚îÄ‚îÄ features.py                # Feature view definitions
   ‚îî‚îÄ‚îÄ data_sources.py            # Data source connections

2. Entity Definition (entities.py):
   ```python
   from feast import Entity, ValueType
   
   device = Entity(
       name="device",
       value_type=ValueType.STRING,
       description="Semiconductor device entity"
   )
   ```

3. Feature View (features.py):
   ```python
   from feast import FeatureView, Field
   from feast.types import Float32, String
   from datetime import timedelta
   
   device_features = FeatureView(
       name="device_parametrics",
       entities=["device"],
       ttl=timedelta(days=30),
       schema=[
           Field(name="vdd", dtype=Float32),
           Field(name="idd", dtype=Float32),
           Field(name="freq", dtype=Float32),
           Field(name="temp", dtype=Float32),
           Field(name="power", dtype=Float32),
       ],
       source=device_data_source,  # From data_sources.py
       online=True
   )
   ```

4. Configuration (feature_store.yaml):
   ```yaml
   project: semiconductor_features
   registry: data/registry.db
   provider: local
   online_store:
       type: redis
       connection_string: localhost:6379
   offline_store:
       type: file
   ```

5. CLI Commands:
   ```bash
   # Initialize project
   feast init feature_repo
   
   # Register features
   feast -c feature_repo apply
   
   # Materialize historical features to offline store
   feast -c feature_repo materialize-incremental 2024-01-01T00:00:00
   
   # Materialize latest features to online store
   feast -c feature_repo materialize 2024-01-01T00:00:00 2024-12-31T23:59:59
   ```

6. Feature Retrieval (Python SDK):
   ```python
   from feast import FeatureStore
   
   store = FeatureStore(repo_path="feature_repo/")
   
   # Offline (training)
   training_df = store.get_historical_features(
       entity_df=entity_df,
       features=["device_parametrics:vdd", "device_parametrics:idd"]
   ).to_df()
   
   # Online (inference)
   online_features = store.get_online_features(
       features=["device_parametrics:vdd", "device_parametrics:idd"],
       entity_rows=[{"device": "DEV00123"}]
   ).to_dict()
   ```

7. Key Benefits:
   - Point-in-time correctness (no data leakage)
   - Automatic online/offline consistency
   - Feature versioning and lineage
   - Scalable to billions of features
   - Production-ready (used by Uber, Shopify, etc.)
"""

print(feast_structure)

print("\n" + "="*60)
print("FEAST VS CUSTOM IMPLEMENTATION")
print("="*60)

comparison = pd.DataFrame({
    'Aspect': [
        'Point-in-time correctness',
        'Online/offline consistency',
        'Feature versioning',
        'Scalability',
        'Setup complexity',
        'Production readiness'
    ],
    'Custom Implementation': [
        'Manual implementation required',
        'Requires careful design',
        'Manual versioning',
        'Limited (single machine)',
        'Low (quick start)',
        'Requires significant work'
    ],
    'Feast': [
        'Built-in (automatic)',
        'Automatic guarantee',
        'Built-in registry',
        'High (distributed)',
        'Medium (config required)',
        'Production-ready'
    ]
})

print(comparison.to_string(index=False))
print("\n‚úÖ Feast recommended for production feature stores")

In [None]:
# Feature store monitoring and validation

class FeatureStoreMonitor:
    """Monitor feature store health and quality."""
    
    def __init__(self, feature_store):
        self.fs = feature_store
        self.baseline_stats = {}
    
    def compute_feature_statistics(self, feature_name):
        """Compute statistics for a feature."""
        if feature_name not in self.fs.offline_store:
            return None
        
        data = self.fs.offline_store[feature_name]
        
        stats = {}
        for col in data.select_dtypes(include=[np.number]).columns:
            stats[col] = {
                'mean': data[col].mean(),
                'std': data[col].std(),
                'min': data[col].min(),
                'max': data[col].max(),
                'nulls': data[col].isnull().sum(),
                'count': len(data[col])
            }
        
        return stats
    
    def set_baseline(self, feature_name):
        """Set baseline statistics for drift detection."""
        stats = self.compute_feature_statistics(feature_name)
        if stats:
            self.baseline_stats[feature_name] = stats
            print(f"‚úÖ Baseline set for {feature_name}")
    
    def check_feature_drift(self, feature_name, threshold=0.2):
        """
        Check if features have drifted from baseline.
        
        Args:
            feature_name: Feature to check
            threshold: Maximum allowed relative change (20%)
        
        Returns:
            Dictionary with drift alerts
        """
        if feature_name not in self.baseline_stats:
            return {'error': 'Baseline not set'}
        
        current_stats = self.compute_feature_statistics(feature_name)
        baseline = self.baseline_stats[feature_name]
        
        drift_alerts = {}
        
        for col in current_stats:
            current = current_stats[col]
            base = baseline[col]
            
            # Check mean drift
            mean_change = abs(current['mean'] - base['mean']) / (base['mean'] + 1e-10)
            
            # Check std drift
            std_change = abs(current['std'] - base['std']) / (base['std'] + 1e-10)
            
            # Check null increase
            null_increase = current['nulls'] - base['nulls']
            
            if mean_change > threshold or std_change > threshold or null_increase > 0:
                drift_alerts[col] = {
                    'mean_drift': mean_change,
                    'std_drift': std_change,
                    'null_increase': null_increase,
                    'status': 'üö® DRIFT DETECTED' if mean_change > threshold else '‚ö†Ô∏è WARNING'
                }
        
        return drift_alerts
    
    def validate_feature_quality(self, feature_name):
        """Validate feature data quality."""
        stats = self.compute_feature_statistics(feature_name)
        
        if not stats:
            return {'error': 'Feature not found'}
        
        issues = []
        
        for col, col_stats in stats.items():
            # Check for excessive nulls
            null_rate = col_stats['nulls'] / col_stats['count']
            if null_rate > 0.1:
                issues.append(f"‚ö†Ô∏è {col}: {null_rate:.1%} null values (threshold: 10%)")
            
            # Check for zero variance (constant features)
            if col_stats['std'] < 1e-6:
                issues.append(f"‚ö†Ô∏è {col}: Nearly constant (std={col_stats['std']:.6f})")
            
            # Check for extreme values (basic outlier detection)
            mean = col_stats['mean']
            std = col_stats['std']
            range_width = col_stats['max'] - col_stats['min']
            
            if range_width > 10 * std:
                issues.append(f"‚ö†Ô∏è {col}: Potential outliers (range >> std)")
        
        return issues if issues else ['‚úÖ All quality checks passed']

# Initialize monitor
monitor = FeatureStoreMonitor(fs)

print("="*60)
print("FEATURE STORE MONITORING")
print("="*60)

# Set baseline for features
monitor.set_baseline('basic_params')
monitor.set_baseline('power_metrics')

# Check feature quality
print("\nFeature Quality Validation:")
print("-" * 60)
for feature_name in ['basic_params', 'power_metrics']:
    print(f"\n{feature_name}:")
    quality_issues = monitor.validate_feature_quality(feature_name)
    for issue in quality_issues:
        print(f"  {issue}")

# Simulate feature drift (create new data with shifted distribution)
print("\n" + "="*60)
print("DRIFT DETECTION SIMULATION")
print("="*60)

# Create drifted data
drifted_data = raw_data.copy()
drifted_data['vdd'] = drifted_data['vdd'] + 0.05  # Shift voltage by 50mV
drifted_data['idd'] = drifted_data['idd'] * 1.15  # Increase current by 15%

# Materialize drifted features
fs.materialize_offline('basic_params', drifted_data)

# Check for drift
drift_alerts = monitor.check_feature_drift('basic_params', threshold=0.1)

print("\nDrift Detection Results:")
print("-" * 60)
if drift_alerts:
    for col, alert in drift_alerts.items():
        print(f"\n{col}:")
        print(f"  Status: {alert['status']}")
        print(f"  Mean drift: {alert['mean_drift']:.3f} ({alert['mean_drift']*100:.1f}%)")
        print(f"  Std drift: {alert['std_drift']:.3f} ({alert['std_drift']*100:.1f}%)")
        print(f"  Null increase: {alert['null_increase']}")
else:
    print("‚úÖ No drift detected")

print("\n‚úÖ Feature monitoring and validation complete")

In [None]:
projects = """
================================================================================
REAL-WORLD FEATURE STORE PROJECTS
================================================================================

POST-SILICON VALIDATION PROJECTS
---------------------------------

1. CENTRALIZED DEVICE FEATURE STORE FOR MULTI-MODEL ECOSYSTEM
   
   Objective: Build single feature store serving 10+ ML models (yield, binning, 
              test time, anomaly detection, wafer correlation)
   
   Success Metrics:
   - Feature reuse: >70% features shared across models
   - Training-serving skew: <1% accuracy difference
   - Feature compute reduction: 80% (eliminate duplication)
   - Model development time: 50% faster (pre-built features)
   
   Architecture:
   - Offline: Snowflake data warehouse (historical STDF data)
   - Online: Redis cluster (real-time binning)
   - Computation: Airflow DAGs (daily feature materialization)
   - Registry: Feast feature definitions
   
   Features (50+ total):
   - Basic: Vdd, Idd, Freq, Temp (raw parametrics)
   - Derived: Power, power_efficiency, thermal_margin
   - Statistical: Z-scores, percentiles vs lot/wafer
   - Temporal: Rolling means, test sequence effects
   - Spatial: Wafer map neighbors, radial position
   
   Implementation:
   ```python
   # Entity: Device
   device = Entity(name="device", value_type=ValueType.STRING)
   
   # Feature views
   basic_params = FeatureView(
       name="device_basic_params",
       entities=["device"],
       ttl=timedelta(days=90),  # 90-day retention
       features=[...],
       source=stdf_data_source
   )
   
   spatial_features = FeatureView(
       name="wafer_spatial",
       entities=["device"],
       features=[...],  # Neighbor statistics
       source=wafer_map_source
   )
   ```
   
   Business Value: $2M annual savings (eliminate duplicate feature pipelines,
                   faster model development, consistent features reduce errors)

2. REAL-TIME BINNING FEATURE STORE (INLINE TEST)
   
   Objective: Enable <50ms binning decisions during production test using 
              pre-computed features from online store
   
   Success Metrics:
   - Latency: <50ms p99 (feature retrieval + inference)
   - Throughput: 10,000 predictions/sec
   - Availability: 99.9% uptime
   - Accuracy: Match offline binning model exactly
   
   Architecture:
   - Online Store: Redis Cluster (6 nodes, 100K devices cached)
   - Feature Pipeline: Kafka Streams (real-time feature computation)
   - Model Serving: TensorFlow Serving (binning model)
   - Monitoring: Prometheus + Grafana (latency, drift, errors)
   
   Features (30 features):
   - Device params: Vdd, Idd, Freq, Temp, Power
   - Lot context: Lot mean/std for each param (updated hourly)
   - Percentile ranks: Device ranking vs lot distribution
   - Derived: power_efficiency, freq_per_watt, thermal_headroom
   
   Workflow:
   1. Device tested ‚Üí parametric values sent to Kafka
   2. Kafka Streams: Compute features ‚Üí update Redis
   3. Binning service: GET device features from Redis
   4. Model inference: Predict bin (PASS/BIN1/BIN2/FAIL)
   5. Return to tester: <50ms total latency
   
   Implementation:
   ```python
   # Online-only features (low latency)
   online_features = FeatureView(
       name="binning_features",
       entities=["device"],
       ttl=timedelta(hours=24),  # Fresh features only
       features=[...],
       online=True,
       offline=False  # Not needed for training
   )
   
   # Real-time serving
   features = store.get_online_features(
       features=["binning_features:vdd", "binning_features:lot_vdd_zscore"],
       entity_rows=[{"device": device_id}]
   ).to_dict()
   ```
   
   Business Value: 30% test time reduction (inline binning eliminates retest),
                   $5M annual savings, improved yield (faster feedback)

3. WAFER-LEVEL SPATIAL FEATURE ENGINEERING
   
   Objective: Build feature store with spatial correlation features for wafer
              map analysis and die-level yield prediction
   
   Success Metrics:
   - Spatial features: 20+ (neighbors, radial, edge effects)
   - Yield prediction improvement: +5% accuracy vs baseline
   - Defect pattern detection: Identify systematic failures
   - Coverage: All wafer test sites (500K+ dies/day)
   
   Architecture:
   - Offline: BigQuery (petabyte-scale wafer test history)
   - Computation: Apache Spark (distributed spatial joins)
   - Storage: Parquet files (partitioned by wafer_id, date)
   - Lineage: Feast + Great Expectations (data validation)
   
   Spatial Features:
   - Neighbor statistics: mean/std of 8 neighbors for each param
   - Radial position: Distance from wafer center, angle
   - Edge effects: Distance from wafer edge, edge bin flags
   - Cluster features: Local density, nearest failure distance
   - Wafer-level: Wafer mean/std, gradient vectors
   
   Point-in-Time Correctness:
   - Critical: Don't leak future die results into neighbor features
   - Solution: Compute neighbors only from previously tested dies
   
   Implementation:
   ```python
   def compute_spatial_features(wafer_df):
       \"\"\"Compute spatial features with point-in-time correctness.\"\"\"
       results = []
       
       for idx, die in wafer_df.sort_values('test_timestamp').iterrows():
           # Only use dies tested BEFORE current die
           previous_dies = wafer_df[wafer_df['test_timestamp'] < die['test_timestamp']]
           
           # Find neighbors
           neighbors = previous_dies[
               (abs(previous_dies['die_x'] - die['die_x']) <= 1) &
               (abs(previous_dies['die_y'] - die['die_y']) <= 1)
           ]
           
           features = {
               'neighbor_mean_vdd': neighbors['vdd'].mean(),
               'neighbor_std_vdd': neighbors['vdd'].std(),
               'radial_distance': np.sqrt(die['die_x']**2 + die['die_y']**2),
               # ... more spatial features
           }
           results.append(features)
       
       return pd.DataFrame(results)
   ```
   
   Business Value: 5% yield improvement = $10M+ annual (better defect detection,
                   root cause analysis, proactive process adjustments)

4. FEATURE VERSIONING FOR A/B EXPERIMENTS
   
   Objective: Version-control feature definitions to enable reproducible
              experiments and rollback when new features hurt performance
   
   Success Metrics:
   - Reproducibility: 100% (same features ‚Üí same results)
   - Experiment velocity: 3x faster (pre-built feature versions)
   - Rollback time: <1 hour (revert to previous feature version)
   - Feature lineage: Full tracking from raw data to model
   
   Versioning Strategy:
   - v1.0: Baseline (raw parametrics: Vdd, Idd, Freq, Temp)
   - v1.1: +Derived features (power, efficiency)
   - v1.2: +Polynomial features (Vdd^2, Freq*Vdd interactions)
   - v2.0: +Spatial features (wafer map neighbors)
   - v2.1: +Temporal features (rolling statistics)
   
   Implementation:
   ```python
   # Feature definitions with versioning
   device_features_v1 = FeatureView(
       name="device_features",
       version="1.0",
       entities=["device"],
       features=[Field(name="vdd"), Field(name="idd"), ...],
       tags={"version": "baseline"}
   )
   
   device_features_v2 = FeatureView(
       name="device_features",
       version="2.0",
       entities=["device"],
       features=[...],  # + spatial features
       tags={"version": "with_spatial"}
   )
   
   # Retrieve specific version
   training_df = store.get_historical_features(
       entity_df=entity_df,
       features=["device_features__v1.0:vdd", ...]  # Explicit version
   ).to_df()
   ```
   
   Experiment Tracking:
   - MLflow: Log feature version with each experiment
   - Comparison: A/B test v1.0 vs v2.0 on same validation set
   - Decision: Promote v2.0 if +2% accuracy improvement
   
   Business Value: Faster experimentation, reproducible results, safe rollback,
                   regulatory compliance (feature lineage for audits)


GENERAL AI/ML PROJECTS
----------------------

5. E-COMMERCE RECOMMENDATION FEATURE STORE
   
   Objective: Unified feature store for recommendation models (product, user,
              context features) serving 1M+ users
   
   Success Metrics:
   - CTR improvement: +15% (better features)
   - Feature freshness: <5 min latency (real-time user activity)
   - Model variety: 5+ models sharing same features
   - Infrastructure cost: -40% (eliminate duplicate pipelines)
   
   Architecture:
   - Offline: Snowflake (historical clickstream, transactions)
   - Online: DynamoDB (user profiles, product features)
   - Streaming: Kafka + Flink (real-time activity features)
   - Registry: Feast (300+ features)
   
   Features:
   - User: demographics, purchase history, browsing patterns
   - Product: category, price, popularity, reviews
   - Context: time_of_day, device_type, location
   - Derived: user_product_affinity, price_sensitivity
   - Real-time: last_5_views, cart_value, session_duration
   
   Implementation:
   ```python
   # User features (updated hourly)
   user_features = FeatureView(
       name="user_profile",
       entities=["user"],
       ttl=timedelta(hours=24),
       features=[...],
       source=user_data_warehouse
   )
   
   # Real-time session features (updated on every event)
   session_features = FeatureView(
       name="user_session",
       entities=["user"],
       ttl=timedelta(minutes=30),
       features=[...],
       source=kafka_stream
   )
   ```
   
   Business Value: $20M annual revenue increase (CTR improvement),
                   $2M infrastructure savings (eliminate duplication)

6. FRAUD DETECTION FEATURE STORE WITH GRAPH FEATURES
   
   Objective: Feature store combining transactional, behavioral, and graph
              features for real-time fraud detection (<100ms)
   
   Success Metrics:
   - Fraud detection rate: 95% (vs 80% baseline)
   - False positive rate: <2% (minimize customer friction)
   - Latency: <100ms p99 (real-time blocking)
   - Feature types: 100+ (traditional + graph embeddings)
   
   Features:
   - Transactional: amount, merchant, category, time
   - Behavioral: velocity (transactions/hour), amount patterns
   - Historical: user lifetime value, fraud history
   - Graph: PageRank, community detection, suspicious connections
   - Derived: amount_zscore_vs_user_history, merchant_risk_score
   
   Graph Feature Engineering:
   ```python
   # Build transaction graph
   G = nx.Graph()
   for txn in transactions:
       G.add_edge(txn['user_id'], txn['merchant_id'], weight=txn['amount'])
   
   # Compute graph features
   pagerank = nx.pagerank(G)
   communities = nx.community.louvain_communities(G)
   
   # Store as features
   graph_features = pd.DataFrame({
       'user_id': list(pagerank.keys()),
       'pagerank_score': list(pagerank.values()),
       'community_id': [get_community(u, communities) for u in pagerank.keys()]
   })
   ```
   
   Business Value: $50M annual fraud prevented, $10M false positive reduction,
                   customer trust (fewer false declines)

7. CHURN PREDICTION FEATURE STORE (TELECOM)
   
   Objective: Feature store for customer churn prediction with 200+ features
              from billing, usage, support, and network data
   
   Success Metrics:
   - Churn prediction AUC: >0.85
   - Early detection: 30 days before churn
   - Feature coverage: 100% of customers
   - Refresh frequency: Daily (offline), hourly (online)
   
   Features (200+ total):
   - Billing: monthly_charge, payment_delays, plan_changes
   - Usage: voice_minutes, data_GB, roaming_frequency
   - Support: tickets_count, avg_resolution_time, satisfaction
   - Network: dropped_calls, avg_speed, coverage_quality
   - Derived: usage_trend, payment_reliability, support_burden
   - Temporal: 3mo/6mo/12mo rolling averages
   
   Point-in-Time Challenge:
   - Problem: Churn label is future event (30 days ahead)
   - Solution: Features must be from 30+ days before churn date
   
   Implementation:
   ```python
   # Create training dataset with 30-day offset
   entity_df = pd.DataFrame({
       'customer_id': customer_ids,
       'event_timestamp': churn_dates - timedelta(days=30),  # 30 days before
       'churned': churn_labels
   })
   
   # Get features as they existed 30 days before churn
   training_df = store.get_historical_features(
       entity_df=entity_df,
       features=[...],  # Point-in-time correct
   ).to_df()
   ```
   
   Business Value: $100M annual retention (save 20% of at-risk customers),
                   proactive interventions, targeted offers

8. DEMAND FORECASTING FEATURE STORE (RETAIL)
   
   Objective: Multi-SKU demand forecasting with features from sales, weather,
              promotions, holidays, and economic indicators
   
   Success Metrics:
   - Forecast accuracy: MAPE <15%
   - SKU coverage: 10,000+ products
   - Forecast horizon: 28 days
   - Feature update: Daily (automated pipeline)
   
   Architecture:
   - Offline: BigQuery (5 years sales history)
   - External: Weather API, economic data API
   - Computation: Airflow DAGs (daily feature materialization)
   - Models: LightGBM per product category
   
   Features:
   - Historical sales: 7d/14d/28d/365d rolling means
   - Trend: Linear regression slope over 90 days
   - Seasonality: Day of week, month, quarter effects
   - Promotions: Active promotions, discount %, ad spend
   - Weather: Temperature, precipitation, forecast
   - Calendar: Holidays, paydays, special events
   - Economic: Unemployment, consumer confidence, gas prices
   - Derived: price_elasticity, promotion_effectiveness
   
   Implementation:
   ```python
   # External data source (weather)
   weather_source = PushSource(
       name="weather_data",
       batch_source=weather_api_connector
   )
   
   weather_features = FeatureView(
       name="weather",
       entities=["location", "date"],
       features=[...],
       source=weather_source
   )
   
   # Join sales + weather + promotions
   training_df = store.get_historical_features(
       entity_df=sku_dates_df,
       features=[
           "sales_history:rolling_7d_mean",
           "weather:temperature",
           "promotions:active_discount"
       ]
   ).to_df()
   ```
   
   Business Value: $30M inventory reduction (better forecasting),
                   10% revenue increase (fewer stockouts),
                   optimized promotions


================================================================================
IMPLEMENTATION CHECKLIST (FOR ALL PROJECTS)
================================================================================

Phase 1: Design & Planning (Week 1-2)
‚ñ° Define entities (device, customer, product, etc.)
‚ñ° Identify features (raw, derived, aggregated)
‚ñ° Choose offline store (Snowflake, BigQuery, S3)
‚ñ° Choose online store (Redis, DynamoDB, Cassandra)
‚ñ° Design feature computation pipeline (Airflow, Spark)
‚ñ° Plan monitoring (drift detection, quality checks)

Phase 2: Implementation (Week 3-6)
‚ñ° Set up Feast (feast init, configure stores)
‚ñ° Define feature views (entities.py, features.py)
‚ñ° Implement feature transformations
‚ñ° Build offline materialization pipeline
‚ñ° Build online serving pipeline
‚ñ° Implement monitoring dashboards

Phase 3: Integration (Week 7-8)
‚ñ° Integrate with model training pipeline
‚ñ° Integrate with model serving (online features)
‚ñ° Set up feature drift alerts
‚ñ° Configure backup and disaster recovery
‚ñ° Load testing (latency, throughput)

Phase 4: Production (Week 9-10)
‚ñ° Deploy to production environment
‚ñ° Monitor feature freshness and quality
‚ñ° Track training-serving skew
‚ñ° Optimize performance (caching, indexing)
‚ñ° Document features (registry, wiki)

Phase 5: Iteration (Ongoing)
‚ñ° Add new features based on model experiments
‚ñ° Version features (v1, v2, etc.)
‚ñ° Deprecate unused features
‚ñ° Optimize costs (storage, compute)
‚ñ° Share features across teams
"""

print(projects)

In [None]:
takeaways = """
================================================================================
KEY TAKEAWAYS: FEATURE STORE IMPLEMENTATION
================================================================================

1. CORE CONCEPTS
   -------------
   
   What is a Feature Store?
   - Centralized repository for ML features
   - Solves training-serving skew (features computed identically)
   - Ensures point-in-time correctness (no data leakage)
   - Enables feature reuse across models and teams
   
   Critical Problem Solved:
   Problem: Training features ‚â† Serving features
   Example: Training uses SQL query, serving uses Python function
   Result: Model accuracy drops in production (skew)
   Solution: Feature store computes features once, serves to both
   
   Components:
   1. Feature Registry: Metadata catalog (definitions, schemas, lineage)
   2. Offline Store: Historical features for training (data warehouse)
   3. Online Store: Low-latency features for serving (key-value store)
   4. Feature Computation: Transforms raw data ‚Üí features


2. OFFLINE VS ONLINE STORES
   -------------------------
   
   Offline Store (Training):
   - Purpose: Historical features for model training
   - Storage: Data warehouse (Snowflake, BigQuery, Redshift)
   - Query pattern: Large batch queries (millions of rows)
   - Latency: Seconds to minutes (not time-critical)
   - Point-in-time joins: Features as they existed at event timestamp
   - Use cases: Training datasets, feature exploration, backtesting
   
   Online Store (Serving):
   - Purpose: Real-time features for production inference
   - Storage: Key-value store (Redis, DynamoDB, Cassandra)
   - Query pattern: Single entity lookup (GET device:12345)
   - Latency: <10ms p99 (critical for real-time)
   - Freshness: Continuously updated (streaming pipeline)
   - Use cases: REST API serving, mobile apps, real-time decisions
   
   Example:
   # Offline: Training yield prediction model
   training_df = store.get_historical_features(
       entity_df=devices_df,  # 100K devices
       features=["device_features:vdd", "device_features:idd"]
   ).to_df()  # Takes 30 seconds, returns 100K rows
   
   # Online: Production binning (real-time)
   features = store.get_online_features(
       features=["device_features:vdd", "device_features:idd"],
       entity_rows=[{"device": "DEV00123"}]
   ).to_dict()  # Takes <10ms, returns 1 device


3. POINT-IN-TIME CORRECTNESS
   --------------------------
   
   Critical Concept:
   Features must only use data available at the event timestamp.
   Prevents data leakage from the future.
   
   Example Problem (Without Point-in-Time):
   Training: Predict if device will fail at time T
   Feature: Lot average voltage (computed from entire lot)
   Issue: Lot average includes devices tested AFTER time T
   Result: Model sees future data during training (leakage)
   Production: Lot average only includes past devices
   Outcome: Model accuracy drops (training-serving skew)
   
   Solution (With Point-in-Time):
   For each training example at time T:
   - Only use features computed from data BEFORE time T
   - Feature store automatically filters data by timestamp
   - Training features match production features exactly
   
   Implementation:
   entity_df = pd.DataFrame({
       'device_id': ['DEV001', 'DEV002'],
       'event_timestamp': [datetime(2024,1,1,10,0), datetime(2024,1,1,11,0)]
   })
   
   # Feature store joins features as they existed at event_timestamp
   training_df = store.get_historical_features(
       entity_df=entity_df,
       features=["lot_statistics:mean_vdd"]  # Only uses data <= event_timestamp
   ).to_df()
   
   Critical for:
   - Time series forecasting (don't leak future values)
   - Churn prediction (features from 30 days before churn)
   - Fraud detection (features from before transaction)
   - Spatial features (only previously tested dies)


4. FEATURE ENGINEERING PATTERNS
   -----------------------------
   
   Raw Features:
   - Direct measurements from data sources
   - Example: vdd, idd, freq, temp (STDF parameters)
   - Minimal transformation, high interpretability
   
   Derived Features:
   - Computed from raw features
   - Example: power = vdd * idd, efficiency = freq / power
   - Domain knowledge embedded, engineered insights
   
   Aggregate Features:
   - Statistical summaries over groups
   - Example: lot_mean_vdd = AVG(vdd) GROUP BY lot_id
   - Contextual information, relative comparisons
   
   Temporal Features:
   - Time-based patterns and trends
   - Example: rolling_7d_mean, day_of_week, trend_slope
   - Capture seasonality, trends, time-of-day effects
   
   Spatial Features:
   - Location-based relationships
   - Example: neighbor_mean (wafer maps), distance_from_center
   - Capture spatial correlations, clustering
   
   Categorical Encodings:
   - Transform categories to numeric
   - Example: bin_category ‚Üí one-hot encoding
   - Enable ML algorithms to use categorical data
   
   Interaction Features:
   - Combinations of features
   - Example: vdd * freq, temp_high AND idd_high
   - Capture non-linear relationships
   
   Feature Hierarchy:
   Level 1 (Raw): vdd=1.2V, idd=250mA
   Level 2 (Derived): power=300mW, efficiency=0.83 GHz/W
   Level 3 (Normalized): vdd_zscore=0.5, power_percentile=85
   Level 4 (Aggregate): lot_mean_power=320mW, wafer_std_vdd=0.02
   Level 5 (Spatial): neighbor_mean_power=310mW, radial_distance=45mm


5. FEAST FRAMEWORK - PRODUCTION STANDARD
   --------------------------------------
   
   Why Feast?
   - Open source, industry standard (Uber, Shopify, etc.)
   - Automatic point-in-time correctness
   - Online/offline consistency guaranteed
   - Feature versioning and lineage built-in
   - Scales to billions of features
   
   Core Abstractions:
   
   Entity: Thing being modeled (device, customer, product)
   device = Entity(name="device", value_type=ValueType.STRING)
   
   Data Source: Raw data location
   device_source = FileSource(
       path="s3://bucket/stdf_data.parquet",
       timestamp_field="test_timestamp"
   )
   
   Feature View: Group of related features
   device_features = FeatureView(
       name="device_parametrics",
       entities=["device"],
       ttl=timedelta(days=90),  # Feature retention
       schema=[Field(name="vdd", dtype=Float32), ...],
       source=device_source,
       online=True  # Materialize to online store
   )
   
   Feature Service: Named set of features for a model
   binning_service = FeatureService(
       name="binning_v1",
       features=[
           device_features[["vdd", "idd", "freq"]],
           lot_features[["lot_mean_vdd"]]
       ]
   )
   
   CLI Workflow:
   # 1. Initialize project
   feast init feature_repo
   
   # 2. Define features (entities.py, features.py)
   
   # 3. Register features
   feast -c feature_repo apply
   
   # 4. Materialize to offline store (historical)
   feast materialize-incremental $(date -u +"%Y-%m-%dT%H:%M:%S")
   
   # 5. Materialize to online store (latest)
   feast materialize 2024-01-01T00:00:00 2024-12-31T23:59:59
   
   Python SDK:
   from feast import FeatureStore
   
   store = FeatureStore(repo_path="feature_repo/")
   
   # Training (offline)
   training_df = store.get_historical_features(
       entity_df=entity_df,
       features=["device_parametrics:vdd", "device_parametrics:idd"]
   ).to_df()
   
   # Inference (online)
   online_features = store.get_online_features(
       features=["device_parametrics:vdd"],
       entity_rows=[{"device": "DEV123"}]
   ).to_dict()


6. FEATURE VERSIONING & REPRODUCIBILITY
   -------------------------------------
   
   Why Version Features?
   - Reproduce experiments (same features ‚Üí same results)
   - A/B test feature sets (v1 vs v2)
   - Rollback when new features hurt performance
   - Regulatory compliance (audit trail)
   
   Versioning Strategies:
   
   1. Explicit Version in Name:
   device_features_v1 = FeatureView(name="device_features_v1", ...)
   device_features_v2 = FeatureView(name="device_features_v2", ...)
   
   2. Git-Based Versioning:
   - Feature definitions in Git repository
   - Tag releases: git tag v1.0.0
   - Checkout specific version for reproduction
   
   3. Feature Registry Versioning (Feast):
   - Feast tracks feature view versions automatically
   - Retrieve specific version: features__v1.0:vdd
   
   4. Schema Evolution:
   v1.0: Basic features (vdd, idd, freq, temp)
   v1.1: +Derived (power, efficiency) [backward compatible]
   v2.0: +Spatial (neighbors) [new feature view]
   v2.1: vdd ‚Üí vdd_mv (unit change) [breaking change]
   
   Experiment Tracking Integration:
   with mlflow.start_run():
       mlflow.log_param("feature_version", "v2.0")
       mlflow.log_param("features", ["vdd", "idd", "power", "neighbors"])
       
       # Train with specific feature version
       features = store.get_historical_features(
           features=["device_features__v2.0:vdd", ...]
       )
       
   Rollback Procedure:
   1. Detect performance degradation (model monitoring)
   2. Identify feature version causing issue (v2.1)
   3. Rollback to previous version (v2.0)
   4. Redeploy model trained on v2.0 features
   5. Investigate v2.1 issue, fix, re-release as v2.2


7. ONLINE SERVING ARCHITECTURE
   ----------------------------
   
   Requirements:
   - Latency: <10ms p99 (feature retrieval only)
   - Throughput: 10K-100K requests/sec
   - Availability: 99.9%+ uptime
   - Consistency: Match offline features exactly
   
   Technology Stack:
   
   Redis (Most Common):
   - In-memory key-value store
   - 1-5ms latency at p99
   - 100K+ ops/sec per node
   - Clustering for horizontal scaling
   - Persistence (RDB, AOF)
   
   DynamoDB (AWS):
   - Managed NoSQL database
   - Single-digit millisecond latency
   - Auto-scaling (pay per request)
   - 99.99% availability SLA
   - Global tables for multi-region
   
   Cassandra (High Scale):
   - Distributed NoSQL database
   - Linear scalability (add nodes)
   - Multi-datacenter replication
   - Tunable consistency
   - Ideal for >1M writes/sec
   
   Data Model:
   Key: entity_type:entity_id (e.g., device:DEV00123)
   Value: JSON with all features {vdd: 1.2, idd: 250, ...}
   
   Example (Redis):
   # Write (streaming pipeline)
   redis.set(
       "device:DEV00123",
       json.dumps({"vdd": 1.2, "idd": 250, "freq": 2.5}),
       ex=86400  # 24-hour TTL
   )
   
   # Read (inference service)
   features_json = redis.get("device:DEV00123")
   features = json.loads(features_json)
   
   Caching Strategy:
   - Hot entities: Cache in application memory (Redis + local cache)
   - Cold entities: Fetch from Redis on-demand
   - TTL: Match feature update frequency (hourly, daily)
   
   Deployment Pattern:
   Internet ‚Üí Load Balancer ‚Üí Inference Service ‚Üí Redis Cluster
                                     ‚Üì
                                 ML Model
   
   Monitoring:
   - Latency (p50, p95, p99)
   - Cache hit rate
   - Redis memory usage
   - Feature staleness (time since last update)


8. FEATURE MONITORING & DRIFT DETECTION
   -------------------------------------
   
   What to Monitor:
   
   1. Feature Freshness:
   - Time since last feature update
   - Alert if >2x expected update interval
   - Example: Daily features not updated in 36 hours
   
   2. Feature Quality:
   - Null rate (<10% threshold)
   - Zero variance (constant features)
   - Outliers (>3 std from mean)
   - Data type violations
   
   3. Feature Drift:
   - Statistical tests: KS test, PSI, Chi-square
   - Distribution shift (mean, std, percentiles)
   - Alert if PSI >0.2 (major drift)
   
   4. Feature Importance Drift:
   - Track feature importance over time
   - Alert if top 5 features change significantly
   - Indicates concept drift or data shift
   
   Monitoring Implementation:
   
   class FeatureMonitor:
       def __init__(self, baseline_stats):
           self.baseline = baseline_stats
       
       def check_drift(self, current_features):
           alerts = []
           
           # 1. Freshness
           if current_features['last_update'] > threshold:
               alerts.append("STALE: Features not updated")
           
           # 2. Quality
           null_rate = current_features.isnull().mean()
           if null_rate > 0.1:
               alerts.append(f"QUALITY: {null_rate:.1%} nulls")
           
           # 3. Statistical drift
           psi = calculate_psi(self.baseline['vdd'], current_features['vdd'])
           if psi > 0.2:
               alerts.append(f"DRIFT: PSI={psi:.3f} for vdd")
           
           return alerts
   
   Dashboard Metrics:
   - Feature availability (% entities with features)
   - Feature completeness (% non-null values)
   - Feature drift score (PSI, KS statistic)
   - Feature correlation (to detect redundancy)
   - Feature usage (which models use which features)
   
   Alerting Rules:
   - Severity 1 (Page): Online store down, >50% null rate
   - Severity 2 (Alert): Major drift (PSI >0.2), stale features
   - Severity 3 (Warning): Minor drift (PSI 0.1-0.2), quality issues
   
   Automated Retraining:
   if feature_drift_detected() or performance_degraded():
       trigger_retraining_pipeline()
       wait_for_new_model()
       if new_model_better():
           deploy_new_model()
       else:
           rollback_features()


9. FEATURE STORE BEST PRACTICES
   -----------------------------
   
   Design Principles:
   
   1. Feature Reusability:
   - Define features once, use across all models
   - Avoid model-specific feature definitions
   - Create feature libraries by domain (device, wafer, lot)
   
   2. Clear Naming Conventions:
   - Prefix with entity: device_vdd, lot_mean_idd
   - Suffix with aggregation: vdd_rolling_7d_mean
   - Version suffix: device_features_v2
   
   3. Feature Documentation:
   - Description: What the feature represents
   - Computation: How it's calculated (SQL, Python)
   - Business meaning: Why it's useful
   - Owner: Team responsible for feature
   - SLA: Update frequency, freshness requirements
   
   4. Separation of Concerns:
   - Raw features: Minimal transformation
   - Derived features: Domain-specific logic
   - Model features: Model-specific transformations
   - Keep raw ‚Üí derived ‚Üí model hierarchy
   
   5. Idempotency:
   - Feature computation should be deterministic
   - Same input ‚Üí same output (reproducibility)
   - No random seeds, no current_time() in features
   
   6. Incremental Computation:
   - Only recompute changed features (not all)
   - Use watermarks to track processed data
   - Saves compute costs, reduces latency
   
   7. Cost Optimization:
   - TTL: Delete old features (save storage)
   - Compression: Use Parquet, columnar formats
   - Materialization schedule: Daily vs hourly vs real-time
   - Caching: Hot features in memory, cold on disk
   
   Code Quality:
   
   # Good: Clear, reusable, documented
   @feature(name="device_power", description="Power consumption in mW")
   def compute_power(vdd, idd):
       \"\"\"
       Calculate device power.
       
       Args:
           vdd: Voltage (V)
           idd: Current (mA)
       
       Returns:
           Power in milliwatts
       \"\"\"
       return vdd * idd
   
   # Bad: Unclear, hardcoded, undocumented
   def f(x, y):
       return x * y * 1000  # Why 1000?


10. COMMON PITFALLS & SOLUTIONS
    ----------------------------
    
    Pitfall 1: Training-Serving Skew
    Problem: Features computed differently in training vs serving
    Example: Training uses SQL AVG(), serving uses Python np.mean()
    Solution: Use feature store for both (guaranteed consistency)
    
    Pitfall 2: Data Leakage (Future Data)
    Problem: Training uses features from after event timestamp
    Example: Lot average includes devices tested later
    Solution: Use point-in-time joins (feature store automatic)
    
    Pitfall 3: Null Handling Inconsistency
    Problem: Training fills nulls with 0, serving fills with mean
    Solution: Document null strategy, enforce in feature definition
    
    Pitfall 4: Feature Staleness (Online Store)
    Problem: Online features not updated, model uses old data
    Example: User features from 2 weeks ago
    Solution: Monitor freshness, set TTL, alert on stale features
    
    Pitfall 5: Over-Engineering Features
    Problem: 1000+ features, most unused, high maintenance cost
    Solution: Start simple, add features based on experiments
    
    Pitfall 6: Ignoring Feature Drift
    Problem: Features change distribution, model accuracy drops
    Solution: Monitor drift (PSI, KS test), retrain when detected
    
    Pitfall 7: Poor Performance (Slow Queries)
    Problem: Offline queries take hours, blocking training
    Solution: Partition data, use columnar formats, pre-aggregate
    
    Pitfall 8: No Feature Versioning
    Problem: Can't reproduce past experiments, rollback impossible
    Solution: Version feature definitions in Git, track in MLflow
    
    Pitfall 9: Single Point of Failure
    Problem: Online store down = all predictions fail
    Solution: Redis clustering, fallback to default features
    
    Pitfall 10: Ignoring Costs
    Problem: Feature store costs $50K/month, mostly unused features
    Solution: Set TTL, delete unused features, optimize storage


11. WHEN TO USE A FEATURE STORE
    ----------------------------
    
    Strong Indicators (Use Feature Store):
    ‚úÖ Multiple models using same features (reuse)
    ‚úÖ Both training and real-time serving (consistency critical)
    ‚úÖ Large team (>5 data scientists, need collaboration)
    ‚úÖ High model count (>10 models in production)
    ‚úÖ Complex features (aggregations, temporal, spatial)
    ‚úÖ Compliance requirements (lineage, reproducibility)
    ‚úÖ Point-in-time correctness critical (finance, healthcare)
    
    Weak Indicators (Maybe Skip):
    ‚ö†Ô∏è Single model, simple features (overhead not worth it)
    ‚ö†Ô∏è Batch-only inference (no real-time serving)
    ‚ö†Ô∏è Small team (<3 people, coordination overhead)
    ‚ö†Ô∏è Proof-of-concept, short-lived project
    ‚ö†Ô∏è Features change frequently (high churn)
    
    Alternatives to Feature Store:
    
    Simple Pipeline (Low Complexity):
    - Jupyter notebook: Feature engineering
    - Pickle file: Save training features
    - Same notebook: Load for inference
    - Works for: Single model, batch inference, small team
    
    Data Warehouse Only (Medium Complexity):
    - BigQuery/Snowflake: Store features as tables
    - dbt: Transform raw ‚Üí features (SQL)
    - Cache in Redis: For serving (manual setup)
    - Works for: Multiple models, but only batch inference
    
    Custom Feature Store (High Complexity):
    - Build your own (like we did in this notebook)
    - Works for: Specific constraints, learning purposes
    - Risk: Maintenance burden, missing features
    
    When to Adopt Feast:
    - 3+ models in production
    - Both batch and real-time serving
    - Team size >5 data scientists
    - Need point-in-time correctness
    - Want to avoid building/maintaining custom solution


12. PRODUCTION READINESS CHECKLIST
    --------------------------------
    
    Infrastructure:
    ‚ñ° Offline store configured (Snowflake, BigQuery, S3)
    ‚ñ° Online store deployed (Redis cluster, DynamoDB)
    ‚ñ° Feature computation pipeline (Airflow, Spark)
    ‚ñ° Monitoring dashboards (Grafana, Datadog)
    ‚ñ° Alerting configured (PagerDuty, Slack)
    
    Features:
    ‚ñ° Feature definitions documented
    ‚ñ° Feature versioning strategy defined
    ‚ñ° Feature ownership assigned (team/person)
    ‚ñ° Feature SLAs documented (freshness, quality)
    ‚ñ° Feature tests written (unit, integration)
    
    Data Quality:
    ‚ñ° Null handling strategy defined
    ‚ñ° Outlier detection configured
    ‚ñ° Schema validation (Great Expectations)
    ‚ñ° Data lineage tracked
    ‚ñ° Backup and disaster recovery
    
    Performance:
    ‚ñ° Online store latency <10ms p99
    ‚ñ° Offline queries optimized (<5 min)
    ‚ñ° Load testing completed (10K+ QPS)
    ‚ñ° Autoscaling configured
    ‚ñ° Cost monitoring enabled
    
    Security:
    ‚ñ° Access control (IAM, RBAC)
    ‚ñ° Encryption at rest and in transit
    ‚ñ° Audit logging enabled
    ‚ñ° PII handling compliant (GDPR, etc.)
    ‚ñ° Network isolation (VPC, firewall)
    
    Operations:
    ‚ñ° Runbooks documented
    ‚ñ° On-call rotation defined
    ‚ñ° Incident response plan
    ‚ñ° Rollback procedures tested
    ‚ñ° Feature deprecation policy


13. NEXT STEPS IN LEARNING PATH
    -----------------------------
    
    After Feature Stores, continue with:
    
    125_ML_Testing_Validation.ipynb:
    - Unit tests for feature transformations
    - Integration tests for feature pipelines
    - Validate feature store outputs
    
    126_CI_CD_for_ML.ipynb:
    - Automate feature materialization
    - Feature store deployment pipeline
    - Feature registry updates in CI/CD
    
    127_Model_Serving_Patterns.ipynb:
    - Integrate feature store with serving
    - Online feature retrieval in REST API
    - Caching strategies for features
    
    Advanced Topics:
    - Stream processing (Kafka, Flink) for real-time features
    - Feature embeddings (represent categories as vectors)
    - Graph features (relationships, network analysis)
    - AutoML for feature selection
    - Feature engineering at scale (Spark, Ray)


================================================================================
FINAL SUMMARY
================================================================================

Feature Stores solve the critical problem of training-serving skew by ensuring
features are computed identically for both model training and production
inference. They provide:

1. Consistency: Same features in training and serving
2. Reusability: Define once, use across all models
3. Point-in-Time Correctness: No data leakage from future
4. Low-Latency Serving: <10ms feature retrieval
5. Versioning: Reproducible experiments, safe rollback
6. Monitoring: Drift detection, quality checks

Feast is the production standard (open source, battle-tested at Uber/Shopify).

Start simple, add complexity as needed. Not every project needs a feature
store, but for multi-model production systems with real-time serving, it's
essential infrastructure.

Key metric: Training-serving consistency. If model accuracy is 95% in training
but 85% in production, investigate training-serving skew first (likely feature
computation differences).

================================================================================
"""

print(takeaways)

## 12. Key Takeaways - Feature Store Mastery

Comprehensive reference guide for production feature store implementation.

## 11. Real-World Project Templates

**8 production-ready feature store projects** (4 post-silicon + 4 general AI/ML).

Each project includes:
- Clear objective and success criteria
- Feature store architecture
- Feature engineering strategy
- Business value and impact

## 10. Feature Store Monitoring & Validation

**What's happening:** Ensuring feature quality and detecting feature drift.

**Key points:**
- **Feature freshness**: Alert if features haven't been updated within SLA
- **Feature drift**: Detect distribution shifts in feature values
- **Feature quality**: Validate nulls, outliers, data types
- **Performance impact**: Monitor how feature changes affect model accuracy

**Production monitoring:** Daily jobs check feature statistics, alert on anomalies, trigger retraining if drift detected.

## 9. Feast Framework - Production Feature Store

**What's happening:** Introduction to Feast, the leading open-source feature store.

**Key points:**
- **Feast components**: Feature registry, offline store (BigQuery/Snowflake), online store (Redis/DynamoDB)
- **Feature definitions**: Entity, data source, feature view (Python files)
- **CLI commands**: feast init, feast apply, feast materialize
- **SDK**: Python API for feature retrieval

**Production setup:** Feast manages feature versioning, point-in-time joins, and online/offline consistency automatically.

## 8. Real-Time Inference with Online Features

**What's happening:** Production inference using features from online store.

**Key points:**
- **Low latency**: Feature retrieval <10ms, total inference <50ms
- **Feature consistency**: Identical to training features (no skew)
- **Entity lookup**: Retrieve pre-computed features by device_id
- **Prediction**: Apply trained model to online features

**Production workflow:** Test handler ‚Üí online feature store ‚Üí model inference ‚Üí bin decision ‚Üí tester.

## 7. Online Feature Serving - Real-Time Inference

**What's happening:** Populating online store for low-latency feature retrieval during production inference.

**Key points:**
- **Online store**: Redis, DynamoDB, Cassandra (key-value, <10ms latency)
- **Feature freshness**: Continuously updated from streaming pipeline
- **Entity-based lookup**: Retrieve features by device_id, customer_id, etc.
- **Precomputed features**: No computation at inference time

**Production scenario:** Production tester sends device_id ‚Üí feature store returns features ‚Üí model predicts bin in <50ms.

## 6. Model Training with Feature Store Features

**What's happening:** Training yield prediction model using features from feature store.

**Key points:**
- **Feature consistency**: Same features used for training and serving
- **Feature selection**: Choose relevant features from store
- **Model performance**: Baseline for comparison with production
- **Feature importance**: Identify most predictive features

**Why this matters:** Model trained on feature store features will match production performance (no training-serving skew).

## 5. Offline Feature Materialization - Training Dataset

**What's happening:** Computing and storing historical features for model training.

**Key points:**
- **Batch computation**: Process all historical data at once
- **Point-in-time correctness**: Features valid at specific timestamps
- **Storage**: Data warehouse, data lake, or file storage
- **Training dataset**: Join features to labels for supervised learning

**Production workflow:** Airflow/Kubeflow schedules daily batch jobs to materialize features from STDF data warehouse.

## 4. Feature Definitions - Register Features in Store

**What's happening:** Defining feature transformations and registering them in the feature store.

**Key points:**
- **Feature function**: Transform raw data ‚Üí feature values
- **Metadata**: Name, description, type, computation logic
- **Registry**: Central catalog of all available features

**Production note:** In Feast, features are defined in Python feature definitions files with entity, data source, and feature view specifications.

## 3. Post-Silicon Feature Engineering - Device Characterization

**What's happening:** Creating device features from STDF test data for yield prediction.

**Key points:**
- **Raw parameters**: Vdd, Idd, Freq, Temp (direct measurements)
- **Aggregate features**: Statistical summaries (mean, std, percentiles)
- **Derived features**: Ratios, power calculations, Z-scores
- **Temporal features**: Test time, sequence effects

**Why this matters:** Consistent feature definitions ensure yield model trained on historical data performs accurately on production devices.

## 2. Feature Store Fundamentals - Building Blocks

**What's happening:** Understanding the core concepts before implementation.

**Key points:**
- **Feature Definition**: Schema, data types, transformation logic
- **Offline vs Online**: Historical data (training) vs low-latency (serving)
- **Point-in-Time Correctness**: Prevents data leakage from future
- **Feature Versioning**: Track changes, enable reproducibility

**Post-silicon context:** Device features must be identical between model training and production test binning to avoid skew.