# 01 - Introduction: ML System Components (Beyond the Model)

---

## What the Chapter Says

A production ML system is **far more than just the model**. The chapter explicitly lists these components:

- **Data collection** + verification + feature extraction
- **Evaluation pipeline**
- **Monitoring**
- **Serving infrastructure**
- **Process management tools**
- **Resource management**
- **Analysis tools**
- **Configuration**

The model ("ML Code") is often a tiny fraction of the overall system.

---

## Meta Interview Signal

| Level | Expectations |
|-------|-------------|
| **E5** | Can articulate the full system beyond the model. Understands why each component exists. Can draw the component diagram and explain data flow. |
| **E6** | Discusses **scale implications** for each component (e.g., monitoring at 1B users). Identifies **failure modes** (stale features, evaluation drift). Proposes **iteration velocity** improvements (faster retraining, automated config). |

---

## Diagram: Production ML System Components

This diagram matches the chapter's conceptual anchor:

In [None]:
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches

fig, ax = plt.subplots(figsize=(14, 10))
ax.set_xlim(0, 14)
ax.set_ylim(0, 10)
ax.axis('off')
ax.set_title('Production ML System Components (Chapter Diagram)', fontsize=14, fontweight='bold')

# Component boxes - matching the chapter's component list
components = [
    # (x, y, width, height, label, color)
    (1, 7.5, 2.5, 1.2, 'Data\nCollection', '#E3F2FD'),
    (4, 7.5, 2.5, 1.2, 'Data\nVerification', '#E3F2FD'),
    (7, 7.5, 2.5, 1.2, 'Feature\nExtraction', '#E3F2FD'),
    (10.5, 7.5, 2.5, 1.2, 'Feature\nStore', '#E3F2FD'),
    
    (5.5, 5, 3, 1.5, 'ML CODE', '#FFCDD2'),  # The model - small box!
    
    (1, 2.5, 2.5, 1.2, 'Serving\nInfrastructure', '#C8E6C9'),
    (4, 2.5, 2.5, 1.2, 'Monitoring', '#C8E6C9'),
    (7, 2.5, 2.5, 1.2, 'Evaluation\nPipeline', '#C8E6C9'),
    (10.5, 2.5, 2.5, 1.2, 'Analysis\nTools', '#C8E6C9'),
    
    (1, 0.5, 2.5, 1.2, 'Configuration', '#FFF9C4'),
    (4, 0.5, 2.5, 1.2, 'Process\nManagement', '#FFF9C4'),
    (7, 0.5, 2.5, 1.2, 'Resource\nManagement', '#FFF9C4'),
]

for (x, y, w, h, label, color) in components:
    rect = mpatches.FancyBboxPatch((x, y), w, h, boxstyle='round,pad=0.05',
                                    facecolor=color, edgecolor='black', linewidth=1.5)
    ax.add_patch(rect)
    ax.text(x + w/2, y + h/2, label, ha='center', va='center', fontsize=9, fontweight='bold')

# Arrows showing data flow
arrow_style = dict(arrowstyle='->', color='gray', lw=1.5)
ax.annotate('', xy=(4, 8.1), xytext=(3.5, 8.1), arrowprops=arrow_style)
ax.annotate('', xy=(7, 8.1), xytext=(6.5, 8.1), arrowprops=arrow_style)
ax.annotate('', xy=(10.5, 8.1), xytext=(9.5, 8.1), arrowprops=arrow_style)
ax.annotate('', xy=(7, 6.5), xytext=(7, 7.5), arrowprops=arrow_style)  # Features to ML Code
ax.annotate('', xy=(7, 3.7), xytext=(7, 5), arrowprops=arrow_style)    # ML Code to Evaluation

# Legend
ax.text(0.5, 9.5, 'Blue = Data Pipeline | Red = ML Code (tiny!) | Green = Ops | Yellow = Management',
        fontsize=10, style='italic')

plt.tight_layout()
plt.show()

### ASCII Version (for quick whiteboard recall)

```
┌─────────────────────────────────────────────────────────────────────────────┐
│                        PRODUCTION ML SYSTEM                                  │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   ┌──────────────┐   ┌──────────────┐   ┌──────────────┐   ┌─────────────┐  │
│   │    Data      │──▶│    Data      │──▶│   Feature    │──▶│   Feature   │  │
│   │  Collection  │   │ Verification │   │  Extraction  │   │    Store    │  │
│   └──────────────┘   └──────────────┘   └──────────────┘   └─────────────┘  │
│                                               │                              │
│                                               ▼                              │
│                                    ┌─────────────────┐                       │
│                                    │    ML CODE      │  ◀── Small fraction!  │
│                                    └─────────────────┘                       │
│                                               │                              │
│                                               ▼                              │
│   ┌──────────────┐   ┌──────────────┐   ┌──────────────┐   ┌─────────────┐  │
│   │   Serving    │◀──│  Monitoring  │◀──│  Evaluation  │──▶│  Analysis   │  │
│   │Infrastructure│   │              │   │   Pipeline   │   │    Tools    │  │
│   └──────────────┘   └──────────────┘   └──────────────┘   └─────────────┘  │
│                                                                              │
│   ┌──────────────┐   ┌──────────────┐   ┌──────────────┐                     │
│   │Configuration │   │   Process    │   │   Resource   │                     │
│   │              │   │  Management  │   │  Management  │                     │
│   └──────────────┘   └──────────────┘   └──────────────┘                     │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘
```

---

## Hands-On: Simulating Component Interactions

Let's build a minimal simulation of the component pipeline for a **Feed Ranking** system (like/dislike prediction).

In [None]:
import numpy as np
import pandas as pd
from datetime import datetime, timedelta
import warnings
warnings.filterwarnings('ignore')

np.random.seed(42)

# ============================================================
# COMPONENT 1: Data Collection
# ============================================================
print("=" * 60)
print("COMPONENT: Data Collection")
print("=" * 60)

n_interactions = 10000

raw_data = pd.DataFrame({
    'user_id': np.random.randint(1, 1001, n_interactions),
    'post_id': np.random.randint(1, 5001, n_interactions),
    'action': np.random.choice(['like', 'dislike', 'view', 'share', None], 
                               n_interactions, p=[0.15, 0.05, 0.6, 0.05, 0.15]),
    'timestamp': [datetime.now() - timedelta(hours=np.random.randint(0, 720)) 
                  for _ in range(n_interactions)],
    'device': np.random.choice(['iOS', 'Android', 'Web', None], n_interactions, p=[0.4, 0.35, 0.2, 0.05])
})

print(f"Collected {len(raw_data)} raw interactions")
print(f"Sample:\n{raw_data.head()}")
print(f"\nMissing values:\n{raw_data.isnull().sum()}")

In [None]:
# ============================================================
# COMPONENT 2: Data Verification
# ============================================================
print("\n" + "=" * 60)
print("COMPONENT: Data Verification")
print("=" * 60)

def verify_data(df):
    """Check data quality - missing values, duplicates, anomalies"""
    issues = []
    
    # Check for nulls
    null_counts = df.isnull().sum()
    for col, count in null_counts.items():
        if count > 0:
            issues.append(f"Column '{col}' has {count} null values ({count/len(df)*100:.1f}%)")
    
    # Check for duplicates
    dup_count = df.duplicated().sum()
    if dup_count > 0:
        issues.append(f"Found {dup_count} duplicate rows")
    
    return issues

verification_issues = verify_data(raw_data)
print("Verification Report:")
for issue in verification_issues:
    print(f"  [!] {issue}")

# Clean data
clean_data = raw_data.dropna(subset=['action']).copy()
clean_data['device'] = clean_data['device'].fillna('Unknown')
print(f"\nAfter cleaning: {len(clean_data)} rows (dropped {len(raw_data) - len(clean_data)})")

In [None]:
# ============================================================
# COMPONENT 3: Feature Extraction
# ============================================================
print("\n" + "=" * 60)
print("COMPONENT: Feature Extraction")
print("=" * 60)

# User-level features
user_features = clean_data.groupby('user_id').agg({
    'post_id': 'count',
    'action': lambda x: (x == 'like').sum() / len(x),  # like rate
}).rename(columns={'post_id': 'total_actions', 'action': 'like_rate'})

# Post-level features  
post_features = clean_data.groupby('post_id').agg({
    'user_id': 'nunique',
    'action': lambda x: (x == 'like').sum(),
}).rename(columns={'user_id': 'unique_viewers', 'action': 'total_likes'})

print(f"Extracted features for {len(user_features)} users and {len(post_features)} posts")
print(f"\nUser features sample:\n{user_features.head()}")
print(f"\nPost features sample:\n{post_features.head()}")

In [None]:
# ============================================================
# COMPONENT 4: Feature Store (simulated)
# ============================================================
print("\n" + "=" * 60)
print("COMPONENT: Feature Store")
print("=" * 60)

class SimpleFeatureStore:
    """Simulates a feature store with versioning"""
    def __init__(self):
        self.features = {}
        self.metadata = {}
    
    def register(self, name, df, description):
        self.features[name] = df
        self.metadata[name] = {
            'description': description,
            'shape': df.shape,
            'created_at': datetime.now(),
            'version': 1
        }
        print(f"Registered: {name} - {df.shape}")
    
    def get(self, name):
        return self.features.get(name)
    
    def list_features(self):
        for name, meta in self.metadata.items():
            print(f"  {name}: {meta['description']} | shape={meta['shape']}")

feature_store = SimpleFeatureStore()
feature_store.register('user_engagement', user_features, 'User-level engagement metrics')
feature_store.register('post_popularity', post_features, 'Post-level popularity metrics')

print("\nFeature Store Contents:")
feature_store.list_features()

In [None]:
# ============================================================
# COMPONENT 5: ML Code (the small part!)
# ============================================================
print("\n" + "=" * 60)
print("COMPONENT: ML Code (The Small Part!)")
print("=" * 60)

from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

# Create training data: predict like vs not-like
training_data = clean_data[clean_data['action'].isin(['like', 'dislike', 'view'])].copy()
training_data['label'] = (training_data['action'] == 'like').astype(int)

# Merge features
training_data = training_data.merge(user_features, on='user_id', how='left')
training_data = training_data.merge(post_features, on='post_id', how='left')
training_data = training_data.fillna(0)

X = training_data[['total_actions', 'like_rate', 'unique_viewers', 'total_likes']]
y = training_data['label']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = LogisticRegression()
model.fit(X_train, y_train)

print(f"Model trained on {len(X_train)} samples")
print(f"Test accuracy: {model.score(X_test, y_test):.3f}")

In [None]:
# ============================================================
# COMPONENT 6: Evaluation Pipeline
# ============================================================
print("\n" + "=" * 60)
print("COMPONENT: Evaluation Pipeline")
print("=" * 60)

from sklearn.metrics import precision_score, recall_score, f1_score, roc_auc_score

y_pred = model.predict(X_test)
y_proba = model.predict_proba(X_test)[:, 1]

metrics = {
    'precision': precision_score(y_test, y_pred),
    'recall': recall_score(y_test, y_pred),
    'f1': f1_score(y_test, y_pred),
    'roc_auc': roc_auc_score(y_test, y_proba)
}

print("Offline Evaluation Metrics:")
for metric, value in metrics.items():
    print(f"  {metric}: {value:.4f}")

In [None]:
# ============================================================
# COMPONENT 7: Monitoring
# ============================================================
print("\n" + "=" * 60)
print("COMPONENT: Monitoring")
print("=" * 60)

class ModelMonitor:
    """Simulates production monitoring"""
    def __init__(self, baseline_metrics):
        self.baseline = baseline_metrics
        self.alerts = []
    
    def check_metrics(self, current_metrics, threshold=0.1):
        """Alert if metrics drift beyond threshold"""
        for metric, current in current_metrics.items():
            baseline = self.baseline.get(metric, current)
            drift = abs(current - baseline) / baseline if baseline > 0 else 0
            status = "OK" if drift < threshold else "ALERT"
            print(f"  {metric}: baseline={baseline:.3f}, current={current:.3f}, drift={drift:.1%} [{status}]")
            if drift >= threshold:
                self.alerts.append(f"{metric} drifted by {drift:.1%}")

monitor = ModelMonitor(metrics)

# Simulate slightly degraded metrics
simulated_current = {k: v * np.random.uniform(0.85, 1.05) for k, v in metrics.items()}
print("Monitoring Check:")
monitor.check_metrics(simulated_current)

In [None]:
# ============================================================
# COMPONENT 8: Serving Infrastructure (simulated)
# ============================================================
print("\n" + "=" * 60)
print("COMPONENT: Serving Infrastructure")
print("=" * 60)

import time

class ModelServer:
    """Simulates serving infrastructure"""
    def __init__(self, model, feature_store):
        self.model = model
        self.feature_store = feature_store
        self.request_count = 0
        self.latencies = []
    
    def predict(self, user_id, post_id):
        start = time.time()
        
        # Fetch features (simulated lookup)
        user_feats = self.feature_store.get('user_engagement')
        post_feats = self.feature_store.get('post_popularity')
        
        # Get features or defaults
        u = user_feats.loc[user_id] if user_id in user_feats.index else pd.Series({'total_actions': 0, 'like_rate': 0.5})
        p = post_feats.loc[post_id] if post_id in post_feats.index else pd.Series({'unique_viewers': 0, 'total_likes': 0})
        
        features = [[u['total_actions'], u['like_rate'], p['unique_viewers'], p['total_likes']]]
        prob = self.model.predict_proba(features)[0][1]
        
        latency_ms = (time.time() - start) * 1000
        self.latencies.append(latency_ms)
        self.request_count += 1
        
        return {'probability': prob, 'latency_ms': latency_ms}

server = ModelServer(model, feature_store)

# Simulate requests
print("Simulating 5 prediction requests:")
for i in range(5):
    result = server.predict(user_id=np.random.randint(1, 100), post_id=np.random.randint(1, 500))
    print(f"  Request {i+1}: prob={result['probability']:.3f}, latency={result['latency_ms']:.2f}ms")

print(f"\nTotal requests served: {server.request_count}")
print(f"Average latency: {np.mean(server.latencies):.2f}ms")

In [None]:
# ============================================================
# COMPONENT 9: Configuration & Resource Management (overview)
# ============================================================
print("\n" + "=" * 60)
print("COMPONENT: Configuration & Resource Management")
print("=" * 60)

system_config = {
    'model': {
        'version': 'v1.2.3',
        'type': 'LogisticRegression',
        'features': ['total_actions', 'like_rate', 'unique_viewers', 'total_likes']
    },
    'serving': {
        'max_latency_ms': 50,
        'replicas': 10,
        'batch_size': 1
    },
    'training': {
        'schedule': 'daily',
        'data_lookback_days': 30
    },
    'monitoring': {
        'alert_threshold': 0.1,
        'metrics': ['precision', 'recall', 'latency_p99']
    }
}

print("System Configuration:")
import json
print(json.dumps(system_config, indent=2))

---

## The Framework Steps (Chapter Order)

The chapter defines exactly this order:

1. **Clarifying requirements**
2. **Framing the problem as an ML task**
3. **Data preparation**
4. **Model development**
5. **Evaluation**
6. **Deployment and serving**
7. **Monitoring and infrastructure**

Each subsequent notebook will cover one or more of these steps in depth.

In [None]:
# Visual representation of the framework
fig, ax = plt.subplots(figsize=(12, 4))
ax.axis('off')

steps = [
    '1. Clarify\nRequirements',
    '2. Frame as\nML Task',
    '3. Data\nPreparation',
    '4. Model\nDevelopment',
    '5. Evaluation',
    '6. Deployment\n& Serving',
    '7. Monitoring\n& Infra'
]

colors = ['#BBDEFB', '#C8E6C9', '#FFF9C4', '#FFCCBC', '#E1BEE7', '#B2DFDB', '#F8BBD9']

for i, (step, color) in enumerate(zip(steps, colors)):
    x = i * 1.6 + 0.5
    rect = mpatches.FancyBboxPatch((x, 1), 1.4, 2, boxstyle='round,pad=0.05',
                                    facecolor=color, edgecolor='black', linewidth=2)
    ax.add_patch(rect)
    ax.text(x + 0.7, 2, step, ha='center', va='center', fontsize=9, fontweight='bold')
    
    if i < len(steps) - 1:
        ax.annotate('', xy=(x + 1.5, 2), xytext=(x + 1.4, 2),
                   arrowprops=dict(arrowstyle='->', color='black', lw=2))

ax.set_xlim(0, 12)
ax.set_ylim(0, 4)
ax.set_title('ML System Design Framework (Chapter Order)', fontsize=12, fontweight='bold')
plt.tight_layout()
plt.show()

---

## Tradeoffs (Chapter-Aligned)

| Component | Tradeoff | E5 Understanding | E6 Addition |
|-----------|----------|------------------|-------------|
| Data Collection | More data vs. cost/latency | Know that more data improves models | Consider data freshness SLAs, ingestion at petabyte scale |
| Feature Store | Freshness vs. compute | Batch features are cheaper | Real-time features for time-sensitive predictions |
| Serving | Latency vs. accuracy | Simpler models = faster | Model compression, caching strategies at scale |
| Monitoring | Coverage vs. alert fatigue | Track key metrics | Define actionable alerts, avoid noise |
| Configuration | Flexibility vs. complexity | Version models and configs | Gradual rollouts, feature flags for safety |

---

## Meta Interview Signal (Detailed)

### E5 Answer Expectations

- Can draw the full system diagram from memory
- Explains why each component exists (not just what it does)
- Understands the data flow from collection → serving
- Articulates that "ML Code" is small compared to the infrastructure

### E6 Additions

- **Scale**: "At Meta scale with billions of users, the feature store alone handles X QPS..."
- **Failure modes**: "If the feature pipeline is delayed, we fall back to cached features from the last successful run"
- **Iteration velocity**: "We can ship a new model version within hours by decoupling training from serving"
- **Feedback loops**: "User engagement signals flow back into training data, creating a flywheel"

---

## Interview Drills

### Drill 1: Whiteboard the System
Draw the production ML system diagram from memory. Include all 9 components mentioned in the chapter.

### Drill 2: Component Failure Scenarios
For each component, describe:
- What happens if it fails?
- What's the fallback?
- How do you detect the failure?

### Drill 3: Scale Discussion
Pick the Feed Ranking use case. Walk through each component and explain how it changes at:
- 1M users
- 100M users  
- 1B users

### Drill 4: The "ML Code is Small" Argument
Your interviewer challenges: "If ML code is so small, why do we hire ML engineers?"
Prepare a 2-minute response.

### Drill 5: Framework Steps
Recite the 7 framework steps in order. For each, give a one-sentence summary of its purpose.