# **Chapter 78: IoT and Sensor Analytics**

## **Learning Objectives**

By the end of this chapter, you will be able to:

- Understand the unique characteristics of IoT sensor data: high velocity, volume, variety, and veracity.
- Design data ingestion pipelines for streaming sensor data using message brokers (MQTT, Kafka).
- Engineer features from raw sensor readings, including time‑domain, frequency‑domain, and statistical features.
- Implement real‑time anomaly detection and predictive maintenance models.
- Deploy models at the edge for low‑latency inference.
- Integrate IoT analytics with the monitoring and alerting system developed in Chapter 73.
- Evaluate models using domain‑specific metrics such as remaining useful life (RUL) accuracy.

---

## **78.1 Introduction to IoT and Sensor Analytics**

The Internet of Things (IoT) generates vast amounts of time‑series data from sensors embedded in machines, environments, and wearables. Predicting equipment failures, detecting anomalies, and optimising operations based on this data can deliver immense value.

Sensor data differs from the time series we have seen so far (financial, retail, weather) in several ways:

- **High frequency**: Sensors may sample at rates from 1 Hz to many kHz.
- **Multivariate**: Many correlated channels (e.g., vibration, temperature, pressure).
- **Noisy**: Electrical interference, calibration drift.
- **Missing data**: Communication dropouts, sensor failures.
- **Spatial distribution**: Sensors are often deployed across a physical asset or environment.

A common application is **predictive maintenance**: using sensor data to predict when a machine will fail, so that maintenance can be scheduled just in time. This avoids unplanned downtime and reduces costs.

In this chapter, we will build an IoT analytics system for a simulated industrial machine (e.g., a pump or a motor) equipped with multiple sensors. We will generate synthetic data that mimics normal operation and developing faults. Then we will:

1. Ingest streaming data.
2. Engineer features.
3. Train a model to predict remaining useful life (RUL) or detect anomalies.
4. Deploy the model for real‑time inference.
5. Trigger alerts when anomalies are detected (integrating with Chapter 73).

---

## **78.2 Sensor Data Characteristics and Challenges**

Before building the pipeline, let's understand the data characteristics.

### **78.2.1 Velocity**
Sensor data often arrives at high rates. A single machine might produce thousands of readings per second. Aggregating over time windows (e.g., 1‑second averages) is common to reduce the load while preserving signal.

### **78.2.2 Volume**
With many sensors and high frequencies, data volumes can be enormous. Efficient storage (e.g., time‑series databases like InfluxDB, or compressed columnar formats) and stream processing are essential.

### **78.2.3 Variety**
Different sensors measure different physical quantities: vibration (accelerometers), temperature (thermocouples), pressure, current, etc. They may have different units, ranges, and sampling rates.

### **78.2.4 Veracity**
Sensor data is noisy. Electrical noise, quantization error, and occasional dropouts must be handled. Outliers may indicate faults or just spurious readings.

### **78.2.5 Temporal Dependencies**
Faults develop over time. Features must capture trends, cycles, and changes in the signal characteristics (e.g., increasing vibration amplitude).

---

## **78.3 Generating Synthetic Sensor Data**

To illustrate the concepts, we'll generate synthetic data for a rotating machine (e.g., a pump). We'll simulate three sensors:

- **Vibration** (accelerometer): amplitude and frequency content change as bearings wear.
- **Temperature**: rises slowly as friction increases.
- **Current**: motor current draw increases with load or friction.

We'll generate data for a machine that runs continuously and eventually fails after a certain amount of time. The data will include:

- A baseline healthy period.
- A degradation period where a fault develops.
- A failure point.

We'll use a simple model: remaining useful life (RUL) decreases linearly, and sensor signals evolve accordingly.

```python
import numpy as np
import pandas as pd
from datetime import datetime, timedelta
import matplotlib.pyplot as plt

def generate_machine_data(machine_id=1, duration_hours=1000, sampling_rate_hz=1, seed=42):
    """
    Generate synthetic sensor data for a machine.
    
    Parameters
    ----------
    machine_id : int
        Identifier for the machine.
    duration_hours : float
        Total run time until failure (hours).
    sampling_rate_hz : int
        Samples per second (simplified, here we use per minute for demo).
    
    Returns
    -------
    df : pd.DataFrame
        Columns: timestamp, machine_id, vibration, temperature, current, fault_progress, rul
    """
    np.random.seed(seed)
    
    # Total number of samples (using minutes for simplicity)
    total_minutes = int(duration_hours * 60)
    timestamps = [datetime.now() + timedelta(minutes=i) for i in range(total_minutes)]
    
    # Fault progress: from 0 (healthy) to 1 (failure)
    fault_progress = np.linspace(0, 1, total_minutes)
    
    # Remaining useful life (minutes)
    rul = np.linspace(total_minutes, 0, total_minutes)
    
    # Sensor signals
    vibration = np.zeros(total_minutes)
    temperature = np.zeros(total_minutes)
    current = np.zeros(total_minutes)
    
    for i, progress in enumerate(fault_progress):
        # Vibration: amplitude increases, frequency content shifts (simplified)
        # Add sinusoidal component with increasing amplitude and frequency modulation
        t = i / 60.0  # time in hours
        # Base vibration (healthy)
        base_vib = 0.1 * np.sin(2 * np.pi * 10 * t) + 0.05 * np.random.randn()
        # Fault contribution: amplitude grows exponentially
        fault_vib = 0.5 * progress**2 * np.sin(2 * np.pi * (10 + 20*progress) * t)
        vibration[i] = base_vib + fault_vib + 0.02 * np.random.randn()
        
        # Temperature: slowly rises with fault
        temperature[i] = 25 + 10 * progress + 2 * np.random.randn()
        
        # Current: increases with load/friction
        current[i] = 10 + 5 * progress + 1 * np.random.randn()
    
    df = pd.DataFrame({
        'timestamp': timestamps,
        'machine_id': machine_id,
        'vibration': vibration,
        'temperature': temperature,
        'current': current,
        'fault_progress': fault_progress,
        'rul': rul
    })
    
    return df

# Generate data for one machine
df_machine = generate_machine_data(machine_id=1, duration_hours=500, sampling_rate_hz=1)  # per minute
print(df_machine.head())

# Plot the signals
fig, axes = plt.subplots(3, 1, figsize=(12, 8), sharex=True)
axes[0].plot(df_machine['timestamp'], df_machine['vibration'])
axes[0].set_ylabel('Vibration')
axes[1].plot(df_machine['timestamp'], df_machine['temperature'])
axes[1].set_ylabel('Temperature')
axes[2].plot(df_machine['timestamp'], df_machine['current'])
axes[2].set_ylabel('Current')
axes[2].set_xlabel('Time')
plt.show()
```

**Explanation:**

- We simulate a machine running for `duration_hours`. The fault progresses linearly from 0 (healthy) to 1 (failure).
- Vibration is modelled as a sinusoidal signal whose amplitude and frequency increase with fault progress, plus noise.
- Temperature and current also increase with fault progress.
- The target variable for prediction could be `rul` (remaining useful life) or a binary `fault` indicator.

In reality, you would collect such data from actual machines, possibly with run‑to‑failure experiments or historical maintenance logs.

---

## **78.4 Data Ingestion for Streaming Sensor Data**

In an IoT system, data arrives continuously. We need a scalable ingestion layer. Common choices:

- **MQTT**: Lightweight pub‑sub protocol, ideal for sensors.
- **Kafka**: Distributed streaming platform, handles high throughput.
- **Amazon Kinesis / Google Pub/Sub**: Cloud‑managed alternatives.

For our example, we'll simulate a Kafka producer that sends sensor readings in real time. We'll then consume them with a streaming processing engine (e.g., Apache Flink, Spark Streaming) or simply with a Python script using the `kafka-python` library.

We'll create a producer that reads from our generated DataFrame and publishes messages to a Kafka topic.

```python
# Simulated Kafka producer (requires kafka-python)
# pip install kafka-python

from kafka import KafkaProducer
import json
import time

def kafka_producer_simulation(df, topic='sensor-data', bootstrap_servers='localhost:9092'):
    """
    Simulate streaming data by publishing each row to Kafka with a delay.
    """
    producer = KafkaProducer(
        bootstrap_servers=bootstrap_servers,
        value_serializer=lambda v: json.dumps(v).encode('utf-8')
    )
    
    for _, row in df.iterrows():
        message = {
            'timestamp': row['timestamp'].isoformat(),
            'machine_id': int(row['machine_id']),
            'vibration': float(row['vibration']),
            'temperature': float(row['temperature']),
            'current': float(row['current'])
        }
        producer.send(topic, value=message)
        print(f"Sent: {message}")
        time.sleep(0.01)  # simulate real-time (10 ms between readings = 100 Hz)
    
    producer.flush()
    producer.close()

# In practice, you would run this in a separate process or container.
# For demonstration, we'll skip actual Kafka and process directly from the DataFrame.
```

**Explanation:**

- The producer serializes each row as JSON and sends it to a Kafka topic.
- The delay controls the simulated data rate; in a real system, sensors would push data at their natural frequency.

---

## **78.5 Feature Engineering for Sensor Data**

Raw sensor readings are rarely used directly. We need to extract features that capture the machine's health state. Common feature families:

- **Time‑domain statistical features**: mean, variance, skewness, kurtosis, peak‑to‑peak, RMS over windows.
- **Frequency‑domain features**: FFT magnitudes at key frequencies, spectral power in bands.
- **Time‑frequency features**: wavelet coefficients, spectrograms.
- **Domain‑specific features**: e.g., for vibration, bearing fault frequencies.

We'll implement a streaming feature extractor that computes rolling window statistics and spectral features.

```python
import numpy as np
from scipy.fft import fft, fftfreq
from scipy.stats import skew, kurtosis

class StreamingFeatureExtractor:
    """
    Extracts features from streaming sensor data using sliding windows.
    """
    
    def __init__(self, window_size=60, step_size=1, sampling_rate=1.0):
        """
        window_size: number of samples in each window.
        step_size: number of samples to slide each step.
        sampling_rate: Hz, for frequency calculations.
        """
        self.window_size = window_size
        self.step_size = step_size
        self.sampling_rate = sampling_rate
        self.buffer = {i: [] for i in range(1, 6)}  # machine_id -> list of samples
        self.feature_names = []
    
    def _time_features(self, data):
        """Compute time-domain statistical features."""
        features = {
            'mean': np.mean(data),
            'std': np.std(data),
            'min': np.min(data),
            'max': np.max(data),
            'range': np.max(data) - np.min(data),
            'rms': np.sqrt(np.mean(data**2)),
            'skew': skew(data) if len(data)>2 else 0,
            'kurtosis': kurtosis(data) if len(data)>2 else 0
        }
        return features
    
    def _freq_features(self, data):
        """Compute frequency-domain features."""
        n = len(data)
        if n < 2:
            return {}
        fft_vals = fft(data)
        fft_abs = np.abs(fft_vals[:n//2])
        freqs = fftfreq(n, d=1/self.sampling_rate)[:n//2]
        
        # Dominant frequency
        dominant_freq = freqs[np.argmax(fft_abs[1:])+1] if len(fft_abs)>1 else 0
        # Spectral centroid
        if np.sum(fft_abs) > 0:
            centroid = np.sum(freqs * fft_abs) / np.sum(fft_abs)
        else:
            centroid = 0
        # Power in bands (e.g., 0-10 Hz, 10-20 Hz, ...)
        bands = [(0,10), (10,20), (20,50), (50,100)]
        band_powers = {}
        for (low, high) in bands:
            mask = (freqs >= low) & (freqs < high)
            band_powers[f'power_{low}_{high}'] = np.sum(fft_abs[mask]) if np.any(mask) else 0
        
        return {
            'dominant_freq': dominant_freq,
            'spectral_centroid': centroid,
            **band_powers
        }
    
    def process_sample(self, machine_id, timestamp, vibration, temperature, current):
        """
        Add a new sample to the buffer. If a window is completed, compute and return features.
        """
        if machine_id not in self.buffer:
            self.buffer[machine_id] = []
        
        self.buffer[machine_id].append({
            'timestamp': timestamp,
            'vibration': vibration,
            'temperature': temperature,
            'current': current
        })
        
        # If we have enough samples, process the oldest window
        if len(self.buffer[machine_id]) >= self.window_size:
            # Take the first window_size samples
            window = self.buffer[machine_id][:self.window_size]
            # Remove those samples from buffer (sliding window)
            self.buffer[machine_id] = self.buffer[machine_id][self.step_size:]
            
            # Extract features
            vib_data = [w['vibration'] for w in window]
            temp_data = [w['temperature'] for w in window]
            curr_data = [w['current'] for w in window]
            
            features = {
                'timestamp': window[-1]['timestamp'],  # timestamp of last sample in window
                'machine_id': machine_id
            }
            
            for sensor, data in [('vib', vib_data), ('temp', temp_data), ('curr', curr_data)]:
                time_feat = self._time_features(data)
                for k, v in time_feat.items():
                    features[f'{sensor}_{k}'] = v
                # Frequency features only for vibration (can also do for others if relevant)
                if sensor == 'vib':
                    freq_feat = self._freq_features(data)
                    for k, v in freq_feat.items():
                        features[f'{sensor}_{k}'] = v
            
            return features
        else:
            return None
```

**Explanation:**

- The `StreamingFeatureExtractor` maintains a buffer per machine. When enough samples are accumulated, it computes a feature vector for the oldest window and then slides.
- Time‑domain features include basic statistics and higher‑order moments.
- Frequency‑domain features use FFT to extract dominant frequency, spectral centroid, and band powers.
- The window size and step size control the trade‑off between latency and feature richness. For predictive maintenance, a window of a few minutes to hours is common.
- This extractor can be used in a streaming pipeline: each new sample is passed to `process_sample`, and if a feature vector is returned, it is sent to the model for inference.

---

## **78.6 Modeling Approaches for Predictive Maintenance**

We can formulate predictive maintenance as:

- **Regression**: predict remaining useful life (RUL) – a continuous value.
- **Classification**: predict whether a fault will occur within a certain time window (e.g., next 24 hours).
- **Anomaly detection**: flag unusual behaviour that may indicate a developing fault.

We'll demonstrate both regression (RUL prediction) and anomaly detection.

### **78.6.1 Regression for Remaining Useful Life**

Using the engineered features, we can train a model to predict RUL. Because RUL decreases monotonically, we must ensure no data leakage (i.e., we only use information available at the time of prediction).

We'll use a random forest regressor.

```python
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error

# Assume we have a DataFrame `features_df` with columns: timestamp, machine_id, features..., rul
# We need to split by time to avoid leakage
def prepare_rul_data(features_df, test_size=0.2):
    features_df = features_df.sort_values('timestamp')
    split_idx = int(len(features_df) * (1 - test_size))
    train = features_df.iloc[:split_idx]
    test = features_df.iloc[split_idx:]
    
    feature_cols = [c for c in features_df.columns if c not in ['timestamp', 'machine_id', 'rul']]
    X_train = train[feature_cols]
    y_train = train['rul']
    X_test = test[feature_cols]
    y_test = test['rul']
    
    return X_train, y_train, X_test, y_test, feature_cols

# Example usage (assuming we already have features_df)
# X_train, y_train, X_test, y_test, feat_cols = prepare_rul_data(features_df)
# model = RandomForestRegressor(n_estimators=100, random_state=42)
# model.fit(X_train, y_train)
# y_pred = model.predict(X_test)
# mae = mean_absolute_error(y_test, y_pred)
# print(f"RUL MAE: {mae:.2f} minutes")
```

**Explanation:**

- We split the data chronologically to avoid using future information.
- The target `rul` is the remaining life in minutes at the time of the feature window.
- The model can then be used in real time: for each new feature window, predict RUL.

### **78.6.2 Anomaly Detection**

Alternatively, we can detect when the machine starts to deviate from normal behaviour. This can be done with:

- **Statistical methods**: control charts, moving thresholds.
- **Machine learning**: one‑class SVM, isolation forest, autoencoders.

We'll implement a simple autoencoder for anomaly detection. An autoencoder is trained on normal (healthy) data only; when reconstruction error exceeds a threshold, an anomaly is flagged.

```python
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

def build_autoencoder(input_dim, encoding_dim=16):
    model = keras.Sequential([
        layers.Dense(64, activation='relu', input_shape=(input_dim,)),
        layers.Dense(encoding_dim, activation='relu'),
        layers.Dense(64, activation='relu'),
        layers.Dense(input_dim, activation='linear')
    ])
    model.compile(optimizer='adam', loss='mse')
    return model

# Train on healthy data only
# healthy_data = features_df[features_df['rul'] > some_threshold]  # e.g., RUL > 100
# X_healthy = healthy_data[feat_cols]
# autoencoder = build_autoencoder(len(feat_cols))
# autoencoder.fit(X_healthy, X_healthy, epochs=50, batch_size=32, validation_split=0.1, verbose=0)

# For each new sample, compute reconstruction error
# reconstructions = autoencoder.predict(X_test)
# mse = np.mean(np.square(X_test - reconstructions), axis=1)
# If mse > threshold, anomaly

# Determine threshold from training (e.g., 95th percentile of training errors)
# threshold = np.percentile(train_errors, 95)
```

**Explanation:**

- Autoencoders learn to compress and reconstruct normal patterns.
- When a fault develops, the reconstruction error increases because the model hasn't seen such patterns.
- The threshold can be set based on the distribution of errors on healthy validation data.
- Anomaly detection is useful when you don't have labelled failure data (common in real life).

---

## **78.7 Real‑Time Processing with Stream Processing Frameworks**

For production IoT systems, you need a stream processing engine that can handle high throughput, stateful operations (like our sliding windows), and integration with machine learning models. Popular choices:

- **Apache Flink**: Provides exactly‑once semantics, event time processing, and a machine learning library (FlinkML).
- **Apache Spark Streaming**: Micro‑batch processing, integrates with MLlib.
- **Kafka Streams**: Lightweight library that runs inside your application.

We'll illustrate a conceptual pipeline using Kafka and a Python consumer that uses our `StreamingFeatureExtractor` and then calls the model.

```python
# Conceptual streaming consumer (using kafka-python)
from kafka import KafkaConsumer
import json
import joblib

# Load pre‑trained model and feature extractor
model = joblib.load("rul_model.pkl")
feature_extractor = StreamingFeatureExtractor(window_size=60, step_size=1, sampling_rate=1.0)

consumer = KafkaConsumer(
    'sensor-data',
    bootstrap_servers='localhost:9092',
    value_deserializer=lambda m: json.loads(m.decode('utf-8'))
)

for message in consumer:
    data = message.value
    machine_id = data['machine_id']
    timestamp = data['timestamp']
    vib = data['vibration']
    temp = data['temperature']
    curr = data['current']
    
    # Extract features (may return None if window not complete)
    features = feature_extractor.process_sample(machine_id, timestamp, vib, temp, curr)
    
    if features is not None:
        # Prepare feature vector for model
        # Assume feature extractor returns a dict with all features
        X = np.array([[features[f] for f in feature_cols]])  # feature_cols must be defined
        rul_pred = model.predict(X)[0]
        print(f"Machine {machine_id} at {timestamp}: predicted RUL = {rul_pred:.1f} minutes")
        
        # Optionally trigger alert if RUL < threshold
        if rul_pred < 60:  # less than 1 hour
            # Send alert to alert manager (Chapter 73)
            pass
```

**Explanation:**

- The consumer listens to the sensor topic, processes each sample through the feature extractor, and when a window is complete, it runs inference.
- The predicted RUL can be used to trigger maintenance alerts.
- This architecture scales by partitioning the Kafka topic by `machine_id` and running multiple consumer instances.

---

## **78.8 Edge Deployment**

Many IoT applications require low‑latency inference at the edge (on the device itself) to avoid sending all data to the cloud. Edge deployment involves:

- **Model compression**: quantization, pruning, knowledge distillation.
- **Lightweight runtimes**: TensorFlow Lite, ONNX Runtime, or specialised hardware (e.g., ARM CMSIS‑NN).
- **Local storage and buffering**: in case of network outages.

We'll demonstrate how to convert a trained model to TensorFlow Lite and run it on a simulated edge device.

```python
# Convert Keras model to TFLite
def convert_to_tflite(model, representative_dataset=None):
    converter = tf.lite.TFLiteConverter.from_keras_model(model)
    converter.optimizations = [tf.lite.Optimize.DEFAULT]
    if representative_dataset:
        converter.representative_dataset = representative_dataset
        converter.target_spec.supported_types = [tf.float16]
    tflite_model = converter.convert()
    return tflite_model

# Save the model
# with open('model.tflite', 'wb') as f:
#     f.write(tflite_model)

# Inference on edge (using Python, but could be C++ on device)
import numpy as np
import tensorflow as tf

# Load TFLite model
interpreter = tf.lite.Interpreter(model_path="model.tflite")
interpreter.allocate_tensors()

input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

# Prepare input data (e.g., a feature vector)
input_data = np.array([feature_vector], dtype=np.float32)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
print("Prediction:", output_data)
```

**Explanation:**

- The model is converted to TensorFlow Lite, which is optimized for mobile and edge devices.
- Quantization reduces model size and speeds up inference, with minimal accuracy loss.
- On the edge, you would embed the TFLite runtime in your device firmware or application.

---

## **78.9 Integrating with Monitoring and Alerting (Chapter 73)**

Our IoT analytics system can trigger alerts when anomalies are detected or when RUL falls below a threshold. We can reuse the `AlertManager` from Chapter 73.

```python
from alerting import AlertManager, AlertRule, SlackChannel

# Initialize alert manager
alert_manager = AlertManager()
alert_manager.register_channel('slack', SlackChannel(webhook_url='https://hooks.slack.com/...'))

# Define an alert rule for low RUL
def low_rul_condition(row):
    return row.get('rul_predicted', 1000) < 60  # less than 60 minutes

low_rul_rule = AlertRule(
    name="Low RUL Warning",
    condition=low_rul_condition,
    severity="P1",
    channels=["slack"],
    cooldown_minutes=30,
    description="Machine predicted to fail within 1 hour."
)
alert_manager.add_rule(low_rul_rule)

# In the streaming consumer, after each prediction:
# if features is not None:
#     row = pd.Series({**features, 'rul_predicted': rul_pred})
#     alert_manager.process_row(row)
```

**Explanation:**

- The alert manager checks each prediction against rules and sends notifications.
- Cooldown prevents flooding.
- Integration with Slack or email ensures that maintenance teams are notified immediately.

---

## **78.10 Case Study: Predictive Maintenance for a Fleet of Pumps**

Let's combine all components into a complete case study.

**Scenario**: A water treatment plant has 10 pumps. Each pump is equipped with vibration, temperature, and current sensors sampling at 1 Hz. We want to predict RUL and detect anomalies to schedule maintenance.

**Steps**:

1. **Data Generation**: Simulate 10 machines with different degradation rates.
2. **Offline Training**: Use historical run‑to‑failure data to train a RUL regression model.
3. **Streaming Pipeline**: Deploy Kafka, a consumer per machine (or partition), feature extraction, model inference.
4. **Alerting**: Integrate with Slack to notify when RUL < 24 hours.
5. **Dashboard**: Visualise current RUL and anomaly scores (using Grafana or a custom dashboard).

We'll simulate the offline training with data from multiple machines.

```python
# Generate data for multiple machines
def generate_fleet_data(num_machines=10, duration_hours=1000):
    dfs = []
    for mid in range(1, num_machines+1):
        # Slight variation in degradation rates
        df = generate_machine_data(machine_id=mid, duration_hours=duration_hours * np.random.uniform(0.8,1.2))
        dfs.append(df)
    return pd.concat(dfs, ignore_index=True)

fleet_df = generate_fleet_data(num_machines=10, duration_hours=800)
# Save to parquet for later use
fleet_df.to_parquet("fleet_data.parquet")
```

**Explanation:**

- Each machine has its own lifespan, but the degradation pattern is similar.
- We can train a global model across all machines, which may generalise better.

Now we run the streaming pipeline (simulated here with a loop through the data).

```python
# Load pre‑trained model and feature extractor
# (Assume we have already trained and saved them)
model = joblib.load("rul_model.pkl")
feature_cols = joblib.load("feature_cols.pkl")
extractor = StreamingFeatureExtractor(window_size=60, step_size=1, sampling_rate=1.0)

# Simulate streaming by iterating through the fleet data in time order
fleet_df = fleet_df.sort_values('timestamp')
for _, row in fleet_df.iterrows():
    features = extractor.process_sample(
        row['machine_id'],
        row['timestamp'],
        row['vibration'],
        row['temperature'],
        row['current']
    )
    if features:
        # Prepare feature vector
        X = np.array([[features[col] for col in feature_cols]])
        rul_pred = model.predict(X)[0]
        print(f"Machine {row['machine_id']} at {row['timestamp']}: RUL={rul_pred:.1f} min")
        # Optionally check alert
        if rul_pred < 60:
            # Send alert (using alert manager)
            pass
```

**Explanation:**

- This simulates real‑time processing. In production, the loop would be replaced by a Kafka consumer.
- The feature extractor maintains per‑machine state internally.

---

## **78.11 Lessons Learned from IoT Analytics**

1. **Data quality is paramount**: Sensor drift, missing data, and outliers must be handled early.
2. **Feature engineering is domain‑specific**: Understanding the physics of the machine helps design meaningful features (e.g., bearing fault frequencies).
3. **Stream processing requires careful state management**: Our sliding window buffer is stateful; in distributed systems, use state stores (e.g., Flink's keyed state).
4. **Model degradation over time**: Machines change, and models may need periodic retraining. Monitor prediction errors and trigger retraining when drift is detected.
5. **Edge vs. cloud trade‑offs**: Edge reduces latency and bandwidth but limits model complexity. Choose based on requirements.
6. **Alerting must be actionable**: Too many false alarms lead to alert fatigue; tune thresholds carefully.

---

## **78.12 Future Improvements**

- **Incorporate more sensors**: e.g., acoustic emissions, oil debris monitoring.
- **Use deep learning for end‑to‑end feature learning**: Convolutional or recurrent networks on raw time series.
- **Multi‑machine correlation**: Learn from the fleet to improve individual predictions.
- **Remaining useful life with uncertainty**: Provide prediction intervals using quantile regression or Bayesian methods.
- **Automated retraining pipeline**: When new failure data arrives, retrain and deploy models automatically.

---

## **Chapter Summary**

In this chapter, we built a complete IoT analytics system for predictive maintenance. We generated synthetic sensor data, designed a streaming feature extractor, trained models for RUL regression and anomaly detection, and deployed them in a real‑time pipeline. We integrated alerting from Chapter 73 and discussed edge deployment. The system demonstrates how time‑series prediction extends to high‑velocity sensor data, with unique challenges in stateful stream processing and real‑time inference.

This chapter concludes our series of domain‑specific adaptations. The next chapter, **Energy Demand Forecasting**, will apply similar principles to the energy sector.

---

**End of Chapter 78**

<div style='width:100%; display:flex; justify-content:space-between; align-items:center; margin: 1em 0;'>
  <a href='77. healthcare_prediction_systems.ipynb' style='font-weight:bold; font-size:1.05em;'>&larr; Previous</a>
  <a href='../TOC.md' style='font-weight:bold; font-size:1.05em; text-align:center;'>Table of Contents</a>
  <a href='79. energy_demand_forecasting.ipynb' style='font-weight:bold; font-size:1.05em;'>Next &rarr;</a>
</div>
