# End-to-End IIoT Predictive Maintenance for Oil Pumps on AWS

This case study demonstrates a complete workflow for predictive maintenance of oil pumps using AWS IoT and machine learning services. We'll collect sensor data, detect anomalies, and predict potential equipment failures.

## Prerequisites

- An AWS account with appropriate permissions
- AWS CLI configured with your credentials
- Python 3.7 or later
- Required Python packages: boto3, pandas, numpy, matplotlib, scikit-learn

Install required packages:

In [None]:
%%bash
pip install boto3 pandas numpy matplotlib scikit-learn

## Step 1: IoT Device Simulation and Data Ingestion

First, we'll simulate IoT devices sending sensor data from oil pumps. In a real-world scenario, you would replace this with actual IoT devices.

In [None]:
import boto3
import json
import time
import random
from datetime import datetime

# Create an IoT client
iot = boto3.client('iot-data')

# Define the IoT topic
topic = 'oil_pump/sensors'

def generate_sensor_data(pump_id):
    return {
        'pump_id': pump_id,
        'timestamp': datetime.now().isoformat(),
        'temperature': random.uniform(50, 100),
        'pressure': random.uniform(100, 500),
        'flow_rate': random.uniform(200, 1000),
        'vibration': random.uniform(0.1, 5.0)
    }

# Simulate data from 5 pumps for 1 minute
for _ in range(60):
    for pump_id in range(1, 6):
        sensor_data = generate_sensor_data(pump_id)
        
        # Publish to IoT Core
        iot.publish(
            topic=topic,
            qos=1,
            payload=json.dumps(sensor_data)
        )
    
    time.sleep(1)  # Wait for 1 second

print("Data simulation completed.")

# In a real-world scenario, replace this simulation with actual IoT device code

## Step 2: Data Processing with AWS IoT Rules and AWS Lambda

We'll use AWS IoT Rules to route the incoming data to a Lambda function for processing and storage.

First, create an AWS IoT Rule:

1. Go to the AWS IoT Core console
2. Navigate to "Act" > "Rules" and click "Create"
3. Set the rule name (e.g., "oil_pump_data_processing")
4. Set the rule query statement:
   ```sql
   SELECT * FROM 'oil_pump/sensors'
   ```
5. Add an action to invoke a Lambda function (we'll create this function next)

Now, let's create the Lambda function for data processing:

In [None]:
import json
import boto3
import os
from datetime import datetime

# Initialize DynamoDB client
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table(os.environ['DYNAMODB_TABLE'])

def lambda_handler(event, context):
    # Parse the incoming IoT data
    iot_data = json.loads(json.dumps(event))
    
    # Add a timestamp for when the data was processed
    iot_data['processed_timestamp'] = datetime.now().isoformat()
    
    # Store the data in DynamoDB
    table.put_item(Item=iot_data)
    
    return {
        'statusCode': 200,
        'body': json.dumps('Data processed and stored successfully!')
    }

Deploy this Lambda function and set up the necessary permissions for it to write to DynamoDB. Also, create a DynamoDB table named "oil_pump_sensor_data" with "pump_id" as the partition key and "timestamp" as the sort key.

## Step 3: Data Analysis and Anomaly Detection

We'll use Amazon SageMaker to train an anomaly detection model using the collected sensor data.

First, let's retrieve the data from DynamoDB and prepare it for modeling:

In [None]:
import boto3
import pandas as pd
from boto3.dynamodb.conditions import Key

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('oil_pump_sensor_data')

# Retrieve data for the last 24 hours
response = table.query(
    KeyConditionExpression=Key('pump_id').eq(1) & Key('timestamp').gte((datetime.now() - timedelta(days=1)).isoformat())
)

# Convert to DataFrame
df = pd.DataFrame(response['Items'])
df['timestamp'] = pd.to_datetime(df['timestamp'])
df = df.set_index('timestamp')

# Select numerical columns for anomaly detection
numerical_columns = ['temperature', 'pressure', 'flow_rate', 'vibration']
X = df[numerical_columns]

# Normalize the data
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Train an Isolation Forest model for anomaly detection
from sklearn.ensemble import IsolationForest
model = IsolationForest(contamination=0.1, random_state=42)
model.fit(X_scaled)

# Predict anomalies
anomalies = model.predict(X_scaled)
X['anomaly'] = anomalies

# Save the model and scaler
import joblib
joblib.dump(model, 'isolation_forest_model.joblib')
joblib.dump(scaler, 'scaler.joblib')

# Upload model and scaler to S3
s3 = boto3.client('s3')
bucket_name = 'your-bucket-name'  # Replace with your S3 bucket name
s3.upload_file('isolation_forest_model.joblib', bucket_name, 'models/isolation_forest_model.joblib')
s3.upload_file('scaler.joblib', bucket_name, 'models/scaler.joblib')

print("Model and scaler uploaded to S3.")

## Step 4: Real-time Anomaly Detection with AWS Lambda

Now, let's create another Lambda function to perform real-time anomaly detection on incoming sensor data:

In [None]:
import json
import boto3
import joblib
import os
import numpy as np

s3 = boto3.client('s3')
bucket_name = os.environ['MODEL_BUCKET']

# Download the model and scaler from S3
s3.download_file(bucket_name, 'models/isolation_forest_model.joblib', '/tmp/model.joblib')
s3.download_file(bucket_name, 'models/scaler.joblib', '/tmp/scaler.joblib')

# Load the model and scaler
model = joblib.load('/tmp/model.joblib')
scaler = joblib.load('/tmp/scaler.joblib')

def lambda_handler(event, context):
    # Parse the incoming IoT data
    iot_data = json.loads(json.dumps(event))
    
    # Extract the relevant features
    features = np.array([[
        iot_data['temperature'],
        iot_data['pressure'],
        iot_data['flow_rate'],
        iot_data['vibration']
    ]])
    
    # Normalize the features
    features_scaled = scaler.transform(features)
    
    # Predict anomaly
    anomaly = model.predict(features_scaled)[0]
    
    # Add anomaly prediction to the data
    iot_data['anomaly'] = int(anomaly)
    
    # If it's an anomaly, send an alert
    if anomaly == -1:
        send_alert(iot_data)
    
    return {
        'statusCode': 200,
        'body': json.dumps(iot_data)
    }

def send_alert(data):
    # Implement your alerting mechanism here (e.g., SNS, email, etc.)
    print(f"ALERT: Anomaly detected for pump {data['pump_id']} at {data['timestamp']}")

Deploy this Lambda function and update the IoT Rule to trigger this function instead of the previous one.

## Step 5: Visualization with Amazon QuickSight

To visualize the data and anomalies in QuickSight:

1. Set up an Amazon Athena table for your DynamoDB data:

In [None]:
CREATE EXTERNAL TABLE oil_pump_sensor_data (
    pump_id int,
    timestamp string,
    temperature double,
    pressure double,
    flow_rate double,
    vibration double,
    anomaly int
)
STORED AS PARQUET
LOCATION 's3://your-bucket/oil_pump_data/';

2. Set up a QuickSight dataset using this Athena table.

3. Create a QuickSight dashboard with the following visualizations:
   - Line chart of sensor readings over time for each pump
   - Scatter plot of temperature vs. pressure, colored by anomaly status
   - Bar chart of anomaly count by pump
   - KPI indicators for current readings of each sensor

## Step 6: Setting up Alerts with Amazon SNS

Create an SNS topic for alerts:

In [None]:
import boto3

sns = boto3.client('sns')

# Create SNS topic
response = sns.create_topic(Name='oil_pump_alerts')
topic_arn = response['TopicArn']

# Subscribe to the topic (replace with your email)
sns.subscribe(
    TopicArn=topic_arn,
    Protocol='email',
    Endpoint='your-email@example.com'
)

print(f"SNS topic created: {topic_arn}")

Update the `send_alert` function in the anomaly detection Lambda:

In [None]:
import boto3

sns = boto3.client('sns')
topic_arn = 'your-sns-topic-arn'  # Replace with your SNS topic ARN

def send_alert(data):
    message = f"ALERT: Anomaly detected for pump {data['pump_id']} at {data['timestamp']}\n"
    message += f"Sensor readings: Temperature: {data['temperature']}, Pressure: {data['pressure']}, "
    message += f"Flow Rate: {data['flow_rate']}, Vibration: {data['vibration']}"
    
    sns.publish(
        TopicArn=topic_arn,
        Message=message,
        Subject='Oil Pump Anomaly Detected'
    )

## Conclusion

This end-to-end example demonstrates how to:

1. Simulate IoT device data for oil pumps
2. Ingest and process data using AWS IoT Core and Lambda
3. Store data in DynamoDB
4. Train an anomaly detection model using scikit-learn
5. Perform real-time anomaly detection on incoming data
6. Visualize data and anomalies using QuickSight
7. Set up alerts for detected anomalies using SNS

Key points to remember:

- Replace the simulated data with real IoT device inputs in a production environment
- Implement proper error handling and logging for all Lambda functions
- Regularly retrain the anomaly detection model as more data becomes available
- Set up appropriate IAM roles and permissions for all services
- Consider using AWS IoT Greengrass for edge computing capabilities
- Implement a more sophisticated anomaly detection model (e.g., LSTM autoencoders) for better accuracy

This IIoT solution provides a foundation for predictive maintenance in the oil and refinery industry. By detecting anomalies early, companies can prevent equipment failures, reduce downtime, and optimize maintenance schedules.