# Predictive Maintenance with Machine Learning (Manufacturing Tutorial)


---

## 1. Problem Statement
In manufacturing, **unexpected machine failure** leads to:
- Production downtime
- Scrap and rework
- High maintenance cost

**Predictive Maintenance** aims to:
> Predict machine failure *before* it happens using sensor data.

### Objective
Build a **machine learning model** to predict whether a machine is **likely to fail** based on sensor readings.


---

## 2. Import Required Libraries


In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix


---

## 3. Generate Synthetic Manufacturing Sensor Data
We simulate realistic machine sensor data:
- Temperature (°C)
- Vibration (mm/s)
- Pressure (bar)
- Runtime since last maintenance (hours)


In [None]:
np.random.seed(42)

n_samples = 1000

data = pd.DataFrame({
    "temperature": np.random.normal(70, 8, n_samples),
    "vibration": np.random.normal(4, 1.2, n_samples),
    "pressure": np.random.normal(5, 0.8, n_samples),
    "runtime_hours": np.random.normal(500, 120, n_samples)
})

# Failure logic (hidden rule)
data["machine_failure"] = (
    (data["temperature"] > 85) |
    (data["vibration"] > 6) |
    (data["runtime_hours"] > 700)
).astype(int)


---

## 4. Initial Data Inspection


In [None]:
data.head()

In [None]:
data.info()

In [None]:
data.describe()

---

## 5. Exploratory Data Analysis (EDA)

### Failure Distribution


In [None]:
sns.countplot(x="machine_failure", data=data)
plt.title("Machine Failure Distribution")
plt.show()


### Sensor Behavior vs Failure

In [None]:
sns.boxplot(x="machine_failure", y="temperature", data=data)
plt.show()


In [None]:
sns.boxplot(x="machine_failure", y="vibration", data=data)
plt.show()


---

## 6. Feature Selection


In [None]:
X = data.drop("machine_failure", axis=1)
y = data["machine_failure"]


---

## 7. Train–Test Split


In [None]:
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.25, random_state=42, stratify=y
)


### Why Split?
- Train data → teach the model
- Test data → simulate real factory conditions


---

## 8. Feature Scaling
Sensor readings have different units and scales.


In [None]:
scaler = StandardScaler()

X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)


---

## 9. Build Predictive Model (Logistic Regression)


In [None]:
model = LogisticRegression()
model.fit(X_train_scaled, y_train)


---

## 10. Model Evaluation


In [None]:
y_pred = model.predict(X_test_scaled)


In [None]:
print(classification_report(y_test, y_pred))


In [None]:
sns.heatmap(confusion_matrix(y_test, y_pred), annot=True, fmt="d")
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.show()


---

## 11. Business Interpretation

| Scenario | Impact |
|--------|--------|
| Correct failure prediction | Planned maintenance |
| Missed failure | Unexpected breakdown |
| False alarm | Minor inspection cost |

> In manufacturing, **false alarms are cheaper than missed failures**.
