This project builds and serves a machine learning API that predicts the probability of machinery failure from operating sensor measurements.
The workflow has two connected parts:
asset_failure.ipynbtrains, evaluates, and saves a machine failure prediction model.app.pyloads the saved.pklmodel and exposes it through a FastAPI application.
The final goal is to make the trained model easy to use from another application, dashboard, script, or automated maintenance workflow.
| File | Purpose |
|---|---|
ai4i2020.csv |
Source dataset used to train the model. |
asset_failure.ipynb |
Jupyter notebook that prepares the data, trains the model, evaluates performance, and saves the model pipeline. |
xgb_smote_pipeline.pkl |
Saved machine learning pipeline used by the API for predictions. |
app.py |
FastAPI application that serves the trained model. |
testapi.py |
Small Python script that sends a sample prediction request to the API. |
requirements.txt |
Python dependencies needed to run the notebook and API. |
The API predicts whether a machine is likely to fail based on five operating conditions:
- Air temperature in Kelvin
- Process temperature in Kelvin
- Rotational speed in RPM
- Torque in Newton-meters
- Tool wear in minutes
For each request, the API returns:
prediction:1for predicted machine failure,0for no predicted machine failurefailure_probability: probability that the machine will failprediction_label: readable label for the prediction
Create and activate a virtual environment, then install the dependencies.
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txtStart the FastAPI application with Uvicorn:
uvicorn app:app --reloadThe API will run locally at:
http://127.0.0.1:8000
FastAPI also provides interactive API documentation at:
http://127.0.0.1:8000/docs
GET /healthExample response:
{
"status": "ok"
}POST /predictExample request body:
{
"Air temperature [K]": 300,
"Process temperature [K]": 310,
"Rotational speed [rpm]": 1500,
"Torque [Nm]": 40,
"Tool wear [min]": 10
}Example response:
{
"prediction": 0,
"failure_probability": 0.0234,
"prediction_label": "No Machine Failure"
}The exact probability may change if the model is retrained.
After starting the API, run:
python testapi.pyThe script sends a sample machine operating profile to http://127.0.0.1:8000/predict and prints the API response.
The model is built in asset_failure.ipynb and saved as xgb_smote_pipeline.pkl.
The notebook imports tools for:
- Loading and manipulating data with
pandas - Splitting data into training and test sets with
train_test_split - Building a preprocessing and modeling pipeline
- Handling missing values with
SimpleImputer - Addressing class imbalance with
SMOTE - Training an
XGBClassifier - Evaluating the model with accuracy, classification report, ROC curve, and AUC
- Saving the final pipeline with
joblib
This keeps the full model-building process reproducible from data loading through model export.
df = pd.read_csv('./ai4i2020.csv')The notebook loads ai4i2020.csv, which contains machine operating measurements and failure labels.
X = df.drop(columns=['UDI', 'Product ID', 'Machine failure', 'TWF', 'HDF', 'PWF', 'OSF', 'RNF'])
y = df['Machine failure']The target variable is Machine failure.
The notebook removes:
UDI: an identifier column that does not describe machine behaviorProduct ID: a product identifier that is not a direct operating measurementMachine failure: the target column, which must not be included as an input featureTWF,HDF,PWF,OSF,RNF: specific failure-mode flags
Removing the specific failure-mode flags is important because the API is designed to predict overall machine failure from operating conditions, not from columns that already describe failure events.
The final model uses these numeric input features:
numeric_features = [
'Air temperature [K]',
'Process temperature [K]',
'Rotational speed [rpm]',
'Torque [Nm]',
'Tool wear [min]'
]These are the same fields required by the API.
preprocessor = ColumnTransformer(
transformers=[
('num', SimpleImputer(strategy='median'), numeric_features)
]
)The preprocessing step fills missing numeric values using the median.
Median imputation is useful because it is less sensitive to extreme values than mean imputation. This is a practical choice for machinery data, where unusual readings can occur and should not overly influence how missing values are filled.
ColumnTransformer is used so preprocessing is tied directly to the expected feature columns. This makes the training workflow and API prediction workflow consistent.
SMOTE(sampling_strategy=0.2, random_state=42, k_neighbors=5)Machine failure events are usually much less common than normal operation. If the model trained directly on the raw class distribution, it could achieve high accuracy by mostly predicting "no failure" while missing many actual failures.
SMOTE creates synthetic examples of the minority class in the training data. This helps the model learn the failure pattern more effectively.
The notebook uses:
sampling_strategy=0.2: increases the number of failure examples without forcing a fully balanced datasetrandom_state=42: makes the training process repeatablek_neighbors=5: controls how nearby minority-class examples are used to create synthetic samples
XGBClassifier(
n_estimators=300,
learning_rate=0.05,
max_depth=4,
subsample=0.8,
colsample_bytree=0.8,
random_state=42,
eval_metric='logloss'
)XGBoost is used because it performs well on structured tabular data and can capture nonlinear relationships between operating conditions and failure risk.
The chosen settings aim to balance predictive performance and generalization:
n_estimators=300: builds enough trees to learn useful patternslearning_rate=0.05: uses smaller learning steps to reduce overfitting riskmax_depth=4: limits tree complexitysubsample=0.8: trains each tree on a sample of rowscolsample_bytree=0.8: trains each tree on a sample of featuresrandom_state=42: makes results reproducibleeval_metric='logloss': uses a probability-aware classification metric during training
xgb_smote_pipeline = Pipeline(steps=[
('preprocessor', preprocessor),
('smote', SMOTE(sampling_strategy=0.2, random_state=42, k_neighbors=5)),
('model', XGBClassifier(...))
])The notebook saves the complete workflow as one pipeline, not just the XGBoost model.
This matters because the API needs to apply the same preprocessing steps used during training. Saving the full pipeline ensures that incoming API data is imputed and transformed correctly before prediction.
X_train, X_test, y_train, y_test = train_test_split(
X,
y,
test_size=0.2,
random_state=42,
stratify=y
)The dataset is split into:
- 80% training data
- 20% test data
stratify=y preserves the same failure/non-failure ratio in both sets. This is especially important for imbalanced datasets because the test set needs to represent the rare failure class fairly.
xgb_smote_pipeline.fit(X_train, y_train)
y_pred = xgb_smote_pipeline.predict(X_test)The notebook trains the pipeline on the training data and evaluates it on the held-out test data.
Reported test results from the notebook:
Accuracy: 0.978
Class 0 - No Machine Failure:
precision: 0.99
recall: 0.98
f1-score: 0.99
Class 1 - Machine Failure:
precision: 0.64
recall: 0.81
f1-score: 0.71
ROC AUC Score: 0.9727
The recall of 0.81 for the failure class means the model detected many of the actual failures in the test set. For a failure prediction use case, recall is important because missing a real failure can be more costly than flagging a machine for inspection.
The ROC AUC score of about 0.973 shows that the model separates failure and non-failure cases well across different probability thresholds.
The notebook prints feature importance from the trained XGBoost model:
| Feature | Importance |
|---|---|
| Torque [Nm] | 0.287694 |
| Rotational speed [rpm] | 0.286066 |
| Tool wear [min] | 0.210199 |
| Air temperature [K] | 0.129629 |
| Process temperature [K] | 0.086412 |
This helps explain which machine measurements had the most influence on the model. In this run, torque, rotational speed, and tool wear were the strongest predictors.
joblib.dump(xgb_smote_pipeline, './xgb_smote_pipeline.pkl')The final trained pipeline is saved as xgb_smote_pipeline.pkl.
This .pkl file contains:
- The median imputation preprocessing step
- The trained XGBoost model
- The feature structure expected at prediction time
The API loads this file with:
model = joblib.load("xgb_smote_pipeline.pkl")When /predict receives a request, the API converts the request body into a one-row pandas DataFrame, passes it into the saved pipeline, and returns the predicted class plus the failure probability.
app.py defines a FastAPI application and a Pydantic input schema.
The input schema expects the same feature names used during model training:
class AssetInput(BaseModel):
air_temperature_k: float = Field(..., alias="Air temperature [K]")
process_temperature_k: float = Field(..., alias="Process temperature [K]")
rotational_speed_rpm: float = Field(..., alias="Rotational speed [rpm]")
torque_nm: float = Field(..., alias="Torque [Nm]")
tool_wear_min: float = Field(..., alias="Tool wear [min]")Inside the /predict endpoint, the API:
- Receives JSON input from the user.
- Converts the input into a DataFrame with the exact column names used during training.
- Calls
model.predict(df)to get the predicted class. - Calls
model.predict_proba(df)to get the probability of machine failure. - Returns the prediction, probability, and readable prediction label.
curl.exe -X POST "http://127.0.0.1:8000/predict" `
-H "Content-Type: application/json" `
-d "{\"Air temperature [K]\":300,\"Process temperature [K]\":310,\"Rotational speed [rpm]\":1500,\"Torque [Nm]\":40,\"Tool wear [min]\":10}"- The
.pklfile must stay in the same folder asapp.pyunless the load path is changed. - Input JSON field names must match the expected API aliases shown in the example request.
- The model predicts probability, not certainty. A high probability means higher predicted risk based on the training data.
- Retraining the notebook can produce a new
.pklfile and slightly different prediction probabilities. - The current notebook is focused on model development and export. For production use, consider adding model versioning, logging, monitoring, authentication, and threshold tuning based on business risk.