# 🧠 MLP Classifier for Anomaly Detection

This notebook implements a simple Multilayer Perceptron (MLP) neural network for binary classification of anomalies in Kubernetes resource usage data.

- Dataset: `k8_synthetic_dataset.csv`
- Evaluation metric: **Macro F1-Score**
- Author: Ammar Yousuf Abrahani
- Date: June 2025

In [1]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, f1_score
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam

In [3]:
# Load Dataset
data = pd.read_csv('../data/raw/k8_synthetic_dataset.csv')
data.head()

Unnamed: 0,cpu_usage,memory_usage,network_io,disk_io,label
0,54.967142,41.71005,337.849431,107.373466,0.0
1,48.617357,44.39819,253.891734,92.133224,0.0
2,56.476885,57.472936,343.480296,100.574896,0.0
3,65.230299,56.103703,367.781893,125.569037,0.0
4,47.658466,49.790984,320.671745,103.821981,0.0


In [None]:
# Features and Labels
X = data[['cpu_usage', 'memory_usage', 'network_io', 'disk_io']].values
y = data['label'].values.astype(int)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
print(f"Train samples: {len(X_train)}, Test samples: {len(X_test)}")

Train samples: 210, Test samples: 90


In [None]:
# Define MLP Model
model = Sequential([
    Dense(32, activation='relu', input_shape=(X_train.shape[1],)),
    Dense(16, activation='relu'),
    Dense(1, activation='sigmoid')
])
model.compile(optimizer=Adam(0.001), loss='binary_crossentropy', metrics=['accuracy'])

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [None]:
#  Train MLP Model
model.fit(X_train, y_train, epochs=20, batch_size=16, validation_split=0.2)

Epoch 1/10
[1m11/11[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 30ms/step - accuracy: 0.2668 - loss: 14.8391 - val_accuracy: 0.9762 - val_loss: 0.9314
Epoch 2/10
[1m11/11[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - accuracy: 0.9468 - loss: 2.6003 - val_accuracy: 0.9762 - val_loss: 1.3033
Epoch 3/10
[1m11/11[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - accuracy: 0.9691 - loss: 1.8514 - val_accuracy: 0.9762 - val_loss: 1.0216
Epoch 4/10
[1m11/11[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 11ms/step - accuracy: 0.9763 - loss: 0.9513 - val_accuracy: 0.9286 - val_loss: 0.6118
Epoch 5/10
[1m11/11[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 11ms/step - accuracy: 0.8891 - loss: 0.9836 - val_accuracy: 0.9048 - val_loss: 0.4352
Epoch 6/10
[1m11/11[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 11ms/step - accuracy: 0.9202 - loss: 0.6836 - val_accuracy: 0.9286 - val_loss: 0.3600
Epoch 7/10
[1m11/11[0m [32m━━━

<keras.src.callbacks.history.History at 0x257ac97f7d0>

In [None]:
# Evaluate on Test Set
y_pred = (model.predict(X_test) > 0.5).astype(int)
report = classification_report(y_test, y_pred, target_names=['Normal', 'Anomaly'], digits=4)
f1_macro = f1_score(y_test, y_pred, average='macro')
print("MLP Classifier Performance:\n")
print(report)
print(f"Macro Average F1-Score: {f1_macro:.4f}")

[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 20ms/step
📊 MLP Classifier Performance:

              precision    recall  f1-score   support

      Normal     0.9213    0.9880    0.9535        83
     Anomaly     0.0000    0.0000    0.0000         7

    accuracy                         0.9111        90
   macro avg     0.4607    0.4940    0.4767        90
weighted avg     0.8497    0.9111    0.8793        90

🔍 Macro Average F1-Score: 0.4767
