# **Evaluation and Validation of Balanced Dataset**

In [4]:
import pandas as pd
import xgboost as xgb
from sklearn.ensemble import RandomForestClassifier
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

## **Step 1: Data Preparation**
- Load the **merged balanced dataset** and **testing set**.
- Split the **balanced dataset** into **training (X_train, y_train)**.
- Ensure both **training and testing data** are properly **scaled**.

In [5]:
# Load merged balanced dataset
df_balanced = pd.read_csv('balanced_attack_data.csv')
df_test = pd.read_csv('UNSW_NB15_testing-set.csv')

# Drop unnecessary columns in df_test to match df_balanced
df_test = df_test.drop(columns=["id", "proto", "service", "state", "label"], errors="ignore")

# Separate features and labels
X = df_balanced.drop(columns=["attack_cat"])  # Features
y = df_balanced["attack_cat"]  # Labels

# Split into train (70%) and validation (30%) sets
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.3, random_state=42, stratify=y)

# Ensure test set is prepared correctly
X_test = df_test.drop(columns=["attack_cat"])
y_test = df_test["attack_cat"]

# Scale numerical features
scaler = MinMaxScaler()
X_train = scaler.fit_transform(X_train)
X_val = scaler.transform(X_val)
X_test = scaler.transform(X_test)

print("Data preparation complete!")
print(f"Training set: {X_train.shape}\nValidation set: {X_val.shape}\nTesting set: {X_test.shape}")

Data preparation complete!
Training set: (252000, 39)
Validation set: (108000, 39)
Testing set: (82332, 39)


## **Step 2: Train Machine Learning Models**
- Choose models for evaluation:
  - **XGBoost**
  - **Random Forest** (Baseline)
  - **Neural Networks (MLPClassifier)**
- Train each model using the **balanced dataset**.

In [6]:
# Encode labels
le = LabelEncoder()
y_train_enc = le.fit_transform(y_train)
y_val_enc = le.transform(y_val)

# Train the XGBoost model
xgb_model = xgb.XGBClassifier()
xgb_model.fit(X_train, y_train_enc)

# Predict on validation set
y_val_pred = xgb_model.predict(X_val)

# Evaluate the XGBoost model
accuracy = accuracy_score(y_val_enc, y_val_pred)
print(f'XGBoost Validation Accuracy: {accuracy:.2f}')
print(f'\nXGBoost Classification Report:\n', classification_report(y_val_enc, y_val_pred, target_names=le.classes_))
print('\nXGBoost Confusion Matrix:\n', confusion_matrix(y_val_enc, y_val_pred))

XGBoost Validation Accuracy: 0.38

XGBoost Classification Report:
                 precision    recall  f1-score   support

      Analysis       0.17      0.19      0.18     12000
      Backdoor       0.16      0.18      0.17     12000
           DoS       0.25      0.03      0.06     12000
      Exploits       0.62      0.78      0.69     12000
       Fuzzers       0.88      0.40      0.55     12000
       Generic       1.00      0.98      0.99     12000
Reconnaissance       0.61      0.21      0.31     12000
     Shellcode       0.17      0.24      0.20     12000
         Worms       0.16      0.35      0.22     12000

      accuracy                           0.38    108000
     macro avg       0.45      0.38      0.38    108000
  weighted avg       0.45      0.38      0.38    108000


XGBoost Confusion Matrix:
 [[ 2264  2084   180   452    70     0   246  2656  4048]
 [ 2074  2172   184   448    57     1   252  2741  4071]
 [ 1520  1515   410  3310    93     4   211  1969  2968]
 [ 

In [7]:
# Train the Random Forest model
rf_model = RandomForestClassifier(n_estimators=100, random_state=42)
rf_model.fit(X_train, y_train_enc)

# Predict on validation set
y_val_pred_rf = rf_model.predict(X_val)

# Evaluate the Random Forest model
rf_accuracy = accuracy_score(y_val_enc, y_val_pred_rf)
print(f'Random Forest Validation Accuracy: {accuracy:.2f}')
print(f'\nRandom Forest Classification Report:\n', classification_report(y_val_enc, y_val_pred_rf, target_names=le.classes_))
print('\nRandom Forest Confusion Matrix:\n', confusion_matrix(y_val_enc, y_val_pred_rf))

Random Forest Validation Accuracy: 0.38

Random Forest Classification Report:
                 precision    recall  f1-score   support

      Analysis       0.16      0.19      0.18     12000
      Backdoor       0.17      0.19      0.18     12000
           DoS       0.18      0.13      0.15     12000
      Exploits       0.63      0.72      0.67     12000
       Fuzzers       0.61      0.43      0.51     12000
       Generic       1.00      0.98      0.99     12000
Reconnaissance       0.33      0.26      0.29     12000
     Shellcode       0.17      0.20      0.18     12000
         Worms       0.16      0.20      0.18     12000

      accuracy                           0.37    108000
     macro avg       0.38      0.37      0.37    108000
  weighted avg       0.38      0.37      0.37    108000


Random Forest Confusion Matrix:
 [[ 2312  2220  1065   382   544     0  1130  2118  2229]
 [ 2274  2285  1069   360   521     1  1094  2149  2247]
 [ 1626  1589  1537  2805   422     5   78

In [None]:
# Train the MLPClassifier model
mlp_model = MLPClassifier(hidden_layer_sizes=(100,), max_iter=500, random_state=42)
mlp_model.fit(X_train, y_train_enc)

# Predict on validation set
y_val_pred_mlp = mlp_model.predict(X_val)

# Evaluate the MLPClassifier model
mlp_accuracy = accuracy_score(y_val_enc, y_val_pred_mlp)
print(f'MLPClassifier Validation Accuracy: {accuracy:.2f}')
print(f'\nMLPClassifier Classification Report:\n', classification_report(y_val_enc, y_val_pred_mlp, target_names=le.classes_))
print('\nMLPClassifier Confusion Matrix:\n', confusion_matrix(y_val_enc, y_val_pred_mlp))

MLPClassifier Validation Accuracy: 0.38

MLPClassifier Classification Report:
                 precision    recall  f1-score   support

      Analysis       0.79      0.01      0.01     12000
      Backdoor       0.00      0.00      0.00     12000
           DoS       0.00      0.00      0.00     12000
      Exploits       0.51      0.74      0.60     12000
       Fuzzers       0.82      0.25      0.38     12000
       Generic       0.97      0.96      0.96     12000
Reconnaissance       0.38      0.09      0.15     12000
     Shellcode       0.15      0.09      0.11     12000
         Worms       0.17      0.90      0.28     12000

      accuracy                           0.34    108000
     macro avg       0.42      0.34      0.28    108000
  weighted avg       0.42      0.34      0.28    108000


MLPClassifier Confusion Matrix:
 [[   64     1     0   464    45    10    38  1080 10298]
 [    0     0     0   438     6    14    71  1086 10385]
 [    3     0     0  3102    52    79   43

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


## **Step 3: Evaluate Model Performance**
- Use the **testing set** to evaluate trained models.
- Compute performance metrics:
  - **Accuracy**
  - **Precision, Recall, F1-score**
  - **Confusion Matrix**
  - **AUC-ROC Curve**
- Compare results to assess improvement.

## **Step 4: Model Validation**
- Perform **Cross-validation** on the training set.
- Conduct **Hyperparameter tuning** (GridSearchCV, RandomizedSearchCV).
- Check for **overfitting/underfitting**.

## **Step 5: Conclusion**
- Summarize model performances.
- Identify the best-performing model for **network intrusion detection**.
- Discuss whether **balancing improved model performance**.