# EEG Eye State Classification with XGBoost Classifier

### Objectives:

This notebook performs the following tasks:

1. **Data Loading and Preprocessing**:
   - Loads the EEG Eye State dataset from the UCI Machine Learning Repository.
   - Displays the dimensions of the features set and the target variable.
   - Previews the first few rows of the dataset.
   - Analyzes the unique classes and class distribution of the target variable.
   - Splits the dataset into training, validation, and test sets.
   

2. **Hyperparameter Tuning for XGBoost**:
   - Defines a range of values for the hyperparameters **`max_depth`**, **`learning_rate`**, **`n_estimators`**, **`subsample`**, and **`colsample_bytree`**.
   - Trains and evaluates multiple XGBoost models for each combination of hyperparameters, recording **accuracy** and **F1 scores** for both **training** and **validation** sets.
   - Outputs the evaluation metrics to identify the models' performance during tuning.

2. **Selecting the Optimal XGBoost Model**:
   - Selects the best-performing models based on the hyperparameter tuning results and makes predictions on the **test** set.
   - Calculates and prints the **accuracy** and **F1 scores** of the **test** set.

3. **Conclusion**:
   - Finalizes the optimal model that performs well on the **test** set.
   - Completes the project of building an **XGBoost Classifier model** for the **EEG Eye State dataset**.

### This project uses data from [UCI Machine Learning Repository](https://archive.ics.uci.edu/dataset/264/eeg+eye+state). This dataset is licensed under a [Creative Commons Attribution 4.0 International (CC BY 4.0) license](https://creativecommons.org/licenses/by/4.0/legalcode). The dataset was converted into numpy arrays for building a machine learning model.

In [2]:
import numpy as np
from ucimlrepo import fetch_ucirepo

# fetch dataset
eeg_eye_state = fetch_ucirepo(id=264)

# data (as pandas dataframes)
X = eeg_eye_state.data.features
y = eeg_eye_state.data.targets

# Print the dimensions of the features set (X) and the target variable (y)
print(X.shape, y.shape)

(14980, 14) (14980, 1)


In [3]:
# Display the first few rows of the features set (X)
X.head()

Unnamed: 0,AF3,F7,F3,FC5,T7,P7,O1,O2,P8,T8,FC6,F4,F8,AF4
0,4329.23,4009.23,4289.23,4148.21,4350.26,4586.15,4096.92,4641.03,4222.05,4238.46,4211.28,4280.51,4635.9,4393.85
1,4324.62,4004.62,4293.85,4148.72,4342.05,4586.67,4097.44,4638.97,4210.77,4226.67,4207.69,4279.49,4632.82,4384.1
2,4327.69,4006.67,4295.38,4156.41,4336.92,4583.59,4096.92,4630.26,4207.69,4222.05,4206.67,4282.05,4628.72,4389.23
3,4328.72,4011.79,4296.41,4155.9,4343.59,4582.56,4097.44,4630.77,4217.44,4235.38,4210.77,4287.69,4632.31,4396.41
4,4326.15,4011.79,4292.31,4151.28,4347.69,4586.67,4095.9,4627.69,4210.77,4244.1,4212.82,4288.21,4632.82,4398.46


In [4]:
# Display the first few entries of the target variable (y)
y.head()

Unnamed: 0,eyeDetection
0,0
1,0
2,0
3,0
4,0


In [5]:
# Get the unique values in the target variable (y) along with their counts to understand class distribution
np.unique(y, return_counts=True)

(array([0, 1]), array([8257, 6723]))

In [11]:
from sklearn.model_selection import train_test_split

# Convert X and y to numpy arrays for further processing
X = X.values
y = y.values.flatten()

# First split: Train + Validation and Test sets (70% for training/validation, 30% for test)
X_train_val, X_test, y_train_val, y_test = train_test_split(X, y, test_size=0.3, random_state=42, stratify=y)

# Second split: Training and Validation sets (70% of train_val data for training, 30% for validation)
X_train, X_val, y_train, y_val = train_test_split(X_train_val, y_train_val, test_size=0.3, random_state=42, stratify=y_train_val)

# Output the shapes of the data splits for verification
print("Training set shape:", X_train.shape, y_train.shape)
print("Validation set shape:", X_val.shape, y_val.shape)
print("Testing set shape:", X_test.shape, y_test.shape)

Training set shape: (7340, 14) (7340,)
Validation set shape: (3146, 14) (3146,)
Testing set shape: (4494, 14) (4494,)


In [26]:
import xgboost as xgb
from sklearn.metrics import accuracy_score, f1_score

# Hyperparameters to test
param_grid = [
    {"max_depth": None, "learning_rate": 0.1, "n_estimators": 100, "verbosity": 1, "random_state": 42,
     "subsample": 0.8, "colsample_bytree": 0.8},
    {"max_depth": 4, "learning_rate": 0.5, "n_estimators": 70, "verbosity": 1, "random_state": 42,
     "subsample": 0.7, "colsample_bytree": 0.9},
    {"max_depth": 7, "learning_rate": 0.05, "n_estimators": 150, "verbosity": 1, "random_state": 42,
     "subsample": 0.9, "colsample_bytree": 0.7},
    {"max_depth": 3, "learning_rate": 0.01, "n_estimators": 300, "verbosity": 1, "random_state": 42,
     "subsample": 0.6, "colsample_bytree": 0.85},
    {"max_depth": 5, "learning_rate": 0.2, "n_estimators": 200, "verbosity": 1, "random_state": 42,
     "subsample": 0.7, "colsample_bytree": 0.8},
    {"max_depth": 6, "learning_rate": 0.15, "n_estimators": 250, "verbosity": 1, "random_state": 42,
     "subsample": 0.75, "colsample_bytree": 0.75},
    {"max_depth": 2, "learning_rate": 0.3, "n_estimators": 120, "verbosity": 1, "random_state": 42,
     "subsample": 0.8, "colsample_bytree": 0.8}
]

# Dictionary to store trained models
models = {}

# Loop over hyperparameter sets and train/evaluate models
for i, params in enumerate(param_grid):
    # Train the XGBoost model
    model = xgb.XGBClassifier(**params)
    model.fit(X_train, y_train)

    # Store the model in a dictionary
    models[f"xgboost_model_{i+1}"] = model
    print(f"Model {i+1} trained and stored in dictionary\n")

    # Evaluate on training data
    y_train_pred = model.predict(X_train)
    train_accuracy = accuracy_score(y_train, y_train_pred)
    train_f1 = f1_score(y_train, y_train_pred)

    # Evaluate on validation data
    y_val_pred = model.predict(X_val)
    val_accuracy = accuracy_score(y_val, y_val_pred)
    val_f1 = f1_score(y_val, y_val_pred)

    # Print the evaluation metrics
    print(f"Model {i+1} | Training Accuracy: {train_accuracy:.4f} | Training F1-score: {train_f1:.4f}")
    print(f"Model {i+1} | Validation Accuracy: {val_accuracy:.4f} | Validation F1-score: {val_f1:.4f}\n")

Model 1 trained and stored in dictionary

Model 1 | Training Accuracy: 0.9568 | Training F1-score: 0.9510
Model 1 | Validation Accuracy: 0.8957 | Validation F1-score: 0.8817

Model 2 trained and stored in dictionary

Model 2 | Training Accuracy: 0.9574 | Training F1-score: 0.9521
Model 2 | Validation Accuracy: 0.8919 | Validation F1-score: 0.8794

Model 3 trained and stored in dictionary

Model 3 | Training Accuracy: 0.9655 | Training F1-score: 0.9610
Model 3 | Validation Accuracy: 0.8986 | Validation F1-score: 0.8848

Model 4 trained and stored in dictionary

Model 4 | Training Accuracy: 0.7779 | Training F1-score: 0.7266
Model 4 | Validation Accuracy: 0.7641 | Validation F1-score: 0.7126

Model 5 trained and stored in dictionary

Model 5 | Training Accuracy: 0.9969 | Training F1-score: 0.9965
Model 5 | Validation Accuracy: 0.9199 | Validation F1-score: 0.9106

Model 6 trained and stored in dictionary

Model 6 | Training Accuracy: 1.0000 | Training F1-score: 1.0000
Model 6 | Validatio

Based on the evaluation results, the optimal model selection focuses on balancing good performance on both the **training** and **validation** sets while avoiding overfitting. Models **5** and **6** show near-perfect **training** **accuracy** and **F1-scores**, but their **validation** performance is notably lower, indicating overfitting. Model **4**, on the other hand, has significantly lower scores on both **training** and **validation** sets, suggesting underfitting as it fails to capture enough complexity. Models **1**, **2**, **3**, and **7** demonstrate more balanced performance, with reasonable gaps between **training** and **validation** results. Among these, **Model 3** emerges as a strong candidate, with a high **validation accuracy** (**0.8986**) and **F1-score** (**0.8848**), while maintaining solid **training** performance (**Accuracy: 0.9655**, **F1-score: 0.9610**). It achieves a good trade-off between capturing complexity and generalizing better to unseen data. Now we will test the **Model 3**, including candidate models that performed well, on the **test** data to finalize the results!

In [35]:
# List of chosen models to evaluate on the test data
chosen_models = ["xgboost_model_1", "xgboost_model_2", "xgboost_model_3", "xgboost_model_7"]

# Loop through each chosen model to make predictions on test set and evaluate performance
for model in chosen_models:
    # Predict the labels for the test dataset using the current model
    y_test_pred = models.get(model).predict(X_test)

    # Calculate the accuracy of the model on the test dataset
    test_accuracy = accuracy_score(y_test, y_test_pred)

    # Calculate the F1-score of the model on the test dataset
    test_f1 = f1_score(y_test, y_test_pred)

    # Print the test accuracy and F1-score for the current model
    print(f"Model {model[-1]} | Test Accuracy: {test_accuracy:.4f} | Test F1-score: {test_f1:.4f}\n")

Model 1 | Test Accuracy: 0.9001 | Test F1-score: 0.8862

Model 2 | Test Accuracy: 0.8947 | Test F1-score: 0.8814

Model 3 | Test Accuracy: 0.9039 | Test F1-score: 0.8906

Model 7 | Test Accuracy: 0.8084 | Test F1-score: 0.7784



The evaluation of the chosen models on the test dataset yielded insightful results. As expected **Model 3** emerged as the top performer, achieving a **Test Accuracy** of **0.9039** and a **Test F1-score** of **0.8906**, indicating good generalization capabilities. Following closely, **Model 1** recorded a **Test Accuracy** of **0.9001** and a **Test F1-score** of **0.8862**, showcasing competitive performance. **Model 2** also demonstrated respectable results with a **Test Accuracy** of **0.8947** and a **Test F1-score** of **0.8814**. **Model 7**, with a **Test Accuracy** of **0.8084** and a **Test F1-score** of **0.7784**, contributed to the overall assessment of model performance. Ultimately, **Model 3** stands out as the final optimal choice based on its reliable performance metrics, validating its candidacy as a good contender in the evaluation process.

This concludes the process of building the XGBoost Classifier model.