## 1. Imports & Data/Model Load
Load the saved model and the original test split (regenerate splits or persist them during training). For reproducible reporting, persist `test_X.npy`, `test_y.npy`.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix, accuracy_score, f1_score, classification_report
import tensorflow as tf

In [None]:
# Load arrays (ensure these were saved in model_training.ipynb)
test_X = np.load('test_X.npy')  # shape: (samples, freq, time, 1)
test_y = np.load('test_y.npy')
model = tf.keras.models.load_model('eeg_personid_model.h5')
print('Loaded:', test_X.shape, test_y.shape)

## 2. Predictions & Metrics

In [None]:
pred_proba = model.predict(test_X)
pred = np.argmax(pred_proba, axis=1)
acc = accuracy_score(test_y, pred)
f1 = f1_score(test_y, pred, average='weighted')
print('Test Accuracy:', acc)
print('Weighted F1:', f1)

## 3. Confusion Matrix (Top 10 Classes)

In [None]:
cm = confusion_matrix(test_y, pred)
unique, counts = np.unique(test_y, return_counts=True)
topk_idx = np.argsort(counts)[-10:]

In [None]:
topk_classes = unique[topk_idx]
mask = np.isin(test_y, topk_classes)
cm_small = confusion_matrix(test_y[mask], pred[mask], labels=topk_classes)
plt.figure(figsize=(8,6))
plt.imshow(cm_small, interpolation='nearest', cmap='Blues')
plt.title('Confusion Matrix (Top 10 Classes)')
plt.colorbar()
plt.xlabel('Predicted')
plt.ylabel('True')
plt.xticks(range(len(topk_classes)), topk_classes+1, rotation=90)
plt.yticks(range(len(topk_classes)), topk_classes+1)
plt.show()

## 4. Classification Report (Top 10 Classes)

In [None]:
print(classification_report(test_y[mask], pred[mask], labels=topk_classes, zero_division=0))

## 5. Discussion
Replace placeholder text with your analysis.

- **Dataset**: EEG Motor Movement/Imagery (109 subjects).
- **Input Representation**: 2s windows, mel-spectrogram (n_mels=64), normalized per window.
- **Model**: Two Conv2D blocks -> GRU(128) -> GRU(64) -> Dense softmax.
- **Metrics**: Report accuracy & weighted F1 above. Compare with baselines (e.g., random guess ~1/109 â‰ˆ 0.009).
- **Confusion Matrix**: Concentrate on top classes to visualize separability; consider class imbalance.
- **Error Analysis**: Identify subjects frequently confused; may indicate similar signal patterns or insufficient window diversity.
- **Improvements**: Multi-channel inputs, CSP spatial filtering, longer windows, augmentation (noise, slight time-warp), hyperparameter tuning, transformer encoders.
- **Limitations**: Using averaged channel signal loses spatial information; may cap identification performance.