# 📌 Overview

This notebook implements the methods described in the paper **"P300-based Character Recognition using CNN and SVM"** by **Sourav Kundu** and **Samit Ari**.

---

# 📌 Dataset

You can download the dataset from the following link:  
[BCI Competition III - Dataset II](https://www.bbci.de/competition/download/competition_iii/albany/BCI_Comp_III_Wads_2004.zip)

---

## 📊 Data Collection

- **Subjects**: Two participants, five sessions each.
- **Task**: Focus on one out of 36 different characters displayed on the screen.

### Matrix Display:
- **Blank Matrix**: 2.5 seconds.
- **Intensifications**: Rows and columns are randomly intensified at 100ms intervals, followed by a 75ms blank period.
- **Repetition**: The process is repeated 15 times per character, totaling 180 intensifications per epoch.

### Data Details:
- **Bandpass Filtered**: 0.1-60Hz
- **Digitized at**: **240Hz** (sampled 240 times per second).
- **Recorded with**: **64 EEG channels**.
- **Data Files**: Four MATLAB *.mat files – one training (85 characters) and one test (100 characters) for each of the two subjects (A and B).

---

## 📌 Contents of the .mat File

For each sample in the Signal matrix, associated events are coded using the following variables:

- **Flashing**:  
  - `1` when row/column was intensified, `0` otherwise.
  
- **StimulusCode**:  
  - `0` when no row/column is being intensified (i.e., matrix is blank).  
  - `1…6` for intensified columns (1 being the left-most column).  
  - `7…12` for intensified rows (7 being the upper-most row).  

- **StimulusType**:  
  - `0` when no row/column is being intensified or the intensified row/column does not contain the desired character.  
  - `1` when the intensified row/column contains the desired character.  
  - This variable allows easy access to labels in the training sets, separating responses that contained the desired character from those that did not. It can also be determined using the **StimulusCode** along with the **TargetChar** that the user focused on.

- **TargetChar**:  
  - The correct character label for each character epoch in the training data.


In [1]:
from glob import glob
from pathlib import Path
import os
import mne
import scipy
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [2]:
DATA_DIR = Path('')
DATA_PATHS = glob(str(DATA_DIR / 'BCI_Comp_III_Wads_2004' / '*.mat'))
ELOC_PATH = DATA_DIR / 'BCI_Comp_III_Wads_2004' / 'eloc64.txt'

In [3]:
Subject_A_Test = scipy.io.loadmat(DATA_PATHS[0])
Subject_A_Train = scipy.io.loadmat(DATA_PATHS[1])
Subject_B_Test = scipy.io.loadmat(DATA_PATHS[2])
Subject_B_Train = scipy.io.loadmat(DATA_PATHS[3])

In [4]:
print(Subject_A_Train.keys())

dict_keys(['__header__', '__version__', '__globals__', 'Signal', 'TargetChar', 'Flashing', 'StimulusCode', 'StimulusType'])


In [5]:
print('Header: ', Subject_A_Train['__header__']), print('\nVersion: ', Subject_A_Train['__version__']);

Header:  b'MATLAB 5.0 MAT-file, Platform: PCWIN, Created on: Mon Nov 29 08:14:54 2004'

Version:  1.0


In [6]:
channels = []
f = open(ELOC_PATH,'r')
for channel in f.read().split('\n'):
    channels.append(channel.split('\t')[-1].split('.')[0])

channels = channels[:64]