# Tutorial for loading leaderboard competition data

There will be three parts of the data that you need to download, please refer to our data downloading guide [package](https://github.com/sylvchev/beetl-competition) to download the data. Description can be seen in https://beetl.ai/data. 

**Important** Please note that for Task 1, you are ONLY allowed to use data in 'SleepSource' and 'LeaderboardSleep', we regard using data not in these folders as cheating. For Task 2, you are ONLY allowed to use Cho 2017, BNCI 2014-001, PhysionetMI (see how to use MOABB API to download in motor imagery task start kits) and data provided in 'leaderboardMI', we regard using data not in these data sets and folders as cheating. We will test run code from top ranking teams in the final stage of the competition. Please fix your random seed or so to make sure the experiemnts are reproducible.

## 1. SleepSource

- This folder contains the source sleep EEG of age group 25-59 with training trials and labels

## 2. LeaderboardSleep

- sleep_target: This folder contains 5 example subjects from age group 60-80 (10sessions) with labels, they are the example subjects from the target group

- testing: This folder contains the leaderboard subjects 6-17 that without labels, that participants need to give predictions.

Data information of the sleep data set above:

100hz 30s-window 2channels (Fpz-Cz, Pz-Oz)
highpass: 0.5 Hz, lowpass: 100.0 Hz

labels:
- 'Sleep stage W': 0
- 'Sleep stage 1': 1
- 'Sleep stage 2': 2
- 'Sleep stage 3': 3
- 'Sleep stage 4': 4
- 'Sleep stage R': 5


## 3. leaderboardMI

There are five subjects for leaderboard testing, S1 S2 are from data set A, S3 S4 S5 are from data set B. We will release more data set details after the competition. Each folder has two subfolders - training and testing. Training folders contain 100 trials (S1,S2) or 120 trials (S3-S5) with labels as target domain samples of that subject. Testing folders contain the trials that you should predict.

Data set A:

sampling rate 500hz, 4s-window 63 channels

Motor imagery labels:

- 'Lefthand': 0
- 'Righthand': 1
- 'Feet': 2
- 'Rest': 3

ch_names =['Fp1', 'Fz', 'F3', 'F7', 'FT9', 'FC5', 'FC1', 'C3', 'T7', 
           'TP9', 'CP5', 'CP1', 'Pz', 'P3', 'P7', 'O1', 'Oz', 
           'O2', 'P4', 'P8', 'TP10', 'CP6', 'CP2', 'C4', 'T8',
           'FT10', 'FC6', 'FC2', 'F4', 'F8', 'Fp2', 'AF7', 'AF3', 
           'AFz', 'F1', 'F5', 'FT7', 'FC3', 'FCz', 'C1', 'C5', 
           'TP7', 'CP3', 'P1', 'P5', 'PO7', 'PO3', 'POz', 'PO4', 
           'PO8', 'P6', 'P2', 'CPz', 'CP4', 'TP8', 'C6', 'C2',
           'FC4', 'FT8', 'F6', 'F2', 'AF4', 'AF8']
           
highpass: 1 Hz, lowpass: 100.0 Hz, Notch filter:50Hz

Data set B: 

200hz 4s-window 32 channels

Motor imagery labels:

- 'Lefthand': 0
- 'Righthand': 1
- 'Feet': 2
- 'Rest': 3

ch_names =['Fp1', 'Fp2', 'F3', 
            'Fz', 'F4', 'FC5', 'FC1', 'FC2','FC6', 'C5', 'C3',
           'C1', 'Cz', 'C2', 'C4', 'C6', 'CP5', 'CP3', 'CP1',
           'CPz', 'CP2', 'CP4', 'CP6', 'P7', 'P5', 'P3', 'P1', 'Pz', 
           'P2', 'P4', 'P6', 'P8']
           
highpass: 1 Hz, lowpass: 100.0 Hz



In [3]:
import numpy as np
import pickle

## Loading Data in 'SourceSleep' and 'LeaderboardSleep'
After you download the competition data, you could load your data with following format. Similarly you could open the label files accordingly.

In [4]:
savebase = 'D:\\leaderboardData\\LeaderboardSleep\\testing\\'
for i_test in range(6,18):
    print('subject ',i_test)
    with open (savebase+"leaderboard_s"+str(i_test)+"r1X.npy", 'rb') as fp:
        X0 = pickle.load(fp)
    print(X0.shape)
    if i_test==6:
        X_test = X0
    else:
        X_test = np.concatenate((X_test,X0))
print('overall test size')
print(X_test.shape)

subject  6
(1227, 2, 3000)
subject  7
(1216, 2, 3000)
subject  8
(1079, 2, 3000)
subject  9
(1076, 2, 3000)
subject  10
(1099, 2, 3000)
subject  11
(1086, 2, 3000)
subject  12
(696, 2, 3000)
subject  13
(1086, 2, 3000)
subject  14
(981, 2, 3000)
subject  15
(1295, 2, 3000)
subject  16
(1046, 2, 3000)
subject  17
(1207, 2, 3000)
overall test size
(13094, 2, 3000)


## Loading Data in 'leaderboardMI' for Data set A (S1,S2)

In [5]:
pilotname='S1'
savebase0='D:\\leaderboardData\\leaderboardMI\\'
savebase = savebase0+pilotname+'\\'+'testing\\'

for i in range(5,15):
    with open (savebase+"race"+str(i+1)+"_padsData.npy", 'rb') as fp:
        Xt = pickle.load(fp)
    if i==5:
        X0 = Xt
    else:
        X0 = np.concatenate((X0,Xt))
    print('run',i+1)
    print(X0.shape)   

print('overall test size')
print(X0.shape)


run 6
(20, 63, 2000)
run 7
(40, 63, 2000)
run 8
(60, 63, 2000)
run 9
(80, 63, 2000)
run 10
(100, 63, 2000)
run 11
(120, 63, 2000)
run 12
(140, 63, 2000)
run 13
(160, 63, 2000)
run 14
(180, 63, 2000)
run 15
(200, 63, 2000)
overall test size
(200, 63, 2000)


## Loading Data in 'leaderboardMI' for Data set B (S3-S5)

In [6]:
pilotname='3'
savebase0='D:\\leaderboardData\\leaderboardMI\\'
savebase = savebase0+'S'+pilotname+'\\testing\\'

with open (savebase+"testing_s"+pilotname+"X.npy", 'rb') as fp:
        X_test = pickle.load(fp)

print('overall test size')
print(X_test.shape)


overall test size
(200, 32, 800)
