## This notebook is to explain the different recordings, data format and how to read files 

<span style="color:blue"> Before doing this I recommend initialising a git repository and setting up a conda environment with the required python packages  </span> 

In [5]:
## import the necessary python packages 
import os 
import pandas as pd 
import numpy as np 

In [6]:
#select the file path - this changes the working directory to the folder with all the files you want to analyse
os.chdir('/home/melissa/PREPROCESSING/GRIN2B/GRIN2B_numpy/')

#### For each animal (e.g animal_id = 130) there is one recording file (e.g 130_GRIN2B.npy) and two brainstate files.

* Each recording file is the raw data in npy format. Each brainstate file is in pickle format (similar to an excel spreadsheet)
* The brainstate files have a brainstate number (0, 1, 2) which correspond to different sleep stages, where:
 * 0 == WAKE
 * 1 == Non-REM
 * 2 == REM
*the brainstate files also have start times and end times which go up in 5 second increments and split the recording file into 5 second epochs of each brainstate


In [8]:
#load all the necessary files for one day of one animal 
animal_130 = np.load('130_GRIN2B.npy')
br_1_130 = pd.read_pickle('130_BL1.pkl')

In [9]:
#view raw recording
animal_130

array([[   0,  -13, 1341, ..., 2780, 2846, 2978],
       [   0,  -13, 1332, ..., 2657, 2689, 2706],
       [   0,  -13, 1355, ..., 2752, 2848, 2965],
       ...,
       [   0,  -13, 1324, ..., 2603, 2703, 2829],
       [   0,  -13, 1285, ..., 2644, 2674, 2697],
       [   0,  -13, 1323, ..., 2730, 2826, 2958]], dtype=int16)

In [10]:
#brainstate 
br_1_130

Unnamed: 0,brainstate,start_epoch,end_epoch
0,0,0,5
1,0,5,10
2,0,10,15
3,0,15,20
4,0,20,25
...,...,...,...
17275,0,86375,86380
17276,0,86380,86385
17277,0,86385,86390
17278,0,86390,86395


For each animal there are different recording 'start' and 'end' times which correspond to when the automatic sleepscorer has started scoring the data

To look at each animal go to: https://github.com/melissafasol/GRIN2B/blob/main/scripts/GRIN2B_constants.py

In [11]:
start = 18088897
end = 39723456

In [16]:
#save a new object which just considers recording in this time frame
sliced_recording = animal_130[:, 18088897:39723456] 
sliced_recording

array([[2763, 2790, 2837, ..., 2719, 2715, 2731],
       [2713, 2705, 2699, ..., 2697, 2704, 2695],
       [2736, 2769, 2797, ..., 2741, 2758, 2773],
       ...,
       [2685, 2711, 2727, ..., 2685, 2705, 2711],
       [2727, 2720, 2715, ..., 2714, 2711, 2701],
       [2739, 2729, 2727, ..., 2713, 2713, 2709]], dtype=int16)

In the above dataframe the first colon represents the rows and the second represents the times.The next colon represents the columns, we have selected data values between the start and end times for all channels, so there should be 16 rows - one row for each channel

## Looking at particular sleepstages

If we want to look at the time values for a particular sleep stage, first select the sleepstage you want to look at in the brainstate file. Let's use REM (2) as an example

In [20]:
#this used the pandas loc method to select the column of choice - brainstate
rem = br_1_130.loc[br_1_130['brainstate'] == 2]
rem

Unnamed: 0,brainstate,start_epoch,end_epoch
963,2,4815,4820
964,2,4820,4825
965,2,4825,4830
966,2,4830,4835
967,2,4835,4840
...,...,...,...
16132,2,80660,80665
16347,2,81735,81740
16348,2,81740,81745
16349,2,81745,81750


If we want to have a look at these time values for REM within the recording we need to select the corresponding data indices


<span style="color:blue">*It is important here to take the sampling rate into consideration. The taini tec recording devices record 250.4 samples every second. Therefore if we want to skip to 4815 seconds ahead of the start time we need to multiply this value by 250.4.* </span> 

In [24]:
rem_time_start = int(4815*250.4)
rem_time_end = int(4820*250.4)
sliced_recording[:, rem_time_start:rem_time_end]

array([[2806, 2799, 2790, ..., 2828, 2829, 2821],
       [2720, 2722, 2724, ..., 2710, 2712, 2709],
       [2801, 2801, 2793, ..., 2771, 2777, 2779],
       ...,
       [2737, 2740, 2738, ..., 2712, 2720, 2723],
       [2696, 2700, 2701, ..., 2680, 2683, 2682],
       [2766, 2757, 2750, ..., 2767, 2777, 2772]], dtype=int16)

I hope this makes sense, from here I would look at selecting different column values so you do not have to hard code the times you want to look at every time. I would recommend learning <span style="color:red">for loops</span> and <span style="color:red"> list comprehension </span> for this

Also try plot this raw data using mne:

In [26]:
import mne
import matplotlib.pyplot


ch_names = ['S1Tr_RIGHT', 'EMG_RIGHT', 'M2_FrA_RIGHT','M2_ant_RIGHT','M1_ant_RIGHT', 'V2ML_RIGHT',
            'V1M_RIGHT', 'S1HL_S1FL_RIGHT', 'V1M_LEFT', 'V2ML_LEFT', 'S1HL_S1FL_LEFT',
            'M1_ant_LEFT','M2_ant_LEFT','M2_FrA_LEFT', 'EMG_LEFT', 'S1Tr_LEFT']

ch_types = ['eeg', 'emg', 'eeg', 'eeg', 'eeg', 'eeg',
           'eeg', 'eeg', 'eeg', 'eeg', 'eeg',
           'eeg', 'eeg', 'eeg', 'emg', 'eeg']


raw_info = mne.create_info(ch_names, sfreq = 250.4, ch_types=ch_types)

os.chdir('/home/melissa/PREPROCESSING/GRIN2B/GRIN2B_numpy/')
test_file = np.load('130_GRIN2B.npy')

raw = mne.io.RawArray(test_file, raw_info)
raw.plot(scalings = 'auto')

Creating RawArray with float64 data, n_channels=16, n_times=69377198
    Range : 0 ... 69377197 =      0.000 ... 277065.483 secs
Ready.
