### Accessing CML data

The CML's database of intracranial and scalp EEG comes in a pandas dataframe format. All the pertinent data about each experimental session is recorded in a row of a large dataframe. These databases are organized around experimental protocols. For example, the <b>'r1'</b> database contains all DARPA RAM data, while the **'pyfr'** database includes all free-recall intracranial data collected in the years prior to RAM. 

Let's load the RAM database to get a better sense of these formats. We're going to use **CMLReaders**, which is a custom library with helper functions to load data for any experiments run by the CML. If you don't already have CMLReaders installed, please follow the instructions here: https://github.com/pennmem/cmlreaders

In [1]:
#First, our import statements. The CMLReader class is your gateway to all experimental data, including electrodes and EEG. The get_data_index function specifically loads experimental databases. 
from cmlreaders import CMLReader, get_data_index

#The "r1" database corresponds to all of the RAM subjects. Use "pyfr" for the pre-RAM iEEG data.
df = get_data_index("r1")

In [2]:
#This dataframe contains all the information about every experimental sessions collected in the RAM project
df[:10]

Unnamed: 0,Recognition,all_events,contacts,experiment,import_type,localization,math_events,montage,original_experiment,original_session,pairs,ps4_events,session,subject,subject_alias,system_version,task_events
0,,protocols/r1/subjects/FBG490/experiments/EFRCo...,protocols/r1/subjects/FBG490/localizations/0/m...,EFRCourierOpenLoop,build,0,,0,,,protocols/r1/subjects/FBG490/localizations/0/m...,,0,FBG490,FBG490,4.0,protocols/r1/subjects/FBG490/experiments/EFRCo...
1,,protocols/r1/subjects/FBG490/experiments/EFRCo...,protocols/r1/subjects/FBG490/localizations/0/m...,EFRCourierOpenLoop,build,0,,0,,,protocols/r1/subjects/FBG490/localizations/0/m...,,1,FBG490,FBG490,4.0,protocols/r1/subjects/FBG490/experiments/EFRCo...
2,,protocols/r1/subjects/FBG490/experiments/EFRCo...,protocols/r1/subjects/FBG490/localizations/0/m...,EFRCourierOpenLoop,build,0,,0,,,protocols/r1/subjects/FBG490/localizations/0/m...,,2,FBG490,FBG490,4.0,protocols/r1/subjects/FBG490/experiments/EFRCo...
3,,protocols/r1/subjects/FBG490/experiments/EFRCo...,protocols/r1/subjects/FBG490/localizations/0/m...,EFRCourierReadOnly,build,0,,0,,,protocols/r1/subjects/FBG490/localizations/0/m...,,0,FBG490,FBG490,4.0,protocols/r1/subjects/FBG490/experiments/EFRCo...
4,,protocols/r1/subjects/FBG491/experiments/EFRCo...,protocols/r1/subjects/FBG491/localizations/0/m...,EFRCourierOpenLoop,build,0,,0,,,protocols/r1/subjects/FBG491/localizations/0/m...,,1,FBG491,FBG491,4.0,protocols/r1/subjects/FBG491/experiments/EFRCo...
5,,protocols/r1/subjects/FBG491/experiments/EFRCo...,protocols/r1/subjects/FBG491/localizations/0/m...,EFRCourierOpenLoop,build,0,,0,,,protocols/r1/subjects/FBG491/localizations/0/m...,,2,FBG491,FBG491,4.0,protocols/r1/subjects/FBG491/experiments/EFRCo...
6,,protocols/r1/subjects/R1001P/experiments/FR1/s...,protocols/r1/subjects/R1001P/localizations/0/m...,FR1,build,0,protocols/r1/subjects/R1001P/experiments/FR1/s...,0,,0.0,protocols/r1/subjects/R1001P/localizations/0/m...,,0,R1001P,R1001P,,protocols/r1/subjects/R1001P/experiments/FR1/s...
7,,protocols/r1/subjects/R1001P/experiments/FR1/s...,protocols/r1/subjects/R1001P/localizations/0/m...,FR1,build,0,protocols/r1/subjects/R1001P/experiments/FR1/s...,0,,1.0,protocols/r1/subjects/R1001P/localizations/0/m...,,1,R1001P,R1001P,,protocols/r1/subjects/R1001P/experiments/FR1/s...
8,,protocols/r1/subjects/R1001P/experiments/FR2/s...,protocols/r1/subjects/R1001P/localizations/0/m...,FR2,build,0,protocols/r1/subjects/R1001P/experiments/FR2/s...,0,,0.0,protocols/r1/subjects/R1001P/localizations/0/m...,,0,R1001P,R1001P,,protocols/r1/subjects/R1001P/experiments/FR2/s...
9,,protocols/r1/subjects/R1001P/experiments/FR2/s...,protocols/r1/subjects/R1001P/localizations/0/m...,FR2,build,0,protocols/r1/subjects/R1001P/experiments/FR2/s...,0,,1.0,protocols/r1/subjects/R1001P/localizations/0/m...,,1,R1001P,R1001P,,protocols/r1/subjects/R1001P/experiments/FR2/s...


### Load data from an example subject
Here, let's go through an example of loading experimental events and EEG from one subject

In [3]:
#First, our import statements
from cmlreaders import CMLReader, get_data_index

#The "r1" database corresponds to all of the RAM subjects
df = get_data_index("r1")

#Specify which subject and experiment we want
sub = 'R1001P'
exp = 'FR1'

#Find out the sessions, localization, and montage for this subject
sessions = list(df[(df['subject']==sub) & (df['experiment']==exp)]['session'])
mont = int(df[(df['subject']==sub) & (df['experiment']==exp)].iloc()[0]['montage'])      #note that *usually* mont and loc will be 0.
loc = int(df[(df['subject']==sub) & (df['experiment']==exp)].iloc()[0]['localization'])

In [4]:
print(f'{sub} sessions: {sessions}')
print(f'{sub} montage: {mont}')
print(f'{sub} localization: {loc}')

R1001P sessions: [0, 1]
R1001P montage: 0
R1001P localization: 0


<i>Usually, montage and localization are both zero, meaning a subject had only one surgery and the subset of recorded electrodes did not change. But not always!</i>

<b>Montage:</b> Refers to set of a subject's electrodes that were recorded in a given experimental session.

<b>Localization:</b> A subject will get a new localization if they were reimplanted after another surgery. Therefore, electrodes may be in different places altogether. 

This subject completed two sessions of FR1, and only had one montage/localization. Let's load data from the first session. First, we'll need to instantiate an instance of the 'CMLReader' object, which is the object class for accessing any CML data. Think of it as a "finder" for any kind of experimental data. At a minimum, you'll need to give it a subject and experiment for it to find anything.

Once you have constructed a reader, the data loading shows the same behavior as the CMLLoad reader used elsewhere in this workshop. For differences between available arguments, please use the '?' operator on loader functions (such as reader.load_eeg?) to find details.

In [5]:
#For first session...
reader = CMLReader(sub, exp, sessions[0], montage=mont, localization=loc)   #reader for loading CML data

#What kind of data can the reader get for us?
reader.reader_names.keys()

dict_keys(['voxel_coordinates', 'jacksheet', 'classifier_excluded_leads', 'good_leads', 'leads', 'area', 'mni_coordinates', 'electrode_coordinates', 'prior_stim_results', 'target_selection_table', 'experiment_log', 'session_json', 'all_events', 'events', 'math_events', 'ps4_events', 'task_events', 'used_classifier', 'baseline_classifier', 'sources', 'eeg', 'pairs', 'contacts', 'matlab_contacts', 'matlab_pairs', 'localization', 'electrode_categories', 'classifier_summary', 'session_summary', 'math_summary'])