# Open MRI Datasets

----

For this workshop and the fMRI and dwi workshops that follow, we will be using a subset of a publicly available dataset, ds000030, from [openneuro.org](https://openneuro.org/datasets/ds000030). This dataset and all others hosted on OpenNeuro is structured according to BIDS.

## OpenNeuro
- client-side BIDS validation
- resumable uploads
- running BIDS apps

## Downloading Data

### Datalad

`Datalad` installs the data - which for a dataset means that we get the "small" data (i.e. the text files) and the download instructions for the larger files. We can now navigate the dataset like its a file system and plan our analysis.

We'll switch to the terminal for this part.

Navigate to the folder where you'd like to download the dataset.

In [1]:
!cd ../../data && datalad install https://github.com/OpenNeuroDatasets/ds000030.git

Clone attempt:   0%|              | 0.00/2.00 [00:00<?, ? Candidate locations/s]
Enumerating: 0.00 Objects [00:00, ? Objects/s][A
                                              [A
Counting:   0%|                              | 0.00/38.0k [00:00<?, ? Objects/s][A
                                                                                [A
Compressing:   0%|                           | 0.00/32.1k [00:00<?, ? Objects/s][A
Compressing:  78%|██████████████▊    | 25.0k/32.1k [00:00<00:00, 247k Objects/s][A
                                                                                [A
Receiving:   0%|                             | 0.00/38.0k [00:00<?, ? Objects/s][A
Receiving:  10%|██                  | 3.80k/38.0k [00:00<00:00, 37.4k Objects/s][A
Receiving:  22%|████▍               | 8.37k/38.0k [00:00<00:00, 39.5k Objects/s][A
Receiving:  32%|██████▍             | 12.2k/38.0k [00:00<00:00, 38.3k Objects/s][A
Receiving:  42%|████████▍           | 16.0k/38.0k [00:00<00:00,

In [None]:
!ls ../../data/ds000030

Getting and dropping data

In [None]:
!datalad get ../../data/ds000030/sub-10788  
!datalad drop ../../data/ds000030/sub-10788

Removing data

In [None]:
!datalad remove ../../data/ds000030

### Amazon Web Services (AWS)

In [None]:
!aws s3 ls --no-sign-request \
  s3://openneuro/ds000030/ds000030_R1.0.5/uncompressed/

In [None]:
!aws s3 sync --no-sign-request \
  s3://openneuro/ds000030/ds000030_R1.0.5/uncompressed \
  ../../data/ds000030 \
  --include '*' \
  --exclude 'derivatives/*' \
  --exclude 'phenotype/*' \
  --exclude 'sub-*'

In [None]:
!aws s3 sync --no-sign-request \
  s3://openneuro/ds000030/ds000030_R1.0.5/uncompressed/sub-10159 \
  ../../data/ds000030/sub-10159

## Querying a BIDS Dataset

[pybids](https://bids-standard.github.io/pybids/) is a Python API for querying, summarizing and manipulating the BIDS folder structure.

In [3]:
from bids.layout import BIDSLayout

In [5]:
layout = BIDSLayout("../../data/ds000030")

Indexing a database can take a really long time, especially if you have several subjects, modalities, scan types, etc. `pybids` has an option to save the indexed results to a SQLite database. This database can then be re-used the next time you want to query the same database.

In [27]:
layout.save("../../data/ds000030/.db")

In [28]:
layout = BIDSLayout("../../data/ds000030", database_path = "../../data/ds000030/.db")

The pybids layout object lets you query your BIDS dataset according to a number of parameters by using a `get_*()` method.  
We can get a list of the subjects we've downloaded from the dataset.

In [6]:
layout.get_subjects()

['10159',
 '10171',
 '10189',
 '10193',
 '10206',
 '10217',
 '10225',
 '10227',
 '10228',
 '10235',
 '10249',
 '10269',
 '10271',
 '10273',
 '10274',
 '10280',
 '10290',
 '10292',
 '10299',
 '10304',
 '10316',
 '10321',
 '10325',
 '10329',
 '10339',
 '10340',
 '10345',
 '10347',
 '10356',
 '10361',
 '10365',
 '10376',
 '10377',
 '10388',
 '10428',
 '10429',
 '10438',
 '10440',
 '10448',
 '10455',
 '10460',
 '10471',
 '10478',
 '10487',
 '10492',
 '10501',
 '10506',
 '10517',
 '10523',
 '10524',
 '10525',
 '10527',
 '10530',
 '10557',
 '10565',
 '10570',
 '10575',
 '10624',
 '10629',
 '10631',
 '10638',
 '10668',
 '10672',
 '10674',
 '10678',
 '10680',
 '10686',
 '10692',
 '10696',
 '10697',
 '10704',
 '10707',
 '10708',
 '10719',
 '10724',
 '10746',
 '10762',
 '10779',
 '10785',
 '10788',
 '10844',
 '10855',
 '10871',
 '10877',
 '10882',
 '10891',
 '10893',
 '10912',
 '10934',
 '10940',
 '10948',
 '10949',
 '10958',
 '10963',
 '10968',
 '10971',
 '10975',
 '10977',
 '10987',
 '10998',


To get a list of all of the files, just use `get()`. 

In [7]:
layout.get()

[<BIDSFile filename='/Users/michael/projects/teaching/carpentries/SDC-BIDS-IntroMRI/data/ds000030/CHANGES'>,
 <BIDSJSONFile filename='/Users/michael/projects/teaching/carpentries/SDC-BIDS-IntroMRI/data/ds000030/dataset_description.json'>,
 <BIDSDataFile filename='/Users/michael/projects/teaching/carpentries/SDC-BIDS-IntroMRI/data/ds000030/participants.tsv'>,
 <BIDSJSONFile filename='/Users/michael/projects/teaching/carpentries/SDC-BIDS-IntroMRI/data/ds000030/phenotype/acds_adult.json'>,
 <BIDSDataFile filename='/Users/michael/projects/teaching/carpentries/SDC-BIDS-IntroMRI/data/ds000030/phenotype/acds_adult.tsv'>,
 <BIDSJSONFile filename='/Users/michael/projects/teaching/carpentries/SDC-BIDS-IntroMRI/data/ds000030/phenotype/adhd.json'>,
 <BIDSDataFile filename='/Users/michael/projects/teaching/carpentries/SDC-BIDS-IntroMRI/data/ds000030/phenotype/adhd.tsv'>,
 <BIDSJSONFile filename='/Users/michael/projects/teaching/carpentries/SDC-BIDS-IntroMRI/data/ds000030/phenotype/admin.json'>,
 <B

There are many arguments we can use to filter down this list. Any BIDS-defined keyword can be passed on as a constraint. In `pybids`, these keywords are known as **entities**. For a complete list of possibilities:

In [8]:
layout.entities

{'subject': <bids.layout.models.Entity at 0x11f2fa850>,
 'session': <bids.layout.models.Entity at 0x11b04c490>,
 'task': <bids.layout.models.Entity at 0x11f2fa990>,
 'acquisition': <bids.layout.models.Entity at 0x11b04c150>,
 'ceagent': <bids.layout.models.Entity at 0x11b04cb10>,
 'reconstruction': <bids.layout.models.Entity at 0x11b04cb50>,
 'direction': <bids.layout.models.Entity at 0x11b04ce10>,
 'run': <bids.layout.models.Entity at 0x11b04cbd0>,
 'proc': <bids.layout.models.Entity at 0x11b04c790>,
 'modality': <bids.layout.models.Entity at 0x11b04c910>,
 'echo': <bids.layout.models.Entity at 0x11b048390>,
 'recording': <bids.layout.models.Entity at 0x11b048ed0>,
 'space': <bids.layout.models.Entity at 0x11b048410>,
 'suffix': <bids.layout.models.Entity at 0x11f134450>,
 'scans': <bids.layout.models.Entity at 0x11b0482d0>,
 'fmap': <bids.layout.models.Entity at 0x11b0483d0>,
 'datatype': <bids.layout.models.Entity at 0x11f1345d0>,
 'extension': <bids.layout.models.Entity at 0x11f134

For example, if we only want the file paths of all of our resting state fMRI scans,

In [9]:
layout.get(datatype="func", suffix="bold", task="rest", extension=[".nii.gz"], return_type="file")

['/Users/michael/projects/teaching/carpentries/SDC-BIDS-IntroMRI/data/ds000030/sub-10159/func/sub-10159_task-rest_bold.nii.gz',
 '/Users/michael/projects/teaching/carpentries/SDC-BIDS-IntroMRI/data/ds000030/sub-10171/func/sub-10171_task-rest_bold.nii.gz',
 '/Users/michael/projects/teaching/carpentries/SDC-BIDS-IntroMRI/data/ds000030/sub-10189/func/sub-10189_task-rest_bold.nii.gz',
 '/Users/michael/projects/teaching/carpentries/SDC-BIDS-IntroMRI/data/ds000030/sub-10206/func/sub-10206_task-rest_bold.nii.gz',
 '/Users/michael/projects/teaching/carpentries/SDC-BIDS-IntroMRI/data/ds000030/sub-10217/func/sub-10217_task-rest_bold.nii.gz',
 '/Users/michael/projects/teaching/carpentries/SDC-BIDS-IntroMRI/data/ds000030/sub-10225/func/sub-10225_task-rest_bold.nii.gz',
 '/Users/michael/projects/teaching/carpentries/SDC-BIDS-IntroMRI/data/ds000030/sub-10227/func/sub-10227_task-rest_bold.nii.gz',
 '/Users/michael/projects/teaching/carpentries/SDC-BIDS-IntroMRI/data/ds000030/sub-10228/func/sub-10228_

**EXERCISE**: Retrieve the file paths of any scan where the subject is '10292' or '50081' and the `RepetitionTime` is 2 seconds.

In [10]:
layout.get(subject="10159", RepetitionTime=2, return_type="file")

['/Users/michael/projects/teaching/carpentries/SDC-BIDS-IntroMRI/data/ds000030/sub-10159/func/sub-10159_task-bart_bold.nii.gz',
 '/Users/michael/projects/teaching/carpentries/SDC-BIDS-IntroMRI/data/ds000030/sub-10159/func/sub-10159_task-rest_bold.nii.gz',
 '/Users/michael/projects/teaching/carpentries/SDC-BIDS-IntroMRI/data/ds000030/sub-10159/func/sub-10159_task-scap_bold.nii.gz',
 '/Users/michael/projects/teaching/carpentries/SDC-BIDS-IntroMRI/data/ds000030/sub-10159/func/sub-10159_task-stopsignal_bold.nii.gz',
 '/Users/michael/projects/teaching/carpentries/SDC-BIDS-IntroMRI/data/ds000030/sub-10159/func/sub-10159_task-taskswitch_bold.nii.gz']

Let's save the first file from our list of file paths to a variable and pull the metadata from its associated JSON file using the `get_metadata()` function.

In [11]:
fmri_file = layout.get(subject="10159", RepetitionTime=2, return_type="file")[0]
layout.get_metadata(fmri_file)

{'AccelNumReferenceLines': 24,
 'AccelerationFactorPE': 2,
 'AcquisitionMatrix': '64/0/0/64',
 'CogAtlasID': 'trm_4d559bcd67c18',
 'CogPOID': '',
 'DeviceSerialNumber': '35343',
 'EPIFactor': 128,
 'EchoTime': 0.03,
 'EchoTrainLength': 1,
 'EffectiveEchoSpacing': 0.000395,
 'FlipAngle': 90,
 'ImageType': 'ORIGINAL/PRIMARY/M/ND/MOSAIC',
 'ImagingFrequency': 123249925,
 'InPlanePhaseEncodingDirection': 'COL',
 'Instructions': 'This task is the one where you score points by inflating balloons. You push the first button to inflate the balloon, and the second button to stop inflating and move on to the next one. The more you inflate the balloon the more points you’ll get, but if you inflate it too much the balloon will pop and you won’t get any points. There are two different colors of balloons, green and white. Green balloons give points, but white balloons don’t, so when you see a white balloon you can just inflate it until it goes away to move on to the next one. You only get a limited n

We can even collect the metadata for all of our fmri scans into a list and convert this into a dataframe.

In [12]:
import pandas as pd

metadata_list = []
all_fmri_files = layout.get(datatype="func", suffix="bold", return_type="file", extension=[".nii.gz"])
for fmri_file in all_fmri_files:
    fmri_metadata = layout.get_metadata(fmri_file)
    metadata_list.append(fmri_metadata)
df = pd.DataFrame.from_records(metadata_list)
df

Unnamed: 0,AccelNumReferenceLines,AccelerationFactorPE,AcquisitionMatrix,CogAtlasID,CogPOID,DeviceSerialNumber,EPIFactor,EchoTime,EchoTrainLength,EffectiveEchoSpacing,...,SequenceVariant,SliceTiming,SoftwareVersions,TaskDescription,TaskFullName,TaskName,TaskParameters,TotalScanTimeSec,TransmitCoilName,VariableFlipAngleFlag
0,24.0,2.0,64/0/0/64,trm_4d559bcd67c18,,35343,128.0,0.03,1,0.000395,...,SK,"[1.0025, 0, 1.0625, 0.06, 1.1225, 0.1175, 1.18...",syngo MR B15,"In the BART (Lejuez et al., 2002), participant...",Balloon Analog Risk Task (BART),bart,"{'ISI': 3, 'ITI': 2, 'mean_iti': 4, 'min_iti':...",542.0,Body,N
1,24.0,2.0,64/0/0/64,trm_4c8a834779883,COGPO_00086,35343,128.0,0.03,1,0.000395,...,SK,"[1.005, 0, 1.0625, 0.06, 1.1225, 0.12, 1.1825,...",syngo MR B15,"In the Resting scan, participants were asked t...",Resting State,rest,,312.0,Body,N
2,24.0,2.0,64/0/0/64,trm_4f2453b806fe1,,35343,128.0,0.03,1,0.000395,...,SK,"[1.005, 0, 1.0625, 0.06, 1.1225, 0.1175, 1.18,...",syngo MR B15,SCAP is a working memory task that tests the m...,Spatial Working Memory Capacity Tasks (SCAP),scap,{'trigger_time': None},590.0,Body,N
3,24.0,2.0,64/0/0/64,tsk_4a57abb949e1a,,35343,128.0,0.03,1,0.000395,...,SK,"[1.0025, 0, 1.0625, 0.0575, 1.12, 0.1175, 1.18...",syngo MR B15,The Stop-Signal Task measures response inhibit...,Stop-Signal Task,stopsignal,"{'Settings': {'BSI': 1, 'ISI': 1.5, 'Ladder1 s...",376.0,Body,N
4,24.0,2.0,64/0/0/64,tsk_4a57abb949e8a,COGPO_00107,35343,128.0,0.03,1,0.000395,...,SK,"[1.0025, 0, 1.0625, 0.0575, 1.12, 0.1175, 1.18...",syngo MR B15,"In the Task-Switching (TS) task, participants ...",Task Switching,taskswitch,"{'button_set': 2, 'left_color': 'red', 'right_...",424.0,Body,N
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1999,24.0,2.0,64/0/0/64,trm_4c8991e6e8597,COGPO_00078,35426,128.0,0.03,1,0.000395,...,SK,"[1.0025, 0, 1.0625, 0.0575, 1.1225, 0.1175, 1....",syngo MR B17,This task was run as a part of the Consortium ...,Paired Associates Memory Task - Retrieval,pamret,{'trigger_time': 8.2909},544.0,Body,N
2000,24.0,2.0,64/0/0/64,trm_4c8a834779883,COGPO_00086,35426,128.0,0.03,1,0.000395,...,SK,"[1.005, 0, 1.0625, 0.06, 1.1225, 0.12, 1.1825,...",syngo MR B17,"In the Resting scan, participants were asked t...",Resting State,rest,,312.0,Body,N
2001,24.0,2.0,64/0/0/64,trm_4f2453b806fe1,,35426,128.0,0.03,1,0.000395,...,SK,"[1.0025, 0, 1.0625, 0.06, 1.1225, 0.1175, 1.18...",syngo MR B17,SCAP is a working memory task that tests the m...,Spatial Working Memory Capacity Tasks (SCAP),scap,{'trigger_time': 8.2668},590.0,Body,N
2002,24.0,2.0,64/0/0/64,tsk_4a57abb949e1a,,35426,128.0,0.03,1,0.000395,...,SK,"[1.005, 0, 1.0625, 0.06, 1.1225, 0.1175, 1.182...",syngo MR B17,The Stop-Signal Task measures response inhibit...,Stop-Signal Task,stopsignal,"{'Settings': {'BSI': 1, 'ISI': 1.5, 'Ladder1 s...",376.0,Body,N


## Exploring Data

Below is a tree diagram showing the folder structure of single MR session within ds000030. This was obtained by using the bash command `tree`.  
`!tree data/ds000030`

```
ds000030
├── CHANGES
├── dataset_description.json
├── derivatives
│   └── fmriprep
├── participants.tsv
├── README
├── sub-50083
│   ├── anat
│   │   ├── sub-50083_T1w.json
│   │   └── sub-50083_T1w.nii.gz
│   └── func
│       ├── sub-50083_task-rest_bold.json
│       └── sub-50083_task-rest_bold.nii.gz
└── task-rest_bold.json
```

The `participants.tsv` file is meant to describe some demographic information on each participant within your study (eg. age, handedness, sex, etc.) Let's take a look at the `participants.tsv` file to see what's been included in this dataset.

In order to load the data into Python, we'll need to import the `pandas` package. The `pandas` **dataframe** is Python's equivalent to an Excel spreadsheet.

In [13]:
import pandas as pd

We'll use the `read_csv()` function. It requires us to specify the name of the file we want to import and the separator that is used to distinguish each column in our file (`\t` since we're working with a `.tsv` file).

In [14]:
participant_metadata = pd.read_csv('../../data/ds000030/participants.tsv', sep='\t')

In order to get a glimpse of our data, we'll use the `head()` function. By default, `head` prints the first 5 rows of our dataframe.

In [15]:
participant_metadata.head()

Unnamed: 0,participant_id,diagnosis,age,gender,bart,bht,dwi,pamenc,pamret,rest,scap,stopsignal,T1w,taskswitch,ScannerSerialNumber,ghost_NoGhost
0,sub-10159,CONTROL,30,F,1.0,,1.0,,,1.0,1.0,1.0,1.0,1.0,35343.0,No_ghost
1,sub-10171,CONTROL,24,M,1.0,1.0,1.0,,,1.0,1.0,1.0,1.0,1.0,35343.0,No_ghost
2,sub-10189,CONTROL,49,M,1.0,,1.0,,,1.0,1.0,1.0,1.0,1.0,35343.0,No_ghost
3,sub-10193,CONTROL,40,M,1.0,,1.0,,,,,,1.0,,35343.0,No_ghost
4,sub-10206,CONTROL,21,M,1.0,,1.0,,,1.0,1.0,1.0,1.0,1.0,35343.0,No_ghost


We can view any number of rows by specifying `n=?` as an argument within `head()`.  
If we want to select particular rows within the dataframe, we can use the `loc[]` function and identify the rows we want based on their index label (the numbers in the left-most column).

In [16]:
participant_metadata.loc[[6, 10, 12]]

Unnamed: 0,participant_id,diagnosis,age,gender,bart,bht,dwi,pamenc,pamret,rest,scap,stopsignal,T1w,taskswitch,ScannerSerialNumber,ghost_NoGhost
6,sub-10225,CONTROL,35,M,1.0,1.0,1.0,,,1.0,1.0,1.0,1.0,1.0,35343.0,No_ghost
10,sub-10249,CONTROL,28,M,1.0,1.0,1.0,,,1.0,1.0,1.0,1.0,1.0,35343.0,No_ghost
12,sub-10271,CONTROL,41,F,1.0,1.0,1.0,,,1.0,1.0,1.0,1.0,1.0,35343.0,No_ghost


**EXERCISE**: Select the first 5 rows of the dataframe using `loc[]`.

In [17]:
participant_metadata.loc[:4]

Unnamed: 0,participant_id,diagnosis,age,gender,bart,bht,dwi,pamenc,pamret,rest,scap,stopsignal,T1w,taskswitch,ScannerSerialNumber,ghost_NoGhost
0,sub-10159,CONTROL,30,F,1.0,,1.0,,,1.0,1.0,1.0,1.0,1.0,35343.0,No_ghost
1,sub-10171,CONTROL,24,M,1.0,1.0,1.0,,,1.0,1.0,1.0,1.0,1.0,35343.0,No_ghost
2,sub-10189,CONTROL,49,M,1.0,,1.0,,,1.0,1.0,1.0,1.0,1.0,35343.0,No_ghost
3,sub-10193,CONTROL,40,M,1.0,,1.0,,,,,,1.0,,35343.0,No_ghost
4,sub-10206,CONTROL,21,M,1.0,,1.0,,,1.0,1.0,1.0,1.0,1.0,35343.0,No_ghost


**EXERCISE:** How many participants do we have in total?

In [18]:
participant_metadata.shape

(272, 16)

There are 2 different methods of selecting columns in a dataframe:  
*  participant_metadata[`'<column_name>'`] (this is similar to selecting a key in a Python dictionary)  
*  participant_metadata.`<column_name>`  

Another way to see how many participants are in the study is to select the `participant_id` column and use the `count()` function.

In [19]:
participant_metadata['participant_id'].count()

272

**EXERCISE:** Which diagnosis groups are part of the study?  
*Hint: use the* `unique()` *function.*

In [20]:
participant_metadata['diagnosis'].unique()

array(['CONTROL', 'SCHZ', 'BIPOLAR', 'ADHD'], dtype=object)

If we want to count the number of participants in each diagnosis group, we can use the `value_counts()` function.

In [21]:
participant_metadata['diagnosis'].value_counts()

CONTROL    130
SCHZ        50
BIPOLAR     49
ADHD        43
Name: diagnosis, dtype: int64

**EXERCISE:** How many males and females are in the study? How many are in each diagnosis group?

In [22]:
participant_metadata['gender'].value_counts()

M    155
F    117
Name: gender, dtype: int64

In [23]:
participant_metadata.groupby(['diagnosis', 'gender']).size()

diagnosis  gender
ADHD       F         22
           M         21
BIPOLAR    F         21
           M         28
CONTROL    F         62
           M         68
SCHZ       F         12
           M         38
dtype: int64

When looking at the participant dataframe, we noticed that there is a column called `ghost_NoGhost`. We should look at the README file that comes with the dataset to find out more about this.

In [24]:
!cat ../../data/ds000030/README

## UCLA Consortium for Neuropsychiatric Phenomics LA5c Study

Preprocessed data described in


Gorgolewski KJ, Durnez J and Poldrack RA. Preprocessed Consortium for Neuropsychiatric Phenomics dataset. F1000Research 2017, 6:1262
https://doi.org/10.12688/f1000research.11964.2


are available at https://legacy.openfmri.org/dataset/ds000030/ and via Amazon Web Services S3 protocol at: s3://openneuro/ds000030/ds000030_R1.0.5/uncompressed/derivatives/

## Subjects / Participants
The participants.tsv file contains subject IDs with demographic informations as well as an inventory of the scans that are included for each subject.

## Dataset Derivatives (/derivatives)
The /derivaties folder contains summary information that reflects the data and its contents:

1. Final_Scan_Count.pdf - Plot showing the over all scan inclusion, for quick reference.
2. parameter_plots/ - Folder contains many scan parameters plotted over time. Plot symbols are color coded by imaging site. Intended

For this tutorial, we're just going to work with participants that are either CONTROL or SCHZ (`diagnosis`) and have both a T1w (`T1w == 1`) and rest (`rest == 1`) scan.

<b>EXERCISE:</b> Filter <code>participant_metadata</code> so that only the above conditions are present.

In [25]:
participant_metadata = participant_metadata[(participant_metadata.diagnosis.isin(['CONTROL', 'SCHZ'])) & 
                                            (participant_metadata.T1w == 1) & 
                                            (participant_metadata.rest == 1)]
participant_metadata

Unnamed: 0,participant_id,diagnosis,age,gender,bart,bht,dwi,pamenc,pamret,rest,scap,stopsignal,T1w,taskswitch,ScannerSerialNumber,ghost_NoGhost
0,sub-10159,CONTROL,30,F,1.0,,1.0,,,1.0,1.0,1.0,1.0,1.0,35343.0,No_ghost
1,sub-10171,CONTROL,24,M,1.0,1.0,1.0,,,1.0,1.0,1.0,1.0,1.0,35343.0,No_ghost
2,sub-10189,CONTROL,49,M,1.0,,1.0,,,1.0,1.0,1.0,1.0,1.0,35343.0,No_ghost
4,sub-10206,CONTROL,21,M,1.0,,1.0,,,1.0,1.0,1.0,1.0,1.0,35343.0,No_ghost
5,sub-10217,CONTROL,33,F,1.0,,1.0,,,1.0,1.0,1.0,1.0,1.0,35343.0,No_ghost
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
175,sub-50077,SCHZ,29,M,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,35426.0,No_ghost
176,sub-50080,SCHZ,29,M,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,35426.0,No_ghost
177,sub-50081,SCHZ,32,M,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,35426.0,No_ghost
178,sub-50083,SCHZ,40,M,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,35426.0,No_ghost
