# ToDo:

 - notebook that introduces how to get input data from a BIDS dataset

## Data input for BIDS datasets
`DataGrabber` and `SelectFiles` are great if you are dealing with generic datasets with arbitrary organization. However if you have decided to use Brain Imaging Data Structure (BIDS) to organized your data (or got your hands on a BIDS dataset) you can take advanted of a formal structure BIDS imposes. In this short tutorial you will learn how to do this.

## `pybids` - a Python API for working with BIDS datasets
`pybids` is a lightweight python API for querying BIDS folder structure for specific files and metadata. You can install it from PyPi:
```
pip install pybids
```
Please note it should be already installed in the tutorial Docker image.

## The `layout` object and simple queries
To begin working with pubids we need to initalize a layout object. We will need it to do all of our queries

In [1]:
from bids.grabbids import BIDSLayout
layout = BIDSLayout("/data/ds000114/")

In [2]:
!tree -I derivatives /data/ds000114/

/data/ds000114/
├── CHANGES
├── dataset_description.json
├── dwi.bval -> .git/annex/objects/JX/4K/MD5E-s335--5bd6fa32ccd0c79e79f9ac63a2c09c1a.bval/MD5E-s335--5bd6fa32ccd0c79e79f9ac63a2c09c1a.bval
├── dwi.bvec -> .git/annex/objects/Pg/wk/MD5E-s1248--0641c68ff6ee6164928c984541653430.bvec/MD5E-s1248--0641c68ff6ee6164928c984541653430.bvec
├── sub-01
│   ├── ses-retest
│   │   ├── anat
│   │   │   └── sub-01_ses-retest_T1w.nii.gz -> ../../../.git/annex/objects/xm/25/MD5E-s8503839--3b3b49b2396b59ddd5a73b7f596f9e46.nii.gz/MD5E-s8503839--3b3b49b2396b59ddd5a73b7f596f9e46.nii.gz
│   │   ├── dwi
│   │   │   └── sub-01_ses-retest_dwi.nii.gz -> ../../../.git/annex/objects/0K/16/MD5E-s99899518--5ebac8e9e23180638dd68dde10b818be.nii.gz/MD5E-s99899518--5ebac8e9e23180638dd68dde10b818be.nii.gz
│   │   └── func
│   │       ├── sub-01_ses-retest_task-covertverbgeneration_bold.nii.gz -> ../../../.git/annex/objects/3q/Qf/MD5E-s22317848--b30f5b2f7a6039a3e384bcb40bec7e55.nii.gz/MD5E-s22317848--b30f

Let's figure out what are the subject labels in this dataset

In [3]:
layout.get_subjects()

['01', '02', '03', '04', '05', '06', '07', '08', '09', '10']

What modalities are included in this dataset?

In [4]:
layout.get_modalities()

['anat', 'dwi', 'func']

What different data types are included in this dataset?

In [5]:
layout.get_types()

['', 'bold', 'description', 'dwi', 'events', 'T1w']

In [6]:
layout.get_types(modality='func')

['bold', 'events']

What are the different tasks included in this dataset?

In [7]:
layout.get_tasks()

['covertverbgeneration',
 'fingerfootlips',
 'linebisection',
 'overtverbgeneration',
 'overtwordrepetition']

We can also ask for all of the data for a particular subject.

In [8]:
layout.get(subject='01')

[File(filename='/data/ds000114/sub-01/ses-retest/anat/sub-01_ses-retest_T1w.nii.gz', subject='01', session='retest', type='T1w', modality='anat'),
 File(filename='/data/ds000114/sub-01/ses-retest/dwi/sub-01_ses-retest_dwi.nii.gz', subject='01', session='retest', type='dwi', modality='dwi'),
 File(filename='/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-covertverbgeneration_bold.nii.gz', subject='01', session='retest', type='bold', task='covertverbgeneration', modality='func'),
 File(filename='/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-fingerfootlips_bold.nii.gz', subject='01', session='retest', type='bold', task='fingerfootlips', modality='func'),
 File(filename='/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-linebisection_bold.nii.gz', subject='01', session='retest', type='bold', task='linebisection', modality='func'),
 File(filename='/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-linebisection_events.tsv', subject='01', sess

We can also ask for a specific subset of data. Note that we are using extension filter to get just the imaging data (BIDS allows both .nii and .nii.gz so we need to include both).

In [9]:
layout.get(subject='01', type='bold', extensions=['nii', 'nii.gz'])

[File(filename='/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-covertverbgeneration_bold.nii.gz', subject='01', session='retest', type='bold', task='covertverbgeneration', modality='func'),
 File(filename='/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-fingerfootlips_bold.nii.gz', subject='01', session='retest', type='bold', task='fingerfootlips', modality='func'),
 File(filename='/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-linebisection_bold.nii.gz', subject='01', session='retest', type='bold', task='linebisection', modality='func'),
 File(filename='/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-overtverbgeneration_bold.nii.gz', subject='01', session='retest', type='bold', task='overtverbgeneration', modality='func'),
 File(filename='/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-overtwordrepetition_bold.nii.gz', subject='01', session='retest', type='bold', task='overtwordrepetition', modality='func'),
 File(file

You probably noticed that this method does not only return the file paths, but objects with relevant query fields. We can easily extract just the file paths.

In [10]:
[f.filename for f in layout.get(subject='01', type='T1w', extensions=['nii', 'nii.gz'])]

['/data/ds000114/sub-01/ses-retest/anat/sub-01_ses-retest_T1w.nii.gz',
 '/data/ds000114/sub-01/ses-test/anat/sub-01_ses-test_T1w.nii.gz']

### Exercise 1:
List all of the BOLD files for subject 03, but only for the linebisection task

## Including `pybids` in your `nipype` workflow
This is great, but what we really want is to include this into our `nipype` workflows. How to do this? We can create our own custom `BIDSDataGrabber` using a `Function` Interface. First we need a plain Python function that for a given subject label and dataset location will return list of BOLD and T1w files.

In [11]:
def get_niftis(subject_id, data_dir):
    # Remember that all the necesary imports need to be INSIDE the function for the Function Interface to work!
    from bids.grabbids import BIDSLayout
    
    layout = BIDSLayout(data_dir)
    
    bolds = [f.filename for f in layout.get(subject=subject_id, type='bold', extensions=['nii', 'nii.gz'])]
    
    return bolds

In [12]:
get_niftis('01', '/data/ds000114/')

['/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-covertverbgeneration_bold.nii.gz',
 '/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-fingerfootlips_bold.nii.gz',
 '/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-linebisection_bold.nii.gz',
 '/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-overtverbgeneration_bold.nii.gz',
 '/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-overtwordrepetition_bold.nii.gz',
 '/data/ds000114/sub-01/ses-test/func/sub-01_ses-test_task-covertverbgeneration_bold.nii.gz',
 '/data/ds000114/sub-01/ses-test/func/sub-01_ses-test_task-fingerfootlips_bold.nii.gz',
 '/data/ds000114/sub-01/ses-test/func/sub-01_ses-test_task-linebisection_bold.nii.gz',
 '/data/ds000114/sub-01/ses-test/func/sub-01_ses-test_task-overtverbgeneration_bold.nii.gz',
 '/data/ds000114/sub-01/ses-test/func/sub-01_ses-test_task-overtwordrepetition_bold.nii.gz']

Ok we got our function. Now we need to wrap it inside a Node object.

In [13]:
from nipype.pipeline import Node, MapNode, Workflow
from nipype.interfaces.utility import IdentityInterface, Function

In [14]:
BIDSDataGrabber = Node(Function(function=get_niftis, input_names=["subject_id",
                                       "data_dir"],
                                   output_names=["bolds", 
                                        "T1ws"]), name="BIDSDataGrabber")
BIDSDataGrabber.inputs.data_dir = "/data/ds000114/"

In [15]:
BIDSDataGrabber.inputs.subject_id='01'
res = BIDSDataGrabber.run()
res.outputs

180124-10:03:13,759 workflow INFO:
	 Executing node BIDSDataGrabber in dir: /tmp/tmpy9_56d99/BIDSDataGrabber
180124-10:03:13,763 workflow INFO:
	 Running node "BIDSDataGrabber" ("nipype.interfaces.utility.wrappers.Function").



T1ws = /data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-fingerfootlips_bold.nii.gz
bolds = /data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-covertverbgeneration_bold.nii.gz

Works like a charm! (hopefully :) Lets put it in a workflow. We are not going to analyze any data, but for demostrantion purposes we will add a couple of nodes that pretend to analyze their inputs

In [16]:
def printMe(paths):
    print("\n\nanalyzing " + str(paths) + "\n\n")
    
analyzeBOLD = Node(Function(function=printMe, input_names=["paths"],
                            output_names=[]), name="analyzeBOLD")

In [17]:
wf = Workflow(name="bids_demo")
wf.connect(BIDSDataGrabber, "bolds", analyzeBOLD, "paths")
wf.run()

180124-10:03:18,375 workflow INFO:
	 Workflow bids_demo settings: ['check', 'execution', 'logging']
180124-10:03:18,380 workflow INFO:
	 Running serially.
180124-10:03:18,381 workflow INFO:
	 Executing node bids_demo.BIDSDataGrabber in dir: /tmp/tmpn2mdqhcf/bids_demo/BIDSDataGrabber
180124-10:03:18,385 workflow INFO:
	 Running node "BIDSDataGrabber" ("nipype.interfaces.utility.wrappers.Function").
180124-10:03:18,446 workflow INFO:
	 Executing node bids_demo.analyzeBOLD in dir: /tmp/tmpf5xn7mnx/bids_demo/analyzeBOLD
180124-10:03:18,450 workflow INFO:
	 Running node "analyzeBOLD" ("nipype.interfaces.utility.wrappers.Function").


analyzing /data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-covertverbgeneration_bold.nii.gz




<networkx.classes.digraph.DiGraph at 0x7f33c81f6c88>

### Exercise 2:
Modify the `BIDSDataGrabber` and the workflow to include T1ws.

## Iterating over subject labels
In the previous example we demostrated how to use `pybids` to "analyze" one subject. How can we scale it for all subjects? Easy - using `iterables`.

In [18]:
BIDSDataGrabber.iterables = ('subject_id', layout.get_subjects())
wf.run()

180124-10:03:24,702 workflow INFO:
	 Workflow bids_demo settings: ['check', 'execution', 'logging']
180124-10:03:24,733 workflow INFO:
	 Running serially.
180124-10:03:24,735 workflow INFO:
	 Executing node bids_demo.BIDSDataGrabber in dir: /tmp/tmp3v5f71z8/bids_demo/_subject_id_10/BIDSDataGrabber
180124-10:03:24,739 workflow INFO:
	 Running node "BIDSDataGrabber" ("nipype.interfaces.utility.wrappers.Function").
180124-10:03:24,792 workflow INFO:
	 Executing node bids_demo.analyzeBOLD in dir: /tmp/tmpzcmn5ypg/bids_demo/_subject_id_10/analyzeBOLD
180124-10:03:24,795 workflow INFO:
	 Running node "analyzeBOLD" ("nipype.interfaces.utility.wrappers.Function").


analyzing /data/ds000114/sub-10/ses-retest/func/sub-10_ses-retest_task-covertverbgeneration_bold.nii.gz


180124-10:03:24,800 workflow INFO:
	 Executing node bids_demo.BIDSDataGrabber in dir: /tmp/tmp2_z54y96/bids_demo/_subject_id_09/BIDSDataGrabber
180124-10:03:24,804 workflow INFO:
	 Running node "BIDSDataGrabber" ("nipype.interf

<networkx.classes.digraph.DiGraph at 0x7f33cbf66a58>

## Accessing additional metadata
Querying different files is nice, but sometimes you want to access more metadata. For example `RepetitionTime`. `pybids` can help with that as well

In [19]:
layout.get_metadata('/data/ds000114/sub-01/ses-test/func/sub-01_ses-test_task-fingerfootlips_bold.nii.gz')

{'EchoTime': 0.05,
 'FlipAngle': 90,
 'RepetitionTime': 2.5,
 'SliceTiming': [0.0,
  1.2499999999999998,
  0.08333333333333333,
  1.333333333333333,
  0.16666666666666666,
  1.4166666666666663,
  0.25,
  1.4999999999999996,
  0.3333333333333333,
  1.5833333333333328,
  0.41666666666666663,
  1.666666666666666,
  0.5,
  1.7499999999999993,
  0.5833333333333333,
  1.8333333333333326,
  0.6666666666666666,
  1.9166666666666659,
  0.75,
  1.9999999999999991,
  0.8333333333333333,
  2.083333333333332,
  0.9166666666666666,
  2.1666666666666656,
  1.0,
  2.249999999999999,
  1.0833333333333333,
  2.333333333333332,
  1.1666666666666665,
  2.416666666666665],
 'TaskName': 'finger_foot_lips'}

Can we incorporate this into our pipeline? Yes we can!

In [20]:
def printMetadata(path, data_dir):
    from bids.grabbids import BIDSLayout
    layout = BIDSLayout(data_dir)
    print("\n\nanalyzing " + path + "\nTR: "+ str(layout.get_metadata(path)["RepetitionTime"]) + "\n\n")
    
analyzeBOLD2 = MapNode(Function(function=printMetadata, input_names=["path", "data_dir"],
                             output_names=[]), name="analyzeBOLD2", iterfield="path")
analyzeBOLD2.inputs.data_dir = "/data/ds000114/"

In [21]:
wf = Workflow(name="bids_demo")
wf.connect(BIDSDataGrabber, "bolds", analyzeBOLD2, "path")
wf.run()

180124-10:03:58,805 workflow INFO:
	 Workflow bids_demo settings: ['check', 'execution', 'logging']
180124-10:03:58,836 workflow INFO:
	 Running serially.
180124-10:03:58,837 workflow INFO:
	 Executing node bids_demo.BIDSDataGrabber in dir: /tmp/tmpjhy8zvcs/bids_demo/_subject_id_10/BIDSDataGrabber
180124-10:03:58,841 workflow INFO:
	 Running node "BIDSDataGrabber" ("nipype.interfaces.utility.wrappers.Function").
180124-10:03:58,897 workflow INFO:
	 Executing node bids_demo.analyzeBOLD2 in dir: /tmp/tmpkjpw2ou4/bids_demo/_subject_id_10/analyzeBOLD2
180124-10:03:58,901 workflow INFO:
	 Executing node _analyzeBOLD20 in dir: /tmp/tmpkjpw2ou4/bids_demo/_subject_id_10/analyzeBOLD2/mapflow/_analyzeBOLD20
180124-10:03:58,904 workflow INFO:
	 Running node "_analyzeBOLD20" ("nipype.interfaces.utility.wrappers.Function").


analyzing /data/ds000114/sub-10/ses-retest/func/sub-10_ses-retest_task-covertverbgeneration_bold.nii.gz
TR: 2.5


180124-10:03:58,953 workflow INFO:
	 Executing node bids_demo

<networkx.classes.digraph.DiGraph at 0x7f33c3f9e898>

### Exercise 3:
Modify the `printMetadata` function to also print `EchoTime` 