# &nbsp;&nbsp;&nbsp; Automatic Localization and Parcellation of Auditory Cortex Areas (ALPACA) 
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<img src="img/ALPACA_logo.png" alt="alpaca logo" width="370" height="250" border="10">
## &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; - data organization & prerequisites 

### This notebook will provide you with information regarding data organization and prerequisites necessary to run the ALPACA toolbox. However, as the tools presented here are highly recommended in state of the art neuroinformatics and aren't specific for the ALPACA toolbox, they're also useful for any other neuroimaging related application intended to be robust, reproducible and openly shared.  

### More precisely, it'll contain the following sections: </br>
</br>
##### - sidequest: [docker](https://www.docker.com)

### - data organization using the [Brain Imaging Data Structure (BIDS)](http://bids.neuroimaging.io) </br>

### - data quality control & assurance using [MRIQC](https://mriqc.readthedocs.io/en/latest/) </br>

### - structural processing using [mindboggle](http://www.mindboggle.info)

Before we actually start, let me pump the breaks right here and open a sidequest that is not only important for this tutorial notebook and the toolbox, but also will change your (neuroscience-programming-related) life completely. </br> 
Picture the following scenario (which we all been in at least a couple of times): you want to use some function, code, scripts, etc. that someone put out there or a colleague shared with you. Usually and depending on the specific case, that comes with certain requirements you have to meet, or more precisely certain dependencies the respective function, code, scripts, etc. builds upon and needs to work properly. Even worse: sometimes you even need a specific OS. Depending on you, your set of skills and your set up this results (quite often) in anger, wasted time and drinks. Well, no more! How you ask? By using a pretty amazing thingy called docker:  

##### sidequest: [docker](https://www.docker.com)

<img src="img/logoDocker.png" alt="Drawing" style="width: 400px;"/>


[Docker](https://www.docker.com/) is an open-source project that automates the deployment of applications inside software containers. Those containers wrap up a piece of software in a complete filesystem that contains everything it needs to run: code, system tools, software libraries, such as [Python](https://www.python.org), neuroimaging related software such as [FSL](https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FSL), [AFNI](https://afni.nimh.nih.gov), [SPM](http://www.fil.ion.ucl.ac.uk/spm/), [FreeSurfer](https://surfer.nmr.mgh.harvard.edu), [ANTs](http://stnava.github.io/ANTs/) and pretty much any other open source software or tools. This guarantees that it will always run the same, regardless of the environment it is running in. As this notebook will just briefly talk about docker and it's application, make sure you go through this very nice and comprehensive [introduction to docker](http://nipy.org/workshops/2017-03-boston/lectures/lesson-container/#1).

Wondering why we're actually talking about this (I mean besides the already mentioned advantages)? Well, as ALPACA is commited and interested in open and reproducible (neuro-)science, the toolbox itself will be in a docker container one day and everything we focus on here will highly depend on / make us of docker. Additionally, and I can't stress this enough, it's incredibly useful. Now, let's check some basic docker commands:

##### Install docker
Before you can do anything, you first need to install [docker](https://www.docker.com/) on your system. </br>
The installation process differes per system. Luckily, the docker homepage has nice instructions for...

- [Ubuntu](https://docs.docker.com/engine/installation/linux/docker-ce/ubuntu/) or [Debian](https://docs.docker.com/engine/installation/linux/docker-ce/debian/)
- [Windows 7/8/9/10](https://docs.docker.com/toolbox/toolbox_install_windows/) or [Windows 10Pro](https://docs.docker.com/docker-for-windows/install/)
- [OS X (from El Capitan 10.11 on)](https://docs.docker.com/docker-for-mac/install/) or [OS X (before El Capitan 10.11)](https://docs.docker.com/toolbox/toolbox_install_mac/)

Once Docker is installed, open up a terminal and test it works properly with the command:

In [None]:
%%bash 
docker run hello-world

**Note**: Linux users might need to use `sudo` to run `docker` commands or follow [post-installation steps](https://docs.docker.com/engine/installation/linux/linux-postinstall/).

##### Pulling and checking available docker images
You can download various docker images from [docker hub](https://hub.docker.com), which always works like this: </br> 

`docker pull docker_id/docker_image:version` , e.g. `docker pull peerherholz/ALPACA:latest`

**Note**: you don't have to include `:version` if you're not looking for a specific version. When not including it, `:latest` is set by default. 

Once it's done or whenever you want, you can check available images on your system:

In [None]:
&&bash
docker images

##### How to run docker images
After installing docker on your system, making sure that the hello-world example was running and pulling one or the other docker image you would like to use, you are good to go and actually run a docker image. The exact implementation of that is a bit different for Windows user, but the general commands look similar. The standard version goes something like this:

`docker run -it --rm -p 8888:8888 docker_id/docker_image command`

However, if you want use stuff (notebooks, code, scripts, etc.) which are not included in the respective docker image, but should run within it's environment, access process or save local or remote data, you can also mount your  directories, e.g.:

`docker run -it --rm -v /path/to/resources/:/home/resources -v /path/to/data/:/home/data -v /path/to/output/:/home/output -p 8888:8888 docker_id/docker_image command`


But what do those flags mean?

- the `-it` flag tells docker that it should open an interactive container instance
- the `--rm` flag tells docker that the container should automatically be removed after we close docker
- the `-p` flag specifies which port we want to make available for docker
- the `-v` flag tells docker which folders should be mount to make them accesible inside the container. Here:   /path/to/resources is your local directory where you stored notebooks, functions, scripts. /path/to/data/ is a directory where you stored your data, and /path/to/output can be an empty directory that will be used for output. - The second part of the `-v` flag (here: /home/resources, /data or /output) specifies under which path the mounted folders can be found inside the container. Important: To use the resource, data, output or any other folder, you first need to create them on your local system!
- `docker_id/docker_image` tells docker which image you want to run
- `command` tells docker, that you want to directly run a certain command within the container

##### Docker tips and tricks
###### Access docker container with bash
You can access a docker container directly with bash or ipython by adding it to the end of your command, i.e.:

`docker run -it --rm -v /path/to/resources/:/home/resources -v /path/to/data/:/home/data -v /path/to/output/:/home/output -p 8888:8888 docker_id/docker_image bash`

`docker run -it --rm -v /path/to/resources/:/home/resources -v /path/to/data/:/home/data -v /path/to/output/:/home/output -p 8888:8888 docker_id/docker_image ipython`


This also works with other software commands, such as bet etc.

###### Stop Docker Container
To stop a running docker container, either close the docker running  terminal or select the terminal and press the `Ctrl-C` shortcut multiple times.

###### List all installed docker images
To see a list of all installed docker images use:



In [None]:
%%bash
docker images

###### Delete a specific docker image
To delete a specific docker image, first use the `docker images` command to list all installed containers and than use the `IMAGE ID` and the `rmi` instruction to delete the container:

In [None]:
%%bash
docker rmi -f IMAGE ID

###### Export and import a docker image
If you don't want to depend on a internet connection, you can also export an already downloaded docker image and than later on import it on another PC. To do so, use the following two commands:

- export docker image docker_id/docker_image`

`docker save -o docker_image.tar docker_id/docker_image`

- import docker image on another PC

`docker load --input docker_image.tar`

It might be possible that you run into administrator privileges isssues because you ran your docker command with `sudo. This means that other users don't have access rights to `docker_image.tar`. To avoid this, just change the rights of `docker_image.tar` with the command:

`sudo chmod 777 docker_image.tar`

For more information check the [already mentioned introduction](http://nipy.org/workshops/2017-03-boston/lectures/lesson-container/#1) or Michael Notter's [introduction to docker notebook](https://miykael.github.io/nipype_tutorial/notebooks/introduction_docker.html). 

### After our short sidequest into the world of docker, let's start with something which is as basic as it's important: the structure of your data.

## data organization using [BIDS](http://bids.neuroimaging.io) 

The ALPACA toolbox assumes or let's say works best when your data is structured according to the Brain Imaging Data Structure (BIDS). BIDS is a simple and intuitive way to organize and describe your neuroimaging and behavioral data. Neuroimaging experiments result in complicated data that can be arranged in many different ways. So far there is no consensus how to organize and share data obtained in neuroimaging experiments. BIDS tackles this problem by suggesting a new standard for the arrangement of neuroimaging datasets.

The idea of BIDS is that the file and folder names follow a strict set of rules (graphic taken from [here](https://www.nature.com/articles/sdata201644/figures/1)):

<img src="img/bids.png" alt="Drawing" style="width: 800px;"/>



BIDS basically describes how you should organize and structure your data, which not only helps you, but also others when sharing your data (which is also eased up). This also allows hassle free applications of other workflows and pipelines which can work with BIDS datasets, increases reproducibility and simplifies collaboration. Once ALPACA has grown up and became super fluffy it is intended to also run as a [BIDS app](http://bids-apps.neuroimaging.io), meaning that you'll be able to run the whole toolbox (or just the parts you want) within one line of code and without any software-installation-related stress. Pretty cool, eh? To do so, BIDS and the here mentioned prerequisites are necessary...just to convince you even more to start using BIDS.

##### How to convert datasets into BIDS?

At this point you might ask yourself: nice stuff, but how do I get my dataset into BIDS? Well, this depends on how your data is currently organized and which file format it is in. If you already have converted your data from [dicom](https://en.wikipedia.org/wiki/DICOM) to [nifti](www.nifti.nimh.nih.gov/) the easist way would be to write a small bash/python/matlab/etc. script that reorganizes you files into BIDS. However, if you still have your files in dicom, it's recommend to convert them again using certain tools, like those which are listed [here](https://neurostars.org/t/convert-data-to-bids-format/720), as they allow you to extract more metadata than is usually present in your niftis and also already organize your dataset into BIDS. As there are quite a few out there, I would recommend checking their respective github sites, play around a bit and decide for one that works well for you. </br>

As an example, for me this is [heudiconv](https://github.com/nipy/heudiconv), which can also be run via [docker](https://hub.docker.com/r/nipy/heudiconv/). </br>
To familiarize yourself with heudiconv, check the additional information and tutorials on it's [github page](https://github.com/nipy/heudiconv). If you want to use heudiconv via docker: </br>

1. get it by running &nbsp;&nbsp; `docker pull nipy/heudiconv` &nbsp;&nbsp; in your terminal 
2. check this [introduction](http://nipy.org/workshops/2017-03-boston/lectures/bids-heudiconv/#1) 

##### Almost there

Got your dataset in BIDS? Coolio! Only two more things you should do before you start working on it!
I guess you know the saying "sharing is caring", eh? Like mentioned before, in the context of BIDS / neuroimaging this means sharing your data publicly and freely with others, so that the whole field and world can benefit from it. That's one part of "caring", with the other being "caring" about privacy protection. More precisely, protecting the privacy of the individuals that have been scanned. Besides being self-evident, all major neuroimaging data sharing initiatives and platforms require [_de-identification_](https://openfmri.org/de-identification/). There are different tools which can be used for that. However, e.g. [openfmri.org](https://openfmri.org) recommends either [mri_deface](https://surfer.nmr.mgh.harvard.edu/fswiki/AutomatedDefacingTools) or [pydeface](https://github.com/poldracklab/pydeface). Both can be applied using your local environment or, of course, within a docker image. Just check both at their websites and play around with the examples. As you can see in the following example from [openfmri.org](https://openfmri.org/de-identification/), they tend to work differently:</br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;original image&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;pydeface&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;mri_deface
<img src="img/anatomical_deface.jpg" alt="Drawing" style="width: 800px;"/>

After trying both, you should decide for one which you'll use througout your dataset. To paraphrase a famous quote from a once famous (and revived in 2017 through an app) series: I choose you, [pydeface](https://github.com/poldracklab/pydeface)! Hence, I briefly show you how to use [pydeface](https://github.com/poldracklab/pydeface) (outside a docker image): 
- get it via `git`

In [None]:
git clone https://github.com/poldracklab/pydeface.git
cd pydeface
python setup.py install

Make sure you have all dependencies needed by pydeface: [FSL](https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FSL), [nibabel](http://nipy.org/nibabel/) and [niyppe](http://nipype.readthedocs.io/en/latest/) (if you don't want to deal with that, you can, obviously, also use pydeface inside a docker image).  

- apply it to the structural nifti images in your BIDS dataset, setting path and name to the respective outputfile to prevent the "_defaced" suffix pydeface would add otherwise by default

In [None]:
pydeface /path/to/BIDS/dataset/sub-*/anat/sub-*_T1w.nii.gz /path/to/BIDS/dataset/sub-*/anat/sub-*_T1w.nii.gz 

De-identified all participants in your BIDS dataset? Awesome! At this point, so before start working with and/or sharing your dataset, you might want to make sure that everything is correct, nothing's missing and you're good to go. But don't worry, you don't have to go through your whole dataset checking every folder and file. As we're in the realm of robust, reproducible and automated processing there's, of course, a tool for that. It's called [bids-validator](https://github.com/INCF/bids-validator) and can be applied in multipe ways:

- web browser version: </br>

  - in google Chrome (currently the only supported browser) go to the [bids-validator website](http://incf.github.io/bids-validator/)
  - select the folder containing your BIDS dataset
  - check the output and, if erros/problems appear (e.g. missing files), resolve them and run it again </br>
  
  
- docker version: </br>

  - get the docker image 
  - run it on your BIDS dataset, by providing the respective path 
  - check the output and, if erros/problems appear (e.g. missing files), resolve them and run it again 

In [None]:
%%bash
docker pull bids/validator

In [None]:
%%bash
docker run -ti --rm -v /path/to/BIDS/dataset:/data:ro bids/validator /data

##### How to make use of it

Besides the already mentioned advantages there are a lot of tools which are intended to and ease up the work with BIDS datasets. A very good example is [pybids](https://github.com/INCF/pybids), which is incredibly useful for any kind of interaction with [BIDS](http://bids.neuroimaging.io) datasets, e.g. within a nice & reproducible [nipype](https://github.com/nipy/nipype) [workflow](http://nbviewer.jupyter.org/github/nipy/workshops/blob/master/170327-nipype/notebooks/basic-bids/basic_data_input_bids.ipynb). Make sure to also have a look at [bidsutils](https://github.com/INCF/bidsutils). Furthermore, you should also check the already mentioned [BIDS apps](http://bids-apps.neuroimaging.io). These are "portable neuroimaging pipelines that understand BIDS datasets". More precisely, [BIDS apps](http://bids-apps.neuroimaging.io/about/) are neuroimaging pipelines / workflows for a [huge variety of analyses](http://bids-apps.neuroimaging.io/apps/) packed in a docker image that will work / run out of the box given a [BIDS dataset](https://www.nature.com/articles/sdata201644) as input. It won't get any more comfortable (okay, maybe with [openneuro.org](https://openneuro.org)).

That being said, let's actually work with BIDS and BIDS apps to get some data ready for ALPACA. 

## data quality control & assurance using [MRIQC](https://mriqc.readthedocs.io/en/latest/)

Assuming your data is in BIDS format, de-identified and the BIDS validator had nothing to complain, it's time to let the games begin. Along the initial steps of analyzing neuroimaging data (or basically any other data as well) some sort of data quality control should be done. This is important, as checking for [artifacts](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4340093/), [inhomogeneities](http://www.mr-tip.com/serv1.php?type=db1&dbs=Inhomogeneity) or any other kind of data corruption can prevent effects of the mentioned possible problems on your analyses and results. An example is depicted below (modified from Esteban O, Birman D, Schaer M, Koyejo OO, Poldrack RA, Gorgolewski KJ (2017) MRIQC: Advancing the automatic prediction of image quality in MRI from unseen sites. PLoS ONE 12(9): e0184661. https://doi.org/10.1371/journal.pone.0184661):


<img src="img/artifact_example_MRIQC.png" alt="Drawing" style="width: 600px;"/>



As visual inspections can become quite time consuming and demanding in the age of big (neuroimaging) data and [inter-rater reliability](https://en.wikipedia.org/wiki/Inter-rater_reliability) shows quite some variability, automated quality assessments come in handy. Over the years some toolboxes were developed which all compute a broad amount of image quality metrics. Among those and very recommendable are [QAP](http://preprocessed-connectomes-project.org/quality-assessment-protocol/) and [MRIQC](http://mriqc.readthedocs.io/en/latest/). As the latter builds upon the first and is already provided as a docker image, this notebook will focus on [MRIQC](http://mriqc.readthedocs.io/en/latest/). 

##### What exactly is [MRIQC](http://mriqc.readthedocs.io/en/latest/)?

MRI Quality Control Tool (MRIQC) is tool for automated quality assessment which it does by extracting (image) quality measures. The [introduction page of MRIQC](http://mriqc.readthedocs.io/en/latest/about.html) provides a comprehensive overview of it's functions and properties:

MRIQC is an open-source project, developed under the following software engineering principles:

1. Modularity and integrability: MRIQC implements a [nipype](http://nipype.readthedocs.io/en/latest/) [workflow](http://miykael.github.io/nipype-beginner-s-guide/firstSteps.html#important-building-blocks) to integrate modular sub-workflows that rely upon third party software toolboxes such as [FSL](https://fsl.fmrib.ox.ac.uk/fsl/fslwiki), [ANTs](http://stnava.github.io/ANTs/) and [AFNI](https://afni.nimh.nih.gov).
2. Minimal preprocessing: the MRIQC workflows should be as minimal as possible to estimate the IQMs on the original data or their minimally processed derivatives.
3. Interoperability and standards: MRIQC follows the the [brain imaging data structure (BIDS)](http://bids.neuroimaging.io), and it adopts the [BIDS-App](http://bids-apps.neuroimaging.io) standard.
4. Reliability and robustness: the software undergoes frequent vetting sprints by testing its robustness against data variability (acquisition parameters, physiological differences, etc.) using images from [OpenfMRI](https://openfmri.org). Its reliability is permanently checked and maintained with [CircleCI](https://circleci.com/gh/poldracklab/mriqc).

MRIQC is part of the MRI image analysis and reproducibility platform offered by the [CRN](http://reproducibility.stanford.edu). This pipeline derives from, and is heavily influenced by, the [PCP Quality Assessment Protocol](http://preprocessed-connectomes-project.org/quality-assessment-protocol/).

##### How to get MRIQC

At this point you can probably guess the next few lines...yup, there are different ways of getting / running MRIQC: via [pip](https://github.com/poldracklab/mriqc), [docker](https://hub.docker.com/r/bids/mriqc/) or [openneuro.org](https://openneuro.org). 