# Initial preparations

<mark> Both the installation and data download only need to be performed once! </mark>

We need to first install all the necessary python packages and download the data we will use in this tutorial. Before we do this, you should create a python environment specifically for processing neuroimaging data with python. [Anaconda](https://www.anaconda.com/products/distribution) is an easy tool to create and manage python environments. Install Anaconda (if don't already have it installed) and create a new environment with the latest version python. Then, with this environment activated (or selected in VS Code), move on to install the necessary python packages.

## Python packages
All the packages we need to install are included in the requirements.txt file in this repository. We can install these packages with a call to pip (note: in a jupyter notebook, you can make calls to the terminal with a %, this command could also be run within a terminal):

In [None]:
%pip install -r requirements.txt

## Download the tutorial data

We are using data from the Pinel Localizer task which includes a 5-minute functional localizer for a few basic cognitive processes (visual perception, finger tapping, langauge, math). The [original paper](https://bmcneurosci.biomedcentral.com/articles/10.1186/1471-2202-8-91) and [OSF website](https://osf.io/vhtf6/files/) include all the details. The full dataset includes 94 subject and is big (42Gb), we will only grab a subset of the data (3Gb). 

### Option 1: Direct download
The instructions below walk through the process of downloading data using DataLad. The process is sort of complicated and takes a long time. It does demonstrate how to use these important tools for open science, so worth the effort if that's something you care about. Alternatively, if you just want to get the data as quickly as possible, download the data (bundled in a zip file) here:

[Direct download of tutorial data](https://www.dropbox.com/s/flfsvgzq3zs6va6/localizer.zip?dl=0)

Note: This is *not* the official source of this dataset and provided only as a convenience for this specific set of tutorial scripts.

### Option 2: Download through official DataLad instance
The data is available through a DataLad instance, which thankfully has a python API. DataLad was included in the requirements.txt file above, so was installed with the rest of the python packages. But, for DataLad to work properly, we also need to install git-annex. See [https://git-annex.branchable.com/install/] for installations instructions (on MacOS, use Homebrew to install: `brew install git-annex`). **Install git-annex before proceeding!**

Once git-annex is installed, we can download the data. NOTE: In the code below, make sure to update `localizer_path` to point to a directory on your computer where the data should be downloaded.

In [None]:
import os
import glob
import datalad.api as dl
import pandas as pd
import warnings
warnings.filterwarnings('ignore') 

# update this path to a local directory on your computer!
localizer_path = '/Users/michael/Dropbox/work/data/dartbrains/data/localizer'

# clone the datalad repository and create a local dataset instance (this will take several minutes!)
dl.clone(source='https://gin.g-node.org/ljchang/Localizer',path=localizer_path)
ds = dl.Dataset(localizer_path)

Cloning the dataset to your local computer only provides links to the file structure, we still need to actually download the data. The get calls below will take some time to complete (30-50 mins!), so get it started and go grab a coffee. 

In [None]:
# download the experiment metadata
result = ds.get(glob.glob(os.path.join(localizer_path,'*.json')))
result = ds.get(glob.glob(os.path.join(localizer_path,'*.tsv')))
result = ds.get(glob.glob(os.path.join(localizer_path, 'phenotype')))
# download the first 5 subjects fmriprep'd data
file_list = glob.glob(os.path.join(localizer_path,'*','fmriprep','sub*'))
file_list.sort()
for f in file_list[:10]:
    result = ds.get(f)

Make sure to note where you installed the tutorial data, you'll need this path for the next scripts in the tutorial.