# Applying Pre-Implemented Analyses

An *Analysis* class can be applied to a dataset in a flexible manner, such as the parameterisation how and where the data is stored, which derivatives are required, and the computing environment in which to generate the derivatives.

## Inspecting the Analysis Class

We will start by importing a predefined Analysis class from `example.analysis`, which performs the same analysis as the workflow in the [Workflows Notebook](basic_workflow.ipynb). We print the "menu", the list of inputs, derivatives and parameters objects of this class can receive/derive, using the `static_menu` class method.

In [1]:
from example.analysis import BasicBrainAnalysis
print(BasicBrainAnalysis.static_menu())


example.analysis.BasicBrainAnalysis Menu 
-----------------------------------------

Inputs:
    magnitude : nifti_gz
        A magnitude image (e.g. T1w, T2w, etc..)

Outputs:
    brain : nifti_gz
        Skull-stripped magnitude image
    smooth : nifti_gz
        Smoothed magnitude image
    smooth_masked : nifti_gz
        Smoothed and masked magnitude image

Parameters:
    smoothing_fwhm : float
        The full-width-half-maxium radius of the smoothing kernel


To see the "full" menu pass the 'full' flag

In [2]:
print(BasicBrainAnalysis.static_menu(full=True))


example.analysis.BasicBrainAnalysis Menu 
-----------------------------------------

Inputs:
    magnitude : nifti_gz
        A magnitude image (e.g. T1w, T2w, etc..)

Intermediate:
    brain_mask : nifti_gz
        Brain mask used for skull-stripping

Outputs:
    brain : nifti_gz
        Skull-stripped magnitude image
    smooth : nifti_gz
        Smoothed magnitude image
    smooth_masked : nifti_gz
        Smoothed and masked magnitude image

Parameters:
    smoothing_fwhm : float
        The full-width-half-maxium radius of the smoothing kernel


## Defining the Dataset to Analyse

Arcana implicitly handles a lot of the menial tasks involved with data input/outputs such as file format conversions and inserting/retrieving data from a repository service (e.g. XNAT). To specify where your data is you need to create a Dataset object.

### Datasets in Directories on Local System

The simplest form of dataset object is just a directory on (or mounted on) your local file system. The structure of this directory depends on its "depth", i.e. whether it has multiple subjects and visits in it or not.

#### Depth: 0

Typically, just used for prototyping, but you can define a dataset for a single subject by just storing all the data within a single directory.

In [3]:
%%bash
# Create a dataset for a single session in a flat directory. We will copy data from the BIDS formatted ds000114
SAMPLE_DSET=output/sample-datasets/depth0
mkdir -p $SAMPLE_DSET
find data/ds000114/sub-01/ses-test -name '*.nii.gz' | xargs -I% cp -f % $SAMPLE_DSET/
tree $SAMPLE_DSET

output/sample-datasets/depth0
├── sub-01_ses-test_T1w.nii.gz
├── sub-01_ses-test_T1w_bet.nii.gz
├── sub-01_ses-test_dwi.nii.gz
├── sub-01_ses-test_task-covertverbgeneration_bold.nii.gz
├── sub-01_ses-test_task-fingerfootlips_bold.nii.gz
├── sub-01_ses-test_task-linebisection_bold.nii.gz
├── sub-01_ses-test_task-overtverbgeneration_bold.nii.gz
└── sub-01_ses-test_task-overtwordrepetition_bold.nii.gz

0 directories, 8 files


In [4]:
from arcana import Dataset
dset0 = Dataset('output/sample-datasets/depth0')
print(dset0)

Dataset(name='/Users/tclose/Documents/Workshops/2019-11-15-N.A.B.-workshop/nipype_arcana_workshop/notebooks/output/sample-datasets/depth0', depth=0, repository=LocalFileSystemRepo())


Notice the `depth` of this dataset is `0`. This means that there aren't any sub-directories for separate subjects or visits in it. However, all datasets in Arcana have an implicit depth of 2 (although future versions may relax this restriction) so we can see that the single "session" (a single visit of a subject) is assigned default subject and visit IDs of 'SUBJECT' and 'VISIT' respectively.

In [5]:
print('subjects:', list(dset0.subject_ids))
print('visits:', list(dset0.visit_ids))

subjects: ['SUBJECT']
visits: ['VISIT']


#### Depth: 1

For a multi-subject dataset we can add sub-directories for each subject

In [6]:
%%bash
# Create a dataset for a multiple subjects in separate sub-directories by copying data from the BIDS formatted ds000114
SAMPLE_DSET=output/sample-datasets/depth1
mkdir -p $SAMPLE_DSET/sub1 $SAMPLE_DSET/sub2  $SAMPLE_DSET/sub3
find data/ds000114/sub-01/ses-test -name '*.nii.gz' | xargs -I% cp -f % $SAMPLE_DSET/sub1
find data/ds000114/sub-02/ses-test -name '*.nii.gz' | xargs -I% cp -f % $SAMPLE_DSET/sub2
find data/ds000114/sub-03/ses-test -name '*.nii.gz' | xargs -I% cp -f % $SAMPLE_DSET/sub3
tree $SAMPLE_DSET

output/sample-datasets/depth1
├── sub1
│   ├── sub-01_ses-test_T1w.nii.gz
│   ├── sub-01_ses-test_T1w_bet.nii.gz
│   ├── sub-01_ses-test_dwi.nii.gz
│   ├── sub-01_ses-test_task-covertverbgeneration_bold.nii.gz
│   ├── sub-01_ses-test_task-fingerfootlips_bold.nii.gz
│   ├── sub-01_ses-test_task-linebisection_bold.nii.gz
│   ├── sub-01_ses-test_task-overtverbgeneration_bold.nii.gz
│   └── sub-01_ses-test_task-overtwordrepetition_bold.nii.gz
├── sub2
│   ├── sub-02_ses-test_T1w.nii.gz
│   ├── sub-02_ses-test_dwi.nii.gz
│   ├── sub-02_ses-test_task-covertverbgeneration_bold.nii.gz
│   ├── sub-02_ses-test_task-fingerfootlips_bold.nii.gz
│   ├── sub-02_ses-test_task-linebisection_bold.nii.gz
│   ├── sub-02_ses-test_task-overtverbgeneration_bold.nii.gz
│   └── sub-02_ses-test_task-overtwordrepetition_bold.nii.gz
└── sub3
    ├── sub-03_ses-test_T1w.nii.gz
    ├── sub-03_ses-test_dwi.nii.gz
    ├── sub-03_ses-test_task-covertverbgeneration_bold.nii.gz
    ├── sub-03_ses-test_task-fingerfootlip

In [7]:
dset1 = Dataset('output/sample-datasets/depth1', depth=1)
print(dset1)
print('subjects:', list(dset1.subject_ids))
print('visits:', list(dset1.visit_ids))

Dataset(name='/Users/tclose/Documents/Workshops/2019-11-15-N.A.B.-workshop/nipype_arcana_workshop/notebooks/output/sample-datasets/depth1', depth=1, repository=LocalFileSystemRepo())
subjects: ['sub1', 'sub2', 'sub3']
visits: ['VISIT']


**Note** that we need to explicitly provide the depth of `1` otherwise Arcana will interpret our 'sub1', 'sub2' and 'sub3' as filesets.

#### Depth: 2

For a dataset with multiple visits per subject we use a `depth == 2`

In [8]:
%%bash
# Create a dataset for a multiple subjects in separate sub-directories by copying data from the BIDS formatted ds000114
SAMPLE_DSET=output/sample-datasets/depth2
mkdir -p $SAMPLE_DSET/sub1/test $SAMPLE_DSET/sub1/retest $SAMPLE_DSET/sub2/test $SAMPLE_DSET/sub2/test \
         $SAMPLE_DSET/sub2/retest $SAMPLE_DSET/sub3/test $SAMPLE_DSET/sub3/retest
find data/ds000114/sub-01/ses-test -name '*.nii.gz' | xargs -I% cp -f % $SAMPLE_DSET/sub1/test
find data/ds000114/sub-02/ses-test -name '*.nii.gz' | xargs -I% cp -f % $SAMPLE_DSET/sub2/test
find data/ds000114/sub-03/ses-test -name '*.nii.gz' | xargs -I% cp -f % $SAMPLE_DSET/sub3/test
find data/ds000114/sub-01/ses-retest -name '*.nii.gz' | xargs -I% cp -f % $SAMPLE_DSET/sub1/retest
find data/ds000114/sub-02/ses-retest -name '*.nii.gz' | xargs -I% cp -f % $SAMPLE_DSET/sub2/retest
find data/ds000114/sub-03/ses-retest -name '*.nii.gz' | xargs -I% cp -f % $SAMPLE_DSET/sub3/retest
tree $SAMPLE_DSET

output/sample-datasets/depth2
├── sub1
│   ├── retest
│   │   ├── sub-01_ses-retest_T1w.nii.gz
│   │   ├── sub-01_ses-retest_dwi.nii.gz
│   │   ├── sub-01_ses-retest_task-covertverbgeneration_bold.nii.gz
│   │   ├── sub-01_ses-retest_task-fingerfootlips_bold.nii.gz
│   │   ├── sub-01_ses-retest_task-linebisection_bold.nii.gz
│   │   ├── sub-01_ses-retest_task-overtverbgeneration_bold.nii.gz
│   │   └── sub-01_ses-retest_task-overtwordrepetition_bold.nii.gz
│   └── test
│       ├── sub-01_ses-test_T1w.nii.gz
│       ├── sub-01_ses-test_T1w_bet.nii.gz
│       ├── sub-01_ses-test_dwi.nii.gz
│       ├── sub-01_ses-test_task-covertverbgeneration_bold.nii.gz
│       ├── sub-01_ses-test_task-fingerfootlips_bold.nii.gz
│       ├── sub-01_ses-test_task-linebisection_bold.nii.gz
│       ├── sub-01_ses-test_task-overtverbgeneration_bold.nii.gz
│       └── sub-01_ses-test_task-overtwordrepetition_bold.nii.gz
├── sub2
│   ├── retest
│   │   ├── sub-02_ses-retest_T1w.nii.gz
│   │   ├── sub-02_ses-re

In [9]:
dset2 = Dataset('output/sample-datasets/depth2', depth=2)
print(dset2)
print('subjects:', list(dset2.subject_ids))
print('visits:', list(dset2.visit_ids))

Dataset(name='/Users/tclose/Documents/Workshops/2019-11-15-N.A.B.-workshop/nipype_arcana_workshop/notebooks/output/sample-datasets/depth2', depth=2, repository=LocalFileSystemRepo())
subjects: ['sub1', 'sub2', 'sub3']
visits: ['retest', 'test']


However, just say the `retest` session of `Subject 3` was corrupted we could exclude it from the analysis by either dropping `Subject 3` or `retest` from the dataset by filtering the IDs

In [10]:
dset2_filter_subs = Dataset('output/sample-datasets/depth2', depth=2, subject_ids=['sub1', 'sub2'])
print(dset2_filter_subs)
print('subjects:', list(dset2_filter_subs.subject_ids))
print('visits:', list(dset2_filter_subs.visit_ids))

Dataset(name='/Users/tclose/Documents/Workshops/2019-11-15-N.A.B.-workshop/nipype_arcana_workshop/notebooks/output/sample-datasets/depth2', depth=2, repository=LocalFileSystemRepo())
subjects: ['sub1', 'sub2']
visits: ['retest', 'test']


To filter the visits used in the analysis

In [11]:
dset2_filter_vis = Dataset('output/sample-datasets/depth2', depth=2, visit_ids=['test'])
print(dset2_filter_vis)
print('subjects:', list(dset2_filter_vis.subject_ids))
print('visits:', list(dset2_filter_vis.visit_ids))

Dataset(name='/Users/tclose/Documents/Workshops/2019-11-15-N.A.B.-workshop/nipype_arcana_workshop/notebooks/output/sample-datasets/depth2', depth=2, repository=LocalFileSystemRepo())
subjects: ['sub1', 'sub2', 'sub3']
visits: ['test']


Or to filter both

In [12]:
dset2_filter_both = Dataset('output/sample-datasets/depth2', depth=2, subject_ids=['sub1', 'sub2'], visit_ids=['test'])
print(dset2_filter_both)
print('subjects:', list(dset2_filter_both.subject_ids))
print('visits:', list(dset2_filter_both.visit_ids))

Dataset(name='/Users/tclose/Documents/Workshops/2019-11-15-N.A.B.-workshop/nipype_arcana_workshop/notebooks/output/sample-datasets/depth2', depth=2, repository=LocalFileSystemRepo())
subjects: ['sub1', 'sub2']
visits: ['test']


### Datasets on XNAT

In addition to data stored on your local file system, Arcana transparently handles all the interactions with datasets stored in an XNAT repository.

To test this we will use a public project set up on Monash's public XNAT instance

In [16]:
import os.path as op
from arcana import XnatRepo
xnat_repo = XnatRepo(server='https://xnat.monash.edu', cache_dir=op.expanduser('~/xnat-cache'))
print(xnat_repo)
xnat_dataset = xnat_repo.dataset('MISC0002')  # This is the ID of the project on MXNAT
print(xnat_dataset)
print('subjects:', list(xnat_dataset.subject_ids))
print('visits:', list(xnat_dataset.visit_ids))

XnatRepo(server=http://xnat.monash.edu, cache_dir=/Users/tclose/xnat-cache)
Dataset(name='MISC0002', depth=2, repository=XnatRepo(server=http://xnat.monash.edu, cache_dir=/Users/tclose/xnat-cache))




subjects: ['sub01', 'sub02', 'sub03']
visits: ['retest', 'test']


**Note:** If you have a look at the 'MISC0002' project on https://xnat.monash.edu.au you will notice that subjects and sessions are labelled according to the conventions used at MBI, i.e. PROJECTID_SUBJECTID and PROJECTID_SUBJECTID_VISITID for subject and session IDs, respectively. This is a current limitation of Arcana although it should be relaxed in the next month or so.

### Notes on Other Repository/Dataset Types

At this stage XNAT is the only data repository platform supported by Arcana. However, care has been taken to modularise the code as much as possible so it should be fairly straightforward to implement support for other platforms (e.g. Loris, DaRIS, MyTaRDIS) as long as they have a REST API (or equivalent) that enables you to list, get and put data. See the base repository class `arcana.repository.base.Repository` for details on the six abstract methods that need to be overriden. 

There used to be a DaRIS module in early versions of Arcana, which could be ressurected without too much effort if you have a DaRIS instance to test against.

`Banana` also adds support for [BIDS](https://bids.neuroimaging.io) via the `BidsDataset`. The BidsDataset objects are able to parse BIDS specific tree structure and insert derivatives at in the `derivatives` directory.

## Configuring the Software Environment

In [None]:
##