# ISLES 2022 Example
This notebook serves as an example for generating predictions for submission to the ISLES 2022 challenge. We'll cover all aspects of dealing with the data.

## Data Download
There are two tasks in ISLES 2022: multi- and single-channel segmentation. The single-channel task uses the ATLAS 2.0 dataset,

### Task 2: ATLAS 2.0

We can use the `atlas` module provided with this notebook to download and reformat the data.

In [None]:
import atlas
atlas.data_fetch()

The data will take a few minutes to download. The resulting file is an encrypted archive; you will first need to decrypt it. You can do so by following the [instructions on the ATLAS 2.0 download page](http://fcon_1000.projects.nitrc.org/indi/retro/atlas_download.html). The following code will prompt you for a password and then decrypt the archive.

In [None]:
import getpass, subprocess
# Decrypt the data; prompt user for password
subprocess.call(['openssl', 'aes-256-cbc', '-md', 'sha256', 
                 '-d', '-a', '-in',
                 'ATLAS_R2.0_encrypted.tar.gz', '-out', 'ATLAS_R2.0.tar.gz',
                 '-pass', f'pass:{getpass.getpass("Enter password")}'])
 
subprocess.call(['tar', '-xzf', 'ATLAS_R2.0.tar.gz'])

We should now have a directory called `ATLAS_2` in the current working directory:

In [None]:
import os
os.listdir('./')

The data distributed by INDI is not compatible with PyBIDS, but the `atlas` module can convert it:

In [None]:
import atlas
atlas.bidsify_indi_atlas('ATLAS_2/', 'data/')

The data is now split into two directories: `data/train` and `data/test`. Predictably, the `train` directory contains data with labels with which to train your model. The `test` directory is the set of images that your model will need to segment. The archive files you downloaded can now be safely deleted.

## Data Loading

To train your model, you'll need to load data samples that are matched with their targets. We provide a Python package for doing just that: [BIDSIO](https://github.com/npnl/bidsio). The following code will walk you through loading matched data. We recommend reading through the BIDSIO GitHub page for up-to-date explanations of the different fields.

In [None]:
import bidsio
bids_loader = bidsio.BIDSLoader(data_entities=[{'subject': '',
                                               'session': '',
                                               'suffix': 'T1w',
                                               'space': 'MNI152NLin2009aSym'}],
                                target_entities=[{'suffix': 'mask',
                                                'label': 'L',
                                                'desc': 'T1lesion'}],
                                data_derivatives_names=['ATLAS'],
                                target_derivatives_names=['ATLAS'],
                                batch_size=2,
                                root_dir='data/train/')

We'll examine a few properties of the loader. First, let's verify that we have the correct number of subjects:

In [None]:
tmp = bids_loader.load_sample(0)
print(f'There are {len(bids_loader)} subjects in our dataset.')
print(f'Every sample loads {len(tmp)} images.')
print(f'Images have the dimensions: {bids_loader.data_shape}')
print(f'Every batch will load {bids_loader.batch_size} samples.')

Our loader can also provide a generator to allow us to iterate through the dataset. The generator is accessed via the `load_batches` method:

In [None]:
for data, target in bids_loader.load_batches():
    print(f'Our data has the shape {data.shape}')
    print(f'Our target has the shape {target.shape}')
    # Cast to library and transfer to desired device
    # Train model
    break

Note the dimensions of our data; they have been reshaped to be consistent with libraries such as PyTorch:  
(Sample in batch, channel, X, Y, Z)  
You can cast the arrays to the package of your choice.

## Predictions
Once your model is trained, you'll want to make predictions on the test data and upload them for evaluation. We expect the data to be formatted as a BIDS dataset. In this section, we'll show you how to easily format your predictions without having to go through the BIDS standard.  
First, we'll load the test data:

In [1]:
import bidsio
bids_loader = bidsio.BIDSLoader(data_entities=[{'subject': '',
                                               'session': '',
                                               'suffix': 'T1w',
                                               'space': 'MNI152NLin2009aSym'}],
                                target_entities=[],
                                data_derivatives_names=['ATLAS'],
                                batch_size=4,
                                root_dir='data/test/')

In [2]:
for dat, image_list in bids_loader.load_batch_for_prediction():
    print(f'Data shape: {dat.shape}')
    print(f'Example BIDS file: {image_list[0]}')
    break

Data shape: (4, 1, 197, 233, 189)
Example BIDS file: (<BIDSImageFile filename='/home/lex/NPNL/projects/isles_tutorial/ISLES_tutorial/data/test/derivatives/ATLAS/sub-r005s016/ses-1/anat/sub-r005s016_ses-1_space-MNI152NLin2009aSym_T1w.nii.gz'>,)


You'll notice that we use a different generator for loading the predictions. This generator also yields the BIDS image file that stored the data. We'll create a new BIDS directory using this information.  
First, we'll need to create a model:

In [3]:
# Create great model.
import numpy as np
class some_model():
    def __init__(self):
        '''
        Simple model to serve as an example.
        '''
        return
    
    def predict(self, data: np.ndarray) -> np.ndarray:
        '''
        Returns '1' for voxels whose value are greater than the image mean.
        Parameters
        ----------
        data : np.ndarray
            Data for which to make a prediction of the labels.
        Returns
        -------
        np.ndarray
            Model prediction for the input data.
    '''
        data_mean = np.mean(data)
        return np.array(data > data_mean, dtype=np.float32)
your_model = some_model()

The `your_model` object will be used a stand-in for a fully-trained model.  
As before, we'll use the `load_batch_for_prediction` method to obtain our data. We can write out our predictions as we generate them using the `write_image_like` method:

In [4]:
help(bids_loader.write_image_like)

Help on function write_image_like in module bidsio.bidsloader:

write_image_like(data_to_write: <built-in function array>, image_to_imitate: bids.layout.models.BIDSImageFile, new_bids_root: str, new_entities: dict = None)
    Writes an image to a different BIDS directory using the path pattern of an existing image. Optionally
    inserts new entities and replaces existing values.
    Parameters
    ----------
    data_to_write : np.array
        Image data to save.
    image_to_imitate : BIDSImageFile
        Image with BIDS entities to imitate
    new_bids_root : str
        BIDS root to save image in.
    new_entities : dict
        Optional. Entity-value pairs to overwrite
    
    Returns
    -------
    None



In [21]:
example_output_dir = 'prediction_bids/'  # Directory where to write out predictions
for dat, image_list in bids_loader.load_batch_for_prediction():
    prediction = your_model.predict(dat)  # Make a prediction
    # Reduce to set of 3D images
    for i in range(prediction.shape[0]):  # Iterate through each sample in the batch
        pred_out = prediction[i,0,...]
        image_ref = image_list[i][0]
        print(f"Writing image for subject {image_ref.entities['subject']}")
        
        bids_loader.write_image_like(data_to_write=pred_out,
                                     image_to_imitate=image_ref,
                                     new_bids_root=example_output_dir,
                                     new_entities={'label': 'L',
                                                   'suffix': 'mask'})
    break

Writing image for subject r005s016
Writing image for subject r005s025
Writing image for subject r005s030
Writing image for subject r005s034


We see that we create a file for each subject present in our batch. Let's verify that the files were created.

In [22]:
import os
for p, _, fnames in os.walk(example_output_dir):  # Walk through dir structure
    if(len(fnames) > 0):
        for f in fnames:
            print(os.path.join(p, f))  # Print full path of files that are found

prediction_bids/sub-r005s016/ses-1/anat/sub-r005s016_ses-1_space-MNI152NLin2009aSym_label-L_mask.nii.gz
prediction_bids/sub-r005s034/ses-1/anat/sub-r005s034_ses-1_space-MNI152NLin2009aSym_label-L_mask.nii.gz
prediction_bids/sub-r005s025/ses-1/anat/sub-r005s025_ses-1_space-MNI152NLin2009aSym_label-L_mask.nii.gz
prediction_bids/sub-r005s030/ses-1/anat/sub-r005s030_ses-1_space-MNI152NLin2009aSym_label-L_mask.nii.gz


You should see one image for each sample in a batch, with `label-L` and `mask` inserted into the filename. BIDS requires one more file, `dataset_description.json`, which we can create with `write_dataset_description`:

In [23]:
help(bidsio.BIDSLoader.write_dataset_description)

Help on function write_dataset_description in module bidsio.bidsloader:

write_dataset_description(bids_root: str, dataset_name: str, author_names: list = None, derivative_name: str = None, derivative_version: str = '1.0')
    Writes the dataset_description.json file to the BIDS root.
    Parameters
    ----------
    bids_root : str
        Path to the BIDS data root directory.
    dataset_name : str
        Name to enter for the various "Name" fields in `dataset_description.json`
    author_names : list
        Optional. List of authors.
    derivative_name : str
        Optional. If not None, write to the `derivatives/derivative_name/` directory instead of the root directory.
    derivative_version : str
        Optional. Version of the pipeline used to generate.
    Returns
    -------
    None



In [32]:
bidsio.BIDSLoader.write_dataset_description(bids_root=example_output_dir,
                                            dataset_name='atlas2_prediction',
                                            author_names=['Hutton, A.'])

We can then take a look at the JSON file:

In [33]:
import json, os
f = open(f'{example_output_dir}{os.sep}dataset_description.json')
dataset_description = json.load(f)
f.close()
print(dataset_description)

{'Name': 'atlas2_prediction', 'BIDSVersion': '1.6.0', 'Authors': ['Hutton, A.'], 'PipelineDescription': {'Name': 'atlas2_prediction'}, 'GeneratedBy': [{'Name': 'atlas2_prediction', 'Version': '1.0'}]}


Our predictions are now a BIDS-compatible dataset and can be submitted to the GC website.

In [38]:
import bids
prediction_bids = bids.BIDSLayout(root=example_output_dir, derivatives=example_output_dir)
print(prediction_bids.derivatives['atlas2_prediction'])

BIDS Layout: ...ISLES_tutorial/prediction_bids | Subjects: 4 | Sessions: 4 | Runs: 0
