# Data harmoinsation in imaging - the problem
## Author: David M Cash
## Principal Research Fellow, Dementia Research Centre, UCL Queen Square Institute of Neurology
This notebook was made as part of the **Health and Bioscience IDEAS** training programme, funded by the [UKRI Innovation Scholars](https://www.ukri.org/opportunity/innovation-scholars-data-science-training-in-health-bioscience/) Data Science Training in Health and Bioscience. Please visit [our website](https://healthbioscienceideas.github.io/) for more information.

This notebook will give a brief example of how differences can arise when acquiring data using multiple scanners in a study. For more background on this, please see the [accompanying documenation](https://healthbioscienceideas.github.io/demon-imaging-harmonisation/). 

To illustrate this probelm, let's take a scenario where we expect the least amount of change. We will look at two scans from a cogntively normal individual, one scanned on Scanner A, and another scanned on scanner B. Both scanners are have a magnetic field strength of 3 Tesla; however, they are two different models. We will register the two scans together and determine what, if any, differences are present between the two.  

## Setup
To start with, we will import a few basic packages and setup the notebook to have interactive content (the matplotlib widget *magic* command in Jupyter)

In [1]:
import os
import shutil
import matplotlib
%matplotlib widget

## Getting the data
For this demonstration, I used [Datalad](https://www.datalad.org/) to setup the data repository. Datalad is built upon git and git-annex, and it provides version control for data management, data sharing and reproducible science. For this lesson, it offers us a convenient way of grabbing the data that we need. These comamands import datalad so that we can use it to clone the repository into our virtual machine.

In [2]:
import nest_asyncio
nest_asyncio.apply()

In [3]:
import datalad.api as dl

Next we "clone" the repository to a local destination. No big files are downloaded yet, just the metadata so this computer know what data exists.

In [4]:
dl_source='https://github.com/HealthBioscienceIDEAS/demon-imaging-data.git'
sample=dl.clone(dl_source,path='/tmp/sample',description='Cloned sample dataset for import')
sample.update(merge=True)
sample.siblings(action='enable',name='demons-data')

[INFO] Fetching updates for Dataset(/tmp/sample) 


.: demons-data(?) [git]


[{'action': 'enable-sibling',
  'path': '/tmp/sample',
  'type': 'sibling',
  'name': 'demons-data',
  'refds': '/tmp/sample',
  'status': 'ok'}]

Now we get our first and second image from the same individual using the datalad get command. Big data is not downloaded automaticaly when cloning a repository with datalad. It is only downloaded as/when it is needed, saving valuable space, especially when working with large repositories. The initial download will take a few seconds, so please be patient. When it is running, there will be an * between the square brackets, and when complete, it will have a number next to it and a little output.bl

In [5]:
bl_img=sample.get('./baseline_t1.nii.gz')
print(bl_img[0]['path'])

/tmp/sample/baseline_t1.nii.gz


In [6]:
fu_img=sample.get('./followup_t1.nii.gz')
print(fu_img[0]['path'])

/tmp/sample/followup_t1.nii.gz


# Intrasubject registration
The python package [nipype](https://nipype.readthedocs.io/en/latest/index.html) provides an effective means in Python to run your image processing workflows, taking "building blocks" of individual commands, allowing you to piece together bits of various software packages (FSL, SPM, FreeSurfer, AFNI, MRtrix, etc) into complete pipelines. We are just going to use it to perform a simple registration between our baseline and followup image.

In [7]:
from nipype.interfaces import niftyreg


This sets up the node that will run rigid registration using  [NiftyReg](https://github.com/KCL-BMEIS/niftyreg), a lightweight easy to use registration package. We set the inputs and outputs up, and it will generate the command line, which you can see below.

In [8]:
out_path=os.path.join(os.getcwd(),'results')
if not os.path.exists(out_path):
    os.makedirs(out_path)
followup_aff=os.path.join(out_path,'followup_to_baseline_aff.txt')
followup_res=os.path.join(out_path,'followup_t1_resampled.nii.gz')
node = niftyreg.RegAladin(verbosity_off_flag=True)
node.inputs.ref_file = bl_img[0]['path']
node.inputs.flo_file = fu_img[0]['path']
node.inputs.aff_file = followup_aff
node.inputs.res_file = followup_res
node.cmdline

'reg_aladin -aff /Users/davecash/src/demon-imaging-harmonisation/results/followup_to_baseline_aff.txt -flo /tmp/sample/followup_t1.nii.gz -omp 1 -ref /tmp/sample/baseline_t1.nii.gz -res /Users/davecash/src/demon-imaging-harmonisation/results/followup_t1_resampled.nii.gz -voff'

If we are happy with the command setup, then we run the code. This might take a little while, as no output will appear below until the registration is completed. 

In [14]:
node.run()

210913-13:29:26,170 nipype.interface INFO:


<nipype.interfaces.base.support.InterfaceResult at 0x7fe63502c4d0>

In [15]:
baseline_nii=os.path.join(out_path,'baseline_t1.nii.gz')
shutil.copyfile(bl_img[0]['path'],baseline_nii)

'/Users/davecash/src/demon-imaging-harmonisation/results/baseline_t1.nii.gz'

It has now generated the affine transformation between the two images and a resampled followup image. You can now see these and the baseline imagein the results directory of the notebook (should be on the left hand side). 

## Checking the registration

After the registration. the corresponding anatomy should be aligned in the baseline image and the registered followup image. The best way to check the registration is to download the images from the results directory and then open them up in your favourite image viewer, so that you can look at both images with a linked cursor, or by toggling back and forth between the imags, so that you can visually see the differences. You can also look at them through this notebook below using a lightweight 3D slice viewer called [nanslice](https://github.com/spinicist/nanslice).

In [16]:
import nanslice.jupyter as ns

Let's first look at the baseline image. Scroll around using the sliders for the slice location. You can also change the brightness and contrast by modifying the clim argument in the line below.   

In [17]:
base=ns.Layer(baseline_nii,clim=(30,350),label="Baseline")
ns.three_plane(base,cbar=True,interactive=True)

Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

VBox(children=(HBox(children=(FloatSlider(value=0.000164031982421875, description='X:', max=113.85000610351562…

Now for the repeat - The original clim arguments should produce similar greyscales for both. Make sure the slice sliders have the same location for both viewers.  
**What differences do you notice?**

In [18]:
repeat=ns.Layer(followup_res,clim=(20,220),label="Followup")
ns.three_plane(repeat,cbar=True,interactive=True)

Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

VBox(children=(HBox(children=(FloatSlider(value=0.000164031982421875, description='X:', max=113.85000610351562…

In some cases, changing scanners can also result in subtle geometric distortions between scans, which can result in differences in volume that are not physiological or pathological in nature. The animated GIF below which toggles between the baseline and followup scan shows not just the changes in intensity and contrast between the two scans, but changes in shape as well.  These three effects are some of the main reasons for why data harmonisation between sites are needed. 
![DistortionExample](registration.gif "registration")