# Tutorial Day 1: Introduction to AI in medical imaging

In this interactive session you will learn more about XNAT, how XNAT stores data, and how you can access and downlonad imaging data from an XNAT server. During this session we will mostly work with the central XNAT server (https://xnat.bmia.nl), but we provide future references for more advanced usecases (e.g. uploading data ot the project you have access to).

XNAT offers REST API that allows its user to interact with an XNAT installation using scripts. Typically, you interact with a REST API by sending various HTTP requests: `GET` for getting data from the server, `POST` for sending any data to the server (including, for example, your login information), etc. However, this way of interaction is not optimal for research use cases. Because of that, we have built a Python wrapper for the XNAT REST API - xnatpy. If you would like to know more about how the library works you can open xnatpy [official documentation](https://xnat.readthedocs.io/en/latest/index.html).

Here is how it can help you in your research work:

In [None]:
# Uncomment if using Google Colab
# !pip install xnat
# import os
# os.mkdir("data")

In [None]:
# Necessary imports

import xnat
import os
import pydicom
import zipfile
import matplotlib.pyplot as plt
from skimage import filters

## Connecting to the XNAT server
First, we connect to the server by using the `connect` function from the `xnat` package:

In [None]:
session = xnat.connect('https://xnat.bmia.nl')

In general, `connect` function can take quite a few parameters (for example, your username), but since we are connecting to the public XNAT instance as a Guest account, we don't need to provide anything except for the server location. In order to terminate the session, use `session.disconnect()` function after you have finished your work.

## Exploring the XNAT server

As we mentioned in our supplementary material, XNAT has a hierarchical data model:
- Projects
- Subjects
- Sessions (a specific visit to a scanner)
- Scans (a specific type of scan during that visit)

For each of the levels of the hierarchy, a corresponding xnatpy class exists. Let's start by looking at all the projects from the server that we have connected to:

In [None]:
session.projects

As you can see, we receive quite a long list of projects. `xnatpy` uses a specific class, `XNATListing`, to represent the collections of objects in XNAT. As a user, you can treat it as a dictionary. For example, to get access to the **sandbox** project, simply use it as a key: 

In [None]:
sandbox = session.projects['sandbox']
sandbox

If you would like to open the selected project (or any other XNAT object, really) in the browser again, or need a URL for your script, you can use `external_uri` method:

In [None]:
sandbox.external_uri()

It is very common to do some processing for all the subjects in a particular XNAT project (e.g., calculating brain volume). Since `subjects` is also an `XNATListing` you can do it in the following way:

In [None]:
for subj in sandbox.subjects.values():
    print(subj.label)

Additionally, each subject can have custom (not defined by XNAT) variables assigned to itself. They can be accessed via `fields` attribute for each type of objects that can have one: 

In [None]:
subject = sandbox.subjects["ANONYMIZ"]
subject.fields

## Exercise 1
Using the information you have learned above, write Python code that will print out labels and age of all subjects older than 85 from the **WORC** database (it can take some time during the first run):

In [None]:
# Solution:

# Downloading data from XNAT
A lot of times you are interested in doing processing locally. XNAT (and `xnatpy`) allows you to download the data associated with an XNAT object in the following way. Assume that we have selected a subject and would like to process one of the scans belonging to it. First, let's list all the experiments from the selected subject:

In [None]:
subject.experiments

Now, let's choose one of the experiments (we suggest using `"ANONYMIZ"` experiment) and list all the scans belonging to it:

In [None]:
mri_session = subject.experiments["ANONYMIZ"]
mri_session.scans

Now, assuming that we are particularly interested in T1 data from this experiment, we can download it in the following way:

In [None]:
mri_session.scans['T1'].download(os.path.join("data", "T1.zip"))
with zipfile.ZipFile(os.path.join("data", "T1.zip"), "r") as zip_ref:
        zip_ref.extractall("data")

Unzip the downloaded archive and let's explore/visualize its contents. When downloading the files `xnatpy` preserves the hierarchical nature of XNAT data model, so paths to the actual imaging data can be quite long. We load the contents of the DICOM file using `pydicom.dcmread` function:

In [None]:
dataset = pydicom.dcmread(os.path.join("data", "ANONYMIZ/scans/6-T1/resources/DICOM/files/1.3.6.1.4.1.40744.99.141253643552231291697372180164147575979-6-43-5opby1.dcm"))

DICOM files consist of a header and image data bundled together. Information in DICOM file header is stored as a collection of standardized tags. We can list all the information stored in the DICOM dataset by printing it:

In [None]:
dataset

Since DICOM files contain quite a lot of information, they are usually go through the process of deidentification  - removing or replacing personal health information, such as, for example, patient name. Most of the time you don't need to access all of the information in the scan, but outputting some summary is often useful:

In [None]:
print(f"Patient ID.......: {dataset.PatientID}")
print(f"Study description: {dataset.StudyDescription}")
print(f"Modality.........: {dataset.Modality}")
print(f"Study date.......: {dataset.StudyDate}")
print(f"Image size.......: {dataset.Rows} x {dataset.Columns}")
print(f"Pixel spacing....: {dataset.PixelSpacing}")

Now, let's visualize the DICOM slice that we have loaded:

In [None]:
plt.imshow(dataset.pixel_array, cmap=plt.cm.gray);

# Exercise 2

Download FLAIR scan from this subject, plot any slice present in the dataset and print out its summary. Try to see what is different between these two scans (Hint: look at the [description](https://en.wikipedia.org/wiki/Fluid-attenuated_inversion_recovery) of the FLAIR technique).

In [None]:
# Solution

## Applying image preprocessing

Now, we are going to apply Gaussian filter to the selected image slice. It is typically used in image processing pipelines to reduce noise and enhance details on an image at the various scales. It is also commonly used before applying edge detection methods, since they are sensitive to noise in an image. For more information about the filter you can have a look at its Wikipedia [page](https://en.wikipedia.org/wiki/Gaussian_blur). We will be using `filters.gaussian` implementation of this algorithm from scikit-image library.

In [None]:
# Reload the original image from the file
dataset = pydicom.dcmread(os.path.join("data", "ANONYMIZ/scans/6-T1/resources/DICOM/files/1.3.6.1.4.1.40744.99.141253643552231291697372180164147575979-6-43-5opby1.dcm"))
original_image = dataset.pixel_array
# Plot the slice
plt.imshow(original_image, cmap=plt.cm.gray);

Now, let's try to apply an edge detection algorithm to the image. We will use edge detection algorithm based on [Sobel filter](https://en.wikipedia.org/wiki/Sobel_operator).

In [None]:
edges_original = filters.sobel(original_image)
plt.imshow(edges_original, cmap=plt.cm.gray);

As you can see, the algorithm detects quite a lot of edges throughout the whole image. Now we will apply Gaussian smoothing first and see how it affects the edge detection algorithm output.

In [None]:
filtered_image = filters.gaussian(original_image, sigma=3.0)
plt.imshow(filtered_image, cmap=plt.cm.gray);

In [None]:
edges_filtered = filters.sobel(filtered_image)
plt.imshow(edges_filtered, cmap=plt.cm.gray);

In [None]:
f, axes = plt.subplots(1,2, figsize=(15, 15))
axes[0].imshow(edges_original, cmap=plt.cm.gray)
axes[0].title.set_text("Edges from the original image")
axes[1].imshow(edges_filtered, cmap=plt.cm.gray)
axes[1].title.set_text("Edges after filtering the original image")

As you can see, after applying Gaussian smoothing, during the edge detection the algorithm focuses on higher-level features of the image. **Exercise 3**: Try changing values of Sigma parameter to see how it affects the outcome of edge detection algorithm.

In [None]:
# Close the session
session.disconnect()

## (Optional) Importing data into XNAT

It is also possible to add data to the server using XNAT REST API. `xnatpy` wraps it in the `import_` method. You can use it in the following way:

In [None]:
# session.services.import_('/path/to/archive.zip', project='project_name', subject='subject_name')

Uploading directly to the archive is often undesirable, as you might want to inspect data before finalizing archival. XNAT has a specific intermediate storage - prearchive - giving a user a chance to review the incoming data. You can upload data to prearchive by specifying it as destination:

In [None]:
# session.services.import_('/path/to/archive.zip', project='project_name', subject='subject_name',  destination='/prearchive')

**NB**: run these cells only after connecting to your local XNAT installation and change the values to something that makes sense (correct paths, project and subject names, etc.)