# Data visualisation


We will be visualizing the collected data in the AI4AR study.

Our data consists of mpMRI images and annotations prepared by three radiologists, as well as clinical information and biopsy results with International Society of Urological Pathology (ISUP) Gleason grading for each lesion. The radiologists annotated the data with multiple imaging features for each lesion, using the assessment algorithm from the PI-RADS standard.

In order to gain a better understanding of the data and to identify any potential trends or patterns, we will be creating a variety of visualizations. These visualizations will include:
- Descriptive statistics: We will use histograms, box plots, and summary statistics to get a sense of the distribution and range of values for each of the variables in the dataset.
- Correlation plots: We will use scatter plots and heatmaps to investigate the relationships between different variables, such as the imaging features and the Gleason grade.
- Data distributions: We will use density plots and violin plots to visualize the distribution of values for each of the variables.
- Data comparisons: We will use bar plots, line plots, and box plots to compare the values of different variables across different groups or categories.

In selecting appropriate data visualization methods, we will consider the type of data (categorical or continuous), the number of variables being plotted, and the intended audience.

Let's begin by importing the necessary libraries and loading the data.

In [3]:
#!pip install --upgrade -e git+https://github.com/piotrsobecki/ai4ar-helper.git#egg=ai4ar

In [4]:
# Setup the notebook
%load_ext autoreload
%autoreload 2

# Add src to path
import sys 
import os 

if os.path.basename(os.getcwd()) != 'ai4ar-radiomics':
    os.chdir('..')

if 'src' not in sys.path:
    sys.path.append('src')


from config import config # For reading the config files

cfg = config(
    ('json', 'config/config.json', True),
    ('json', 'config/config-ext.json', True),
    ignore_missing_paths = True
)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


# AI4AR Helper package

In [5]:
import ai4ar

### Visualize data for the single case

On DataSet init, the images are not loaded

In [6]:
dataset = ai4ar.Dataset(cfg['data_dir'])

TypeError: 'function' object is not subscriptable

In [None]:
dataset.case_ids

In [None]:
case = dataset['001']
case.images_keys() # All images for the case

# Visualize single image

Images are cached after first access

In [None]:
import matplotlib.pyplot as plt

# T2W image
t2w_img = case.image('data/t2w') 
# Combined lesion annotations
combined_t2w_mask_img = case.image('lesion_labels/lesion1/t2w', combine=True, cache=False, combine_pp = ai4ar.required_agreement(1)) 

t2w = t2w_img.arr()
combined_t2w_mask = combined_t2w_mask_img.arr()

# Select the max slice (biggest mask)
slice = ai4ar.select_slice(combined_t2w_mask)

# Show the combined mask on top of the image and the original image on left 
fig, ax = plt.subplots(1, 2, figsize=(10, 5))
ax[0].imshow(t2w[slice], cmap='gray')
ax[1].imshow(t2w[slice], cmap='gray')

# Show the combined mask on top of the image and the original image on left 
# The mask is shown in red and the image in grayscale, with the mask being semi-transparent but the 0 values are not shown and do not affect the underlying image
ax[1].imshow(combined_t2w_mask[slice], cmap='jet', alpha=0.5, interpolation='none', aspect='auto')

# ax[1].imshow(combined_t2w_mask[slice], cmap='jet', alpha=0.5)

# Hide the axis labels
ax[0].axis('off')
ax[1].axis('off')

fig.tight_layout()
fig.show()

# Combine lesion mask

Sum all annotations for lesion1 on t2w image. It's the same for other modalities

In [None]:
import matplotlib.pyplot as plt

combined_t2w_mask = case.image('lesion_labels/lesion1/t2w', combine=True)
combined_t2w_mask = combined_t2w_mask.arr()
# Visualize the combined mask (select_slice is a helper function to select the slice with the most mask)
plt.imshow(combined_t2w_mask[ai4ar.select_slice(combined_t2w_mask)], cmap='gray')


Now it's cached

In [None]:
case.visualize('combined/')

# Visualize all images

In [None]:
case.visualize('anatomical_labels/')

In [None]:
case.visualize('data/')

In [None]:
case.visualize('lesion_labels/')