# Show Patient Trajectories

The purpose of this notebook is to show the path that the patient follows in the MRI machine. It is step to undertsanding whther or not I can synchronize images from differenet series.

## Objectives

**Description** | **Status** | **Remarks**
----------------------------------|--------|--------------------------------------
Display patient's trajectory|OK|All 4 series are shown; some series overlap.
Establish whether it will be possible to join different series by establishing a mapping between images|WIP|The plots suggest that this may be partly possible in cases where there are two series overlapping. 



In [None]:
from matplotlib.pyplot import axes, cm, figure, savefig,  title
from pydicom           import dcmread
from os                import sep, walk
from os.path           import join, normpath
from random            import sample

## Hyperparameters

In [None]:
class Hyperparameter:
    N = 12              # Number of patients to be shown     

## Build list of datasets to process

### Data model

The training and test datasets each consist of a collection of _studies_ each for a single patient; a study is identified by a _label_ consisting of 5 digits.  Each study has a _label_, and the objective is to make a prediction for each label (or study)--[David Roberts](https://www.kaggle.com/c/rsna-miccai-brain-tumor-radiogenomic-classification/discussion/252972#1387906).
Each study contains 4 series--[Reuben Schmidt](https://www.kaggle.com/c/rsna-miccai-brain-tumor-radiogenomic-classification/discussion/252972#1388006)

- Fluid Attenuated Inversion Recovery (FLAIR)
- T1-weighted pre-contrast (T1w)
- T1-weighted post-contrast (T1wCE)
- T2-weighted (T2w)

>in T2 images water is bright, and in T1 images fat is bright. In FLAIR, cerebrospinal fluid is dark but the rest of the image looks like T2


In [None]:
# Study
#
# This class represents the data from one MRI Study

class Study:
    FLAIR = 0
    T1w   = 1
    T1wCE = 2
    T2w   = 3
    
    series_id = { 'FLAIR' : FLAIR,
                  'T1w'   : T1w,
                  'T1wCE' : T1wCE,
                  'T2w'   : T2w}
           
    def __init__(self,image_id):
        self.image_id = image_id
        self.series   = [[], [], [], []]
     
    # add
    #
    # Add one file to specified series
    
    def add(self,series_id,path):
        self.series[series_id].append(path)
    
    # compress
    #
    # Remove files that consist of empty images from all series
    #
    # See https://www.kaggle.com/c/rsna-miccai-brain-tumor-radiogenomic-classification/discussion/252968
    
    def compress(self):
        def non_empty(path):
            return dcmread(path).pixel_array.sum()>0
        
        for i in range(len(self.series)):
            self.series[i] = [path for path in self.series[i] if non_empty(path)]
        
        
    def length(self):
        return [len(series) for series in self.series]

        
# parse_path
#
# Determine whether a file is test or training

def parse_path(path):
    train     = 0
    test      = 0
    image_id  = None
    series_id = None
    for folder in normpath(path).split(sep):
        if folder in Study.series_id:
            series_id = Study.series_id[folder]
        train += folder=='train'
        test  += folder=='test'
        if folder.isnumeric() and image_id == None:
            image_id = folder
    return path,train>0,test>0,image_id,series_id

# get_seq
#
# Extract sequence from file name, used to sort images

def get_seq(filename):
    base  = filename.split('.') # base -- Image-13.dcm -> Image-13
    parts = base[0].split('-')  # split into forst part + sequence
    try:
        return int(parts[-1])
    except ValueError:
        return filename
    
training_agenda = {}   # Data that needs to be processed for training
testing_agenda  = {}   # Data that needs to be processed for testing

for dirname, _, filenames in walk('/kaggle/input'):
    for filename in sorted(filenames,key=get_seq):
        full_path = join(dirname, filename)
        path,train,test,image_id,series_id = parse_path(full_path)
        if train:
            if image_id not in training_agenda:
                training_agenda[image_id]= Study(image_id)
            training_agenda[image_id].add(series_id,path)
        if test:
            if image_id not in testing_agenda:
                testing_agenda[image_id]= Study(image_id)
            testing_agenda[image_id].add(series_id,path)
  
    

## Plot stuff

1. Verify that I can read and plot an image
2. Establish criteria for identifying that in image is blank

In [None]:
# plot_study
#
# Display all images in study


# get_image_plane
#
# Convert the Image Orientation Patient tag cosine values into a text string of the plane.
# This represents the plane the image is 'closest to' .. it does not explain any obliqueness
# snarfed from https://www.kaggle.com/davidbroberts/determining-mr-image-planes
def get_image_plane(loc):

    row_x = round(loc[0])
    row_y = round(loc[1])
    row_z = round(loc[2])
    col_x = round(loc[3])
    col_y = round(loc[4])
    col_z = round(loc[5])

    if row_x == 1 and row_y == 0 and col_x == 0 and col_y == 0:
        return "Coronal"

    if row_x == 0 and row_y == 1 and col_x == 0 and col_y == 0:
        return "Sagittal"

    if row_x == 1 and row_y == 0 and col_x == 0 and col_y == 1:
        return "Axial"

    return "Unknown"

# plot_orbit
#
# Display all images in study

def plot_orbit(study):
    fig       = figure(figsize=(20,20))
    ax        = axes(projection='3d')

    for i,series in enumerate(study.series):
        xs = []
        ys = []
        zs = []
        s  = []
        for file_name in series:
            dcim = dcmread(file_name)
            xs.append(dcim.ImagePositionPatient[0] )
            ys.append(dcim.ImagePositionPatient[1] )
            zs.append(dcim.ImagePositionPatient[2] )
            s.append(10 if dcim.pixel_array.sum()> 0 else 1)
        ax.scatter(xs,ys,zs,
                   label = f'{dcim.SeriesDescription}: {dcim.PatientPosition} {get_image_plane(dcim.ImageOrientationPatient)}',
                   s     = s)
    ax.set_xlabel('X')
    ax.set_ylabel('Y')
    ax.set_zlabel('Z')

    title(dcim.PatientID)
    ax.legend()
    savefig(f'{dcim.PatientID}')

for study_id in sample(training_agenda.keys(),Hyperparameter.N):
    plot_orbit(training_agenda[study_id])

