# Intensity variance issues across datasets: an exploration

Intensity may vary across different datasets, and it may vary differently on different tissues. In the simplest case scenario of intensity vairnace across daatasets some are simply lighter than others. What we could expect is more complicated. Some MRI machines may match each other in intensity on some materials i.e. air; but not others i.e. certain tissues. 
We can automatically set the "air" around a brain MRI to zero, however the question of matching intensities in the tissues remains,
This notebook represents initial approaches to the problem. An augemented pair of datasets of will be created, which do not match in intensity distribution, and then remapped.  

### Imports
The data will be processed using the libraries and modules below:

In [None]:
import os       # using operating system dependent functionality (folders)
import glob
import pandas as pd # data analysis and manipulation
import numpy as np    # numerical computing (manipulating and performing operations on arrays of data)
import copy     # Can Copy and Deepcopy files so original file is untouched.
from ipywidgets import IntSlider, Output
import ipywidgets as widgets
from IPython.display import display
import matplotlib.pyplot as plt
import SimpleITK as sitk
import skimage
import hashlib
import sys
sys.path.insert(0, '../') # path to functions
from cvasl import file_handler as fh # 
from cvasl import mold #
from cvasl import carve
from cvasl.file_handler import Config

### Load image files
Use the config pathways for the different datasets, then view one image as an example.

In [None]:
config = Config()
root_mri_directory = config.get_directory('root_mri_directory')

In [None]:
mri_pattern = os.path.join(root_mri_directory, '**/*.gz')
gz_files = glob.glob(mri_pattern, recursive=True)

In [None]:
gz_files

In [None]:
# an example path to an mrid brain .nii image:
t1_fn = gz_files[0]
# read the .nii image containing the volume with SimpleITK:
sitk_t1 = sitk.ReadImage(t1_fn)
# and access the numpy array:
t1 = sitk.GetArrayFromImage(sitk_t1)
# now display it
mold.SliceViewer(t1)

### Create augmented datasets
Here we will copy our base dataset to create two seperate datasets which we will change in terms of intensity values.

In [None]:
# just make two identical array sets
arrays_dataset_1 = []
arrays_dataset_2 = []
names = []
together = []
together_2 = []
for file in gz_files:
    read_file = sitk.ReadImage(file)
    arrayed_file = sitk.GetArrayFromImage(read_file)
    arrays_dataset_1.append(arrayed_file)
    arrays_dataset_2.append(arrayed_file)
    names.append(file)
    together.append((file, arrayed_file))
    together_2.append((file, arrayed_file))

In [None]:
#together[0]

In [None]:
#together[0][1].min(),together[0][1].max(), int(abs(together[0][1].min()- together[0][1].max()))

In [None]:
# and how many pixels?
together[0][1].shape[0]*together[0][1].shape[1]*together[0][1].shape[2]

In [None]:
# show example of first in array_dataset
plt.hist(together[0][1].ravel(),425,[-175,252])
plt.title(together[0][0])
plt.show()


OK, but let's see what scale these were all on, before we go further

In [None]:
for image in arrays_dataset_1:
    print(image.min(), image.max(), image.shape[0]*image.shape[1]*image.shape[2])

So our pixel values were set in floating points ranging from -177 to over 4000, and some images are very large. This richness of information is something we probably want to keep. 

In [None]:
# show example of first in array_dataset
for name, image in together:
    plt.hist(image.ravel(),bins=100,range=[image.min(),image.max()])
    plt.title(name)
    plt.show()


### Creating an artificially darker dataset, or dataset 'tanning' array_group_2,if you will

In [None]:
for name, image in together_2:
    image= skimage.exposure.rescale_intensity(image, out_range=(0, 256))
    plt.hist(image.ravel(),bins=100,range=[image.min(),image.max()])
    plt.title(name)
    plt.show()


In [None]:
mold.SliceViewer(together[3][1])

In [None]:
mold.SliceViewer(together_2[3][1])

We have a problem, slice viewer is rescaling the values for us... to be fixed before continuing notebook