In [None]:
# hide
import sys
sys.path.append("..")

# Fast radiomics
Implementing a Radiomics workflow into fastai.

## Conventional Radiomics Workflow
Radiomics is a method where qunatitative features are extracted from medical images, which have the potential to allow conclusions to disease characterics not appreciated by a human observer.
A typical radiomics workflow usually consits out of several steps.

### 1. Image segmentation
Especially for radiomics analysis for cancer, the area of interest is maually segmented before the anlysis. An example is given below, where a T2w MRI sequence of the prostate has been segmented for the two different zones.
![pixelwise prostate segmentation](../img/prostate_segmentation.PNG)

### 2. Feature extraction
In a second step quantitative imaging features are extracted for each labeled region and usually stored in a table. 

### 3. Feature cleaning
Features are analyzed and highly correlationg features are usually excluded to reduce dimensionality of the data and redundancy of the features. 

### 4. Classification
Using the extracted features several machine learning models can be trained. 

## Challenges in Radiomics
Radiomics models are difficult to train and suffer from low generalizability on new data. The first challenge is accurate feature cleaning. Selecting the final features on the whole data, including the test dataset, likely leads to overfitting of the models. Instead features should only be extracted in the training dataset alone not on the validation or the test dataset. Also, there is a lack of augmentation strategies. Depending on the device used to aquire the medical image, imaging parameters (radiation dose, field strength, distance to device), reconstruction parameters and vendor specific post processig algorithms may differ and influene the pixel values. Using the conventional radiomics approach, this cannot be easily corrected for. 

## Proposed solution
Impleneting feature extraction into a fastai callback could help slove the above mentioned issues, as before each training epoch images can be normalized and augmented and even feature selection can be repeated mutliple times. Leading to am overall more robust approach. However, this approach takes a lot of time, as feature extraction has to be repeated for each epoch leading to epoch times of 5 - 20 minutes with pur training time of only a few seconds. 
Another possibility would be to repeatedly extract all features at the beginning with different augmentations and then swap the different tables during training. 

In [None]:
from faimed3d.basics import *
from faimed3d.data import *
from faimed3d.augment import *

from fastai.basics import *
from fastai.vision.augment import *

from radiomics import featureextractor
import SimpleITK as sitk

import logging
# set level for all classes
logger = logging.getLogger("radiomics")
logger.setLevel(logging.ERROR)

In [None]:
torch.cuda.set_device(1)

In [None]:
prostate = pd.read_csv('../../dl-prostate-mapping/data/prostata-train.csv', sep = ',')

In [None]:
def restore_metadata(im, path):
    "restores metadata from original image"
    meta = TensorDicom3D.create(path).metadata
    im.metadata = meta
    return im

In [None]:
def extract_features(sitk_im, mask_path):
    feature_extractor = featureextractor.RadiomicsFeatureExtractor()
    result = feature_extractor.execute(sitk_im, mask_path)

    for k in list(result.keys()):
        if k.startswith('diagnostics'):
            del result[k]

#    features = [float(result[key]) for key in result]
    return pd.DataFrame.from_dict(result)

In [None]:
pipe = Pipeline([TensorDicom3D.create, 
                 RandomContrast3D(p=0.2),  
                 RandomNoise3D(p=0.2), 
                 RandomBrightness3D(p=0.2), 
                 RandomWarp3D(p=1., max_magnitude  = 0.2), 
                 torch.squeeze], 
                 split_idx = 0)

In [None]:
im = pipe(prostate.t2_dcm_path[0])
im = restore_metadata(im, prostate.t2_mask_base[0])
extract_features(im.as_sitk(), prostate.t2_mask_base[0])

ValueError: If using all scalar values, you must pass an index