# 2.1 Feature Extraction

**The first step is to import the image or the batch of images that we want to process. For the purpose of this project our masks (ROI) are already segmented.**

On average, Pyradiomics extracts ≈1500 features per image, which consist of the 16 shape descriptors and features extracted from original and derived images (LoG with 5 sigma levels, 1 level of Wavelet decomposistions yielding 8 derived images and images derived using Square, Square Root, Logarithm and Exponential filters).

Detailed description on feature classes and individual features is provided in section [Radiomic Features](https://pyradiomics.readthedocs.io/en/latest/features.html#radiomics-features-label) of the documentation.

This example shows how to use the radiomics package and the feature extractor.
The feature extractor handles preprocessing, and then calls the needed featureclasses to calculate the features.
It is also possible to directly instantiate the feature classes. However, this is not recommended for use outside debugging or development.

## 2.1.1 Single image

### Setting up data

First, import some built-in Python modules needed to get our testing data.
Second, import the toolbox, only the `featureextractor` is needed, this module handles the interaction with other parts of the toolbox.

Here we use `SimpleITK` (referenced as `sitk`, see http://www.simpleitk.org/ for details) to load two brain images and the corresponding segmentations as label maps.

In [23]:
%matplotlib inline
import matplotlib.pyplot as plt

# from __future__ import print_function
import six
import os  # needed navigate the system to get the input data
import SimpleITK as sitk
import logging
from radiomics import getTestCase
import radiomics
from radiomics import featureextractor, getFeatureClasses

Set up logging to a log file

In [5]:
# Get the PyRadiomics logger (default log-level = INFO)
logger = radiomics.logger
logger.setLevel(logging.DEBUG)  # set level to DEBUG to include debug log messages in log file

# Write out all log entries to a file
handler = logging.FileHandler(filename='testLog.txt', mode='w')
formatter = logging.Formatter('%(levelname)s:%(name)s: %(message)s')
handler.setFormatter(formatter)
logger.addHandler(handler)

In [29]:
# repositoryRoot points to the root of the repository. The following line gets that location if this Notebook is run
# from it's default location in \radiomics\thesis-paper\notebooks
data_path = os.path.join("..", "data", "pyradiomics_data")
print("Data Path: ", data_path)
imagepath_1, labelpath_1 = getTestCase('brain1', data_path)

if imagepath_1 is None or labelpath_1 is None:  # Something went wrong, in this case PyRadiomics will also log an error
    raise Exception('Error getting testcase!')  # Raise exception to prevent cells below from running in case of "run all"
    
image_1 = sitk.ReadImage(imagepath_1)
label_1 = sitk.ReadImage(labelpath_1)

Data Path:  ../data/pyradiomics_data


URLError: <urlopen error [Errno 110] Connection timed out>

### Show the image

Using `matplotlib.pyplot` (referenced as `plt`), display the images in grayscale and labels in color.

In [None]:
plt.figure(figsize=(20,20))
# First image
plt.subplot(2,2,1)
plt.imshow(sitk.GetArrayFromImage(image_1)[12,:,:], cmap="gray")
plt.title("Brain #1")
plt.subplot(2,2,2)
plt.imshow(sitk.GetArrayFromImage(label_1)[12,:,:])        
plt.title("Segmentation #1")

plt.show()

### Instantiating the extractor

Now that we have our input, we need to define the parameters and instantiate the extractor.
For this there are five possibilities:

**1. Use defaults, don't define custom settings**

```
extractor = featureextractor.RadiomicsFeatureExtractor()
```

**2. Define parameters in a dictionary**

```
# First define the settings
settings = {}
settings['binWidth'] = 20
settings['sigma'] = [1, 2, 3]

# Instantiate the extractor
extractor = featureextractor.RadiomicsFeatureExtractor(**settings)
```

**3. Define parameters in the constructor**

```
extractor = featureextractor.RadiomicsFeatureExtractor(binWidth=20, sigma=[1, 2, 3])
```

**4. Control filters and features after initialisation**

```
extractor.enableImageTypeByName('LoG')

# Disable all feature classes, save firstorder
extractor.disableAllFeatures()
extractor.enableFeatureClassByName('firstorder')

# Specify some additional features in the GLCM feature class
extractor.enableFeaturesByName(glcm=['Autocorrelation', 'Homogeneity1', 'SumSquares'])
```

**5. Use a parameter file**

```
paramsPath = os.path.join('..', 'examples', 'exampleSettings', 'Params.yaml')
extractor = featureextractor.RadiomicsFeatureExtractor(paramsPath)
```

In [28]:
# Instantiate the extractor
extractor = featureextractor.RadiomicsFeatureExtractor()

# Enable all features
extractor.enableAllFeatures()

# Alternative; only enable 'Mean' and 'Skewness' features in firstorder
# extractor.enableFeaturesByName(firstorder=['Mean', 'Skewness'])

print('Extraction parameters:')
for setting in extractor.settings.keys():
    print('\t' + setting)
    
print('\nEnabled filters:')
for image_type in extractor.enabledImagetypes.keys():
    print('\t' + image_type)
    
print('\nEnabled features:')
for feature in extractor.enabledFeatures.keys():
    print('\t' + feature)

Extraction parameters:
	minimumROIDimensions
	minimumROISize
	normalize
	normalizeScale
	removeOutliers
	resampledPixelSpacing
	interpolator
	preCrop
	padDistance
	distances
	force2D
	force2Ddimension
	resegmentRange
	label
	additionalInfo

Enabled filters:
	Original

Enabled features:
	firstorder
	glcm
	gldm
	glrlm
	glszm
	ngtdm
	shape
	shape2D


### Getting the docstrings of the active features


In [24]:
featureClasses = getFeatureClasses()
print('Active features:')
for cls, features in six.iteritems(extractor.enabledFeatures):
    if len(features) == 0:
        features = [f for f, deprecated in six.iteritems(featureClasses[cls].getFeatureNames()) if not deprecated]
    for f in features:
        print(f)
        print(getattr(featureClasses[cls], 'get%sFeatureValue' % f).__doc__)

Active features:
10Percentile

    **5. 10th percentile**

    The 10\ :sup:`th` percentile of :math:`\textbf{X}`
    
90Percentile

    **6. 90th percentile**

    The 90\ :sup:`th` percentile of :math:`\textbf{X}`
    
Energy

    **1. Energy**

    .. math::
      \textit{energy} = \displaystyle\sum^{N_p}_{i=1}{(\textbf{X}(i) + c)^2}

    Here, :math:`c` is optional value, defined by ``voxelArrayShift``, which shifts the intensities to prevent negative
    values in :math:`\textbf{X}`. This ensures that voxels with the lowest gray values contribute the least to Energy,
    instead of voxels with gray level intensity closest to 0.

    Energy is a measure of the magnitude of voxel values in an image. A larger values implies a greater sum of the
    squares of these values.

    .. note::
      This feature is volume-confounded, a larger value of :math:`c` increases the effect of volume-confounding.
    
Entropy

    **3. Entropy**

    .. math::
      \textit{entropy} = -\displaystyl

### Extract features
Now that we have our extractor set up with the correct parameters, we can start extracting features:

In [None]:
result = extractor.execute(imagePath, maskPath)

In [None]:
print('Result type:', type(result))  # result is returned in a Python ordered dictionary)
print('')
print('Calculated features')
for key, value in six.iteritems(result):
    print('\t', key, ':', value)