Ultraleap Image Texture Prediction Model © Ultraleap Limited 2020

Licensed under the Ultraleap closed source licence agreement; you may not use this file except in compliance with the License.

A copy of this License is included with this download as a separate document. 

Alternatively, you may obtain a copy of the license from: https://www.ultraleap.com/closed-source-licence/

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

# The TexNet architecture and experiment code

This is the core code used for results presented in the paper, "Incorporating the Perception of Visual Roughness into the Design of Mid-Air Haptic Textures", which can be found at: https://dl.acm.org/doi/10.1145/3385955.3407927

In order to run this model there are some prerequisite Python libraries that you should have installed. The code was tested with version shown in brackets. If you do not have any of these libraries then use the commands below in your [Jupyter notebook](https://jupyter.org/install.html ""): 

In [None]:
import sys

In [None]:
!{sys.executable} -m pip install "tensorflow-gpu==2.1.0"
!{sys.executable} -m pip install "keras==2.3.1"
!{sys.executable} -m pip install "pandas==0.25.1"
!{sys.executable} -m pip install "scikit-learn==0.21.3"
!{sys.executable} -m pip install "scikit-image==0.15.0"
!{sys.executable} -m pip install "opencv-python==4.2.0"

## Implementation of the TexNet model
First of all let's import the necessary Python libraries.

In [None]:
import glob, os, itertools
import numpy as np
import pandas as pd
import sklearn.metrics as skm
import matplotlib.pyplot as plt
from keras.layers import Input, Dense, Conv2D, Flatten, MaxPooling2D, concatenate
from keras.models import Model
from keras.regularizers import l2
from keras.optimizers import Nadam, sgd
from skimage import transform
from sklearn.model_selection import train_test_split
from cv2 import imread
from skimage.feature.texture import greycomatrix

## Importing the data set
We now load the prepared .csv file that contains the list of images from the [Penn Haptic Texture ToolKit (HaTT)](https://repository.upenn.edu/meam_papers/299/, ""). *This image data base must be downloaded separately from the link provided and referenced at the location where you save it.* In order to use the correct .csv file, we provide an additional Jupyter notebook that produces the correctly formatted file. 

**Please run [DataProcessing.ipynb](../notebooks/DataProcessing.ipynb) first if you have not yet done so before continuing.**

Running the above notebook first will output a file with the file name:

*date_time_**texturedim**_**outlierstate**.csv*

In [None]:
# Uncomment this line and include link to file output via DataProcessing.ipynb, stored in 'input_data/' folder.
#file = *DataProcessing.ipynb output .csv file*

# Check if file or data frame passed to function.
if isinstance(file, pd.DataFrame):
    data_frame = file
if not os.path.exists(file):
    print("The file %s does not exist!" % file)
else:
    data_frame = pd.read_csv(r"{}".format(file), error_bad_lines=False)  # Create data frame from file.

## Model Parameters
Now that we have initialised out data set and imported it as a Pandas data frame, we need to set some global parameters for our training.

* **IMAGE_DIR** - the path to your downloaded HaTT image data set.
* **IMAGE_SIZE** - Associated image size for computation. From 32 - 1024. Output images are square.
* **TEST_TEXTURE_LIST** - If a specific list of test image textures is required then these can be input here. The output file from *'DataProcessing.ipynb'* must be queried to find image names.
* **PREDICTOR_VARIABLE** - Texture dimension to predict. Select from 'roughness', 'bumpiness', 'hardness', 'stickiness', 'warmness'. Can be median or mean: 'mean_{dimension}'.
* **EPOCHS** - How many epochs you require the model to train over. Default is 150.
* **BATCH_SIZE** - Set custom batch_size. Default is 1.

Using the above parameters we will be able to compile and train a model that matches those produced in our paper. You could also train the model to learn and predict different texture dimensions, such as bumpiness, or hardness.

In [None]:
IMAGE_DIR = "input_data/penn_images_hatt/*.bmp"
IMAGE_SIZE = 256
TEST_TEXTURE_LIST = ['denim_square', 'cork_square', 'plastic_mesh_2_square', 'brick_2_square', 'bubble_envelope_square', 'silk_1_square', 'paper_plate_2_square', 'metal_mesh_square', 'glitter_paper_square']
PREDICTOR_VARIABLE = "median_roughness"

EPOCHS = 150
BATCH_SIZE = 1

## Generating our different feature sets
This model utilises a number of different data inputs, from individual images, to computed Grey-Level Co-Occurrence Matrices (GLCMs), as well as Haralick features computed from each GLCM. The following functions produce various different sets of these data, ready to be utilised during training.

In [None]:
class ImageFeatures(object):
    """Feature computation class to generate optimal GLCMs and equivalent Haralick features.
    """

    def __init__(self, **args):
        """Initialise the ImageFeatures class and pass distance, angle values for computation of matrices.

        Parameters
        ----------
        distance: array_like, optional
            Integer type list of pixel pair distance offsets.
        angles: array_like, optional  - [0, 45, 90, 135]
            Integer type list of angles in degrees.
        """
        self.distance = args.get('distance')
        self.angle = args.get('angle')

    def convert_image(self, image, image_size=256, greyscale=True, resize=True):
        """Rescale images from initial input size to reduced size. Image should be square. Will be resized as square if
        optional argument 'image_size' is given.

        Parameters
        ----------
        image: path/to/file
            Image file path. .jpg, .png, or .bmp formatted.
        image_size: int, optional
            Size that image should be resized to. Default = 256.
        greyscale: bool, optional
            Sets whether image should be converted to greyscale or colour. Default = True
        resize: bool, optional
            If True, then image will be resized to the 'image_size' (w x h). Default = True.

        Returns
        -------
        converted_image: array_like, uint8
            Returns an image as a 2D array of grey level values (0 - 255) as type uint8. If greyscale is set to False,
            then returned image will be 3x2D arrays corresponding to RGB channels.
        """
        if greyscale:
            converted_image = imread(image, 0)
        else:
            converted_image = imread(image, 1)

        if resize and image_size is not None:
            converted_image = transform.resize(converted_image, output_shape=(image_size, image_size))
        else:
            print("Error: Image size not given. Please include a reshaped image size.")
        return (converted_image * 255).astype('uint8')

    def create_matrix(self, image, distance=None, angle=None, symmetric=False, normalise=False):
        """Produces a grey level co-occurence matrix based on input image, distance by which to compare pixel grey-level
        co-occurences, and corresponding angle.

        Parameters
        ----------
        image: array_like
            Having converted images into uint (preferably 256x256), pass as input to create matrix.
        distance: int, optional
            A specific distance value by which co-occurring grey-levels should be tallied across. Default = 1
        angle: int, optional
            A specific angle in degrees by which the matrix should be scanned across. Default = 0
        symmetric: bool, optional
            Determines whether output matrix is symmetric. Handled by sk-image. Default = False.
        normalise: bool, optional
            If true, matrix values are the probablities of co-occurring pixel grey-level values, where sum = 1.
            Handled by sk-image. Default = False.

        Returns
        -------
        matrix: 4D ndarray
            Output is a grey-level matrix, matrix[image_size, image_size, distance, angle] is returned. Matrix
            identifies where a given grey level value (0 - 255) occurs in comparison to an equivalent value for a given
            distance and angle. Output is uint32 if 'normalise' = True. Otherwise, float64. Handled by sk-image to
            reduce processing time.
        """
        if distance is None:
            distance = np.array([1])
        if angle is None:
            angle = np.array([0])
        return greycomatrix(image, distance, angle, symmetric=symmetric, normed=normalise)

    def create_haralick(self, matrix):
        """Function to compute 7 Haralick features used as factors to determine the level of underlying texture
        dimension contained within an image whose matrix has been calculated. We compute specific features related
        to different independent statistical properties that can be obtained from GLCMs.

        Matrices must be normalised in order to correctly calculate each Haralick feature.

        Contrast Group: This group identifies pixel co-occurrences in relation to their distance from the GLCM diagonal.
            contrast: float64, [0 - 10e6]
                'sum of squares variance'. Weights increase exponentially as values move away from diagonal. Therefore,
                an image that has a high contrast value signifies co-occurring pixel values occur far from the diagonal.
            homogeneity: float64, [0 - 1]
                Weights decrease exponentially away from the diagonal, meaning a matrix's homogeneity value is higher
                when its contrast is very low (close to diagonal). Therefore, images with very little variance will
                produce a homogeneity value that approaches 1.

        Orderliness Group: This group explains how 'regular' the pixel value differences are within a matrix.
            asm: float64, [0 - >1]
                'Angular Second Moment', if a matrix contains large numbers for only a few pixel co-occurrences, then
                asm will be high, indicating that the underlying texture is some repeated pattern (orderly). Conversely,
                if asm is low, then the underlying texture will be very randomised in changes in grey level.
            energy: float64, [0 - 1]
                This is the square root of asm.

        Descriptives Group: This group calculates descriptive statistics such as mean, stdev on the matrix entries, not
                            the image values themselves.
                mean: float64
                    matrix mean demonstrates the frequency of occurrence of one pixel value (j) being found across a
                    distance, and angle input value, at its (i) neighbour. For symmetric matrices, calculating mean
                    for i will be identical to j mean.
                stdev: float64
                    square root of the variance in terms of the dispersion of values around the calculated mean for
                    pixel co-occurrences.
                corr: float64
                    Correlation that identifies the linear dependency of grey levels on those of neighbouring pixels.
                    0 (uncorrelated) and 1 (perfectly correlated). This measure is independent of all other Haralick
                    features. High values denote high predictability of pixel relationships.
                cls_shade: float64
                    Cluster Shade measures the skewness and uniformity in the computed matrix. Higher values suggest
                    more asymmetry around the mean.
                cls_prom: float64
                    Cluster Prominence measures asymmetry in the matrix. Similar characteristics to Cluster Shade
                    (cls_shade).

        Refer to:
            R. M. Haralick, K. Shanmugam and I. Dinstein, "Textural Features for Image Classification,"
            in IEEE Transactions on Systems, Man, and Cybernetics, vol. SMC-3, no. 6, pp. 610-621, Nov. 1973,
            doi: 10.1109/TSMC.1973.4309314.

        Parameters
        ----------
        matrix: ndarray - 2D
            An input matrix array containing co-occurring grey-level values for an image. Must first be normalised.

        Returns
        -------
        features: ndarray
            Outputs a list of 9 Haralick features calculated from the input matrix.
        """
        # Initialise a mesh grid
        level = matrix.shape[0]
        I, J = np.ogrid[0:level, 0:level]
        ij = I * J

        # Compute Homogeneity weights and apply to matrix for homogeneity calculation.
        homo_weight = 1. / (1. + (I - J) ** 2)
        homo = np.apply_over_axes(np.sum, (matrix * homo_weight), axes=(0, 1))[0, 0]

        # Compute Contrast weights. Apply to matrix.
        con_weight = (I - J) ** 2
        contrast = np.apply_over_axes(np.sum, (matrix * con_weight), axes=(0, 1))[0, 0]

        # Compute ASM (angular second moment). Take square root for energy calculation.
        asm = np.apply_over_axes(np.sum, (matrix ** 2), axes=(0, 1))[0, 0]
        energy = np.sqrt(asm)

        k = np.arange(len(matrix))
        tk = np.arange(2 * (len(matrix)))
        p = matrix / matrix.sum()
        pravel = p.ravel()
        px = p.sum(0)
        py = p.sum(1)

        ux = np.dot(px, k)
        uy = np.dot(py, k)
        vx = np.dot(px, k ** 2) - ux ** 2
        vy = np.dot(py, k ** 2) - uy ** 2
        sx = np.sqrt(vx)
        sy = np.sqrt(vy)

        if sx == 0.0 or sy == 0.0:
            corr = 1.0
        else:
            # Compute Correlation value across matrix
            corr = (1. / sx / sy) * (np.dot(ij.ravel(), pravel) - ux * uy)

        px_plus_y = np.zeros(2 * len(matrix), np.double)
        px_minus_y = np.zeros(len(matrix), np.double)

        idx1 = np.arange(0, level * 2)
        idx2 = np.arange(0, level)

        tmp1 = np.array([np.array(I + J).reshape(-1, 1), np.array(matrix).reshape(-1, 1)])
        tmp2 = np.array([np.abs(I - J).reshape(-1, 1), np.array(matrix).reshape(-1, 1)])

        for i in idx1:
            px_plus_y[i] = tmp1[1][tmp1[0] == i].sum()

        for i in idx2:
            px_minus_y[i] = tmp2[1][tmp2[0] == i].sum()

        # Compute mean and standard deviation of matrix.
        mean = np.dot(tk, px_plus_y)
        stdev = np.sqrt(np.dot(tk ** 2, px_plus_y) - mean ** 2)

        # Calculate Cluster Shade and Prominence weights, then obtain values by applying across matrix.
        shade_weight = np.power((I + J - ux - uy), 3)
        prom_weight = np.power((I + J - ux - uy), 4)

        cls_shade = np.apply_over_axes(np.sum, (matrix * shade_weight), axes=(0, 1))[0, 0]
        cls_prom = np.apply_over_axes(np.sum, (matrix * prom_weight), axes=(0, 1))[0, 0]

        return np.array([homo, asm, contrast, energy, corr, mean, stdev, cls_shade, cls_prom])

    def compute_chi_sum(self, matrix):
        """This function computes a Chi-Square value for an input matrix. This value can be used to ascertain which
        particular distance value and angle used to calculate the matrix has enabled the underlying structure to be
        captured. High Chi-Square values will indicate that the input matrix has more succesfully captured the texture
        structure within an image.

        Parameters
        ----------
        matrix: ndarray - 2D
            An input matrix array containing co-occurring grey-level values for an image. Should be non-normalised.

        Returns
        -------
        chi_sum: float64
            Provides a float64 value of the computed Chi-Square value that indicates the goodness of fit of the matrix
            computed over a specific distance and angle, in relation to the underlying texture structure in the
            corresponding image.

        Refer to:
            Zucker, Steven W., and Demetri Terzopoulos. "Finding structure in co-occurrence matrices
            for texture analysis." Computer graphics and image processing 12, no. 3 (1980): 286-308.
        """

        # Calculate matrix row and column totals.
        row_matrix = np.repeat(matrix.sum(axis=0), matrix.shape[0], axis=0).reshape(matrix.shape[0], matrix.shape[1])
        col_matrix = np.repeat(matrix.sum(axis=1), matrix.shape[0], axis=0).reshape(matrix.shape[0], matrix.shape[1]).T

        # Calculate multiplied row and column totals then divide by the sum of the non-normalised matrix.
        matrix_rc_totals = (row_matrix * col_matrix) / matrix.sum()

        # Obtain the expected values for each cell in the matrix
        expected_values = (matrix - matrix_rc_totals) ** 2

        # Return the sum of all values where rows and columns sums are non zero.
        return np.sum(np.divide(expected_values, matrix_rc_totals, where=(matrix_rc_totals != 0)))

## Computing features and generating corresponding lists of variables.
We have generated individual functions that can produce our GLCMs and Haralick features. We can now pass in our image data set and create the corresponding features as we like.

In [None]:
class FeaturesLists:
    """This class is essentially a composite class of some functions in the ImageFeatures class. The functions in
    this class enable numerous images to be processed simultaneously, and subsequently, to generate optimal matrices for
    each image, along with additional Haralick features, and tuples of the optimal distance and angle values.
    """

    def __init__(self, image_dir, **args):
        """Initialise the class and provide a directory of images that should be processed.

        Parameters
        ----------
        image_dir: str, path
            A path that contains the associated images that should be processed.
        image_size: int, optional
            A specified image resize value.
        """
        self.image_dir = image_dir
        self.image_size = args.get('image_size')
        self.image_features = ImageFeatures()

    def create_image_list(self):
        """Function to read each image from the input directory passed to the class initialisation function.

        Returns
        -------
        image_list: ndarray
            A list of processed images in uint8 format. Calls convert_image in ImageFeatures.
        """
        image_list = []
        for image in glob.glob(self.image_dir):
            image_list.append(self.image_features.convert_image(image))
        return image_list

    def create_matrix_list(self, image_list, distances, angles):
        """This function will process the image list output from 'create_image_list' and subsequently calculate optimal
        matrices for each image, based on the list of distances and angles provided.

        Parameters
        ----------
        image_list: uint8, array_like
            A list of processed images in uint8 format.
        distances: int, array_like
            A list of integer value distances that matrices should be calculated for.
        angles: int, array_like
            A list of integer value angles in degrees, that each matrix should be computed across.

        Returns
        -------
        matrix_list: float64, 2D array
            A list of optimal matrix arrays that have captured the underlying texture in each image
            based on Chi-Square calculations.
        inputs_list: tuple of ints, array_like
            A list of tuples that contain the distance and angle combination that produced the highest Chi-Square value,
            highlighting the underlying structure in a texture image.
        haralick_list: 1d numpy array
            A list of corresponding Haralick features computed from an optimal matrix. The order of features is:
                Homogeneity
                Contrast
                Energy
                Correlation
                Mean
                Standard Deviation
                Cluster Shade
                Cluster Prominence
        """
        matrix_list = []
        inputs_list = []
        haralick_list = []
        chisum = np.zeros(shape=(len(distances), len(angles)))

        # Compute all possible permutations of distances and angles to calculate matrices.
        perms = list(itertools.product(np.arange(0, len(distances)), np.arange(0, len(angles))))

        for image in image_list:
            non_norm_mat = self.image_features.create_matrix(image, distances, angles)
            for i, perm in enumerate(perms):
                # Calculate each ChiSquare value for matrix with distance/angle combinations. Store best combination.
                chisum[perm[0], perm[1]] = self.image_features.compute_chi_sum(non_norm_mat[:, :, perm[0], perm[1]])
                max_chi = np.unravel_index(np.argmax(chisum, axis=None), chisum.shape)
                inputs_list.append(tuple((distances[max_chi[0]], angles[max_chi[1]])))

            # Calculate normalised matrix with optimal distance/angle values.
            optimal_matrix = self.image_features.create_matrix(image, distance=[distances[max_chi[0]]],
                                                               angle=[angles[max_chi[1]]], symmetric=False,
                                                               normalise=True)
            matrix_list.append(optimal_matrix)

            # Calculate haralick features from optimal matrix.
            haralick = self.image_features.create_haralick(optimal_matrix[:, :, 0, 0])
            haralick_list.append(haralick)

        # return each of the lists of optimal matrices, distance/angle combos, and haralick feature values.
        return matrix_list, inputs_list, haralick_list

## Splitting our data into training, validation, and test sets.
In addition to generating this new data, we must also separate our data set into train, test and validation sets. The following functions prepare our data as necessary.

In [None]:
class PrepareData(object):
    """This object class can be instantiated in order to conduct various splitting and rescaling requirements before our
    model is to be trained.
    """

    def __init__(self, test_texture_list=None, train_size=0.9, shuffle=True):
        """Initialise the class for the purpose of preparing our data.

        Parameters
        ----------
        test_texture_list: array_like(str), optional
            If the user would like to split the data set to exclude a specific set of images as a test set for model
            prediction, a list of corresponding string values can be provided. This must be the specific name for each
            texture. The initial data frame created from the DataProcessing.ipynb should be queried for these names.
        train_size: float, optional
            If a specific training data size is to be used then a value by which the input data should be split can be
            set.
        shuffle: bool, optional
            Sets whether data should be shuffled during the splitting into train, test and validation sets.
        """
        self.test_list = test_texture_list
        self.train_size = train_size
        self.shuffle = shuffle

    def append_feature_data(self, **args):
        """This function will concatenate the initial data frame from DataProcessing.ipynb with any of the additional
        feature data created in the previous steps, such as image arrays, matrices, haralick features.

        Parameters
        ----------
        data_frame: Pandas Dataframe, optional
            This should be the data frame obtained from running DataProcessing.ipynb.
        image_list: array_like, optional
            A list of corresponding images that should be processed by the model.
        matrix_list: array_like, optional
            This list should contain each matrix that has been computed for any image.
        haralick_list: array_like, optional
            A list of float64 values that contains the computed Haralick features for any of the images in the data set.

        Returns
        -------
        appended_df: Pandas Dataframe
            A concatenated data frame that contains the appended feature lists passed as input.
        """
        data_frame = args.get('data_frame')
        image_list = args.get('image_list')
        matrix_list = args.get('matrix_list')
        haralick_list = args.get('haralick_list')

        if data_frame is not None:
            features_df = pd.DataFrame()
        if image_list is not None:
            features_df['image_list'] = [image / 255 for image in image_list]
        if matrix_list is not None:
            features_df['matrix_list'] = matrix_list
        if haralick_list is not None:
            # Create Pandas DataFrame with columns specific for each Haralick feature.
            features_df['har_homo'] = [item[0] for item in haralick_list]
            features_df['har_contrast'] = [item[2] for item in haralick_list]
            features_df['har_energy'] = [item[3] for item in haralick_list]
            features_df['har_corr'] = [item[4] for item in haralick_list]
            features_df['har_mean'] = [item[5] for item in haralick_list]
            features_df['har_stdev'] = [item[6] for item in haralick_list]
            features_df['har_cls_shade'] = [item[7] for item in haralick_list]
            features_df['har_cls_prom'] = [item[8] for item in haralick_list]

        appended_df = pd.concat([data_frame, features_df], axis=1)
        appended_df.set_index('tex_name', drop=True, inplace=True)
        return appended_df

    def scale_data(self, data, rescale_min, rescale_max):
        """Helper function that will rescale any input data to a specified range.

        Parameters
        ----------
        data: array_like, Pandas Series
            A column from a Pandas DataFrame whose values should be rescaled.
        rescale_min, rescale_max: int
            Integer values by which the range of values in the 'data' input should be rescaled to.

        Returns
        -------
        rescaled_data: array_like, Pandas Series
            The rescaled column of data ranging from some given min - max value range.
        """
        col_max = data.max()
        col_min = data.min()
        return data.apply(lambda x: ((x - col_min) / (col_max - col_min)) * (rescale_max - rescale_min) + rescale_min)

    def scale_df(self, df, rescale_min, rescale_max):
        """Helper function to rescale all columns in a Pandas dataframe if this is required.

        Parameters
        ----------
        df: array_like, Pandas DataFrame
            A complete Pandas DataFrame whose values should be rescaled.
        rescale_min, rescale_max: int
            Integer values by which the range of values in the 'df' input should be rescaled to.

        Returns
        -------
        new_df: Pandas DataFrame
            A new data frame containing rescaled column data ranging from some given min - max value range.
        """
        new_df = pd.DataFrame(df, copy=True)
        for i in range(len(new_df.columns)):
            if new_df.iloc[:, i].dtypes == 'float64':
                new_df.iloc[:, i] = new_df.iloc[:, i].apply(
                    lambda x: ((x - new_df.iloc[:, i].min()) / (new_df.iloc[:, i].max() - new_df.iloc[:, i].min())) *
                              (rescale_max - rescale_min) + rescale_min)
        return new_df

    def reshape_array(self, data, shape):
        """Helper function that will reshape a given set of input data to some alternative shape.

        Parameters
        ----------
        data: array_like
            Some list or Pandas Series of data that should reshaped.
        shape: tuple(int)
            A particular shape by which the input data should be converted into.

        Returns
        -------
        reshaped_data: array_like
            The converted data based on the shape value given.
        """
        data = np.array(data.tolist())
        return np.reshape(data, newshape=shape)

    def split_for_training(self, data):
        """This function will split data input into separate train, validation and test sets.

        Parameters
        ----------
        data: array_like
            Pandas DataFrame of input data.

        Returns
        -------
        train, test, val: split Pandas data frames.
        """
        test, val = None, None

        if self.test_list is not None:
            train = data.drop(index=self.test_list)
            test = data.drop(train.index)
            train, val = train_test_split(train, test_size=len(self.test_list), shuffle=self.shuffle)
        else:
            train, test = train_test_split(data, test_size=1 - self.train_size, shuffle=self.shuffle)
            train, val = train_test_split(train, test_size=len(test), shuffle=self.shuffle)

        return train, test, val

## Preparing the model for training
We now need to create and prepare the model so that it can accept the different input features we may wish to use for training. This model is broken down into distinct architectures depending on the input data. For image and matrix training a CNN is used (TexNetConv2D), whereas an MLP (TexNetMLP) is used for Haralick feature data.

In [None]:
class TexNetModels(object):
    """The TexNet model class can be instantiated in order to initialise the different networks (CNN & MLP) for training
    on our different data types. In addition we can run the training step from this object too.
    """

    def __init__(self, **kwargs):
        """Initialise the TexNet model.

        Parameters
        ----------
        kwargs: dictionary of params.
            If the user wishes to prepare and train a model with the various different input features then this
            dict object can be used to state which parameters they would like to use.
        """
        self.train_i = kwargs.get('image_training')
        self.train_m = kwargs.get('matrix_training')
        self.train_h = kwargs.get('haralick_training')

        self.train_i_data = kwargs.get('train_image_data')
        self.train_m_data = kwargs.get('train_matrix_data')
        self.train_h_data = kwargs.get('train_haralick_data')

        self.test_i_data = kwargs.get('test_image_data')
        self.test_m_data = kwargs.get('test_matrix_data')
        self.test_h_data = kwargs.get('test_haralick_data')

        self.val_i_data = kwargs.get('val_image_data')
        self.val_m_data = kwargs.get('val_matrix_data')
        self.val_h_data = kwargs.get('val_haralick_data')

        self.train_t = kwargs.get('train_target')
        self.test_t = kwargs.get('test_target')
        self.val_t = kwargs.get('val_target')

    def texnet_conv2d(self, input_data):
        """ Architecture of the CNN model that can be trained on 2D input data (images and matrices).
        Architecture is 3 Keras Conv2D layers with window size of 7x7 and 3x3, 16 filters, 'relu' activations,
        and He normal kernel initialisation. Max pooling layers between each Conv2D layer, window size 4x4 and 2x2.
        Finally, a Dense layer with L2 kernel regularisation set at 0.005.

        Parameters
        ----------
        input_data: array_like
            Expects some 2D matrix input of shape 256x256.

        Returns
        -------
        texnet_conv2D: Keras Convolutional model.
        """
        _input = Input(shape=input_data[0].shape)  # Set input shape from length of first dimension in matrix input.

        # Initialise layers
        texnet_conv2D = Conv2D(16, (7, 7), activation='relu', padding='same', kernel_initializer='he_normal')(_input)
        texnet_conv2D = MaxPooling2D(pool_size=(4, 4))(texnet_conv2D)
        texnet_conv2D = Conv2D(16, (3, 3), activation='relu', padding='same', kernel_initializer='he_normal')(
            texnet_conv2D)
        texnet_conv2D = MaxPooling2D(pool_size=(2, 2))(texnet_conv2D)
        texnet_conv2D = Conv2D(16, (3, 3), activation='relu', padding='same', kernel_initializer='he_normal')(
            texnet_conv2D)
        texnet_conv2D = MaxPooling2D(pool_size=(2, 2))(texnet_conv2D)

        # Flatten and initialise Dense layer with 16 filters and L2 kernal regularisation.
        texnet_conv2D = Flatten()(texnet_conv2D)
        texnet_conv2D = Dense(16, activation='relu', kernel_regularizer=l2(0.005))(texnet_conv2D)
        texnet_conv2D = Model(inputs=_input, outputs=texnet_conv2D)
        return texnet_conv2D

    def texnet_mlp(self, input_data):
        """ Architecture of the Multi-Layer Perceptron that can be trained on Haralick feature data.
        Architecture is 2 Keras Dense layers with 16 filters, 'relu' activations,
        and He normal kernel initialisation. The second Dense layer features L2 kernel regularisation set at 0.01.

        Parameters
        ----------
        input_data: array_like
            Expects some 1D NumPy array.

        Returns
        -------
        tex_net_conv2D: Keras Convolutional model.
        """
        _input = Input(shape=(input_data.shape[1],))
        texnet_mlp = Dense(16, activation='relu')(_input)
        texnet_mlp = Dense(16, activation='relu', kernel_regularizer=l2(0.01))(texnet_mlp)
        texnet_mlp = Model(inputs=_input, outputs=texnet_mlp)
        return texnet_mlp

    def prepare_model(self):
        """ This function initialises the model architecture based on the selected input features.
        If only Haralick data is used then only the TexNetMLP model will be initialised. If more than one input feature
        is used then the model will be ensembled with a Keras concatenate layer before predictions are done via a final
        Dense layer.

        Returns
        -------
        model: Compiled model either single network (CNN or MLP), or some ensembled model comprised of both model
        architectures.
        """
        model_inputs = []
        model_outputs = []

        # Initialise TexNet Conv2D model if images are given as input.
        if self.train_i:
            image_model = self.texnet_conv2d(self.train_i_data)
            model_inputs.append(image_model.input)
            model_outputs.append(image_model.output)

        # Initialise TexNet Conv2D model if matrices are given as input.
        if self.train_m:
            matrix_model = self.texnet_conv2d(self.train_m_data)
            model_inputs.append(matrix_model.input)
            model_outputs.append(matrix_model.output)

        # Initialise TexNet MLP if Haralick features are given as input.
        if self.train_h:
            haralick_model = self.texnet_mlp(self.test_h_data)
            model_inputs.append(haralick_model.input)
            model_outputs.append(haralick_model.output)

        # If only a singular input feature is used then ensembling is not conducted.
        if len(model_outputs) == 1:
            final_layer = Dense(1, activation='sigmoid')(model_outputs[0])
            model = Model(inputs=model_inputs, outputs=final_layer)
        else:
            concatenate_model = concatenate(model_outputs)
            final_layer = Dense(len(model_outputs), activation='relu')(concatenate_model)
            final_layer = Dense(1, activation='sigmoid')(final_layer)
            model = Model(inputs=model_inputs, outputs=final_layer)

        if len(model_outputs) == 0:
            print(
                "Error: No models have been trained!"
                "Please set a training data set to 'True' so a model can be correctly compiled.")
        else:
            return model

    def train_model(self, tensor, epochs, batch_size=None, loss_function='mae', optimizer='adam'):
        """ This object will conduct the training step of the compiled model obtained from the 'prepare_model' function.

        Parameters
        ----------
        tensor: Tensorflow Keras model.
            Input should be the prepared model obtained from the 'prepare_model' function.
        epochs: int
            Number of epochs model should run through. Default = 150.
        batch_size: int
            How many different features should be passed through the model at one time. Default = 1.
        loss_function: string
            Which loss function the model should be minimising for. Default is Mean Absolute Error (mae).
        optimizer: string
            Which optimiser the model should be using during training.
            Default = 'adam' - Keras Nadam (Adam with Nesterov Momentum) lr = 0.0005.

        Returns
        -------
        train_data: array_like
            The split set of training data used during model training
        test_data: array_like
            The split set of test data
        val_data: array_list
            The split set of data for validation.
        tensor.fit: Tf Keras model
            Trained model over the input number of epochs.
        """
        if self.train_i is None and self.train_m is None and self.train_h is None:
            print("\n")
            print("No data as input! Select a training data set.")

        # Initialise optimizer
        if optimizer == 'adam':
            optimizer = Nadam(lr=0.0005)
        if optimizer == 'sgd':
            optimizer = SGD(lr=0.1, decay=1e-6, momentum=0.6)

        # Compile the model with the selected optimiser and loss function.
        tensor.compile(optimizer=optimizer, loss=loss_function)

        train_data = []
        test_data = []
        val_data = []

        if self.train_i:
            train_data.append(self.train_i_data)
            test_data.append(self.test_i_data)
            val_data.append(self.val_i_data)

        if self.train_m:
            train_data.append(self.train_m_data)
            test_data.append(self.test_m_data)
            val_data.append(self.val_m_data)

        if self.train_h:
            train_data.append(self.train_h_data)
            test_data.append(self.test_h_data)
            val_data.append(self.val_h_data)

        if batch_size is None:
            batch_size = len(self.train_i_data)

        # Return output trained model.
        return train_data, test_data, val_data, tensor.fit(train_data, self.train_t, batch_size=batch_size,
                                                           epochs=epochs, verbose=1,
                                                           validation_data=(val_data, self.val_t))

## Visualising the Performance of our Model
Now that the model can be prepared we define a set of useful functions for visualising how well our model performs.

In [None]:
class VisualisePerformance(object):
    """ This class offers a few of the visualisation functions that display the models performance.
    """

    def __init__(self, **kwargs):
        """ Initialise the object and provide the input data for display.

        kwargs: dict
            Provide a dictionary of the inputs selected for visualisation.
        """
        self.train_d = kwargs.get('train_data')
        self.test_d = kwargs.get('test_data')
        self.val_d = kwargs.get('val_data')
        self.train_t = kwargs.get('train_target')
        self.test_t = kwargs.get('test_target')
        self.val_t = kwargs.get('val_target')

    def predictions(self, compiled_model, batch_size=None):
        """ This function will compute the predicted values based on the set of
        test data passed to the class during initialisation.

        Parameters
        ----------
        compiled model: Tf Keras compiled model.
            This should be the model that was trained on output from the 'train_model' function in the 'TexNetModels'
            class.
        """
        return compiled_model.predict(self.test_d, batch_size=batch_size, verbose=1)

    def display_loss(self, prepared_model):
        """ Displays the various loss values obtained from the predictions made on train, test, and validation data
        sets.

        Parameters
        ----------
        prepared model: Tf Keras model.
            This should be the model that was trained on output from the 'prepare_model' function in the 'TexNetModels'
            class.
        """
        train_loss = prepared_model.evaluate(self.train_d, self.train_t, verbose=0)
        test_loss = prepared_model.evaluate(self.test_d, self.test_t, verbose=0)
        val_loss = prepared_model.evaluate(self.val_d, self.val_t, verbose=0)
        return ('Train Loss: {}'.format(train_loss),
                'Validation Loss: {}'.format(val_loss),
                'Test Loss: {}'.format(test_loss))

    def compute_accuracy_metrics(self, predicted):
        """ Displays the various errors quantities by comparing actual values with predictions made by the model.

        Parameters
        ----------
        predicted: array-like
            The predicted values returned from the 'predictions' function.

        Returns
        -------
        MAPE: float64
            Mean Absolute Percentage Error
        MAE: float64
            Mean Absolute Error
        MSE: float64
            Mean Squared Error
        RMSE: float64
            Root Mean Squared Error
        R2: float64
            R2 - Coefficient of determination.
        """
        mape = np.mean(np.abs(self.test_t.values[:, np.newaxis] - predicted / self.test_t.values[:, np.newaxis]))
        mae = skm.mean_absolute_error(self.test_t.values, predicted)
        mse = skm.mean_squared_error(self.test_t.values, predicted)
        rmse = np.sqrt(mse)
        r2 = skm.r2_score(self.test_t.values, predicted)
        return {'MAPE': mape, 'MAE': mae, 'MSE': mse, 'RMSE': rmse, 'R2': r2}

    def plot_loss(self, tensor):
        """ Plots loss over epochs for both training and validation data.

        Parameters
        ----------
        tensor: Tf Keras trained model.
            This should be the output trained model from 'train_model' function.

        Returns
        -------

        Plot of loss over epoch.
        """
        plt.plot(tensor.history['loss'])
        plt.plot(tensor.history['val_loss'])
        plt.title('Model Loss over Epochs')
        plt.ylabel('Loss')
        plt.xlabel('Epoch')
        plt.legend(['Train', 'Val'], loc='upper left')
        plt.show()

    def create_prediction_df(self, predicted):
        """ Generate a Pandas DataFrame that contains the associated texture dimension actual values and predicted values.

        Parameters
        ----------
        predicted: array_like
            The predicted values returned from the 'predictions' function.

        Returns
        -------
        prediction_df: Pandas DataFrame
        """
        prediction_df = pd.DataFrame(self.test_t)
        prediction_df['predicated_target'] = predicted
        prediction_df = prediction_df.sort_values(by=prediction_df.columns[0], ascending=True)

        return prediction_df

    def plot_predictions(self, data_frame, dimension):
        """ Generate a plot of predicted values against actual values.

        Parameters
        ----------

        data_frame: Pandas DataFrame
            Values obtained from 'create_prediction_df' function
        dimension: string
            Type of texture dimension plot is being computed for.
        """
        plt.style.use(u'seaborn-whitegrid')

        xdata = data_frame.index.format()
        ydata = data_frame.iloc[:, 0]
        y2data = data_frame.iloc[:, 1]

        fig, ax = plt.subplots(figsize=(12, 12))

        ax.plot(xdata, ydata, 'o--', alpha=0.7, label="Subjective {} Values".format(dimension), c='b')
        for x, y in zip(xdata, ydata):
            label = "{:.2f}".format(y)
            plt.annotate(label, (x, y), textcoords="offset points", xytext=(0, 2), ha='center')

        ax.plot(xdata, y2data, 'o-', alpha=0.7, label="Model Predicted {} Values".format(dimension), c='g')
        for x, y in zip(xdata, y2data):
            label = "{:.2f}".format(y)
            plt.annotate(label, (x, y), textcoords="offset points", xytext=(0, -2), ha='center')

        ax.grid(color='grey', linestyle='-', linewidth=0.25, alpha=0.5)
        ax.spines['top'].set_visible(False)
        ax.spines['right'].set_visible(False)

        ax.set_title('Subjective {}: Actual Values vs Model Predicted Values'.format(dimension))  # Set title.
        ax.set_xlabel('Image Texture', fontsize=18)
        ax.set_ylabel('{}'.format(dimension), fontsize=18)
        ax.margins(x=0.01)
        plt.ylim(0, 1)

        ax.legend(loc='upper left')
        plt.xticks(rotation=45)
        plt.show()

## Putting it All Together
We have defined the necessary objects that are needed to compile and train our model to make predictions. Now we shall go ahead and intialise everything so that we can make some predictions, and visualise the model's performance.

In [None]:
# Initialise our FeaturesLists object
feature_creation = FeaturesLists(IMAGE_DIR, image_size = IMAGE_SIZE)

In [None]:
# Generate our list of images
image_list = feature_creation.create_image_list()

In [None]:
# Create our list of matrices, obtain the optimal matrix input values, generate our Haralick features for each image.
matrices, inputs, haralick = feature_creation.create_matrix_list(image_list, [1, 2, 4, 6, 8, 10, 15, 20], [0, 45, 90, 135])

In [None]:
# Initialise our Prepare Data object and pass our list of test textures that we want to retain for evaluating the performance
# of our model.
prepare_data = PrepareData(TEST_TEXTURE_LIST)

In [None]:
# We initialise our data into a new data frame that will include additional columns containing our computed feature sets.
df = prepare_data.append_feature_data(data_frame = data_frame, image_list = image_list, matrix_list = matrices, haralick_list = haralick)

In [None]:
# We calculate the Log of cluster shade and prominence to better display the range of computed values.
for col in [col for col in df.columns if 'har_cls' in col]:
    df["log_{}".format(col)] = np.log(np.abs(df[col]))

# Values in Haralick data are then rescaled between 0 and 1 if their max is greater than 1.
for col in [col for col in df.columns if 'har_' in col and df[col].max() > 1]:
    df[col] = prepare_data.scale_data(df[col], 0, 1)

In [None]:
# Define which features we wish to use during training.
HARALICK_FEATURES = ['har_homo', 'har_corr', 'har_contrast', 'har_energy', 'har_mean', 'har_stdev', 'log_har_cls_prom', 'log_har_cls_shade']

In [None]:
# Create each of the separate train, test, and validation data sets.
train, test, val = prepare_data.split_for_training(df)

In [None]:
# Reshape the image and matrix data so that it can be passed to our model.
train_images = prepare_data.reshape_array(train.image_list, (len(train.image_list), train.image_list[0].shape[0], train.image_list[0].shape[1], 1))
test_images = prepare_data.reshape_array(test.image_list, (len(test.image_list), test.image_list[0].shape[0], test.image_list[0].shape[1], 1))
val_images = prepare_data.reshape_array(val.image_list, (len(val.image_list), val.image_list[0].shape[0], val.image_list[0].shape[1], 1))

train_matrix = prepare_data.reshape_array(train.matrix_list, (len(train.matrix_list), train.matrix_list[0].shape[0], train.matrix_list[0].shape[1], 1))
test_matrix = prepare_data.reshape_array(test.matrix_list, (len(test.matrix_list), test.matrix_list[0].shape[0], test.matrix_list[0].shape[1], 1))
val_matrix = prepare_data.reshape_array(val.matrix_list, (len(val.matrix_list), val.matrix_list[0].shape[0], val.matrix_list[0].shape[1], 1))

In [None]:
# Initialise our dictionary that defines which parameters we want to train on.
features = {
    'image_training':True,
    'matrix_training':True,
    'haralick_training':True,
    'train_image_data':train_images,
    'train_matrix_data':train_matrix,
    'train_haralick_data':train.loc[:, HARALICK_FEATURES],
    'test_image_data':test_images,
    'test_matrix_data':test_matrix,
    'test_haralick_data':test.loc[:, HARALICK_FEATURES],
    'val_image_data':val_images,
    'val_matrix_data':val_matrix,
    'val_haralick_data':val.loc[:, HARALICK_FEATURES],
    'train_target':train[PREDICTOR_VARIABLE] / 100, # Divide predictor variable values by 100
    'test_target':test[PREDICTOR_VARIABLE] / 100, 
    'val_target':val[PREDICTOR_VARIABLE] / 100
}

In [None]:
# Initilise our TexNetModels object and pass to it our feature key word argument dictionary.
texNet = TexNetModels(**features)

In [None]:
# Prepare the model for training
prepared_model = texNet.prepare_model()

In [None]:
# Train our model, and produce the split train, test, and validation data sets.
train_data, test_data, val_data, tensor_history = texNet.train_model(prepared_model, EPOCHS, BATCH_SIZE)

In [None]:
# Initiliase key word args for visualising model performance.
visualisation_data = {
    'train_data':train_data,
    'test_data':test_data,
    'val_data':val_data,
    'train_target':train[PREDICTOR_VARIABLE] / 100, # Divide predictor variable values by 100
    'test_target':test[PREDICTOR_VARIABLE] / 100, 
    'val_target':val[PREDICTOR_VARIABLE] / 100
}

In [None]:
# Initialise the VisualisePerformance object and pass to it the dictionary of key word params.
performance_visualiser = VisualisePerformance(**visualisation_data)

In [None]:
# Generate predictions from our trained model.
predictions = performance_visualiser.predictions(prepared_model)

In [None]:
# Display loss values for each of our data sets.
performance_visualiser.display_loss(prepared_model)

In [None]:
# Display accuracy metrics for predictions by our model.
performance_visualiser.compute_accuracy_metrics(predictions)

In [None]:
# Display loss function over no. of epochs.
performance_visualiser.plot_loss(tensor_history)

In [None]:
# Create a Pandas Data Frame that contains our predictions and real values in ascending order.
predictions_df = performance_visualiser.create_prediction_df(predictions)
predictions_df

In [None]:
# Plot the performance of our predictions as a line graph in comparison to our actual values.
performance_visualiser.plot_predictions(predictions_df, PREDICTOR_VARIABLE)