# Low Dimensional Convolutional Filters Using Harmonic Polynomials

We use the standard convolutional net for MNIST as outlined in [the convolutional tensorflow tutorial on MNIST](https://www.tensorflow.org/tutorials/estimators/cnn), except we restrict the dimensions of our filters using harmonic polynomials.

## Filter Dimension Restriction

The normal filter consists of a `5x5` patch for each input channel and output channel; i.e. a tensor of shape `(5, 5, n_input_channels, n_output_channels)`. Each `5x5` patch has dimension `25` when the coefficients of the patch are not restricted.

Here we reduce the dimensions of each `5x5` patch to a lower-dimensional sub-space, in particular we look at restricting each
`5x5` patch to have values in the space of harmonic polynomials of `x` and `y` with degree at most `3`. This is a `7`-dimesional space. This greatly reduces the number of trainable coefficients in the model.

### Why Harmonic?

We are interested in finding patches that are going to be good at picking out different types of edges. So they need a good
mix of positive and negative values. That is, they need to avoid having maxima/minima such as `p(x,y) = x**2 + y**2`. We also make the basis orthonormal.

Here is what the harmonic filter basis looks like:

![Othonormal Harmonic Filter Basis](files/graphs/orthonormal_harmonic_polys.png)

### How to Restrict Dimensions of Filters?

To restrict the filters we start with a basis of filters of shape `(5, 5, n_basis)`. We use a non-trainable Depthwise Convolution2D layer to get coefficients coming from this filter sub-space. Now, a Depthwise Convoution2D layer actually needs
filters for each input channel, so we actually make `n_input_channels` copies of the filter basis to make a tensor of shape
`(5, 5, n_input_channels, n_basis)`. This is the depthwise filter for the Depthwise Convoutional layer. The output of this
is of shape `(5, 5, n_basis * n_input_channels)`. The respective outputs of each input channel is grouped into contiguous
segments of size `n_basis` along the last axis.

To then train on the output of the (non-trainable) Depthwise Convolution2D Layer, we next add a regular Convolution2D layer with filter of shape `(1, 1, n_basis * n_output_channels, n_output_channels)`. This effectively creates a point-wise convolution over the coefficients from the restricted filter sub-space.

# Results

## The Effective Kernels

### First Convolution

Here are pictures of the effective kernels for the first 2D Convolution.

![Effective Kernels for First Convolution](files/graphs/effective_filters_1.png)

What do these filters do to the image inputs? Consider the following example image

![Example Digit Image](files/graphs/example_orig.svg)

Here are the graphs of the output of the first (effective) 2D Convolution with combined 2D Pooling:

![Output of First Effective 2D Convolution](files/graphs/conv_pool1.png)

### Second Convolution

Here are the effective kernels for the second 2D Convolution.

![Effective Kernels for Second Convolution](files/graphs/effective_filters_2.png)

# Description of Data Files

The dataset is from [http://yann.lecun.com/exdb/mnist/](http://yann.lecun.com/exdb/mnist/). The decription of the file formats
is also given there.

## Image File Format
```
[offset] [type]          [description] 
0000     32 bit integer  magic number 
0004     32 bit integer  number of images 
0008     32 bit integer  number of rows 
0012     32 bit integer  number of columns 
0016     unsigned byte   pixel 
0017     unsigned byte   pixel 
........ 
xxxx     unsigned byte    pixel
```

## Label File Format
```
[offset] [type]          [description] 
0000     32 bit integer  magic number (MSB first) 
0004     32 bit integer  number of items 
0008     unsigned byte   label 
0009     unsigned byte   label 
........ 
xxxx     unsigned byte   label
The labels values are 0 to 9.
```

In [None]:
import struct # To unpack string literals of bytes to integers.
import numpy as np
import matplotlib.pyplot as plt
plt.rcParams['svg.fonttype'] = 'none' # Saves space when saving svg plots to file.

import tensorflow as tf
import keras

In [None]:
X = {}
y = {}

# Get the Training Data

In [None]:
# Get the training images information.

with open('data/train-images.idx3-ubyte', 'rb') as f:
    _ = f.read(4) # Read the magic number.
    training_info = {name : f.read(4) for name in ['n_images', 'n_rows', 'n_columns']}
    print(training_info)
    # Make sure to enforce big-endian.
    training_info = {key : struct.unpack('>i', value)[0] for key, value in training_info.items()}
    print(training_info)
    
    images = np.fromfile(f, dtype = 'uint8')
    images = images.reshape(training_info['n_images'], training_info['n_rows'], training_info['n_columns'])
X['train'] = images.astype('float32') / 255

In [None]:
# Get the training labels.

with open('data/train-labels.idx1-ubyte', 'rb') as f:
    _ = f.read(4) # Read the magic number.
    training_info['n_labels'] = struct.unpack('>i', f.read(4))[0]
    print(training_info)
        
    labels = np.fromfile(f, dtype = 'uint8')
y['train'] = labels

In [None]:
sample_i = 3
plt.imshow(images[sample_i])
plt.title('Label = ' + str(y['train'][sample_i]))
plt.show()

In [None]:
np.unique(y['train'], return_counts = True)

# Get the Test Data

In [None]:
# Get the test images.

with open('data/t10k-images.idx3-ubyte', 'rb') as f:
    _ = f.read(4) # Read the magic number.
    testing_info = {name : f.read(4) for name in ['n_images', 'n_rows', 'n_columns']}
    print(testing_info)
    # Make sure to enforce big-endian.
    testing_info = {key : struct.unpack('>i', value)[0] for key, value in testing_info.items()}
    print(testing_info)
    
    images = np.fromfile(f, dtype = 'uint8')
    images = images.reshape(testing_info['n_images'], testing_info['n_rows'], testing_info['n_columns'])
X['test'] = images.astype('float32') / 255

In [None]:
# Get the training labels.

with open('data/t10k-labels.idx1-ubyte', 'rb') as f:
    _ = f.read(4) # Read the magic number.
    testing_info['n_lables'] = struct.unpack('>i', f.read(4))[0]
    print(testing_info)
        
    labels = np.fromfile(f, dtype = 'uint8')
y['test'] = labels

# Get the First Convolution Filter Space

In [None]:
def make_orthonormal(filter_basis):
    filter_basis = filter_basis.T.astype('float32')
    print(filter_basis.shape)
    for i, channel in enumerate(filter_basis):
        for normalized in filter_basis[:i]:
            dot = (channel * normalized).sum()
            channel = channel - dot * normalized
        filter_basis[i] = channel / np.linalg.norm(channel)
    return filter_basis.T

def make_filter_basis(shape_2d, fncs):
    '''
    Parameters
    ----------
    fncs : List of Functions
        The funcs f(x,y) to use to make the filter basis.
    '''
    n_channels = len(fncs)
    x_coord = np.full(shape_2d, np.arange(shape_2d[1]))
    y_coord = np.full((shape_2d[1], shape_2d[0]), np.arange(shape_2d[0])).T
    filter_base = np.array([x_coord, y_coord])
    
    basis = [f(filter_base) for f in fncs]
    basis = [x / np.linalg.norm(x) for x in basis]
    basis = np.array(basis).T.astype('float32')
    
    return basis

In [None]:
# Get harmonic polynomials.
# Use default-value for function currying inside list expressions.

max_degree = 3
fncs =  ([[lambda X : X[0]**0]] +
        [[lambda X, p = i : np.real((X[0] - 2 + 1j * (X[1] - 2))**p),
         lambda X, p = i : np.imag((X[0] - 2 + 1j * (X[1] - 2))**p)]
            for i in np.arange(1, max_degree + 1, 1)])

#Flatter the list of functions.
fncs = [a for inner in fncs
          for a in inner]

filter_basis_harmonic = make_filter_basis((5, 5), fncs)
print(filter_basis_harmonic.shape)

In [None]:
for i in range(filter_basis_harmonic.shape[-1]):
    plt.subplot(2, 4, i + 1)
    plt.imshow(filter_basis_harmonic[:, :, i])
    plt.title('Filter ' + str(i))
    plt.xticks([0, 2, 4])
    plt.yticks([0, 2, 4])
    ax = plt.gca()
plt.tight_layout()
plt.savefig('graphs/harmonic_polys.png')
plt.show()

In [None]:
filter_basis_orthonormal = make_orthonormal(filter_basis_harmonic)
for i in range(filter_basis_orthonormal.shape[-1]):
    plt.subplot(2, 4, i + 1)
    plt.imshow(filter_basis_orthonormal[:, :, i])
    plt.title('Filter ' + str(i))
    plt.xticks([0, 2, 4])
    plt.yticks([0, 2, 4])
    
plt.tight_layout()
plt.savefig('graphs/orthonormal_harmonic_polys.png')
plt.show()

## The Simple Harmonic Polynomial Filter Basis

![Simple Harmonic Polynomial Filters](graphs/harmonic_polys.png)

## The Orthonormal Harmonic Polynomial Filter Basis

![Orthonormal Harmonic Polynomial Filters](graphs/orthonormal_harmonic_polys.png)

# Build the neural network

The different equiaffine maps give us different channels for each image. Then we use 3d convolution to make sure that we apply
the same filter to each equiaffine map result.

In [None]:
# Function to make layers for low-dimensional convolution.

def make_low_dim_conv2d_layers(filter_basis, n_input_channels, n_filters, name):
    '''
    filter_basis : np.nd_array
        Has shape (filter_height, filter_width, n_basis)
    '''
    # Add axis for channels. Depthwise Conv2D will need a copy for each channel.
    filter_basis = filter_basis[..., np.newaxis, :]
    print('filter_basis.shape = ', filter_basis.shape)
    
    depthwise_filter = np.full(filter_basis.shape[:-2] + (n_input_channels,) + filter_basis.shape[-1:], filter_basis)
    print('depthwise_filter.shape = ', depthwise_filter.shape)
    depthwise_filter_init = tf.keras.initializers.Constant(depthwise_filter)
    layers = [tf.keras.layers.DepthwiseConv2D(kernel_size = filter_basis.shape[:2],
                                              depth_multiplier = filter_basis.shape[-1],
                                              padding = 'same',
                                              use_bias = False,
                                              depthwise_initializer = depthwise_filter_init,
                                              trainable = False,
                                              name = name + '_filter_space'),
             tf.keras.layers.Conv2D(kernel_size = (1, 1),
                                    padding = 'valid',
                                    filters = n_filters,
                                    activation = tf.nn.relu,
                                    name = name + '_pointwise_conv2D')]
    return layers

In [None]:
# Function to construct layers of the complete model.

def make_layers(filter_basis, n_filters, input_shape):
    
    layers = [tf.keras.layers.Reshape(input_shape = input_shape,
                                      target_shape = input_shape + (1,),
                                      name = 'Initial_Make_Channel')]
          
    layers += make_low_dim_conv2d_layers(filter_basis[0], 1, n_filters[0], 'low_dim_conv2d_1')

    layers += [tf.keras.layers.MaxPool2D(pool_size = (2, 2),
                                         strides = (2, 2),
                                         name = 'MaxPool_1')]

    layers += make_low_dim_conv2d_layers(filter_basis[1], n_filters[0], n_filters[1], 'low_dim_conv2d_2')

    layers += [tf.keras.layers.MaxPool2D(pool_size = (2, 2),
                                         strides = (2, 2),
                                         name = 'MaxPool_2'),
               tf.keras.layers.Reshape(target_shape = (n_filters[1] * (input_shape[0]//4) * (input_shape[1]//4),),
                                       name = 'Reshape_to_1D'),
               tf.keras.layers.Dense(units = (n_filters[1] * (input_shape[0]//4) * (input_shape[1]//4)) // 3,
                                     activation = tf.nn.relu,
                                     name = 'Dense_1'),
               tf.keras.layers.Dropout(rate = 0.4),
               tf.keras.layers.Dense(units = 10,
                                     activation = tf.nn.softmax,
                                     name = 'Class_Logits')
         ]
    return layers

In [None]:
harmonic_layers = make_layers(filter_basis = [filter_basis_orthonormal for _ in range(2)],
                              n_filters = [32, 64],
                              input_shape = X['train'].shape[1:])

In [None]:
filename = 'harmonic_poly_model.h5'
try:
    harmonic_model = tf.keras.models.load_model('saved_models/' + filename)
    print('Model automatically LOADED from file ' + filename)
except:
    print('File saved_models/' + filename + ' can\'t be opened. Rebuilding and retraining model.')
    harmonic_model = tf.keras.Sequential(harmonic_layers)
    harmonic_model.compile(optimizer = 'adam',
                           loss='sparse_categorical_crossentropy',
                           metrics=['accuracy'])
    harmonic_model.fit(X['train'], y['train'], epochs = 5)
    print('Model saved to saved_models/' + filename)
    harmonic_model.save('saved_models/' + filename)

In [None]:
test_loss, test_acc = harmonic_model.evaluate(X['test'], y['test'])

# Look at Final Kernels for Harmonic Polynomial Filters

In [None]:
# Combine the depthwise-kernel from the model with the pointwise-kernel to get the effective
# kernels for the first 2D Convolution.

depthwise_kernel = tf.keras.backend.eval(harmonic_model.layers[1].depthwise_kernel)
print('depthwise_kernel.shape = ', depthwise_kernel.shape)
pointwise_kernel = tf.keras.backend.eval(harmonic_model.layers[2].kernel)
print('pointwise_kernel.shape = ', pointwise_kernel.shape)
effective_kernel = np.dot(depthwise_kernel, pointwise_kernel[0, 0, ...])
print('effective_kernel.shape = ', effective_kernel.shape)

In [None]:
# Graph the effective kernels for the first 2D convolution.

fig = plt.figure(figsize = (15, 10))
for i in range(32):
    plt.subplot(4, 8, i + 1)
    plt.imshow(effective_kernel[:, :, 0, i])
    if i != 3 and i != 4:
        plt.title('Filter ' + str(i))
    if i != 0:
        plt.xticks([0, 2, 4])
        plt.yticks([0, 2, 4])

plt.suptitle('Effective Filters for First Convolution')
plt.tight_layout()
plt.savefig('graphs/effective_filters_1.png')
plt.show()

## Picture of the Effective Filters for the First Convolution

![The effective kernels of the first 2D Convolution](files/graphs/effective_filters_1.png)

In [None]:
# Get teh effective kernels for the second convolution.

depthwise_2 = tf.keras.backend.eval(harmonic_model.layers[4].depthwise_kernel)
print('depthwise_2.shape = ', depthwise_2.shape)
pointwise_2 = tf.keras.backend.eval(harmonic_model.layers[5].kernel)
print('pointwise_2.shape = ', pointwise_2.shape)
ind_pointwise = np.arange(pointwise_2.shape[2])
depthwise_2 = depthwise_2[:, :, ind_pointwise // 7, ind_pointwise % 7]
combination_2 = np.dot(depthwise_2, pointwise_2[0, 0, ...])
print('combination_2.shape = ', combination_2.shape)

In [None]:
# Graph the effective kernels for the second 2D convolution.

fig = plt.figure(figsize = (15, 7))
for i in range(64):
    plt.subplot(5, 13, i + 1)
    plt.imshow(combination_2[:, :, i])
    if i < 5 or i > 7:
        plt.title('Filter ' + str(i))
    ax = plt.gca()
    if i > 0:
        plt.xticks([])
        plt.yticks([])
    else:
        plt.xticks([0, 2, 4])
        plt.yticks([0, 2, 4])

plt.suptitle('Effective Kernels for Second 2D Convolution')
plt.tight_layout()
plt.savefig('graphs/effective_filters_2.png')
plt.show()

## Picture of Effective Kernels For Second Convolution

![Effective Kernels for Second 2D Convolution](files/graphs/effective_filters_2.png)

# Look at Output of Convolution Layers for Particular Example

In [None]:
# Graph the example.

example = X['test'][20]
plt.imshow(example)
ax = plt.gca()
plt.xticks(np.arange(0, 28, 10))
plt.yticks(np.arange(0, 28, 10))
plt.title('Original Example Input')
plt.savefig('graphs/example_orig.svg')
plt.show()

In [None]:
# Get the result of applying the first convolutional layers.

result = example.reshape(1, 28, 28, 1)
for i in range(4):
    result = harmonic_model.layers[i](result)
result = tf.keras.backend.eval(result)
print('result.shape = ', result.shape)

In [None]:
# Graph the results of the first 2D convolution.

fig = plt.figure(figsize = (15, 10))
for i in range(32):
    plt.subplot(4, 8, i + 1)
    plt.imshow(result[0, :, :, i])
    ax = plt.gca()
    
plt.suptitle('Result of First 2D Convolution For Example')
plt.tight_layout()
plt.savefig('graphs/conv_pool1.png')
plt.show()

## Example of Output for First Convolution and Pooling

The original input is:

![Original](files/graphs/example_orig.svg)

The output of the first convolution and pooling layers is:

![First Convolution and Pooling](files/graphs/conv_pool1.png)

# Use LDA of 5x5 sub-samples to pick out filter space

In [None]:
# Function to get 5x5 sub-samples from image data.

def make_sub_samples(images):
    images = images[:, 1:26, 1:26] # Drop extra
    i_ind = 5 * np.arange(5)[..., np.newaxis, np.newaxis, np.newaxis] + np.arange(5)[..., np.newaxis]
    j_ind = 5 * np.arange(5)[..., np.newaxis, np.newaxis] + np.arange(5)
    sub_samples = images[:, i_ind, j_ind]
    sub_samples = sub_samples.reshape(-1, 5, 5)
    return sub_samples

In [None]:
# Get the sub-samples from the training data.

sub_samples = make_sub_samples(X['train'])
sub_samples.shape

In [None]:
# Let's take a look at a sub-sample for a particular image.

plt.imshow(X['train'][0])
plt.title('Original Image')
plt.show()

plt.imshow(sub_samples[6])
plt.title('Sub-sample 6')
plt.show()

In [None]:
# Get the labels for the sub-samples.

sub_sample_labels = [label for label in y['train']
                           for _ in range(5 * 5)] # Make sure to repeat for sub-samples
sub_sample_labels = np.array(sub_sample_labels)
sub_sample_labels.shape

In [None]:
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis

In [None]:
# Train LDA on sub-samples data.

lda = LinearDiscriminantAnalysis()
lda.fit(sub_samples.reshape(-1, 25), sub_sample_labels)

In [None]:
# Use LDA to find a sub-space of filters for the first convolution.

# For comparision purposes, use the same dimension as before.
n_lda = 7 

# Reshape the lda coefficients to get basis.
filter_basis_lda = np.rollaxis(lda.coef_[:n_lda].reshape(-1, 5, 5), 0, 3) 
print(filter_basis_lda.shape)
plt.imshow(filter_basis_lda[:, :, 0])
plt.title('LDA Filter Basis Element 0')
plt.show()

In [None]:
# Graph all of the filters picked out by the LDA.

for i in range(filter_basis_lda.shape[-1]):
    plt.subplot(2, 4, i + 1)
    plt.imshow(filter_basis_lda[:, :, i])
    plt.title('Filter ' + str(i))

plt.suptitle('LDA Filter Basis')
plt.tight_layout()
plt.show()

In [None]:
# Get the layers for the model that uses the LDA filter basis for the first 2d convolution.

lda_layers = make_layers(filter_basis = [filter_basis_lda, filter_basis_harmonic],
                         n_filters = [32, 64],
                         input_shape = X['train'].shape[1:])

In [None]:
filename = 'low_dim_filter_lda.h5'
try:
    lda_model = tf.keras.models.load_model('saved_models/' + filename)
    print('Model automatically LOADED from file ' + filename)
except:
    print('File saved_models/' + filename + ' can\'t be opened. Rebuilding and retraining model.')
    lda_model = tf.keras.Sequential(lda_layers)
    lda_model.compile(optimizer = 'adam',
                      loss='sparse_categorical_crossentropy',
                      metrics=['accuracy'])
    lda_model.fit(X['train'], y['train'], epochs = 5)
    print('Model saved to saved_models/' + filename)
    lda_model.save('saved_models/' + filename)

In [None]:
test_loss, test_acc = lda_model.evaluate(X['test'], y['test'])