The following code is included to show some of the challenges that made progress difficult at the start of this project. Naturally, as more time was spent on the project, these challenges started to become easier to handle and avoid altogether on a regular basis. Within the digits.py code, the exact model is built using the following layer calls from tf.keras.layers.

In [2]:
%run "..\code\digits.py"

"""
inputs = tf.keras.Input(shape=(N, N))
reshape = tf.keras.layers.Reshape((N, N, 1))(inputs)
features = tf.keras.layers.Conv2D(10, [3, 3], [1, 1], 'same', activation='relu')(reshape)
counts = tf.keras.layers.Conv2D(17, [N, N], [1, 1], 'valid', activation='relu')(features)
outputs = tf.keras.layers.Dense(10, activation='relu')(counts)
exact_model = tf.keras.Model(inputs=inputs, outputs=outputs, name="exact_model")
"""

Here we create a random image as a numpy array with shape (6,6).

In [8]:
rand_img = random_image()
print(rand_img)
print(rand_img.shape)

[[1. 1. 1. 1. 0. 0.]
 [0. 0. 0. 1. 0. 0.]
 [0. 1. 1. 1. 0. 0.]
 [0. 0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0. 0.]
 [0. 0. 1. 1. 0. 0.]]
(6, 6)


The model M is the exact model, which should take our image as input. However, calling the model predict() method produces errors. This is because the image has not been supplied with a _batch dimension_. To get the model to work, the image must be reshaped to shape (1,6,6).

In [9]:
M = make_exact_model()
M.predict(rand_img, verbose=0)



ValueError: in user code:

    File "c:\Users\m0ode\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\engine\training.py", line 1845, in predict_function  *
        return step_function(self, iterator)
    File "c:\Users\m0ode\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\engine\training.py", line 1834, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "c:\Users\m0ode\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\engine\training.py", line 1823, in run_step  **
        outputs = model.predict_step(data)
    File "c:\Users\m0ode\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\engine\training.py", line 1791, in predict_step
        return self(x, training=False)
    File "c:\Users\m0ode\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\utils\traceback_utils.py", line 67, in error_handler
        raise e.with_traceback(filtered_tb) from None
    File "c:\Users\m0ode\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\layers\reshaping\reshape.py", line 111, in _fix_unknown_dimension
        raise ValueError(msg)

    ValueError: Exception encountered when calling layer "reshape_6" (type Reshape).
    
    total size of new array must be unchanged, input_shape = [6], output_shape = [6, 6, 1]
    
    Call arguments received by layer "reshape_6" (type Reshape):
      • inputs=tf.Tensor(shape=(None, 6), dtype=float32)


In [10]:
M.predict(rand_img.reshape(1, N, N), verbose=0)

array([[[[0., 0., 0., 1., 0., 0., 0., 0., 0., 0.]]]], dtype=float32)

Furthermore, this input is immediately reshaped within the model to (1,6,6,1), where the last dimension is a _channel dimension_ for input to the convolution layer. This makes sense when convolution layers are used in sequence, as they are later in the model, but is not intuitive to a new user.

In [11]:
M1 = tf.keras.Model(inputs=M.input, outputs=M.layers[1].output)
M1.predict(rand_img.reshape(1, N, N), verbose=0).shape



(1, 6, 6, 1)

The model M2 has layers up to the first convolution layer. The convolution layer contains 12 channels, each with a (3,3) matrix as a weights kernel and a single value as the bias. In the function definition we specify the weights in the intuitive manner. However, the weights shape of (10,3,3) in the code below is not correct and inside the fucntion it is reshaped to (3,3,1,10). Again, the extra dimension with value 1 is the input channel dimension.

In [15]:
M2 = tf.keras.Model(inputs=M.input, outputs=M.layers[2].output)
weights2 = [[[-1, -1,  0], [-1,  1,  1], [-1, -1,  0]],
            [[ 0, -1, -1], [ 1,  1, -1], [ 0, -1, -1]],
            [[-1, -1, -1], [-1,  1, -1], [ 0,  1,  0]],
            [[ 0,  1,  0], [-1,  1, -1], [-1, -1, -1]],
            [[-1, -1,  0], [-1,  1,  1], [ 0,  1, -1]],
            [[ 0, -1, -1], [ 1,  1, -1], [-1,  1,  0]],
            [[ 0,  1, -1], [-1,  1,  1], [-1, -1,  0]],
            [[-1,  1,  0], [ 1,  1, -1], [ 0, -1, -1]],
            [[-1,  1, -1], [-1,  1,  1], [-1,  1, -1]],
            [[-1,  1, -1], [ 1,  1, -1], [-1,  1, -1]]]
bias2 = [-1, -1, -1, -1, -2, -2, -2, -2, -3, -3]
weights2 = np.array(weights2)
bias2 = np.array(bias2)
print(weights2.shape)
print(bias2.shape)
M2.layers[2].set_weights([weights2, bias2])

(10, 3, 3)
(10,)


ValueError: Layer conv2d_12 weight shape (3, 3, 1, 10) is not compatible with provided weight shape (10, 3, 3).

Rebuilding M2, we can see that organizing the shape as (3,3,1,10) has the effect that in order to view the first (3,3) kernel, we need to first take particular slices of the weights object. Furthermore, this must be reshaped again to display in a readable format that we can use to check for correctness.

In [22]:
M2 = tf.keras.Model(inputs=M.input, outputs=M.layers[2].output)
layer2_W = M2.layers[2].get_weights()[0][:,:,:,0]
print(layer2_W.reshape((3,3)))

[[[-1.]
  [-1.]
  [ 0.]]

 [[-1.]
  [ 1.]
  [ 1.]]

 [[-1.]
  [-1.]
  [ 0.]]]
[[-1. -1.  0.]
 [-1.  1.  1.]
 [-1. -1.  0.]]
