<a href="https://colab.research.google.com/github/ezpogue/CS220Project/blob/main/CS220Tests.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!git clone  https://github.com/ezpogue/CS220Project.git

Cloning into 'CS220Project'...
remote: Enumerating objects: 6, done.[K
remote: Counting objects: 100% (6/6), done.[K
remote: Compressing objects: 100% (4/4), done.[K
remote: Total 6 (delta 0), reused 0 (delta 0), pack-reused 0[K
Receiving objects: 100% (6/6), done.


#Information

This notebook is where we've been testing various methods of implementing approximate arithmetic into our Tensorflow neural networks. Almost all of the code in this notebook is either from the Tensorflow documentation directly or from ChatGPT/Google Bard.

#Set up

Set up tensorflow

In [None]:
import tensorflow as tf
print("TensorFlow version:", tf.__version__)

Load dataset, we are using the MNIST fashion dataset

In [None]:
mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

#Custom functions

Small custom matmul function to verify custom functions work

In [None]:
#Reference
def custom_matmul(a, b, transpose_a=False, transpose_b=False):
  return tf.matmul(a,b)

Attempt 1: This uses a custom matmul, which calls a custom reduce_sum, which calls a custom sum. Compared to the future attempts, this is a really roundabout way of achieving what we want, and it doesn't even work properly. This is the point we switched from ChatGPT generated functions to Google Bard generated functions.

In [None]:
import numpy as np

def custom_matmul(a, b):
    # Ensure the inner dimensions match for matrix multiplication
    assert a.shape[-1] == b.shape[0], "Inner dimensions do not match for matrix multiplication"

    # Get shapes of input matrices
    a_shape = tf.shape(a)
    b_shape = tf.shape(b)

    # Reshape the matrices to 2D
    a_reshaped = tf.reshape(a, [-1, a_shape[-1]])
    b_reshaped = tf.reshape(b, [b_shape[0], -1])

    # Transpose matrix b
    b_transposed = tf.transpose(b_reshaped)

    # Perform element-wise multiplication
    elementwise_product = tf.expand_dims(a_reshaped, 2) * tf.expand_dims(b_transposed, 0)

    # Sum along the inner dimension to get the result
    result = custom_reduce_sum(elementwise_product, axis=1)

    # Reshape the result back to the original shape
    result = tf.reshape(result, tf.concat([a_shape[:-1], b_shape[1:]], axis=0))

    return result


def custom_reduce_sum(input_tensor, axis=None, keepdims=False):
    # Convert axis to a list for more flexible handling
    if axis is None:
        axis = list(range(len(input_tensor.shape)))
    elif not isinstance(axis, list):
        axis = [axis]

    # Determine the dimensions to reduce
    reduce_dims = []
    for a in axis:
        reduce_dims.append(input_tensor.shape[a])

    # Perform reduction by manually summing along the specified axis
    result = input_tensor.numpy()  # Convert to NumPy array for simplicity

    for dim in sorted(reduce_dims, reverse=True):
        result = custom_sum(result, axis=dim, keepdims=keepdims)

    return tf.constant(result)

def custom_sum(array, axis=None, keepdims=False):
    if axis is None:
        # Sum over all elements if axis is None
        result = 0
        for element in np.nditer(array):
            result += element
        return np.array(result) if keepdims else result

    # If axis is specified, manually sum along the specified axis
    axis = axis if isinstance(axis, tuple) else (axis,)
    reduce_dims = sorted(axis, reverse=True)

    result = array.copy()
    for dim in reduce_dims:
        # Manually sum along the specified axis
        for i in range(result.shape[dim]):
            result = np.add.reduceat(result, indices=i, axis=dim, keepdims=True)

    return result if keepdims else np.squeeze(result)

Attempt 2: Basically manually doing matrix multiplication with the naive method. This implementation breaks in the update step, as sum sometimes becomes an empty list. We are not sure why this happens

In [None]:
#Almost works
#Sometimes sum = [] which breaks it
def custom_matmul(a, b, transpose_a=False, transpose_b=False):
  """
  Performs matrix multiplication with optional transpositions.

  Args:
    a: Tensor of shape (m, n).
    b: Tensor of shape (n, p).
    transpose_a: Boolean, whether to transpose a before multiplication.
    transpose_b: Boolean, whether to transpose b before multiplication.

  Returns:
    Tensor of shape (m, p).
  """

  a = tf.transpose(a) if transpose_a else a
  b = tf.transpose(b) if transpose_b else b

  #a = tf.cast(a, dtype=tf.float32)
  #b = tf.cast(b, dtype=tf.float32)

  m, n, p = tf.shape(a)[0], tf.shape(a)[1], tf.shape(b)[1]
  c = tf.zeros(shape=(m, p))

  for i in range(m):
    for j in range(p):
      sum = 0.0
      for k in range(n):
        sum += a[i, k] * b[k, j]
      updated_tensor = tf.tensor_scatter_nd_update(c, updates=sum, indices=[i,j], )
  return c

Attempt 3: Actually does work, but no approximation has been introduced. It basically does what matmul() does but slower

In [3]:
#Works but doesn't really tell you anything

def custom_matmul(a, b, transpose_a=False, transpose_b=False):
  """
  Performs matrix multiplication with optional transpositions.

  Args:
    a: Tensor of shape (m, n).
    b: Tensor of shape (n, p).
    transpose_a: Boolean, whether to transpose a before multiplication.
    transpose_b: Boolean, whether to transpose b before multiplication.

  Returns:
    Tensor of shape (m, p).
  """

  a = tf.transpose(a) if transpose_a else a
  b = tf.transpose(b) if transpose_b else b

  # Einsum summation for efficient matrix multiplication
  #return tf.einsum("mi,jk->mj", a, b)

  # Alternative using element-wise multiplication and reduction
  c = a[..., None] * b[None, ...]
  return tf.reduce_sum(c, axis=-1)

Attempt 4: Same as attempt 3, but with a custom reduce_sum function. This one breaks and we don't know why

In [None]:
#DOES NOT WORK
#Not sure the actual issue

def custom_matmul(a, b, transpose_a=False, transpose_b=False):
  """
  Performs matrix multiplication with optional transpositions.

  Args:
    a: Tensor of shape (m, n).
    b: Tensor of shape (n, p).
    transpose_a: Boolean, whether to transpose a before multiplication.
    transpose_b: Boolean, whether to transpose b before multiplication.

  Returns:
    Tensor of shape (m, p).
  """

  a = tf.transpose(a) if transpose_a else a
  b = tf.transpose(b) if transpose_b else b

  # Einsum summation for efficient matrix multiplication
  #return tf.einsum("mi,jk->mj", a, b)

  # Alternative using element-wise multiplication and reduction
  c = a[..., None] * b[None, ...]
  return tf.custom_reduce_sum(c, axis=-1)


#Does not work
def custom_reduce_sum(x, axis=None):
  sum_val = 0.0
  for i in range(x.shape[0]):
    for j in range(x.shape[1]):
      sum_val += x[i, j]
  # Adapt the loop based on your desired reduction axes
  return sum_val

Attempt 5: Same as attempt 3, but with an approximate multiplication function used instead of normal multiplication. This one doesn't work due to the Tensor to string conversion breaking

In [None]:
#DOES NOT WORK
#Tried approximating tensor multiplication instead of exact


def custom_matmul(a, b, transpose_a=False, transpose_b=False):
  """
  Performs matrix multiplication with optional transpositions.

  Args:
    a: Tensor of shape (m, n).
    b: Tensor of shape (n, p).
    transpose_a: Boolean, whether to transpose a before multiplication.
    transpose_b: Boolean, whether to transpose b before multiplication.

  Returns:
    Tensor of shape (m, p).
  """

  a = tf.transpose(a) if transpose_a else a
  b = tf.transpose(b) if transpose_b else b

  # Einsum summation for efficient matrix multiplication
  #return tf.einsum("mi,jk->mj", a, b)

  # Alternative using element-wise multiplication and reduction
  c = approximate_tensor_multiplier(a[..., None], b[None, ...])
  return tf.reduce_sum(c, axis=-1)


def approximate_tensor_multiplier(a, b, n_bits=4):
  """
  Approximates multiplication of two tensors using a simplified circuit-like approach.

  Args:
    a: Tensor of shape (m, n).
    b: Tensor of shape (n, p).
    n_bits: Number of bits for binary representation (higher values improve accuracy).

  Returns:
    Tensor of shape (m, p) representing the approximated product.
  """

  # Convert integer elements to binary strings with padding
  a_bin = tf.strings.as_string(tf.math.bitwise.bitwise_and(a, 2**n_bits - 1), base=2, padlen=n_bits)
  b_bin = tf.strings.as_string(tf.math.bitwise.bitwise_and(b, 2**n_bits - 1), base=2, padlen=n_bits)

  # Initialize partial product tensor with zeros
  product = tf.zeros_like(tf.matmul(a, b))

  # Iterate over columns of the second tensor
  for i in range(b.shape[1]):
    # Extract current bit column from the second tensor
    current_bit = tf.strings.substr(b_bin, i, 1)
    current_bit = tf.cast(tf.strings.to_number(current_bit), tf.bool)

    # Add shifted first tensor elements based on the current bit
    shifted_a = tf.bitwise.left_shift(a[:, i], tf.range(n_bits, dtype=tf.int32)[::-1])
    product += tf.where(current_bit, shifted_a, tf.zeros_like(shifted_a))

  # Apply sign based on original signs
  sign_matrix = tf.ones_like(product)
  condition = tf.math.logical_and(tf.math.reduce_any(tf.math.less(a, 0), axis=1), tf.math.reduce_any(tf.math.less(b, 0), axis=0))
  sign_matrix = tf.where(condition, -1.0 * sign_matrix, sign_matrix)
  product *= sign_matrix

  return product

#Models

The default neural network from the documentation

In [None]:
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10)
])

Same as default model, but replace the first dense layer with a custom layer. This one uses a rounded matrix multiplication using OR. This one missed the mark a bit since it calls matmul anyways

In [None]:
# Custom layer with custom OR operation
class CustomLogicLayer(tf.keras.layers.Layer):
    def __init__(self, units=32):
        super(CustomLogicLayer, self).__init__()
        self.units = units

    def build(self, input_shape):
        # Define trainable weights for the layer
        self.w = self.add_weight(
            shape=(input_shape[-1], self.units),
            initializer="random_normal",
            trainable=True,
        )
        self.b = self.add_weight(
            shape=(self.units,),
            initializer="random_normal",
            trainable=True,
        )

    def call(self, inputs):
        # Custom OR operation: Check if the result of the matrix multiplication
        # is greater than 0 or if the bias term is greater than 0
        result = tf.math.logical_or(tf.matmul(inputs, self.w) > 0, self.b > 0)

        # Convert boolean values to float32 (0.0 or 1.0)
        result = tf.cast(result, tf.float32)
        return result

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  CustomLogicLayer(units=128),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10)
])

Same as above but with just matrix multiplication

In [None]:
# Custom layer with stock operation
class CustomLogicLayer(tf.keras.layers.Layer):
    def __init__(self, units=32):
        super(CustomLogicLayer, self).__init__()
        self.units = units

    def build(self, input_shape):
        # Define trainable weights for the layer
        self.w = self.add_weight(
            shape=(input_shape[-1], self.units),
            initializer="random_normal",
            trainable=True,
        )
        self.b = self.add_weight(
            shape=(self.units,),
            initializer="random_normal",
            trainable=True,
        )

    def call(self, inputs):
        # Custom OR operation: Check if the result of the matrix multiplication
        # is greater than 0 or if the bias term is greater than 0
        result = tf.matmul(inputs, self.w)

        # Convert boolean values to float32 (0.0 or 1.0)
        result = tf.cast(result, tf.float32)
        return result

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  CustomLogicLayer(units=128),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10)
])

Same as above but with matmul replaced with custom_matmul. It uses whichever function we last compiled from the earlier section.

In [None]:
# Custom layer with custom matmul operation
class CustomLogicLayer(tf.keras.layers.Layer):
    def __init__(self, units=32):
        super(CustomLogicLayer, self).__init__()
        self.units = units

    def build(self, input_shape):
        # Define trainable weights for the layer
        self.w = self.add_weight(
            shape=(input_shape[-1], self.units),
            initializer="random_normal",
            trainable=True,
        )
        self.b = self.add_weight(
            shape=(self.units,),
            initializer="random_normal",
            trainable=True,
        )

    def call(self, inputs):
        # Custom OR operation: Check if the result of the matrix multiplication
        # is greater than 0 or if the bias term is greater than 0
        result = custom_matmul(inputs, self.w)

        # Convert boolean values to float32 (0.0 or 1.0)
        result = tf.cast(result, tf.float32)
        return result

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  CustomLogicLayer(units=128),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10)
])

#Model Compilation

Create logbits and convert to probabilities. Not entirely sure how this works



In [None]:
predictions = model(x_train[:1]).numpy()
predictions

In [None]:
tf.nn.softmax(predictions).numpy()

Define loss function

In [None]:
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

In [None]:
loss_fn(y_train[:1], predictions).numpy()

Compile the model

In [None]:
model.compile(optimizer='adam',
              loss=loss_fn,
              metrics=['accuracy'])

#Training and Evaluation

Fit model to our training set. Default epochs is 5

In [None]:
model.fit(x_train, y_train, epochs=5)

Evaluate model. The true default model comes out to about 97% accuracy

In [None]:
model.evaluate(x_test,  y_test, verbose=2)