# Lesson 2: Neural Net Computations

In the previous assignment we used Keras to train a neural network. In this assignment you will build your own minimal neural net library. The basic structure is given to you; you will need to fill in details such as weight updating for backpropogation. Then you will test the network on learning the XOR function.

Read through the class definitions below first to understand the basic architecture.

Then you should add code as necessary where marked "TODO" in the code below and remove the NotImplementedError exceptions.

## Reference 

- [Toward Data Science](https://towardsdatascience.com/coding-neural-network-forward-propagation-and-backpropagtion-ccf8cf369f76)
- [Derivative Sigmoid](https://beckernick.github.io/sigmoid-derivative-neural-network/)
- [Back Propagation](https://ml-cheatsheet.readthedocs.io/en/latest/backpropagation.html)



In [15]:
import numpy as np

## Define a Neural Network Class

In [16]:
class NNet():
    """Implements a basic feedforward neural network."""
    
    def __init__(self):
        self._layers = []  # An ordered list of layers. The first layer is the input; the final is the output.
    
    def _add_layer(self, layer):
        if self._layers:
            # Update pointers. We keep a doubly-linked-list of layers for convenience.
            prev_layer = self._layers[-1]
            prev_layer.set_next_layer(layer)
            layer.set_prev_layer(prev_layer)
            
        self._layers.append(layer)
    
    def add_input_layer(self, size, **kwargs):
        assert type(size).__name__ == 'int', ('Input layer requires integer size. Type was %s instead.' 
                                              % type(size).__name__)
        layer = InputLayer(size=size, **kwargs)
        self._add_layer(layer)

    def add_dense_layer(self, size, **kwargs):
        assert type(size).__name__ == 'int', ('Dense layer requires integer size. Type was %s instead.' 
                                              % type(size).__name__)
        # Find the previous layer's size.
        prev_size = self._layers[-1].size()
        layer = DenseLayer(shape=(prev_size, size), **kwargs)
        self._add_layer(layer)

    def summary(self, verbose=False):
        """Prints a description of the model."""
        for i, layer in enumerate(self._layers):
            print('%d: %s' % (i, str(layer)))
            if verbose:
                print('weights:', layer.get_weights())
                if layer._use_bias:
                    print('bias:', layer._bias)
                print()

        
    def predict(self, x):
        """Given an input vector x, run it through the neural network and return the output vector."""
        assert isinstance(x, np.ndarray)
        
        # TODO
        raise NotImplementedError()

        
    def train_single_example(self, X_data, y_data, learning_rate=0.01):
        """Train on a single example. X_data and y_data must be numpy arrays."""
        
        assert isinstance(X_data, np.ndarray)
        assert isinstance(y_data, np.ndarray)

        # Forward propagation.
        
        # TODO
        
        
        # Backpropagation.
        
        # TODO
        raise NotImplementedError()

    

    def train(self, X_data, y_data, learning_rate, num_epochs, randomize=True, verbose=True, print_every_n=100):
        """Both X_data and y_data should be ndarrays. One example per row.
        
        This function takes the data and learning rate, and trains the network for num_epochs passes over the 
        complete data set. 
        
        If randomize==True, the X_data and y_data should be randomized at the start of each epoch. Of course,
        matching X,y pairs should have matching indices after randomization, to avoid scrambling the dataset.
        (E.g., a set of indices should be randomized once and then applied to both X and y data.)
        
        If verbose==True, will print a status report every print_every_n epochs with these
        results:
        
        * Results of running "predict" on each example in the training set
        * MSE (mean squared error) on the dataset
        * Accuracy on the dataset
        """
        assert isinstance(X_data, np.ndarray)
        assert isinstance(y_data, np.ndarray)
        assert X_data.shape[0] == y_data.shape[0]

        # TODO
        raise NotImplementedError()

    
    def compute_mean_squared_error(self, X_data, y_data):
        """Given input X_data and target y_data, compute and return the mean squared error."""
        assert isinstance(X_data, np.ndarray)
        assert isinstance(y_data, np.ndarray)
        assert X_data.shape[0] == y_data.shape[0]
        
        mse = 0
        
        # TODO
        raise NotImplementedError()

        
        return mse
    
    def compute_accuracy(self, X_data, y_data):
        """Given input X_data and target y_data, convert outputs to binary using a threshold of 0.5
        and return the accuracy: # examples correct / total # examples."""
        assert isinstance(X_data, np.ndarray)
        assert isinstance(y_data, np.ndarray)
        assert X_data.shape[0] == y_data.shape[0]
        
        correct = 0
        for i in range(len(X_data)):
            outputs = self.predict(X_data[i])
            outputs = outputs > 0.5
            if outputs == y_data[i]:
                correct += 1
        acc = float(correct) / len(X_data)
        return acc

## Define activation functions

In [17]:
class Activation():  # Do not edit; update derived classes.
    """Base class that represents an activation function and knows how to take its own derivative."""
    def __init__(self, name):
        self.name = name
    
    def activate(x):
        """x is a scalar or a numpy array. Returns the output y, the result of applying the function to input x."""
        raise NotImplementedError()
    
    def derivative_given_y(self, y):
        """y is a scalar or a numpy array. 
        
        Returns the derivative d(f)/dx given the *activation* value y."""
        raise NotImplementedError()

In [41]:
class IdentityActivation(Activation):
    """Activation function that passes input through unchanged."""
    
    def __init__(self):
        super().__init__(name='Identity')
    
    def activate(self, x):
        """x is a scalar or a numpy array. Returns the output y, the result of applying the function to input x."""
        return x
    
    def derivative_given_y(self, y):
        """y is a scalar or a numpy array. 
        
        Returns the derivative d(f)/dx given the *activation* value y."""
        return 1
    
    
class SigmoidActivation(Activation):
    """Sigmoid activation function."""

    def __init__(self):
        super().__init__(name='Sigmoid')
    
    def activate(self, x):
        """x is a scalar or a numpy array. Returns the output y, the result of applying the function to input x."""
        # Done
        def sigmoid(z):
            return 1 / (1 + np.exp(-z))
        
        if type(x).__name__ == 'ndarray': 
            return np.array(list(map(sigmoid, x)))
        
        return sigmoid(x)
        
        #raise NotImplementedError()

    
    def derivative_given_y(self, y):
        """y is a scalar or a numpy array. 
        
        Returns the derivative d(f)/dx given the *activation* value y."""
        # Done
        S = self.activate(y)
        return S * (1 - S)
        #raise NotImplementedError()
        


In [45]:
X_data = np.array([[0,0],[1,0],[0,1],[1,1]])

print(SigmoidActivation().activate(X_data))
print(SigmoidActivation().activate(1))
print(SigmoidActivation().derivative_given_y(X_data))
print(SigmoidActivation().derivative_given_y(1))

[[0.5        0.5       ]
 [0.73105858 0.5       ]
 [0.5        0.73105858]
 [0.73105858 0.73105858]]
0.7310585786300049
[[0.25       0.25      ]
 [0.19661193 0.25      ]
 [0.25       0.19661193]
 [0.19661193 0.19661193]]
0.19661193324148185


## Define a method to initialize neural net weights

In [25]:
from random import uniform 

def WeightInitializer():
    """Function to return a random weight. for example, return a random float from -1 to 1."""
    #Done
    #raise NotImplementedError()
    return uniform(-1,1)


## Define a neural net Layer base class

In [20]:
class Layer():
    """Base class for NNet layers. DO NOT MODIFY THIS CLASS. Update derived classes instead.
    
    Conceptually, in this library a Layer consists at a high level of:
      * a collection of weights (a 2D numpy array)
      * the output nodes that come after the weights above
      * the activation function that is applied to the summed signals in these output nodes
      
    So a Layer isn't just nodes -- it's weights as well as nodes.
      
    Specifically, to send signal forward through a 3-layer network, we start with an Input Layer that does
    very little.  The outputs from the Input layer are simply the fed-in input data.  
    
    Then, the next layer will be a Dense layer that holds the weights from the Input layer to the first hidden
    layer and stores the activation function to be used after doing a product of weights and Input-Layer
    outputs.
    
    Finally, another Dense layer will hold the weights from the hidden to the output layer nodes, and stores
    the activation function to be applied to the final output nodes.
    
    For a typical 1-hidden layer network, then, we would have 1 Input layer and 2 Dense layers.
    
    Each Layer also has funcitons to perform the forward-pass and backpropagation steps for the weights/nodes
    associated with the layer.
    
    Finally, each Layer stores pointers to the pervious and next layers, for convenience when implementing
    backprop.
    """
   
    def __init__(self, shape, use_bias, activation_function=IdentityActivation, weight_initializer=None, name=''):
        # These are the weights from the *previous* layer to the current layer.
        self._weights = None
        
        # Tuple of (# inputs, # outputs) for Dense layers or just a scalar for an input layer.
        assert type(shape).__name__ == 'int' or type(shape).__name__ == 'tuple', (
            'shape must be scalar or a 2-element tuple')
        if type(shape).__name__ == 'tuple':
            assert len(shape)==2, 'shape must be 2-dimensional. Was %d instead' % len(shape)
        self._shape = shape 
    
        # True to use a bias node that inputs to each node in this layer; False otherwise.
        self._use_bias = use_bias
        
        if use_bias:
            bias_size = shape[-1] if len(shape) > 1 else shape
            self._bias = np.zeros(bias_size)
            if weight_initializer:
                for i in range(bias_size):
                    self._bias[i] = weight_initializer()
        
        # Activation function to be applied to each dot product of weights with inputs.
        # Instantiate an object of this class.
        self._activation_function = activation_function() if activation_function else None
        
        # Method used to initialize the weights in this Layer at creation time.
        self._weight_initializer = weight_initializer
        
        # Layer name (optional)
        self._name = name
        
        # Calculated output vector from the most recent feed_forward(inputs) call.
        self._outputs = None
        
        # Doubly linked list pointers to neighbor layers.
        self._prev_layer = None  # Previous layer is closer to (or is) the input layer.
        self._next_layer = None  # Next layer is closer to (or is) the output layer.
    
    def set_prev_layer(self, layer):
        """Set pointer to the previous layer."""
        self._prev_layer = layer
    
    def set_next_layer(self, layer):
        """Set pointer to the next layer."""
        self._next_layer = layer
    
    def size(self):
        """Number of nodes in this layer."""
        if type(self._shape).__name__ == 'tuple':
            return self._shape[-1]
        else:
            return self._shape
        
    def get_weights(self):
        """Return a numpy array of the weights for inputs to this layer."""
        return self._weights
    
    def get_bias(self):
        """Return a numpy array of the bias for nodes in this layer."""
        return self._bias
    
    def feed_forward(self, inputs):
        """Feed the given inputs through the input weights and activation function, and set the outputs vector.
        
        Also returns the outputs vector for convenience."""
        raise NotImplementedError()
        
    def backpropagate(self, error, learning_rate):
        """Adjusts the weights coming into this layer based on the given output error vector.
        
        For the output layer, the "error" vector should be a list of output errors, y_k - t_k.
        For a hidden layer, the "error" vector should be a list of the delta values from the following layer, such as delta_z_k
        
        Returns a list of the delta values for each node in this layer. These deltas can be used as the error
        values when calling backpropagate on the previous layer."""
        raise NotimplementedError()
        
    def __str__(self):
        activation_fxn_name = self._activation_function.name if self._activation_function else None
        return '[%s] shape %s, use_bias=%s, activation=%s' % (self._name, self._shape, self._use_bias,
                                                              activation_fxn_name)

### Define InputLayer and DenseLayer base classes

The DenseLayer class is where most of the computation happens

In [21]:
class InputLayer(Layer):
    """A neural network 1-dimensional input layer."""
    
    def __init__(self, size, name='Input'):
        assert type(size).__name__ == 'int', 'Input size must be integer. Was %s instead' % type(size).__name__
        super().__init__(shape=size, use_bias=False, name=name, activation_function=None)
    
    def feed_forward(self, inputs):
        assert len(inputs)==self._shape, 'Inputs must be of size %d; was %d instead' % (self._shape, len(inputs))
        self._outputs = inputs
        return self._outputs

    def backpropagate(self, error, learning_rate):
        return None  # Nothing to do.

In [48]:
class DenseLayer(Layer):
    """A neural network layer that is fully connected to the previous layer."""
    
    def __init__(self, shape, use_bias=True, name='Dense', **kwargs):
        super().__init__(shape=shape, use_bias=use_bias, name=name, **kwargs)
        
        self._weights = np.zeros(shape)
        if self._weight_initializer:
            for i in range(shape[0]):
                for j in range(shape[1]):
                    self._weights[i,j] = self._weight_initializer()
    
    def feed_forward(self, inputs):
        """Feed the given inputs through the input weights and activation function, and set the outputs vector.
        
        Also returns the outputs vector for convenience."""
        
        assert len(inputs)==self._shape, 'Inputs must be of size %d; was %d instead' % (self._shape, len(inputs))
        

        output = np.dot(self.get_weights(), inputs) + self.get_bias()
        output = self._activation_function.activate(output)
        # raise NotImplementedError()
        
        # Update output vector for later use, and return it.
        self._outputs = output 
        return self._outputs
        
    def backpropagate(self, error, learning_rate):
        """Adjusts the weights coming into this layer based on the given output error vector.
        
        For the output layer, the "error" vector should be a list of output errors, y_k - t_k.
        For a hidden layer, the "error" vector should be a list of the delta values from the following layer, such as delta_z_k
        
        Returns a list of the delta values for each node in this layer. These deltas can be used as the error
        values when calling backpropagate on the previous layer."""
        assert isinstance(error, np.ndarray)
        assert isinstance(self._prev_layer._outputs, np.ndarray)
        assert isinstance(self._outputs, np.ndarray)  
        
        # Compute deltas. 
        deltas = None
        # TODO
        raise NotImplementedError()
          
        
        # Compute gradient.
        # TODO
        
        
        # Adjust weights.
        # TODO
        self._weights = self.get_weights() - learning_rate * 
        
        
        # Adjust bias weights.
        if self._use_bias:
            # TODO
            pass
            
        return deltas
        

# Train a neural net

## Create a dataset for the XOR problem

In [49]:
X_data = np.array([[0,0],[1,0],[0,1],[1,1]])
y_data = np.array([[0,1,1,0]]).T
print(X_data)
print(y_data)
print(type(X_data))

[[0 0]
 [1 0]
 [0 1]
 [1 1]]
[[0]
 [1]
 [1]
 [0]]
<class 'numpy.ndarray'>


## Create a neural network using the library.

In [52]:
nnet = NNet()
nnet.add_input_layer(2)
nnet.add_dense_layer(2, weight_initializer=WeightInitializer, activation_function=SigmoidActivation)
nnet.add_dense_layer(1, weight_initializer=WeightInitializer, activation_function=SigmoidActivation, name='Output')
nnet.summary()

0: [Input] shape 2, use_bias=False, activation=None
1: [Dense] shape (2, 2), use_bias=True, activation=Sigmoid
2: [Output] shape (2, 1), use_bias=True, activation=Sigmoid


In [51]:
nnet.summary(verbose=True)

0: [Input] shape 2, use_bias=False, activation=None
weights: None

1: [Dense] shape (2, 2), use_bias=True, activation=Sigmoid
weights: [[ 0.75597926 -0.22474251]
 [-0.26396281 -0.03155437]]
bias: [0.95909984 0.71383101]

2: [Output] shape (2, 1), use_bias=True, activation=Sigmoid
weights: [[ 0.76506915]
 [-0.26753383]]
bias: [0.59856937]



# Train the network

In [None]:
# TODO

## Print the resuting neural net weights.

In [None]:
nnet.summary(verbose=True)