**`Simple Neural Network Implementations`**

**Model 1:** We are given a set of instances with numerical attributes, and a numerical label/target value (i.e. ground truth) for each instance. POur goal is to create a neural network that can predict the label for any given instance. In our simplest neural network model, the prediction is just a linear combination of the attributes. The network is trained by optimizing the weights (i.e. constant co-efficients) of this linear combination using gradient descent.

To demo this model, we will use the "traffic lights" example, where we have three traffic lights, the state of each light is either `on` or `off` (i.e. 1 or 0) and the corresponding label is either walk or stop (1 or 0). The training dataset is contrived such that there is a strong correlation between the second attribute/light and the target. We would therefore expect the second weight to be much larger than the other two weights after the model has been trained sufficiently. 

In [9]:
import numpy as np

# traffic lights dataset (each row is and instance, the first three coulumns are the attributes and the last column is the label)
traffic_lights = np.array([ [1, 0, 1, 0], 
                            [0, 1, 1, 1],
                            [0, 0, 1, 0],
                            [1, 1, 1, 1],
                            [0, 1, 1, 1],
                            [1, 0, 1, 0]] )

# number of gradient descent iterations
niters = 30

# learning rate (i.e gradient descent step-size)
alpha = 0.1

# initialize random weights
weights = np.random.randn(3) 
print(f"Initial weights: {weights}")

# train the network
for i in range(niters):

    total_error = 0.0
    for j in range(traffic_lights.shape[0]):
        
        input = traffic_lights[j, :-1]
        target = traffic_lights[j, -1]
         
        # compute prediction
        prediction = np.dot(weights, input) 
        
        # compute squared error
        error = (prediction - target)**2
        total_error += error

        # compute gradient of error w.r.t. weights
        grad = 2 * (prediction - target) * input

        # optimize weights using gradient descent
        weights -= alpha * grad

    print(f"Iteration# {i+1}, Updated weights: {weights}, Total error: {total_error}")



Initial weights: [-1.02969712 -0.46599741  0.66143296]
Iteration# 1, Updated weights: [-0.71679545 -0.00835358  0.94155586], Total error: 4.0545463507026325
Iteration# 2, Updated weights: [-0.62875226  0.18567509  0.83731934], Total error: 2.077425957169878
Iteration# 3, Updated weights: [-0.55665943  0.32462719  0.72018092], Total error: 1.4927200859844327
Iteration# 4, Updated weights: [-0.48849056  0.43444972  0.61733862], Total error: 1.0843576328267484
Iteration# 5, Updated weights: [-0.42552766  0.52348898  0.52888333], Total error: 0.791076341265537
Iteration# 6, Updated weights: [-0.36876556  0.59672578  0.45299734], Total error: 0.57839637224605
Iteration# 7, Updated weights: [-0.31842426  0.65760436  0.38794189], Total error: 0.42337682412962546
Iteration# 8, Updated weights: [-0.27426182  0.70861317  0.33219437], Total error: 0.310086025674872
Iteration# 9, Updated weights: [-0.23580514  0.7516052   0.28443665], Total error: 0.22717868688080495
Iteration# 10, Updated weights