**Homework 12.**

For this assignment, you won't need more than numpy:

In [None]:
import numpy as np

In this assignment, you'll be creating three classes: `Linear`, `Softmax`, and `Model`. The `Linear` and `Softmax` classes will define layers of a Neural Network. The Model class defines a particular network, given a list of layers.

For example, consider the following code:
```
layer1=Linear(2,3)
layer2=Linear(3,10)
layer3=Softmax()
network=Model([layer1,layer2,layer3])
```

Here, layer1 takes a feature matrix with 2 columns (features), and produces a matrix with the same number of rows and 3 columns. layer2 then accepts that matrix, and produces one with 10 columns. Finally, the Softmax layer produces columns with positive values that sum to 1, representing the probability that each observation (row) is one of 10 possible classes.

All three classes (Linear and Softmax) should have a `predict` method. If `X` is a feature matrix of shape (n,2), then running `network.predict(X)` after the above code will call the `predict` methods of each layer, and produce a matrix of shape (n,10), of positive values, where each row sums to 1.


In [None]:
class Linear():
  '''Fully connected linear layer class'''
  def __init__(self, input_size, output_size):
    np.random.seed(1) #Don't use in practice! This is just to make sure we all get the same answers
    self.weights = np.random.randn(input_size, output_size) * np.sqrt(2.0 / input_size) #Standard weight initialization
    self.biases = np.zeros(output_size) #Standard bias initialization

  def predict(self,input):
    return np.dot(input, self.weights) + self.biases

In [None]:
class Softmax():
  '''Implement Softmax as final layer for prediction only'''
  def __init__(self):
    pass #No init function necessary

  def predict(self,input):
    exp_input = np.exp(input)
    softmax_output = exp_input / np.sum(exp_input, axis=1, keepdims=True)
    return softmax_output

In [None]:
class Model():
  def __init__(self,layerlist):
    self.layerlist=layerlist

  def add(self,layer):
    self.layerlist+=[layer]

  def predict(self,input):
    output = input
    for layer in self.layerlist:
      output = layer.predict(output)
    return output

Now test your code. Run this code block. You should see a matrix of shape (15,10), where each row sums to 1.

In [None]:
np.random.seed(4)
X=np.random.random(30).reshape((15,2)) #generate a random feature matrix with 2 features, and 15 observations

layer1=Linear(2,3)
layer2=Linear(3,10)
layer3=Softmax()
network=Model([layer1,layer2,layer3])

network.predict(X)

array([[0.58895817, 0.0053466 , 0.00687284, 0.00794724, 0.01831197,
        0.01763088, 0.18482285, 0.08559619, 0.0710381 , 0.01347517],
       [0.65962938, 0.00266868, 0.00447758, 0.00629504, 0.01162858,
        0.02223053, 0.12355755, 0.09678726, 0.06217709, 0.0105483 ],
       [0.35700385, 0.02805517, 0.02521141, 0.02265791, 0.05290135,
        0.02441019, 0.26915131, 0.08679041, 0.10011159, 0.03370681],
       [0.28066637, 0.0326074 , 0.01742272, 0.01056797, 0.05199814,
        0.00511925, 0.47308729, 0.03736744, 0.07223923, 0.01892421],
       [0.28124061, 0.02057589, 0.03515588, 0.05157294, 0.04297376,
        0.14372767, 0.08414889, 0.17619607, 0.10674149, 0.05766681],
       [0.36470112, 0.02618914, 0.02166174, 0.01825689, 0.0499858 ,
        0.01755654, 0.30457046, 0.07472269, 0.09379073, 0.0285649 ],
       [0.69297767, 0.00099388, 0.00276141, 0.00564827, 0.00602524,
        0.04598877, 0.05315221, 0.13196138, 0.05184023, 0.00865094],
       [0.26766376, 0.01098497, 0.0258494

To see this as predicted classes, rather than probabilities of each class, recall that you can use the argmax function to pick out the class with the highest probability for each observation:

In [None]:
np.argmax(network.predict(X),axis=1)

array([0, 0, 0, 6, 0, 0, 0, 0, 5, 5, 0, 0, 0, 0, 0])

Without a softmax layer you can do regression tasks if the last layer outputs a single real number for each observation. Here we'll use the `add` method of the `Model` class to build a neural network one layer at a time, instead of all at once. This avoids having to name each layer.

In [None]:
network2=Model([])
network2.add(Linear(2,10))
network2.add(Linear(10,1))

network2.predict(X)

array([[8.12371278],
       [8.75681564],
       [5.22462795],
       [6.25547264],
       [3.16413242],
       [5.68040587],
       [9.01326988],
       [3.17403503],
       [1.4344951 ],
       [3.68993064],
       [6.16532713],
       [8.10644278],
       [1.37261639],
       [4.43398192],
       [6.13749615]])

In the next assignment, you'll add methods to your classes that adjust the weights and biases of each layer, so that your network genetrates predictions that match some target.