Exercise 1 Simple neural network
Implement a simple neural network that predicts the class of iris plant basing on its parameters. 
Use 2 hidden layers. The layers should be fully connected with RELu activation function, except for the output layer, which should use softmax function.
 No regularisation or optimalisation are needed. No batching is needed. 


In [15]:
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
iris = datasets.load_iris()


In [16]:
print(iris.DESCR)


.. _iris_dataset:

Iris plants dataset
--------------------

**Data Set Characteristics:**

:Number of Instances: 150 (50 in each of three classes)
:Number of Attributes: 4 numeric, predictive attributes and the class
:Attribute Information:
    - sepal length in cm
    - sepal width in cm
    - petal length in cm
    - petal width in cm
    - class:
            - Iris-Setosa
            - Iris-Versicolour
            - Iris-Virginica

:Summary Statistics:

                Min  Max   Mean    SD   Class Correlation
sepal length:   4.3  7.9   5.84   0.83    0.7826
sepal width:    2.0  4.4   3.05   0.43   -0.4194
petal length:   1.0  6.9   3.76   1.76    0.9490  (high!)
petal width:    0.1  2.5   1.20   0.76    0.9565  (high!)

:Missing Attribute Values: None
:Class Distribution: 33.3% for each of 3 classes.
:Creator: R.A. Fisher
:Donor: Michael Marshall (MARSHALL%PLU@io.arc.nasa.gov)
:Date: July, 1988

The famous Iris database, first used by Sir R.A. Fisher. The dataset is taken
from Fis

In [17]:
X = iris.data
y = iris.target

In [18]:
X_train , X_test , y_train, y_test = train_test_split(X,y , test_size=0.25 , random_state=42,shuffle=True)

In [19]:
class Neuron():
    def __init__(self,in_features,out_features):
        self.weights = np.random.uniform(low = - ((1 / in_features)**0.5) ,high= (1 / in_features)**0.5 , size = (in_features,out_features))    # as in PyTorch
        self.bias = np.random.uniform(low = - ((1 / in_features)**0.5) ,high= (1 / in_features)**0.5 , size = (1,out_features))
        
    def forward(self, X):
        return X @ self.weights + self.bias


    

In [None]:
def hidden_unit(X , in_features, out_features):
    weights = np.random.uniform(low = - ((1 / in_features)**0.5) ,high= (1 / in_features)**0.5 , size = (in_features,out_features))    # as in PyTorch
    bias = np.random.uniform(low = - ((1 / in_features)**0.5) ,high= (1 / in_features)**0.5 , size = (1,out_features))
    
    return X @ weights + bias

def Relu_forward(X):
    return np.maximum(X,0)

def Relu_backward(X):
    # basicly kill gradient if <=0
    X[X<=0] = 0
    return X

def Softmax_forward(X):
    X_exp = np.exp(X)
    row_sum = np.sum(X_exp,axis=1,keepdims=True)
    return X_exp/row_sum  
def Logloss(X,labels):
    # matrix after softmax 
    return -np.mean(labels * np.log(X))

def CrossEntropyBackward(Softmax_output,labels):
    # combining log loss and soft max gives nice derivative 
    # X is output of softmax and labels are true labels
    return Softmax_output - labels     

In [21]:
N_1 = Neuron(4,10)
N_2 = Neuron(10,10)
N_3 = Neuron(10,3)


In [22]:
X_1 = Relu_forward(N_1.forward(X))   # it has some weight and bias

In [23]:
X_2 = Relu_forward(N_2.forward(X_1)) # it has some weight and bias

In [24]:
X_softmax = Softmax_forward(N_3.forward(X_2)) # it has some weight and bias 

In [25]:
train_labels = np.eye(3)[y] # one hot encoding 3 type of flowers


In [26]:
Logloss(X_softmax,train_labels)

0.5189670522580744

In [27]:
Z_softmax = CrossEntropyBackward(X_softmax,train_labels)

In [28]:
Z_softmax

array([[-0.3917301 ,  0.31434632,  0.07738378],
       [-0.4058364 ,  0.3212361 ,  0.0846003 ],
       [-0.40555043,  0.32000355,  0.08554688],
       [-0.40983041,  0.32302601,  0.0868044 ],
       [-0.39162533,  0.31403855,  0.07758679],
       [-0.37474396,  0.30651181,  0.06823215],
       [-0.40153619,  0.31789333,  0.08364286],
       [-0.39581422,  0.31680464,  0.07900958],
       [-0.41793999,  0.32591973,  0.09202027],
       [-0.40586507,  0.32215122,  0.08371386],
       [-0.38217728,  0.31034537,  0.07183191],
       [-0.39981747,  0.31894305,  0.08087443],
       [-0.4098012 ,  0.32333906,  0.08646213],
       [-0.41958547,  0.32501584,  0.09456963],
       [-0.36842887,  0.30201485,  0.06641402],
       [-0.35931577,  0.29724944,  0.06206633],
       [-0.37412608,  0.30424856,  0.06987752],
       [-0.38979897,  0.31298655,  0.07681242],
       [-0.37307885,  0.30652431,  0.06655454],
       [-0.38391498,  0.31028399,  0.07363099],
       [-0.38791595,  0.31432784,  0.073

In [29]:
Z_Relu2 = Relu_backward(Z_softmax)

In [30]:
Z_Relu2

array([[0.        , 0.31434632, 0.07738378],
       [0.        , 0.3212361 , 0.0846003 ],
       [0.        , 0.32000355, 0.08554688],
       [0.        , 0.32302601, 0.0868044 ],
       [0.        , 0.31403855, 0.07758679],
       [0.        , 0.30651181, 0.06823215],
       [0.        , 0.31789333, 0.08364286],
       [0.        , 0.31680464, 0.07900958],
       [0.        , 0.32591973, 0.09202027],
       [0.        , 0.32215122, 0.08371386],
       [0.        , 0.31034537, 0.07183191],
       [0.        , 0.31894305, 0.08087443],
       [0.        , 0.32333906, 0.08646213],
       [0.        , 0.32501584, 0.09456963],
       [0.        , 0.30201485, 0.06641402],
       [0.        , 0.29724944, 0.06206633],
       [0.        , 0.30424856, 0.06987752],
       [0.        , 0.31298655, 0.07681242],
       [0.        , 0.30652431, 0.06655454],
       [0.        , 0.31028399, 0.07363099],
       [0.        , 0.31432784, 0.07358811],
       [0.        , 0.31002073, 0.0739765 ],
       [0.