# Artificial_Neural_networks_intro
  
    
### Logical Computations with Neurons
Warren McCulloch and Walter Pitts proposed a very simple network with which neurons which could have identical properties to logic gates. Complex models can be built from this just like hardware.

### The Perceptron
invented by Frank Rosenblatt. It is a single layer network using linear threshold units. The outputs are numbers instead of binary input/output values like the neruron above. The LTU computes a weighted sum of its inputs and then applies a step function to that sum and outputs the result. 



    h_w = step(z) = step(w^TX)

##### Perceptron Learning Rules:
w_i,j(next_step) = w_i,j + n(yJ - yhat_j)x_i

* W_i,j is the connection weight between the i^th input neuron and the j^th output neuron.
* y_hat is the output of the j^th output neuron for the current training instance. 
* y_j is the target output of the j^th output neuron for the current training instance. 
* n is the learning rate

In [4]:
import numpy as np 
from sklearn.datasets import load_iris
from sklearn.linear_model import Perceptron

iris = load_iris()
X = iris.data[:, (2,3)] #petal length, petal width
y = (iris.target ==0).astype(np.int) # Iris Setosa

per_clf = Perceptron(random_state=42)
per_clf.fit(X, y)

y_pred = per_clf.predict([[2, 0.5]])

##### Perceptrons cannot make class prediction probabilities only classifications #####

An MLP (multi layer perceptron) consists of one or more layers of TLUs (hidden layers) and one final layer of TLUs called the output layer. 

to train using reverse-mode autodiff. The error is measure between the networks output error and then it computes how much of the error was contributed to by each neuron in the hidden layer. This pass efficiently measures the error gradient across all the connection weights in the netwok by propagating the error gradient backward in the network.

in order for this algorithm to work logistic function should be used instead of step for the perceptrons. 
1/ (1+exp(-z)). It is differentiable and output value ranges from -1 to 1 so gradients can be calculated and output is more normalized.

### Training a DNN using tensorflow

In [5]:
import tensorflow as tf

n_inputs = 28*28 # MNIST
n_hidden1 = 300
n_hidden2 = 100
n_outputs = 10

X = tf.placeholder(tf.float32, shape=(None, n_inputs), name="X") #To feed batches to during training
y = tf.placeholder(tf.int64, shape=(None), name="y")


def neuron_layer(X, n_neurons, name, activation=None):
    with tf.name_scope(name): #name scope using name of layer
        n_inputs = int(X.get_shape()[1]) #get the number of inputs
        stddev = 2 / np.sqrt(n_inputs + n_neurons) #standard deviation of distribution
        init = tf.truncated_normal((n_inputs, n_neurons), stddev=stddev)#random values from a truncated normal dist
        W = tf.Variable(init, name="kernel") #weights
        b = tf.Variable(tf.zeros([n_neruons]), name="bias") #bias
        Z = tf.matmul(X, W)+b #prediction
        if activation is not None: 
            return activation(X)
        else: 
            return X

In [9]:
#Creating layers
with tf.name_scope("dnn"):
    hidden1 = tf.layers.dense(X, n_hidden1, name="hidden1", activation=tf.nn.relu)
    hidden2 = tf.layers.dense(hidden1, n_hidden2, name="hidden2", activation=tf.nn.relu)
    logits = tf.layers.dense(hidden2, n_outputs, name="outputs")
    
    # cost funcition
    # xentropy is equivalent to applying the softmax activation function
    # and then computing cross entropy.
with tf.name_scope("loss"):
    xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=logits)
    loss = tf.reduce_mean(xentropy, name="loss")

    # Training using GradientDescent
learning_rate = 0.01
with tf.name_scope("train"):
    optimizer = tf.train.GradientDescentOptimizer(learning_rate)
    training_op = optimizer.minimize(loss) #minimizing loss function
    
    #using accuracy as a performance measure
with tf.name_scope("eval"):
    correct = tf.nn.in_top_k(logits, y, 1)
    accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))
    init = tf.global_variables_initializer()
    saver = tf.train.Saver()

Instructions for updating:
Use keras.layers.Dense instead.
Instructions for updating:
Please use `layer.__call__` method instead.


#### Execution phase

In [14]:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data/")

n_epochs = 40
batch_size = 50

with tf.Session() as sess:
    init.run()
    for epoch in range(n_epochs):
        for iteration in range(mnist.train.num_examples // batch_size):
            X_batch, y_batch = mnist.train.next_batch(batch_size)
            sess.run(training_op, feed_dict={X:X_batch, y:y_batch})
        acc_train = accuracy.eval(feed_dict={X:X_batch, y:y_batch})
        acc_val = accuracy.eval(feed_dict={X:mnist.validation.images, y: mnist.validation.labels})
        
        print(epoch, "Train accuracy:", acc_train, "Val accuracy;", acc_val)
        
    save_path = saver.save(sess, "models/tensorflow/my_model_final.cpkt:")

Extracting /tmp/data/train-images-idx3-ubyte.gz
Extracting /tmp/data/train-labels-idx1-ubyte.gz
Extracting /tmp/data/t10k-images-idx3-ubyte.gz
Extracting /tmp/data/t10k-labels-idx1-ubyte.gz
0 Train accuracy: 0.86 Val accuracy; 0.906
1 Train accuracy: 0.98 Val accuracy; 0.9252
2 Train accuracy: 0.9 Val accuracy; 0.9338
3 Train accuracy: 0.96 Val accuracy; 0.94
4 Train accuracy: 0.9 Val accuracy; 0.9462
5 Train accuracy: 0.96 Val accuracy; 0.95
6 Train accuracy: 0.96 Val accuracy; 0.9524
7 Train accuracy: 0.98 Val accuracy; 0.9574
8 Train accuracy: 0.96 Val accuracy; 0.9602
9 Train accuracy: 0.98 Val accuracy; 0.9612
10 Train accuracy: 0.98 Val accuracy; 0.9634
11 Train accuracy: 0.96 Val accuracy; 0.9666
12 Train accuracy: 0.98 Val accuracy; 0.9678
13 Train accuracy: 0.98 Val accuracy; 0.9682
14 Train accuracy: 0.94 Val accuracy; 0.9702
15 Train accuracy: 0.94 Val accuracy; 0.9712
16 Train accuracy: 1.0 Val accuracy; 0.9712
17 Train accuracy: 1.0 Val accuracy; 0.9726
18 Train accuracy: 