# Table of Content

[Neural Networks](#NeuralNetworks)

* [Introduction](#Introduction)
	* [History](#History)
	* [Perceptrons](#Datasets)





# <a id="NeuralNetworks"></a>Neural Networks

## <a id="Introduction"></a>Introduction

### <a id="BiologicalNeurons"></a>Biological Neurons

Many tasks that involve intelligence, pattern recognition, object classifications or detection are difficult
to implement using classical software engineering principles, even those tasks can easily be performed by animals or young children. 

For example, family cat can easily recognize you, versus a stranger?
Small child can easily recognize who is dad, and who is mom.

Human brains can perform complex pattern recognition tasks without even noticing ?
How brans can do that ?

The answer lies in our bodies. Each of us contains a real-life biological neural
networks that is connected to our nervous systems. Network is composed of a large number of
interconnected neurons (nerve cells).

One brain has approximately 10 billion neurons, each connected to about 10,000
other neurons. The cell body of the neuron is called the ***soma***, where the inputs (dendrites) and
outputs (axons) connect soma to other soma.

Each neuron receives electrochemical inputs from other neurons at their ***dendrites***. If these
electrical inputs are powerful enough activate the neuron, then the activated neuron transmits
the signal along its ***axon***, passing it along to the ***dendrites*** of other neurons. These attached neurons
may also fire, thus continuing the process of passing the message along.

Firing a neuron  is a binary operation – the neuron either
fires or it doesn’t fire. There are no different ***grades*** of firing. Neuron will fire only 
if the total signal received at the ***soma*** exceeds a given threshold.

![Image](course/assets/image/biological-neurons.png)

***Dendrite***: Receives signals from other neurons   
***Soma***: Processes the information   
***Axon***: Transmits the output of this neuron   
***Synapse***: Point of connection to other neurons   

Can we simulate neural network from nature ?

So, if we simulate brain structure then we should try to implement computation system composed of the connected nodes, 
where on each node we will execute a simple computation. Such a structure can be implemented using directed graph.
From graph theory, we know that a directed graph consists of a set of nodes (i.e., vertices's) and a set of 
connections (i.e., edges) that are link together.

Each node performs a simple computation. Each connection then carries a signal (i.e., the
output of the computation) from one node to another, labeled by a weight indicating the extent to
which the signal is amplified or diminished. Some connections have large, positive weights that
amplify the signal, indicating that the signal is very important when making a classification. 
Others have negative weights, diminishing the strength of the signal, thus specifying that the output of
the node is less important in the final classification. 

Initially connection weights are defined with random values, which are modified using learning algorithm.

Such a system is Artificial Neural Network.

The word ***neural*** is the adjective form of ***neuron***, and ***network*** denotes a graph-like
structure, therefore, an ***Artificial Neural Network*** is a computation system that attempts to simulate the neural 
connections in our nervous system. 

Artificial neural networks are also referred to as ***neural networks***. It is common to abbreviate
Artificial Neural Network and refer to them as ***ANN*** or simply ***NN**.

### <a id="ArtificialNeurons"></a>Artificial Neurons

In 1943 ***Warren S. McCulloch***, a neuroscientist, and ***Walter Pitts, a logician***, published a paper ***A logical calculus of the ideas immanent in nervous activity***. In this paper McCulloch and Pitts tried to understand how the brain could produce highly complex patterns by using many basic cells that are connected together. These basic brain cells are called neurons, and McCulloch and Pitts gave a highly simplified model of a neuron in their paper. 

The McCulloch and Pitts model of a neuron, which we will call an **MCP neuron** for short, has made an important contribution to the development of artificial neural networks -- which model key features of biological neurons.

Model is divided into 2 parts. The first part, ***g*** takes an input, performs an aggregation and based on the aggregated value the second part, ***f*** makes a decision.

![Image](course/assets/image/McCullochPittsNeuron.png)

The original ***MCP neuron** had limitation, so the the next major development in neural networks was the concept of a ***perceptron*** which was introduced by ***Frank Rosenblatt*** in 1958. Further refined and carefully analyzed by **Minsky** and ***Papert*** (1969) — their model is referred to as the ***perceptron*** model.

Essentially the ***perceptron*** is an ***MCP neuron*** where the inputs are first passed through some ***preprocessors*** which are called association units. These association units detect the presence of certain specific features in the inputs. In fact, as the name suggests, a perceptron was intended to be a pattern recognition device, and the association units correspond to feature or pattern detectors.


![Image](course/assets/image/perceptron-model.png)


The perceptron model, proposed by Minsky-Papert, is a more general computational model than ***MCP neuron***. It overcomes some of the limitations of the ***MCP neuron*** by introducing the concept of numerical weights (a measure of importance) for inputs, and a mechanism for learning those weights. Inputs are no longer limited to boolean values like in the case of an ***MCP neuron***, it supports real inputs as well which makes it more useful and generalized.

It takes an input, aggregates it (weighted sum) and returns 1 only if the aggregated sum is more than some threshold else returns 0. 


In [82]:
import numpy as np

class Perceptron:
    
    def __init__(self, number_of_inputs, learning_rate=0.1):
        # initialize the weight matrix
        np.random.seed(7)
        self.W = np.random.randn(number_of_inputs + 1) / np.sqrt(number_of_inputs)
        self.learning_rate = learning_rate
        
    # activation function    
    def step(self, x):
        return 1 if x > 0 else 0
        
    def fit(self, X, y, epochs=10 ):
        
        # add bias to input value
        X = np.c_[X, np.ones(X.shape[0])]
        
        # let's train over desired number of values
        for epoch in np.arange(0, epochs):
            # loop over each individual data point
            for (x, target) in zip(X, y):
                
                # take the dot product of the input features
                # and the weight matrix, then pass calculated value
                # through the step function 
                prediction = self.step(np.dot(x, self.W))
                
                # update weights if prediction is not same as expected target
                if prediction != target:
                    # calculate error
                    error = prediction - target
                    
                    # update the weight matrix
                    self.W += -self.learning_rate * error * x
                    # print(self.W)
                    
    def predict(self, X, addBias=True):
        # ensure our input is a matrix
        X = np.atleast_2d(X)

        # check to see if the bias column should be added
        if addBias:
            # insert a column of 1's as the last entry in the feature matrix (bias)
            X = np.c_[X, np.ones((X.shape[0]))]

        # take the dot product of the input features
        # and the weight matrix, then pass calculated value
        # through the step function
        return self.step(np.dot(X, self.W))
        

In [83]:
# define AND dataset
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([[0], [0], [0], [1]])

# obtain perceptron
p = Perceptron(X.shape[1], learning_rate=0.1)

# train
p.fit(X, y, epochs=20)

# now that our network is trained, loop over the data points
for (x, target) in zip(X, y):
    # make a prediction on the data point and display the result
    # to our console
    pred = p.predict(x)
    print("[INFO] data={}, ground-truth={}, pred={}".format(x, target[0], pred))


[INFO] data=[0 0], ground-truth=0, pred=0
[INFO] data=[0 1], ground-truth=0, pred=0
[INFO] data=[1 0], ground-truth=0, pred=0
[INFO] data=[1 1], ground-truth=1, pred=1


In [84]:
# define OR dataset
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([[0], [1], [1], [1]])

# obtain perceptron
p = Perceptron(X.shape[1], learning_rate=0.1)

# train
p.fit(X, y, epochs=20)

# now that our network is trained, loop over the data points
for (x, target) in zip(X, y):
    # make a prediction on the data point and display the result
    # to our console
    pred = p.predict(x)
    print("[INFO] data={}, ground-truth={}, pred={}".format(x, target[0], pred))

[INFO] data=[0 0], ground-truth=0, pred=0
[INFO] data=[0 1], ground-truth=1, pred=1
[INFO] data=[1 0], ground-truth=1, pred=1
[INFO] data=[1 1], ground-truth=1, pred=1


In [85]:
# define XOR dataset
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([[0], [1], [1], [0]])

# obtain perceptron
p = Perceptron(X.shape[1], learning_rate=0.1)

# train
p.fit(X, y, epochs=20)

# now that our network is trained, loop over the data points
for (x, target) in zip(X, y):
    # make a prediction on the data point and display the result
    # to our console
    pred = p.predict(x)
    print("[INFO] data={}, ground-truth={}, pred={}".format(x, target[0], pred))

[INFO] data=[0 0], ground-truth=0, pred=1
[INFO] data=[0 1], ground-truth=1, pred=0
[INFO] data=[1 0], ground-truth=1, pred=0
[INFO] data=[1 1], ground-truth=0, pred=0
