Every day we witness the awesome powers of neural networks, they detect features, solve problems, and make you look like a mad scientist. So why are they so strong? Short answer; because they gain their strength from the simplicity of their structure.

# Functions
Imagine a small box, you throw something into the box and you get some results from the box. Quite simply, I'm sure you know what I'm talking about, actually this box is a mathematical function. image.png You have some inputs in your hand, and you define and run a function that will generate output from these inputs, now you will have outputs or results.What if we had inputs and we also had outputs? Then the missing piece of the puzzle would be the function itself, right?So let's look at the problem from a different window and design the box that will produce the output we have from the inputs. Perceptrons help solve this problem.

# Perceptron 
is a mathematical model that multiplies the values you give to its inputs by a weighting coefficient and adds a bias value, comparing the output you expect to be with your output value, and updating the relevant weight and bias parameters to get the desired output.

Let's think of a scenario that will stop thinking a little more abstractly and simplify their understanding immediately and bring it to life.Take the issue of propositions you might remember from the topic of logic, two of which should suggest "and" and "or". image.png Let's look at the table, we have inputs and we also have outputs that vary according to these inputs, hopefully it tells you the same thing. Perceptrons. Now let's jump to the more fun side of things and solve this problem with a very simple single layer neural network structure, which we call perceptron.

Our perceptron will be a simple structure with 2 neurons in the first layer and only 1 neuron at the output, this will be enough to solve the "and gate" problem.

In [37]:
import keras
import numpy as np
#Let's define the inputs we have and the outputs of these inputs.
xs_and = np.array([[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]], dtype = float)
ys_and = np.array([[0.0], [0.0], [0.0], [1.0]], dtype = float)

model_and = keras.Sequential()
model_and.add(keras.layers.Dense(units=2, input_shape=[2]))
model_and.add(keras.layers.Dense(units=1))

model_and.compile(optimizer="sgd", loss="mean_squared_error", metrics=["accuracy"])

model_and.fit(xs_and, ys_and, epochs=300, verbose=0)

<tensorflow.python.keras.callbacks.History at 0x7fa3830f7940>

Let's run the model and see if our neural network really created the structure that will give the right result for us. Let's send "0" and "0"

In [38]:
model_and.predict(np.array([[0.0, 0.0]]))

array([[-0.23038407]], dtype=float32)

Let's define possible entries in our table to variables, then let's try to print all my inputs by writing a simple for loop which output will give each

In [39]:
pred_zero_zero = np.array([[0.0,0.0]]) # Expecting False
pred_zero_one = np.array([[0.0,1.0]])
pred_one_zero = np.array([[1.0,0.0]])
pred_one_one = np.array([[1.0,1.0]])
predictions = [pred_zero_zero, pred_zero_one, pred_one_zero, pred_one_one]

for pred in predictions:
    print(model_and.predict(pred))

[[-0.23038407]]
[[0.23535606]]
[[0.26852098]]
[[0.73426116]]


What would I get if I set a threshold value here and set it to be "1" when the output is greater than 0.5 and "0" when it is small.

In [40]:
def prediction_and(pred):
    x = model_and.predict(pred)
    print(x)
    if x > 0.5:
        print("Correct\n")
    else:
        print("False\n")

for pred in predictions:
    prediction_and(pred)

[[-0.23038407]]
False

[[0.23535606]]
False

[[0.26852098]]
False

[[0.73426116]]
Correct



We solved the "and gate" problem with a simple perceptron or a single layer neural network.

How can we put it on paper with a simple calculator? To do this, we need the weight and bias values within our neural network.

In [41]:
first_layer_weights_and = model_and.layers[0].get_weights()[0]
first_layer_biases_and  = model_and.layers[0].get_weights()[1]
print("FIRST LAYER WEIGHTS")
print(first_layer_weights_and)
print("\nFIRST LAYER BIASES")
print(first_layer_biases_and)

second_layer_weights_and = model_and.layers[1].get_weights()[0]
second_layer_biases_and  = model_and.layers[1].get_weights()[1]
print("\nOUTPUT LAYER WEIGHT")
print(second_layer_weights_and)
print("\n OUTPUT LAYER BIAS")
print(second_layer_biases_and)

FIRST LAYER WEIGHTS
[[-0.53747416  1.0522358 ]
 [-0.36312664 -0.56071895]]

FIRST LAYER BIASES
[ 0.1199667  -0.02607459]

OUTPUT LAYER WEIGHT
[[-1.126337  ]
 [-0.10118645]]

 OUTPUT LAYER BIAS
[-0.09789953]


We multiply the inputs to each neuron by their weight and add them all together, then add the bias.Let's apply [1,1] to the inputs.

The value of the first neuron in the Hidden Layer: [[1.0 x 0.1698] + [1.0 x -0.7285]] + [-0.0383] = -0.5979

The value of the second neuron in the Hidden Layer: [[1.0 x 0.6799] + [1.0 x -0.1077]] + [-0.0391] = 0.5331

The value of the neuron in the Output Layer: z3 = [[-0.5979 x -0.500] + [0.5331 x 0.354]] + [0.0311] = 0.517

**Convolutional Neural Networks**
---
A convolutional neural network is a feed-forward neural network that is generally used to analyze visual images by processing data with grid-like topology. It’s also known as a ConvNet. A convolutional neural network is used to detect and classify objects in an image. 

Traditional neural networks called the multilayer perceptron (MLP) are modeled on the human brain, whereby neurons are stimulated by connected nodes and are only activated when a certain threshold value is reached.
There are several drawbacks of MLP’s, especially when it comes to image processing. MLPs use one perceptron for each input (e.g. pixel in an image, multiplied by 3 in RGB case). The amount of weights rapidly becomes unmanageable for large images. For a 224 x 224 pixel image with 3 color channels there are around 150,000 weights that must be trained!

Computers ‘see’ in a different way than we do. Their world consists of only numbers. Every image can be represented as 2-dimensional arrays of numbers, known as pixels. But the fact that they perceive images in a different way, doesn’t mean we can’t train them to recognize patterns, like we do. We just have to think of what an image is in a different way. To teach an algorithm how to recognise objects in images, we use a specific type of Artificial Neural Network: a Convolutional Neural Network (CNN). Their name stems from one of the most important operations in the network: convolution.







The concept of hierarchy plays a significant role in the brain. Information is stored in sequences of patterns, in sequential order. The neocortex, which is the outermost layer of the brain, stores information hierarchically. It is stored in cortical columns, or uniformly organised groupings of neurons in the neocortex.

Regular Neural Networks transform an input by putting it through a series of hidden layers. Every layer is made up of a set of neurons, where each layer is fully connected to all neurons in the layer before. Finally, there is a last fully-connected layer — the output layer — that represent the predictions.

Convolutional Neural Networks are a bit different. First of all, the layers are organised in 3 dimensions: width, height and depth. Further, the neurons in one layer do not connect to all the neurons in the next layer but only to a small region of it. Lastly, the final output will be reduced to a single vector of probability scores, organized along the depth dimension.


CNNs have two components: 
* The Hidden layers/Feature extraction part
In this part, the network will perform a series of convolutions and pooling operations during which the features are detected. If you had a picture of a zebra, this is the part where the network would recognise its stripes, two ears, and four legs.
* The Classification part 
Here, the fully connected layers will serve as a classifier on top of these extracted features. They will assign a probability for the object on the image being what the algorithm predicts it is.


Convolution is one of the main building blocks of a CNN. The term convolution refers to the mathematical combination of two functions to produce a third function. It merges two sets of information.

CNN, the convolution is performed on the input data with the use of a filter or kernel to then produce a feature map. We execute a convolution by sliding the filter over the input. At every location, a matrix multiplication is performed and sums the result onto the feature map. The area of our filter is also called the receptive field, named after the neuron cells!


We perfom numerous convolutions on our input, where each operation uses a different filter. This results in different feature maps. In the end, we take all of these feature maps and put them together as the final output of the convolution layer. Just like any other Neural Network, we use an activation function to make our output non-linear. In the case of a Convolutional Neural Network, the output of the convolution will be passed through the activation function.

In [42]:
import numpy as np

In [43]:
from keras.layers import Conv2D, Activation, MaxPool2D, Flatten, Dense
from keras.models import Sequential

In [44]:
# Images fed into this model are 512 x 512 pixels with 3 channels
img_shape = (28,28,1)

In [45]:
# Set up the model
model = Sequential()

In [46]:
# Add convolutional layer with 3, 3 by 3 filters and a stride size of 1
# Set padding so that input size equals output size
model.add(Conv2D(6,2,input_shape=img_shape))

In [47]:
# Add relu activation to the layer 
model.add(Activation('relu'))

In [48]:
#Pooling
model.add(MaxPool2D(2))

In [49]:
#Fully connected layers
# Use Flatten to convert 3D data to 1D
model.add(Flatten())
# Add dense layer with 10 neurons
model.add(Dense(10))

In [50]:
# we use the softmax activation function for our last layer
model.add(Activation('softmax'))

In [51]:
# give an overview of our model
model.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_2 (Conv2D)            (None, 27, 27, 6)         30        
_________________________________________________________________
activation_5 (Activation)    (None, 27, 27, 6)         0         
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 13, 13, 6)         0         
_________________________________________________________________
flatten_3 (Flatten)          (None, 1014)              0         
_________________________________________________________________
dense_4 (Dense)              (None, 10)                10150     
_________________________________________________________________
activation_6 (Activation)    (None, 10)                0         
Total params: 10,180
Trainable params: 10,180
Non-trainable params: 0
__________________________________________________

In [52]:
"""Before the training process, we have to put together a learning process in a particular form. 
It consists of 3 elements: an optimiser, a loss function and a metric."""
model.compile(loss='sparse_categorical_crossentropy', optimizer = 'adam', metrics=['acc'])

In [53]:
# dataset with handwritten digits to train the model on
from keras.datasets import mnist

In [54]:
(x_train, y_train), (x_test, y_test) = mnist.load_data()

In [55]:
x_train = np.expand_dims(x_train,-1)
x_test = np.expand_dims(x_test,-1)

In [None]:
# Train the model, iterating on the data in batches of 32 samples# for 10 epochs
model.fit(x_train, y_train, batch_size=32, epochs=10, validation_data=(x_test,y_test))

Epoch 1/10
Epoch 2/10