# Deep Learning Vs Machine Learning Vs AI
* **AI** to mimic Human Behaviour.
* **Machine Learning** : Statistical methods to enable prediction.
* **Deep Learning** : To Train neural networks to learn from **large volumes** of structured and unstructured data for **complex models**
* Deep Learning better than Machine Learning as we can train large volumes of high dimensional data to train complex models with better accuracy.

# Perceptron
* **simplest** form of a neural network. 
* It's a **single-layer** neural network with one or more inputs, a set of weights, a bias, and an output which is sent to an activation function.
* Perceptrons, in their basic form, do not have a mechanism for updating weights.
* The learning rule for the perceptron involves adjusting the weights based on the misclassification of training examples to reduce error.
* Learning Rule:
    1. Initializing Threshold and weights, bias.
    2. Provide Input and calculate the output.
    3. Update the weights
    4. Repeat steps 2 and 3.
        

# Neural Network
* It is a model inspired by the structure and functioning of the **human brain**.
* It consists of **interconnected nodes** , or **artificial neurons**, organized in layers.
* Information is processed and passed through these neurons, with each connection having a **weight** that determines the strength of the signal or **slope of the prediction line**.
* **“Biases”**  are associated with each artificial neuron (node) in a layer. These bias terms are independent of the input data and are used to **shift the output** (prediction line in the axis) of the neuron.
* Epoch – The number of times the algorithm runs on the whole training dataset.
* Sample – A single row of a dataset.
* Batch – It denotes the number of samples to be taken to for updating the model parameters.
* Learning rate – It is a parameter that provides the model a scale of how much model weights should be updated.

# One Layer Neural Network Architecture
<img src="images/detailed[1].png" style="width:600px; height:300px;">

* **Activation Function** :
    * It is applied to the **weighted sum of inputs to the neuron (including the bias term)**, and the result becomes the output of the neuron that is then passed to the next layer
    * It is used to introduce **non-linearity** into the model to learn **complex patterns** and make it capable of approximating any arbitrary function.
    * used to **activate a neuron** based on a **threshold value** .
* **Vectorizing the neural network** : 
<img src="images/vector[1].png" style="width:300px; height:200px;">

# Two Layer Neural Network Architecture
<img src="images/detailed[2].png" style="width:300px; height:400px;">

* **Formula expansion** :
<img src="images/formula[2].png" style="width:500px; height:250px;">

* **Vectorizing the neural network** :
<img src="images/vector[2][1].jpeg" style="width:500px; height:500px;">
<img src="images/vector[2][2].jpeg" style="width:500px; height:250px;">

# M-Samples
<img src= "images/m samples.jpeg" style="width:500px; height:500px;">


# Basic implementation

## single layer single neuron architecture

<img src= "images/simple[1].jpeg" style="width:300px; height:200px;">

In [1]:
inputs = [1.0, 2.0, 3.0]
weights = [2.0, 4.0, 6.0]
bias = 3.0

output = inputs[0]*weights[0] + inputs[1]*weights[1] + inputs[2]*weights[2] + bias
print(output)

31


## single layer multiple neuron architecture
<img src= "images/simple[2].jpeg" style="width:500px; height:500px;">

In [3]:
inputs = [1.0, 2.0, 3.0, 4.0]

weights1 = [0.01, 0.02, 0.03, 0.04]
weights2 = [0.02, 0.03, 0.04, 0.05]
weights3 = [0.03, 0.04, 0.05, 0.06]

bias1 = 2.0
bias2 = 3.0
bias3 = 4.0

output = [inputs[0]*weights1[0] + inputs[1]*weights1[1] + inputs[2]*weights1[2] + inputs[3]*weights1[3] + bias1,
          inputs[0]*weights2[0] + inputs[1]*weights2[1] + inputs[2]*weights2[2] + inputs[3]*weights2[3] + bias2,
          inputs[0]*weights3[0] + inputs[1]*weights3[1] + inputs[2]*weights3[2] + inputs[3]*weights3[3] + bias3]
print(output)

[2.3, 3.4, 4.5]


## Dot Product

In [2]:
import numpy as np
inputs = [1.0, 2.0, 3.0]
weights = [2.0, 4.0, 6.0]
bias = 3.0

output = np.dot(inputs, weights) + bias
print(output)

31.0


<img src= "images/dotproduct.jpeg" style="width:500px; height:500px;">

In [None]:
inputs = [1.0, 2.0, 3.0, 4.0]

weights = [[0.01, 0.02, 0.03, 0.04] ,[0.02, 0.03, 0.04, 0.05] , [0.03, 0.04, 0.05, 0.06]]

bias = [2.0 , 3.0, 4.0]

output = np.dot(weights, inputs) + bias
print(output)

## Multiple Samples
<img src= "images/msample-vector.jpeg" style="width:500px; height:500px;">

In [23]:
# the input is a matrix of multiple samples

inputs = [[1.0, 2.0, 3.0, 4.0], [2.0,3.0,4.0,5.0],[3.0,4.0,5.0,6.0]]

weights = [[0.01, 0.02, 0.03, 0.04] ,[0.02, 0.03, 0.04, 0.05] , [0.03, 0.04, 0.05, 0.06]]

bias = [2.0 , 3.0 , 4.0]

output = np.dot(inputs,np.array(weights).T) + bias

print(output)

[[2.3  3.4  4.5 ]
 [2.4  3.54 4.68]
 [2.5  3.68 4.86]]


## Multiple layers
<img src= "images/2layer.jpeg" style="width:500px; height:500px;">

In [24]:
inputs = [[1.0, 2.0, 3.0, 4.0], [2.0,3.0,4.0,5.0],[3.0,4.0,5.0,6.0]]

weights_1 = [[0.01, 0.02, 0.03, 0.04] ,[0.02, 0.03, 0.04, 0.05] , [0.03, 0.04, 0.05, 0.06]]

bias_1 = [2.0 , 3.0, 4.0]

output_layer1 = np.dot(inputs,np.array(weights_1).T) + bias_1

print(output_layer1)

weights_2 = [[0.06, 0.07, 0.08] ,[0.04, 0.03, 0.02] , [0.01, 0.04, 0.04]]

bias_2 = [5.0 , 3.0, 8.0]

output_layer2 = np.dot(output_layer1,np.array(weights_2).T) + bias_2

print(output_layer2)

[[2.3  3.4  4.5 ]
 [2.4  3.54 4.68]
 [2.5  3.68 4.86]]
[[5.736  3.284  8.339 ]
 [5.7662 3.2958 8.3528]
 [5.7964 3.3076 8.3666]]


## Creating object layers

In [26]:
import numpy as np

np.random.seed(10)

inputs = [[1.0, 2.0, 3.0, 4.0], [2.0,3.0,4.0,5.0],[3.0,4.0,5.0,6.0]]


class Layer_Dense:

    def __init__(self, n_inputs, n_neurons):
        self.weights = 0.10 * np.random.randn(n_inputs, n_neurons)
        self.biases = np.zeros((1, n_neurons))

    def forward(self, inputs):
        self.output = np.dot(inputs, self.weights) + self.biases


layer1 = Layer_Dense(4,3)
layer2 = Layer_Dense(3,3)
print(layer1.biases)
layer1.forward(inputs)
# print(layer1.output)
layer2.forward(layer1.output)
print(layer2.output)

[[0. 0. 0.]]
[[ 0.03154683 -0.05097602 -0.02772224]
 [ 0.01055149 -0.04635657 -0.00102355]
 [-0.01044384 -0.04173711  0.02567514]]


## ReLU Activation

In [16]:
import numpy as np
np.random.seed(10)

X = [[1.0, 2.0, 3.0, 4.0], [2.0,3.0,4.0,5.0],[3.0,4.0,5.0,6.0]]


class Layer_Dense:
    def __init__(self, n_inputs, n_neurons):
        self.weights = 0.10 * np.random.randn(n_inputs, n_neurons)
        self.biases = np.zeros((1, n_neurons))
    def forward(self, inputs):
        self.output = np.dot(inputs, self.weights) + self.biases


class Activation_ReLU:
    def forward(self, inputs):
        self.output = np.maximum(0, inputs)


layer1 = Layer_Dense(4,2)
activation1 = Activation_ReLU()

layer1.forward(X)

#print(layer1.output)
activation1.forward(layer1.output)
print(activation1.output)

[[0.11668402 0.        ]
 [0.1839874  0.        ]
 [0.25129077 0.        ]]


## Soft Max Activation

In [23]:
import numpy as np
np.random.seed(10)

X = [[1.0, 2.0, 3.0, 4.0], [2.0,3.0,4.0,5.0],[3.0,4.0,5.0,6.0]]


class Layer_Dense:
    def __init__(self, n_inputs, n_neurons):
        self.weights = 0.10 * np.random.randn(n_inputs, n_neurons)
        self.biases = np.zeros((1, n_neurons))
    def forward(self, inputs):
        self.output = np.dot(inputs, self.weights) + self.biases


class Activation_ReLU:
    def forward(self, inputs):
        self.output = np.maximum(0, inputs)
class Activation_SoftMax:
    def forward(self, inputs):
        exp_values = np.exp(inputs - np.max(x, axis=-1, keepdims=True))
        print(exp_values)
        self.output = exp_values / np.sum(exp_values, axis=-1, keepdims=True)


layer1 = Layer_Dense(4,2)
activation1 = Activation_ReLU()

layer1.forward(X)
print(layer1.output)

activation1.forward(layer1.output)
print(activation1.output)

layer2 = Layer_Dense(2,3)
activation2 = Activation_SoftMax()


layer2.forward(activation1.output)
print(layer2.output)

activation2.forward(layer2.output)
print(activation2.output)

[[ 0.11668402 -0.10275513]
 [ 0.1839874  -0.09321932]
 [ 0.25129077 -0.08368351]]
[[0.11668402 0.        ]
 [0.1839874  0.        ]
 [0.25129077 0.        ]]
[[ 5.00741406e-05 -2.03730542e-03  5.05272359e-03]
 [ 7.89569201e-05 -3.21242380e-03  7.96713609e-03]
 [ 1.07839700e-04 -4.38754218e-03  1.08815486e-02]]
[[0.00673828 0.00672423 0.00677208]
 [0.00673848 0.00671634 0.00679184]
 [0.00673867 0.00670845 0.00681167]]
[[0.3330081  0.33231371 0.33467819]
 [0.3328193  0.33172567 0.33545502]
 [0.33262964 0.3311377  0.33623267]]


## Calculating Loss - Categorical Cross Entropy

In [11]:
import numpy as np
np.random.seed(10)

X = [[1.0, 2.0, 3.0, 4.0], [2.0,3.0,4.0,5.0],[3.0,4.0,5.0,6.0]]


class Layer_Dense:
    def __init__(self, n_inputs, n_neurons):
        self.weights = 0.10 * np.random.randn(n_inputs, n_neurons)
        self.biases = np.zeros((1, n_neurons))
    def forward(self, inputs):
        self.output = np.dot(inputs, self.weights) + self.biases


class Activation_ReLU:
    def forward(self, inputs):
        self.output = np.maximum(0, inputs)
class Activation_SoftMax:
    def forward(self, inputs):
        exp_values = np.exp(inputs - np.max(inputs, axis=-1, keepdims=True))
        print(exp_values)
        self.output = exp_values / np.sum(exp_values, axis=-1, keepdims=True)

class Loss_CategoricalCrossEntropy(Loss):
    def forward(self, predicted_prob , true_labels):
        epsilon = 1e-15
    #     clipped = predicted_prob
        clipped = np.clip(predicted_prob , epsilon , 1-epsilon)
        if len(true_labels.shape) == 1:
            loss = np.mean(-np.log(clipped[range(len(predicted_prob)), true_labels]))
        elif len(true_labels.shape) == 2:
            loss = np.mean(-np.log(np.sum(clipped*true_labels , axis =1)))
        self.output = loss

layer1 = Layer_Dense(4,2)
activation1 = Activation_ReLU()

layer1.forward(X)
print(layer1.output)

activation1.forward(layer1.output)
print(activation1.output)

layer2 = Layer_Dense(2,3)
activation2 = Activation_SoftMax()


layer2.forward(activation1.output)
print(layer2.output)

activation2.forward(layer2.output)
print(activation2.output)

# true_labels = np.array([1,1,1])

true_labels = np.array([[0,1,0],[0,1,0],[0,1,0]])
loss = Loss_CategoricalCrossEntropy()
loss.forward(activation2.output, true_labels)
print(loss.output)

[[ 0.11668402 -0.10275513]
 [ 0.1839874  -0.09321932]
 [ 0.25129077 -0.08368351]]
[[0.11668402 0.        ]
 [0.1839874  0.        ]
 [0.25129077 0.        ]]
[[ 5.00741406e-05 -2.03730542e-03  5.05272359e-03]
 [ 7.89569201e-05 -3.21242380e-03  7.96713609e-03]
 [ 1.07839700e-04 -4.38754218e-03  1.08815486e-02]]
[[0.99500984 0.99293505 1.        ]
 [0.99214285 0.9888827  1.        ]
 [0.98928412 0.98484689 1.        ]]
[[0.3330081  0.33231371 0.33467819]
 [0.3328193  0.33172567 0.33545502]
 [0.33262964 0.3311377  0.33623267]]
1.1034479290784434
