## Single Neuron

In [6]:
inputs = [1.0, 2.0, 3.0, 2.5]
weights = [0.2, 0.8, -0.5, 1.0]
bias = 2.0

output = [
    inputs[0]*weights[0] +
    inputs[1]*weights[1] +
    inputs[2]*weights[2] +
    inputs[3]*weights[3] + bias
]

print(output)

[4.8]


## Layer of Neurons 
- Neural networks typically have layers that consist of more than one neuron. 
- Layers are nothing more than groups of neurons. 
- This is called a fully connected neural network — every neuron in the current layer has connections to every neuron from the previous layer. 

In [5]:
inputs = [1, 2, 3, 2.5]

weights1 = [0.2, 0.8, -0.5, 1]
weights2 = [0.5, -0.91, 0.26, -0.5]
weights3 = [-0.26, 0.27, 0.17, 0.87]

bias1 = 2.0
bias2 = 3.0
bias3 = 0.5

outputs = [
    
    #Neuron 1:
    inputs[0] * weights1[0] +
    inputs[1] * weights1[1] +
    inputs[2] * weights1[2] +
    inputs[3] * weights1[3] + bias1,
    
    #Neuron 2:
    inputs[0] * weights2[0] +
    inputs[1] * weights2[1] +
    inputs[2] * weights2[2] +
    inputs[3] * weights2[3] + bias2,
    
    #Neuron 3:
    inputs[0] * weights3[0] +
    inputs[1] * weights3[1] +
    inputs[2] * weights3[2] +
    inputs[3] * weights3[3] + bias3
]

print(outputs)

[4.8, 1.21, 3.465]


## A Single Neuron with NumPy
This makes the code much simpler to read and write (and faster to run):

In [8]:
import numpy as np 

inputs = [1.0, 2.0, 3.0, 2.5]
weights = [0.2, 0.8, -0.5, 1.0]
bias = 2.0

outputs = np.dot(weights, inputs) + bias 
print(outputs)

4.8


## A Layer of Neurons with NumPy

In [19]:
inputs = np.array([1.0, 2.0, 3.0, 2.5])
weights = np.array([
    [0.2, 0.8, -0.5, 1],
    [0.5, -0.91, 0.26, -0.5],
    [-0.26, 0.27, 0.17, 0.87]
])

biases = np.array([2.0, 3.0, 0.5])

outputs = np.dot(weights, inputs) + biases

print(outputs)
print('{} x  {} = {}'.format(weights.shape, inputs.shape, outputs.shape))

[4.8   1.21  3.465]
(3, 4) x  (4,) = (3,)


## A Layer of Neurons & Batch of Data NumPy

In [22]:
inputs = np.array([
    [1.0, 2.0, 3.0, 2.5],
    [2.0, 5.0, -1.0, 2.0],
    [-1.5, 2.7, 3.3, -0.8]
])

weights = np.array([
    [0.2, 0.8, -0.5, 1.0],
    [0.5, -0.91, 0.26, -0.5],
    [-0.26, -0.27, 0.17, 0.87]
])

biases = np.array([2.0, 3.0, 0.5])

print('shape of inputs: ',inputs.shape)
print('shape of weights: ',weights.shape)

outputs = np.dot(inputs, weights.T) + biases
print('outputs: ', outputs)

shape of inputs:  (3, 4)
shape of weights:  (3, 4)
outputs:  [[ 4.8    1.21   2.385]
 [ 8.9   -1.81   0.2  ]
 [ 1.41   1.051  0.026]]


## Adding Layer 

In [27]:
inputs = np.array([
    [1.0, 2.0, 3.0, 2.5],
    [2.0, 5.0, -1.0, 2.0],
    [-1.5, 2.7, 3.3, -0.8]
])

weights1 = np.array([
    [0.2, 0.8, -0.5, 1.0],
    [0.5, -0.91, 0.26, -0.5],
    [-0.26, -0.27, 0.17, 0.87]
])  #4 Neurons
biases1 = np.array([2.0, 3.0, 0.5])

weights2 = np.array(
    [[0.1, -0.14, 0.5],
    [-0.5, 0.12, -0.33],
    [-0.44, 0.73, -0.13]    
])  #3 Neurons
biases2 = np.array([-1, 2, -0.5])

print('shape of inputs: ',inputs.shape)
print('shape of weights in first hidden layer: ',weights1.shape)
print('shape of weights in second hidden layer: ',weights2.shape)

layer1_outputs = np.dot(inputs, weights1.T) + biases1
print('shape of layer1 outputs', layer1_outputs.shape)
layer2_outputs = np.dot(layer1_outputs, weights2) + biases2
print('shape of layer2 outputs', layer2_outputs.shape)

print('Outputs: ', layer2_outputs)

shape of inputs:  (3, 4)
shape of weights in first hidden layer:  (3, 4)
shape of weights in second hidden layer:  (3, 3)
shape of layer1 outputs (3, 3)
shape of layer2 outputs (3, 3)
Outputs:  [[-2.1744   3.21425  1.19065]
 [ 0.707    0.6828   4.5213 ]
 [-1.39594  1.9477  -0.14521]]


# Activation Functions


## Sigmoid Function
The sigmoid activation function is a mathematical function commonly used in artificial neural networks to introduce non-linearity into the output of a neuron. It takes any input value and transforms it into a value between 0 and 1.

The formula for the sigmoid function is:

**f(x) = 1 / (1 + e^-x)**

where x is the input to the function, e is the mathematical constant Euler's number (approximately 2.71828), and f(x) is the output of the function.

The graph of the sigmoid function is an S-shaped curve, where the output values start near 0 when the input is very negative, gradually increase to 0.5 at x=0, and then continue to increase towards 1 as the input becomes positive.

The sigmoid function is often used as an activation function in the output layer of binary classification problems, where the goal is to predict one of two classes. It can also be used in the hidden layers of a neural network to introduce non-linearity into the model. However, the sigmoid function can suffer from the vanishing gradient problem, which can make it difficult to train deep neural networks.

In [28]:
def sigmoid(z):
    return 1 / (1 + np.exp(-z))

## Rectified Linear Unit (ReLU) Function:
As mentioned earlier, with “dead neurons,” it’s usually better to have a more granular approach for the hidden neuron activation functions.
The Sigmoid function, historically used in hidden layers, was eventually replaced by the Rectified Linear Units activation function (or ReLU).

In [29]:
def ReLU(inputs):
    return np.maximum(0, inputs)

## Softmax Function
The softmax activation function is a mathematical function used in machine learning and deep learning for classification tasks. It takes as input a vector of real-valued numbers and transforms them into a probability distribution, where each element in the output vector represents the probability of a particular class.

The softmax function is defined as follows:

**softmax(x) = e^x / sum(e^x)**

where x is a vector of real numbers, e is the mathematical constant approximately equal to 2.71828, and sum(e^x) is the sum of the exponential of each element in the input vector.

The output of the softmax function is a probability distribution, where each element in the output vector represents the probability of a particular class. The sum of all the probabilities in the output vector is equal to 1.

The softmax function is commonly used as the output activation function in neural networks for classification tasks. It is particularly useful when the number of classes is greater than two, as it allows the neural network to assign probabilities to each possible class, rather than simply choosing between two options.

In [35]:
def Softmax(x):
    e_x = np.exp(inputs - np.max(x))
    return e_x / e_x.sum(axis=0)

In [36]:
Softmax(10)

array([[0.26313249, 0.04329137, 0.42226619, 0.60848946],
       [0.71526828, 0.8695305 , 0.00773408, 0.36906752],
       [0.02159923, 0.08717812, 0.56999974, 0.02244302]])

# FULL NEURAL NETWORK

In [1]:
import numpy as np

class NeuralNetwork:
    def __init__(self, input_size, hidden_size, output_size):
        self.input_size = input_size
        self.hidden_size = hidden_size
        self.output_size = output_size
        
        # Initialize weights
        self.weights1 = np.random.randn(self.input_size, self.hidden_size) * 0.01
        self.weights2 = np.random.randn(self.hidden_size, self.output_size) * 0.01
    
    def forward(self, X):
        # Forward pass
        self.z1 = np.dot(X, self.weights1)
        self.a1 = self.sigmoid(self.z1)
        self.z2 = np.dot(self.a1, self.weights2)
        y_pred = self.sigmoid(self.z2)
        return y_pred
    
    def backward(self, X, y, y_pred, learning_rate):
        # Backward pass
        delta2 = (y_pred - y) * self.sigmoid_derivative(self.z2)
        d_weights2 = np.dot(self.a1.T, delta2)
        delta1 = np.dot(delta2, self.weights2.T) * self.sigmoid_derivative(self.z1)
        d_weights1 = np.dot(X.T, delta1)
        
        # Update weights
        self.weights1 -= learning_rate * d_weights1
        self.weights2 -= learning_rate * d_weights2
    
    def train(self, X, y, learning_rate, epochs):
        for epoch in range(epochs):
            # Forward pass
            y_pred = self.forward(X)
            
            # Backward pass
            self.backward(X, y, y_pred, learning_rate)
            
            # Compute loss
            loss = np.mean(np.square(y_pred - y))
            
            # Print loss every 100 epochs
            if epoch % 100 == 0:
                print(f"Epoch {epoch}: Loss = {loss:.4f}")
    
    def sigmoid(self, z):
        return 1 / (1 + np.exp(-z))
    
    def sigmoid_derivative(self, z):
        return self.sigmoid(z) * (1 - self.sigmoid(z))


In [2]:
# Example usage
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([[0], [1], [1], [0]])

nn = NeuralNetwork(input_size=2, hidden_size=4, output_size=1)
nn.train(X, y, learning_rate=0.1, epochs=1000)

Epoch 0: Loss = 0.2500
Epoch 100: Loss = 0.2500
Epoch 200: Loss = 0.2500
Epoch 300: Loss = 0.2500
Epoch 400: Loss = 0.2500
Epoch 500: Loss = 0.2500
Epoch 600: Loss = 0.2500
Epoch 700: Loss = 0.2500
Epoch 800: Loss = 0.2500
Epoch 900: Loss = 0.2500
