<a href="https://colab.research.google.com/github/dabster108/Neural-Networks-Scratch/blob/main/neural_networks_.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
''' Neural network is a computational system inspired by the human brain, called an artificial neural network (ANN).It is used to learn patterns from data and make predictions.

Layers
	1.	Input layer → receives the data (like numbers, images, or words).
	2.	Hidden layer(s) → processes the data and finds patterns.
	3.	Output layer → produces the result or prediction.

Key Terms
	•	Weight → decides how much influence an input has on the output.
	•	Bias → a number added to input to adjust the output and help fit the data better.
	•	Gradient → the signal used to update weights and biases during training so the network learns. '''

' Neural network is a computational system inspired by the human brain, called an artificial neural network (ANN).It is used to learn patterns from data and make predictions.\n\nLayers\n\t1.\tInput layer → receives the data (like numbers, images, or words).\n\t2.\tHidden layer(s) → processes the data and finds patterns.\n\t3.\tOutput layer → produces the result or prediction.\n\nKey Terms\n\t•\tWeight → decides how much influence an input has on the output.\n\t•\tBias → a number added to input to adjust the output and help fit the data better.\n\t•\tGradient → the signal used to update weights and biases during training so the network learns. '

In [None]:
''' Step 1: Import the necessary libraries (NumPy for computations).
Step 2: Define the network architecture, including input size, hidden layers, and output size. Initialize weights and biases randomly.
Step 3: Define the activation functions for the hidden and output layers along with their derivatives for backpropagation.
Step 4: Implement forward propagation to compute outputs from inputs through the network.
Step 5: Compute the loss using a suitable loss function, such as binary cross-entropy for binary classification.
Step 6: Implement backward propagation to calculate gradients and update the weights and biases using gradient descent.
Step 7: Set up a training loop to iterate the forward and backward passes over multiple epochs while monitoring the loss.
Step 8: Create a prediction function that applies a threshold on the output to classify inputs.
Step 9: Test the network on a dataset like XOR to ensure it learns correctly.'''

' Step 1: Import the necessary libraries (NumPy for computations).\nStep 2: Define the network architecture, including input size, hidden layers, and output size. Initialize weights and biases randomly.\nStep 3: Define the activation functions for the hidden and output layers along with their derivatives for backpropagation.\nStep 4: Implement forward propagation to compute outputs from inputs through the network.\nStep 5: Compute the loss using a suitable loss function, such as binary cross-entropy for binary classification.\nStep 6: Implement backward propagation to calculate gradients and update the weights and biases using gradient descent.\nStep 7: Set up a training loop to iterate the forward and backward passes over multiple epochs while monitoring the loss.\nStep 8: Create a prediction function that applies a threshold on the output to classify inputs.\nStep 9: Test the network on a dataset like XOR to ensure it learns correctly.'

In [None]:
import numpy as np
# For our first example, we’ll build a small network that can learn the XOR problem:
# Input layer → 2 neurons (since XOR has 2 inputs)
# Hidden layer → 2 neurons
# Output layer → 1 neuron (binary classification: 0 or 1)

'''
So, basically weights are learnable parameters for which the input has the influence in the ouput and biases are like parameter or added numbers to fit data better
'''

' \nSo, basically weights are learnable parameters for which the input has the influence in the ouput and biases are like parameter or added numbers to fit data better \n'

In [None]:

np.random.seed(42)

# Network architecture
input_size = 2
hidden_size = 2
output_size = 1

# Initialize weights and biases
W1 = np.random.randn(input_size, hidden_size)   # weights for input -> hidden
b1 = np.zeros((1, hidden_size))                 # biases for hidden layer
W2 = np.random.randn(hidden_size, output_size)  # weights for hidden -> output
b2 = np.zeros((1, output_size))                 # biases for output layer

In [None]:
'''Activation functions introduce non-linearity into the network, which is crucial because without them, the network would just be doing linear math (like a straight line) acouldn’t learn complex patterns like XOR.
We will use 2 activation functions :
RELU (Rectified Linear Unit) for hidden layers
Sigmoid Function  (Since we need problabilities for the binary classification )
'''
# Sigmoid activation (for output layer)
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

# Derivative of sigmoid (for backpropagation)
def sigmoid_derivative(x):
    return sigmoid(x) * (1 - sigmoid(x))

# ReLU activation (for hidden layer)
def relu(x):
    return np.maximum(0, x)

# Derivative of ReLU (for backpropagation)
def relu_derivative(x):
    return (x > 0).astype(float)

In [None]:
''' Forward Propagation means passing the input through the network (layer by layer) to get the ouput
Multiply input by weights and add bias (linear part)
Apply activation function (non-linear part)
Pass result to the next layer
'''
def forward(X):
    # Input -> Hidden
    Z1 = np.dot(X, W1) + b1     # linear step
    A1 = relu(Z1)               # apply ReLU

    # Hidden -> Output
    Z2 = np.dot(A1, W2) + b2    # linear step
    A2 = sigmoid(Z2)            # apply Sigmoid (probabilities)

    return Z1, A1, Z2, A2



In [None]:
''' Now we use the loss function to evaluate the networks prediction from the actual target values '''
def compute_loss(y_true, y_pred):
    m = y_true.shape[0]  # number of samples
    loss = - (1/m) * np.sum(
        y_true * np.log(y_pred + 1e-8) + (1 - y_true) * np.log(1 - y_pred + 1e-8)
    )
    return loss

In [None]:
''' Backward Propagation
Backward propagation (backprop) is where the network learns by adjusting its weights and biases.
We calculate the gradient of the loss with respect to each parameter (using the chain rule) and update them with gradient descent.

'''
def backward(X, y, Z1, A1, Z2, A2, W1, b1, W2, b2, learning_rate=0.1):
    m = y.shape[0]

    # Output layer error
    dZ2 = A2 - y
    dW2 = (1/m) * np.dot(A1.T, dZ2)
    db2 = (1/m) * np.sum(dZ2, axis=0, keepdims=True)

    # Hidden layer error
    dA1 = np.dot(dZ2, W2.T)
    dZ1 = dA1 * relu_derivative(Z1)
    dW1 = (1/m) * np.dot(X.T, dZ1)
    db1 = (1/m) * np.sum(dZ1, axis=0, keepdims=True)

    # Update parameters
    W1 -= learning_rate * dW1
    b1 -= learning_rate * db1
    W2 -= learning_rate * dW2
    b2 -= learning_rate * db2

    return W1, b1, W2, b2

In [None]:
''' Traning the neural networks
Now we’ll train the neural network by repeatedly doing:
	1.	Forward propagation → get predictions
	2.	Compute loss → see how wrong the predictions are
	3.	Backward propagation → update weights and biases
 '''
 # XOR dataset
X = np.array([[0,0],
              [0,1],
              [1,0],
              [1,1]])
y = np.array([[0],
              [1],
              [1],
              [0]])

epochs = 10000
learning_rate = 0.1

for i in range(epochs):
    # Forward pass
    Z1, A1, Z2, A2 = forward(X)

    # Compute loss
    loss = compute_loss(y, A2)

    # Backward pass (update weights and biases)
    W1, b1, W2, b2 = backward(X, y, Z1, A1, Z2, A2, W1, b1, W2, b2, learning_rate)

    # Print loss occasionally
    if i % 1000 == 0:
        print(f"Epoch {i}, Loss: {loss:.4f}")

Epoch 0, Loss: 0.7164
Epoch 1000, Loss: 0.3533
Epoch 2000, Loss: 0.3486
Epoch 3000, Loss: 0.3477
Epoch 4000, Loss: 0.3474
Epoch 5000, Loss: 0.3472
Epoch 6000, Loss: 0.3470
Epoch 7000, Loss: 0.3469
Epoch 8000, Loss: 0.3469
Epoch 9000, Loss: 0.3468


In [None]:
''' Prediction Function '''
def predict(X, W1, b1, W2, b2):
    _, _, _, A2 = forward(X)
    return (A2 > 0.5).astype(int)
print("Predictions:")
print(predict(X, W1, b1, W2, b2))
print("Actual:")
print(y)

Predictions:
[[0]
 [1]
 [0]
 [0]]
Actual:
[[0]
 [1]
 [1]
 [0]]
