In [None]:
#Here we are taking AND operation and using backpropogation algorithm to classify the output as 0 or 1.

In [2]:
import numpy as np  # Imports the NumPy library, which provides support for array operations and numerical functions

def sigmoid(x):  # Defines the sigmoid activation function
    return 1 / (1 + np.exp(-x))  # Squashes input values to be between 0 and 1

def sigmoid_derivative(x):  # Defines the derivative of the sigmoid function
    return x * (1 - x)  # Computes the derivative used in backpropagation for gradient calculation

# Input data for the AND gate (4 samples, each with 2 binary features)
inputs = np.array([
    [0, 0],
    [0, 1],
    [1, 0],
    [1, 1]
])

# Expected output for an AND gate (1 output per input sample)
labels = np.array([[0], [0], [0], [1]])

learning_rate = 0.1  # Sets the learning rate, which controls the update step size during training

input_layer_neurons = inputs.shape[1]  # Sets the number of neurons in the input layer equal to the number of input features
hidden_layer_neurons = 2               # Sets the number of neurons in the hidden layer (arbitrarily chosen)
output_neurons = 1                      # Sets the number of neurons in the output layer to 1 (since it's a binary classification)

# Initialize weights for connections between input and hidden layer
weights_input_hidden = np.random.uniform(size=(input_layer_neurons, hidden_layer_neurons))
# Initialize bias for the hidden layer
bias_hidden = np.random.uniform(size=(1, hidden_layer_neurons))

# Initialize weights for connections between hidden and output layer
weights_hidden_output = np.random.uniform(size=(hidden_layer_neurons, output_neurons))
# Initialize bias for the output layer
bias_output = np.random.uniform(size=(1, output_neurons))

epochs = 10000  # Number of iterations (or passes) for training the neural network
for epoch in range(epochs):  # Iterates over each epoch (training cycle)
    # Forward pass: calculate hidden layer's weighted input and activation
    hidden_layer_input = np.dot(inputs, weights_input_hidden) + bias_hidden  # Computes input for the hidden layer
    hidden_layer_activation = sigmoid(hidden_layer_input)  # Applies sigmoid activation function to hidden layer input

    # Calculate output layer's weighted input and activation
    output_layer_input = np.dot(hidden_layer_activation, weights_hidden_output) + bias_output  # Computes input for the output layer
    predicted_output = sigmoid(output_layer_input)  # Applies sigmoid activation function to output layer input

    # Calculate the error as the difference between the predicted and actual output
    error = labels - predicted_output  # Calculates error for output layer

    # Backpropagation step: calculate gradient for output layer using the error and sigmoid derivative
    d_predicted_output = error * sigmoid_derivative(predicted_output)  # Computes the gradient for output layer

    # Calculate error for the hidden layer
    error_hidden_layer = d_predicted_output.dot(weights_hidden_output.T)  # Backpropagates error to hidden layer
    d_hidden_layer = error_hidden_layer * sigmoid_derivative(hidden_layer_activation)  # Computes gradient for hidden layer

    # Update weights and biases for hidden-to-output layer using the calculated gradient and learning rate
    weights_hidden_output += hidden_layer_activation.T.dot(d_predicted_output) * learning_rate
    bias_output += np.sum(d_predicted_output, axis=0, keepdims=True) * learning_rate
    
    #The .T in weights_hidden_output.T represents the transpose of the weights_hidden_output matrix.
    #d_predicted_output is the gradient of the output layer, which has dimensions (n_samples, output_neurons).
#weights_hidden_output is the matrix of weights between the hidden and output layers, with dimensions (hidden_layer_neurons, output_neurons).
#To perform matrix multiplication (dot), the dimensions must align: the number of columns in the first matrix must match the number of rows in the second matrix.

    # Update weights and biases for input-to-hidden layer using the calculated gradient and learning rate
    weights_input_hidden += inputs.T.dot(d_hidden_layer) * learning_rate
    bias_hidden += np.sum(d_hidden_layer, axis=0, keepdims=True) * learning_rate

    # Every 1000 epochs, print the mean absolute error to monitor training progress
    if epoch % 1000 == 0:
        print(f"Epoch {epoch}, Error: {np.mean(np.abs(error))}")

# Print the final predicted output after training completes
print("\nFinal predicted output:")
print(predicted_output)


Epoch 0, Error: 0.5947693411886765
Epoch 1000, Error: 0.36582384477199703
Epoch 2000, Error: 0.17721425719878361
Epoch 3000, Error: 0.09306961474037057
Epoch 4000, Error: 0.0656786869368591
Epoch 5000, Error: 0.05235073444058997
Epoch 6000, Error: 0.0443546161073536
Epoch 7000, Error: 0.03895553895260831
Epoch 8000, Error: 0.035026769801154815
Epoch 9000, Error: 0.032017243604984785

Final predicted output:
[[0.00633995]
 [0.03581228]
 [0.03393122]
 [0.95757809]]


In [None]:
Simple implementation of a neural network using the backpropagation algorithm in Python. T
his example uses NumPy for numerical computations and demonstrates how a basic neural network can learn to perform 
binary classification.

Overview
Architecture:

Input Layer: 2 neurons
Hidden Layer: 2 neurons
Output Layer: 1 neuron
Activation Function: Sigmoid function for both hidden and output layers.

Dataset: Logical AND problem.

Dataset
Input 1	Input 2	Output
0	0	0
0	1	0
1	0	0
1	1	1

About bias:
    In a neural network, bias is an additional parameter added to each neuron (or node) in a layer, allowing the model to have greater flexibility in fitting the data. 
    It’s a constant term added to the weighted sum of inputs before applying the activation function.
Why Bias is Important
Flexibility in Output Values: Without bias, the output of each neuron would always have to pass through the origin (0,0) when the inputs are zero, limiting the range of possible values and making it harder for the network to learn complex patterns.

Shifting the Activation: The bias term shifts the activation function to help the network adjust the relationship between inputs and outputs. This is particularly important in classification tasks.

Learning Patterns: With bias, the neuron can still activate (output non-zero) even when all inputs are zero, allowing it to learn patterns that wouldn’t be possible without it
    
bias_hidden = np.random.uniform(size=(1, hidden_layer_neurons))
bias_output = np.random.uniform(size=(1, output_neurons))
bias_hidden: A bias vector added to each neuron in the hidden layer.
bias_output: A bias term added to the neuron(s) in the output layer.
    
During each forward pass, the bias is added to the weighted sum of inputs before the activation function is applied. For example, the line:

python
Copy code
hidden_layer_input = np.dot(inputs, weights_input_hidden) + bias_hidden

adds bias_hidden to the weighted sum of inputs to each neuron in the hidden layer.

Similarly, the output layer’s input calculation:

python
Copy code
output_layer_input = np.dot(hidden_layer_activation, weights_hidden_output) + bias_output
adds bias_output to the weighted sum of the hidden layer’s activations before the activation function.