In [1]:
import numpy as np

class NeuralNetwork:
    def __init__(self, input_size, hidden_size, output_size):
        self.input_size = input_size
        self.hidden_size = hidden_size
        self.output_size = output_size
        
        # Initialize weights and biases
        self.weights_input_hidden = np.random.randn(self.input_size, self.hidden_size)
        self.bias_input_hidden = np.random.randn(1, self.hidden_size)
        self.weights_hidden_output = np.random.randn(self.hidden_size, self.output_size)
        self.bias_hidden_output = np.random.randn(1, self.output_size)
        
    def sigmoid(self, x):
        return 1 / (1 + np.exp(-x))
    
    def sigmoid_derivative(self, x):
        return x * (1 - x)
    
    def forward(self, X):
        # Input to hidden layer
        self.hidden_output = self.sigmoid(np.dot(X, self.weights_input_hidden) + self.bias_input_hidden)
        # Hidden to output layer
        self.predicted_output = self.sigmoid(np.dot(self.hidden_output, self.weights_hidden_output) + self.bias_hidden_output)
        return self.predicted_output
    
    def backward(self, X, y, learning_rate):
        # Calculate error
        error = y - self.predicted_output
        
        # Compute gradients
        delta_output = error * self.sigmoid_derivative(self.predicted_output)
        delta_hidden = delta_output.dot(self.weights_hidden_output.T) * self.sigmoid_derivative(self.hidden_output)
        
        # Update weights and biases
        self.weights_hidden_output += self.hidden_output.T.dot(delta_output) * learning_rate
        self.bias_hidden_output += np.sum(delta_output, axis=0, keepdims=True) * learning_rate
        self.weights_input_hidden += X.T.dot(delta_hidden) * learning_rate
        self.bias_input_hidden += np.sum(delta_hidden, axis=0, keepdims=True) * learning_rate
    
    def train(self, X, y, epochs, learning_rate):
        for epoch in range(epochs):
            output = self.forward(X)
            self.backward(X, y, learning_rate)
            if epoch % 100 == 0:
                loss = np.mean(np.square(y - output))
                print(f'Epoch {epoch}: Loss = {loss:.4f}')
                
    def predict(self, X):
        return self.forward(X)

# Example usage
X = np.array([[0,0], [0,1], [1,0], [1,1]])
y = np.array([[0], [1], [1], [0]])

input_size = 2
hidden_size = 3
output_size = 1

nn = NeuralNetwork(input_size, hidden_size, output_size)
nn.train(X, y, epochs=1000, learning_rate=0.1)

# Test the trained model
print("Predictions:")
print(nn.predict(X))

Epoch 0: Loss = 0.2885
Epoch 100: Loss = 0.2466
Epoch 200: Loss = 0.2442
Epoch 300: Loss = 0.2416
Epoch 400: Loss = 0.2388
Epoch 500: Loss = 0.2355
Epoch 600: Loss = 0.2315
Epoch 700: Loss = 0.2267
Epoch 800: Loss = 0.2208
Epoch 900: Loss = 0.2137
Predictions:
[[0.40164534]
 [0.4589818 ]
 [0.62812302]
 [0.47848862]]


The code defines a simple feedforward neural network with one hidden layer and a single output. The network uses the sigmoid activation function for both hidden and output layers, and it trains with a basic form of backpropagation. Let's break it down step-by-step.

### Class Definition
- `class NeuralNetwork:`: Defines the Neural Network class.

### Initialization (`__init__`)
- `def __init__(self, input_size, hidden_size, output_size):`: Constructor that initializes the neural network with specified input, hidden, and output layer sizes.
- `self.input_size = input_size`: Stores the input layer size.
- `self.hidden_size = hidden_size`: Stores the hidden layer size.
- `self.output_size = output_size`: Stores the output layer size.
- **Weights and Biases Initialization**:
  - `self.weights_input_hidden = np.random.randn(self.input_size, self.hidden_size)`: Initializes the weights between the input and hidden layers with random values from a standard normal distribution.
  - `self.bias_input_hidden = np.random.randn(1, self.hidden_size)`: Initializes the biases for the hidden layer.
  - `self.weights_hidden_output = np.random.randn(self.hidden_size, self.output_size)`: Initializes the weights between the hidden and output layers.
  - `self.bias_hidden_output = np.random.randn(1, self.output_size)`: Initializes the biases for the output layer.

### Activation Functions
- `def sigmoid(self, x):`: Defines the sigmoid activation function, which maps any real number to a value between 0 and 1.
  - `return 1 / (1 + np.exp(-x))`: The sigmoid function formula.
- `def sigmoid_derivative(self, x):`: Defines the derivative of the sigmoid function, used in backpropagation to compute gradients.
  - `return x * (1 - x)`: The derivative of sigmoid. This derivative helps determine how much to adjust the weights during training.

### Forward Propagation (`forward`)
- `def forward(self, X):`: Defines the forward propagation method, which computes the outputs of the network given an input `X`.
  - `self.hidden_output = self.sigmoid(np.dot(X, self.weights_input_hidden) + self.bias_input_hidden)`: Computes the hidden layer output by taking the dot product of `X` with the weights for the input-to-hidden layer, adding the bias, then applying the sigmoid function.
  - `self.predicted_output = self.sigmoid(np.dot(self.hidden_output, self.weights_hidden_output) + self.bias_hidden_output)`: Computes the output layer by doing a similar operation, using the hidden layer's output.
  - `return self.predicted_output`: Returns the predicted output from the network.

### Backward Propagation (`backward`)
- `def backward(self, X, y, learning_rate):`: Defines the backward propagation method to update the weights and biases based on the error between the predicted and actual output.
  - `error = y - self.predicted_output`: Computes the error between the actual output `y` and the predicted output `self.predicted_output`.
  - **Calculate Gradients**:
    - `delta_output = error * self.sigmoid_derivative(self.predicted_output)`: Calculates the gradient for the output layer using the error and the derivative of sigmoid.
    - `delta_hidden = delta_output.dot(self.weights_hidden_output.T) * self.sigmoid_derivative(self.hidden_output)`: Computes the gradient for the hidden layer by propagating the `delta_output` backward through the weights, then multiplying by the derivative of sigmoid.
  - **Update Weights and Biases**:
    - `self.weights_hidden_output += self.hidden_output.T.dot(delta_output) * learning_rate`: Updates the weights between hidden and output layers using the gradients and the learning rate.
    - `self.bias_hidden_output += np.sum(delta_output, axis=0, keepdims=True) * learning_rate`: Updates the bias for the output layer.
    - `self.weights_input_hidden += X.T.dot(delta_hidden) * learning_rate`: Updates the weights between input and hidden layers.
    - `self.bias_input_hidden += np.sum(delta_hidden, axis=0, keepdims=True) * learning_rate`: Updates the bias for the hidden layer.

### Training (`train`)
- `def train(self, X, y, epochs, learning_rate):`: Defines the training method.
  - `for epoch in range(epochs)`: Loops over the specified number of epochs.
  - `output = self.forward(X)`: Performs forward propagation to compute the network's output given the training data `X`.
  - `self.backward(X, y, learning_rate)`: Applies backward propagation to update weights and biases.
  - `if epoch % 100 == 0:`: Every 100 epochs, a checkpoint for printing the loss.
    - `loss = np.mean(np.square(y - output))`: Computes the mean squared error loss.
    - `print(f'Epoch {epoch}: Loss = {loss:.4f}')`: Outputs the loss for monitoring training progress.

### Prediction (`predict`)
- `def predict(self, X):`: Defines the predict method to generate predictions for given input `X`.
  - `return self.forward(X)`: Returns the predicted output after forward propagation.

### Example Usage
- **Create Input Data**:
  - `X = np.array([[0,0], [0,1], [1,0], [1,1]])`: Example data for training, representing the inputs for an XOR operation.
  - `y = np.array([[0], [1], [1], [0]])`: The corresponding expected outputs (target values).
- **Initialize Neural Network**:
  - `nn = NeuralNetwork(input_size, hidden_size, output_size)`: Creates an instance of the neural network with specified sizes.
- **Train the Neural Network**:
  - `nn.train(X, y, epochs=1000, learning_rate=0.1)`: Trains the network on the given data for 1000 epochs with a learning rate of 0.1.
- **Test the Trained Model**:
  - `print("Predictions:")`: Outputs the heading for the predictions.
  - `print(nn.predict(X))`: Makes predictions for the given inputs and prints the results.

### Summary
This code creates a simple feedforward neural network with one hidden layer and uses backpropagation to update weights and biases. The network is trained on an XOR-like dataset and tested to see how well it performs after training. The code demonstrates fundamental concepts in neural networks, including activation functions, backpropagation, and training with epochs and learning rate.