---
# Simple Neural Network: From Scratch

## **Introduction**
A **Simple Neural Network** is a basic neural network with one hidden layer. It is capable of solving simple classification problems, such as the XOR problem. In this notebook, we will implement a simple neural network from scratch using NumPy.

---

## **Mathematical Foundations**

### **1. Forward Propagation**
Forward propagation involves computing the output of the network given an input. The steps are as follows:

1. **Linear Transformation (Hidden Layer):**
   \\[
   z_1 = X \cdot W_1 + b_1
   \\]
   - \\( X \\): Input data (shape: \\( n \times m \\), where \\( n \\) is the number of samples and \\( m \\) is the number of features).
   - \\( W_1 \\): Weights of the hidden layer (shape: \\( m \times h \\), where \\( h \\) is the number of hidden units).
   - \\( b_1 \\): Biases of the hidden layer (shape: \\( h \\)).

2. **Activation Function (Hidden Layer):**
   \\[
   a_1 = \sigma(z_1)
   \\]
   - \\( \sigma \\): Sigmoid activation function:
     \\[
     \sigma(x) = \frac{1}{1 + e^{-x}}
     \\]

3. **Linear Transformation (Output Layer):**
   \\[
   z_2 = a_1 \cdot W_2 + b_2
   \\]
   - \\( W_2 \\): Weights of the output layer (shape: \\( h \times o \\), where \\( o \\) is the number of output units).
   - \\( b_2 \\): Biases of the output layer (shape: \\( o \\)).

4. **Activation Function (Output Layer):**
   \\[
   a_2 = \sigma(z_2)
   \\]

### **2. Backpropagation**
Backpropagation involves computing the gradients of the loss function with respect to the weights and biases. The steps are as follows:

1. **Compute the Error:**
   \\[
   \text{error} = a_2 - y
   \\]
   - \\( y \\): True labels.

2. **Compute the Gradient of the Loss with Respect to \( W_2 \) and \( b_2 \):**
   \\[
   \delta_2 = \text{error} \cdot \sigma'(z_2)
   \\]
   \\[
   \frac{\partial L}{\partial W_2} = a_1^T \cdot \delta_2
   \\]
   \\[
   \frac{\partial L}{\partial b_2} = \sum \delta_2
   \\]

3. **Compute the Gradient of the Loss with Respect to \( W_1 \) and \( b_1 \):**
   \\[
   \delta_1 = \delta_2 \cdot W_2^T \cdot \sigma'(z_1)
   \\]
   \\[
   \frac{\partial L}{\partial W_1} = X^T \cdot \delta_1
   \\]
   \\[
   \frac{\partial L}{\partial b_1} = \sum \delta_1
   \\]

4. **Update the Weights and Biases:**
   \\[
   W_2 = W_2 - \eta \cdot \frac{\partial L}{\partial W_2}
   \\]
   \\[
   b_2 = b_2 - \eta \cdot \frac{\partial L}{\partial b_2}
   \\]
   \\[
   W_1 = W_1 - \eta \cdot \frac{\partial L}{\partial W_1}
   \\]
   \\[
   b_1 = b_1 - \eta \cdot \frac{\partial L}{\partial b_1}
   \\]
   - \\( \eta \\): Learning rate.

---

## **Implementation**
Below is the Python code for implementing a simple neural network from scratch. 

In [1]:
import numpy as np

class SimpleNeuralNetwork:
    def __init__(self, input_size, hidden_size, output_size):
        # Initialize weights and biases randomly
        self.weights1 = np.random.randn(input_size, hidden_size)
        self.bias1 = np.random.randn(hidden_size)
        self.weights2 = np.random.randn(hidden_size, output_size)
        self.bias2 = np.random.randn(output_size)

    def sigmoid(self, x):
        # Sigmoid activation function
        return 1 / (1 + np.exp(-x))

    def sigmoid_derivative(self, x):
        # Derivative of the sigmoid function
        return x * (1 - x)

    def forward(self, X):
        # Forward propagation
        self.z1 = np.dot(X, self.weights1) + self.bias1
        self.a1 = self.sigmoid(self.z1)
        self.z2 = np.dot(self.a1, self.weights2) + self.bias2
        self.a2 = self.sigmoid(self.z2)
        return self.a2

    def backward(self, X, y, output):
        # Backpropagation
        self.error = output - y
        self.delta2 = self.error * self.sigmoid_derivative(output)
        self.delta1 = np.dot(self.delta2, self.weights2.T) * self.sigmoid_derivative(self.a1)

        # Update weights and biases
        self.weights2 -= np.dot(self.a1.T, self.delta2)
        self.bias2 -= np.sum(self.delta2, axis=0)
        self.weights1 -= np.dot(X.T, self.delta1)
        self.bias1 -= np.sum(self.delta1, axis=0)

    def train(self, X, y, epochs=1000):
        # Training the network
        for _ in range(epochs):
            output = self.forward(X)
            self.backward(X, y, output)

# Example usage
if __name__ == "__main__":
    # Input data (4 samples, 3 features each)
    X = np.array([[0, 0, 1], [0, 1, 1], [1, 0, 1], [1, 1, 1]])
    y = np.array([[0], [1], [1], [0]])  # XOR operation

    # Create a neural network
    nn = SimpleNeuralNetwork(input_size=3, hidden_size=4, output_size=1)

    # Train the network
    nn.train(X, y, epochs=10000)

    # Test the network
    print("Predictions after training:")
    for x in X:
        print(f"Input: {x}, Output: {nn.forward(x)}")

Predictions after training:
Input: [0 0 1], Output: [0.0121126]
Input: [0 1 1], Output: [0.98528177]
Input: [1 0 1], Output: [0.99436309]
Input: [1 1 1], Output: [0.01194609]


In [None]:
# Finish