---


# Deep Neural Network: From Scratch

## **Introduction**
A **Deep Neural Network (DNN)** is a neural network with multiple hidden layers. It is capable of solving more complex problems than a simple neural network. In this notebook, we will implement a deep neural network from scratch using NumPy.

---

## **Mathematical Foundations**

### **1. Forward Propagation**
Forward propagation involves computing the output of the network given an input. The steps are as follows:

1. **Linear Transformation (Hidden Layer \\( i \\)):**


   \\[
   z_i = a_{i-1} \cdot W_i + b_i
   \\]

   
   - \\( a_{i-1} \\): Activation of the previous layer.
   - \\( W_i \\): Weights of the current layer.
   - \\( b_i \\): Biases of the current layer.

3. **Activation Function (Hidden Layer \\( i \\)):**
   \\[
   a_i = \sigma(z_i)
   \\]
   - \\( \sigma \\): Sigmoid activation function.

4. **Output Layer:**
   \\[
   z_{\text{out}} = a_{\text{last}} \cdot W_{\text{out}} + b_{\text{out}}
   \\]
   \\[
   a_{\text{out}} = \sigma(z_{\text{out}})
   \\]

### **2. Backpropagation**
Backpropagation involves computing the gradients of the loss function with respect to the weights and biases. The steps are as follows:

1. **Compute the Error:**
   \\[
   \text{error} = a_{\text{out}} - y
   \\]

2. **Compute the Gradient of the Loss with Respect to \\( W_{\text{out}} \\) and \\( b_{\text{out}} \\):**
   \\[
   \delta_{\text{out}} = \text{error} \cdot \sigma'(z_{\text{out}})
   \\]
   \\[
   \frac{\partial L}{\partial W_{\text{out}}} = a_{\text{last}}^T \cdot \delta_{\text{out}}
   \\]
   \\[
   \frac{\partial L}{\partial b_{\text{out}}} = \sum \delta_{\text{out}}
   \\]

3. **Compute the Gradient of the Loss with Respect to \\( W_i \\) and \\( b_i \\):**
   \\[
   \delta_i = \delta_{i+1} \cdot W_{i+1}^T \cdot \sigma'(z_i)
   \\]
   \\[
   \frac{\partial L}{\partial W_i} = a_{i-1}^T \cdot \delta_i
   \\]
   \\[
   \frac{\partial L}{\partial b_i} = \sum \delta_i
   \\]

4. **Update the Weights and Biases:**
   \\[
   W_i = W_i - \eta \cdot \frac{\partial L}{\partial W_i}
   \\]
   \\[
   b_i = b_i - \eta \cdot \frac{\partial L}{\partial b_i}
   \\]

---

## **Implementation**
Below is the Python code for implementing a deep neural network from scratch.

In [3]:
import numpy as np

class DeepNeuralNetwork:
    def __init__(self, layer_sizes):
        # Initialize weights and biases for each layer
        self.weights = [np.random.randn(layer_sizes[i], layer_sizes[i+1]) for i in range(len(layer_sizes)-1)]
        self.biases = [np.random.randn(layer_sizes[i+1]) for i in range(len(layer_sizes)-1)]

    def sigmoid(self, x):
        # Sigmoid activation function
        return 1 / (1 + np.exp(-x))

    def sigmoid_derivative(self, x):
        # Derivative of the sigmoid function
        return x * (1 - x)

    def forward(self, X):
        # Forward propagation
        self.activations = [X]
        self.z_values = []
        for i in range(len(self.weights)):
            z = np.dot(self.activations[-1], self.weights[i]) + self.biases[i]
            self.z_values.append(z)
            self.activations.append(self.sigmoid(z))
        return self.activations[-1]

    def backward(self, X, y, output):
        # Backpropagation
        self.error = output - y
        self.deltas = [self.error * self.sigmoid_derivative(output)]
        for i in range(len(self.weights)-1, 0, -1):
            delta = np.dot(self.deltas[-1], self.weights[i].T) * self.sigmoid_derivative(self.activations[i])
            self.deltas.append(delta)
        self.deltas.reverse()

        # Update weights and biases
        for i in range(len(self.weights)):
            self.weights[i] -= np.dot(self.activations[i].T, self.deltas[i])
            self.biases[i] -= np.sum(self.deltas[i], axis=0)

    def train(self, X, y, epochs=1000):
        # Training the network
        for _ in range(epochs):
            output = self.forward(X)
            self.backward(X, y, output)

# Example usage
if __name__ == "__main__":
    # Input data (4 samples, 3 features each)
    X = np.array([[0, 0, 1], [0, 1, 1], [1, 0, 1], [1, 1, 1]])
    y = np.array([[0], [1], [1], [0]])  # XOR operation

    # Create a deep neural network
    nn = DeepNeuralNetwork(layer_sizes=[3, 4, 4, 1])

    # Train the network
    nn.train(X, y, epochs=10000)

    # Test the network
    print("Predictions after training:")
    for x in X:
        print(f"Input: {x}, Output: {nn.forward(x)}")

Predictions after training:
Input: [0 0 1], Output: [0.00394415]
Input: [0 1 1], Output: [0.99481365]
Input: [1 0 1], Output: [0.99459514]
Input: [1 1 1], Output: [0.00561824]


In [None]:
# Finish