# 🧠 Artifical Neural Network with single Hidden Layer 

## 🔹 How It Works

This is a basic **feedforward neural network** with a single hidden layer containing 128 neurons. It processes input features through the following steps:

1. **Forward Pass**:
   - Computes a linear combination of inputs and hidden layer weights.
   - Applies **ReLU** activation in the hidden layer to introduce non-linearity.
   - Computes a second linear combination with output layer weights.
   - Applies **softmax** to obtain class probabilities.

2. **Loss Computation**:
   - Uses **cross-entropy loss** which is suitable for multi-class classification problems.

3. **Backward Pass**:
   - Computes gradients of the loss w.r.t. weights using **backpropagation**.
   - Updates weights using **gradient descent**.

This model is suitable for classification tasks and is implemented using NumPy without any deep learning frameworks.

---
## 🔹 Key Components

### 1. **Neuron (Node)**
Each neuron computes:

```
z = w·x + b  
a = activation(z)
```

Where:  
- `x` = input  
- `w` = weights  
- `b` = bias  
- `z` = linear combination  
- `a` = output after activation  

### 2. **Activation Functions**
Add non-linearity:
- **ReLU**: `f(x) = max(0, x)`
- **Sigmoid**: `f(x) = 1 / (1 + e^-x)`
- **Tanh**: `f(x) = (e^x - e^-x)/(e^x + e^-x)`

### 3. **Loss Function**
Measures prediction error:
- **MSE** (regression): `(y_pred - y_true)^2`
- **Cross-entropy** (classification)

### 4. **Backpropagation**
Gradient-based method to minimize the loss:
- Uses chain rule to compute gradients of loss w.r.t. weights.  
- Updates weights via **gradient descent**.

---

## 🔹 Training Process (Forward + Backward Pass)

1. **Initialize** weights and biases  
2. **Forward pass**: Compute output with current weights  
3. **Compute loss**  
4. **Backward pass**: Compute gradients via backpropagation  
5. **Update weights**: `w = w - lr * dw`  
6. **Repeat** until convergence or max epochs

## A walkthrough of my code : 

## 🔹 Class: `SingleHiddenLayerNN`

### **Constructor**
```python
__init__(learning_rate, feature_size, output_size)
```
- `learning_rate`: Learning rate for gradient descent.
- `feature_size`: Number of input features.
- `output_size`: Number of output classes.
- Initializes weights and biases for:
  - Hidden layer of size 128 (fixed).
  - Output layer based on number of classes.

---

## 🔹 Methods

### `relu(x)`
Applies ReLU activation: `max(0, x)`

### `relu_derivative(x)`
Computes derivative of ReLU: `1 if x > 0 else 0`

### `softmax(x)`
Applies softmax activation:
```
softmax(x) = exp(x - max(x)) / sum(exp(x - max(x)))
```

### `forward(x)`
Performs a forward pass:
1. Linear combination → ReLU (hidden layer)
2. Linear combination → Softmax (output layer)
3. Returns class probabilities

### `backward(x, y)`
Performs backpropagation using:
- **Cross-entropy loss** for softmax output
- Gradients calculated w.r.t. weights and biases
- Weights updated via gradient descent

### `train(x, y, epochs)`
Trains the model over multiple epochs:
- Uses forward and backward passes
- Logs training loss every 100 epochs
- Stores loss in `self.training_loss`

### `predict(x)`
Predicts class labels:
- Performs a forward pass
- Returns `argmax` of class probabilities

---

## 🔹 Architecture Summary

```
Input → [Linear → ReLU] → Hidden (128 neurons) → [Linear → Softmax] → Output
```

---

## 🔹 Use Case

- Multi-class classification
- Simple datasets with numerical features
- Educational and prototyping purposes

---


In [None]:
import numpy as np

class SingleHiddenLayerNN:
    def __init__(self, learning_rate, feature_size, output_size):
        self.learning_rate = learning_rate
        self.hidden_size = 128

        self.weights1 = np.random.randn(feature_size, self.hidden_size) * 0.01
        self.bias1 = np.zeros((1, self.hidden_size))
        self.weights2 = np.random.randn(self.hidden_size, output_size) * 0.01
        self.bias2 = np.zeros((1, output_size))

    def relu(self, x):
        return np.maximum(0, x)

    def relu_derivative(self, x):
        return (x > 0).astype(float)

    def softmax(self, x):
        e_x = np.exp(x - np.max(x, axis=1, keepdims=True))
        return e_x / np.sum(e_x, axis=1, keepdims=True)

    def forward(self, x):
        self.hidden_input = np.dot(x, self.weights1) + self.bias1
        self.hidden_output = self.relu(self.hidden_input)
        self.final_input = np.dot(self.hidden_output, self.weights2) + self.bias2
        self.final_output = self.softmax(self.final_input)
        return self.final_output

    def backward(self, x, y):
        m = x.shape[0]
        self.error = self.final_output - y  # for softmax + cross-entropy
        d_output = self.error / m

        dW2 = np.dot(self.hidden_output.T, d_output)
        db2 = np.sum(d_output, axis=0, keepdims=True)

        d_hidden = np.dot(d_output, self.weights2.T) * self.relu_derivative(self.hidden_input)
        dW1 = np.dot(x.T, d_hidden)
        db1 = np.sum(d_hidden, axis=0, keepdims=True)

        self.weights2 -= self.learning_rate * dW2
        self.bias2 -= self.learning_rate * db2
        self.weights1 -= self.learning_rate * dW1
        self.bias1 -= self.learning_rate * db1

    def train(self, x, y, epochs):
        self.training_loss = []
        for i in range(epochs):
            self.forward(x)
            self.backward(x, y)
            if i % 100 == 0 or i == epochs - 1:
                loss = -np.sum(y * np.log(self.final_output + 1e-9), axis=1).mean()
                self.training_loss.append(loss)
                print(f"Epoch: {i}, Loss: {loss:.4f}")

    def predict(self, x):
        probabilities = self.forward(x)
        return np.argmax(probabilities, axis=1)