### Day 1: Understanding the Basics of Neural Networks

#### 1. **Neuron Formula: The Building Block**
A neuron in a neural network is inspired by the human brain. It takes inputs, processes them, and produces an output. Let’s break down the formula of a **perceptron** (the simplest form of a neuron):

- **Inputs**: $ x_1, x_2, ..., x_n $ (features)
- **Weights**: $ w_1, w_2, ..., w_n $ (how important each input is)
- **Bias**: $ b $ (shifts the activation function, so the model is more flexible)
- **Output**: The perceptron’s output is a weighted sum of the inputs, passed through an activation function.

$$ z = w_1x_1 + w_2x_2 + ... + w_nx_n + b $$
 
This $$ z $$ is then passed through an **activation function** to give the final output, typically something like:

$$ y = f(z) $$

Here, $ f(z) $ could be a **sigmoid function**, a **ReLU** (Rectified Linear Unit), or another activation function.

#### 2. **Forward Propagation: Passing Inputs Through the Network**
In **forward propagation**, the data passes through all the layers of the network:

- The inputs are passed to each neuron.
- The neuron computes the weighted sum of inputs plus bias, and the result is transformed using the activation function.
- This continues through all the layers, from the input to the output layer, where the final prediction is made.

**Goal**: Make a prediction based on the input.

#### 3. **Backpropagation: Learning From Mistakes**
Neural networks learn through **backpropagation**. After predicting an output, we compare it with the true value using a **loss function** (e.g., Mean Squared Error for regression or Cross Entropy for classification). 

To improve the model, we need to adjust the weights and biases to reduce the error:

1. **Calculate the error**: Compare the predicted output to the true output using a loss function.
2. **Compute gradients**: Using calculus, calculate how much each weight contributed to the error (this is done through the chain rule).
3. **Update weights**: Adjust the weights and biases slightly to reduce the error. This is done using **gradient descent**.

$$ w_i \gets w_i - \eta \frac{\partial L}{\partial w_i} $$

Where $$ \eta $$ is the learning rate and $$ \frac{\partial L}{\partial w_i} $$ is the gradient of the loss with respect to the weight $$ w_i $$.

**Goal**: Minimize the loss by adjusting weights and biases to improve the accuracy of the model.

#### 4. **Activation Functions: The Decision Makers**
- **Sigmoid**: Squashes the output between 0 and 1, useful for probabilities.
  
  $$ f(z) = \frac{1}{1 + e^{-z}} $$
  
- **ReLU (Rectified Linear Unit)**: Output is 0 if $ z \leq 0 $, and $ z $ if $ z > 0 $. Great for deeper networks because it avoids the vanishing gradient problem.

  $$ f(z) = \max(0, z) $$

- **Softmax**: Used in the output layer for multi-class classification. It converts raw scores into probabilities that sum to 1.

#### 5. **Programming from Scratch**

We’ll start by coding a simple perceptron in Python:

Here’s a breakdown of the perceptron code, explaining each part:

### 1. **Imports**
```python
import numpy as np
```
- **`numpy`** is imported as `np` to handle arrays and mathematical operations efficiently, such as dot products.

---

### 2. **Activation Function**
```python
def activation_function(x):
    return 1 if x >= 0 else 0
```
- The **activation function** determines the perceptron’s output. In this case, it's a simple **step function**, which returns `1` if the input `x` is greater than or equal to `0` and `0` otherwise.
- The perceptron fires (returns `1`) if the input signal is strong enough (non-negative), and doesn't fire (returns `0`) if the signal is weak (negative).

---

### 3. **Perceptron Class**
#### 3.1. **Initialization**
```python
class Perceptron:
    def __init__(self, input_size, learning_rate=0.1):
        self.weights = np.zeros(input_size + 1)  # +1 for the bias term
        self.learning_rate = learning_rate
```
- The **`Perceptron` class** defines the model.
- **`input_size`** specifies how many input features there are. The perceptron is built for binary classification, receiving a number of inputs.
- The **weights** are initialized to zeros using `np.zeros`. There’s one weight for each input, plus one for the **bias term** (which controls the threshold for the decision boundary).
- **`learning_rate`** defines the step size for updating the weights during training.

#### 3.2. **Predict Function**
```python
def predict(self, inputs):
    summation = np.dot(inputs, self.weights[1:]) + self.weights[0]
    return activation_function(summation)
```
- **`predict`** computes the weighted sum of the inputs:
    - `np.dot(inputs, self.weights[1:])` calculates the dot product between the input values and the weights (excluding the bias).
    - **`self.weights[0]`** is the bias term added to the summation.
- The **activation function** then decides the output based on the summation (whether it's 0 or 1).
  
#### 3.3. **Training Function**
```python
def train(self, training_inputs, labels, epochs=10):
    for _ in range(epochs):  # Loop over the dataset multiple times
        for inputs, label in zip(training_inputs, labels):
            prediction = self.predict(inputs)
            self.weights[1:] += self.learning_rate * (label - prediction) * inputs
            self.weights[0] += self.learning_rate * (label - prediction)
```
- **`train`** method performs the learning process:
  - **`training_inputs`**: The data that will be used to train the perceptron (e.g., `[[0,0], [0,1], [1,0], [1,1]]` for an AND gate).
  - **`labels`**: The true output for each input (e.g., `[0, 0, 0, 1]` for an AND gate).
  - **`epochs`**: Number of times the model goes through the entire dataset. Training over multiple epochs ensures that the perceptron updates its weights until it converges on correct predictions.

- For each input-label pair:
  1. **Prediction**: The current input is passed to `self.predict(inputs)` to get the perceptron's output.
  2. **Weight update**: The weights are updated based on the error `(label - prediction)`. If the prediction is wrong, the weights are adjusted to reduce future errors:
     - **`self.weights[1:]`**: The weights associated with the inputs are updated.
     - **`self.weights[0]`**: The bias weight is updated.
     - The amount of update is determined by the learning rate and the error.

---

### 4. **Training Data**
```python
training_inputs = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
labels = np.array([0, 0, 0, 1])  # AND gate
```
- These are the training inputs and corresponding labels for the **AND gate** problem.
  - Inputs are all possible pairs of 0s and 1s: `[0, 0]`, `[0, 1]`, `[1, 0]`, `[1, 1]`.
  - Labels are the expected outputs: 
    - `0 AND 0` = 0
    - `0 AND 1` = 0
    - `1 AND 0` = 0
    - `1 AND 1` = 1

---

### 5. **Creating and Training the Perceptron**
```python
perceptron = Perceptron(input_size=2)
perceptron.train(training_inputs, labels, epochs=10)
```
- A perceptron is created with **2 input features** (`input_size=2`).
- The perceptron is trained on the **AND gate** dataset for **10 epochs**.

---

### 6. **Testing the Perceptron**
```python
print(perceptron.predict(np.array([0, 0])))  # Expected output: 0
print(perceptron.predict(np.array([1, 1])))  # Expected output: 1
print(perceptron.predict(np.array([0, 1])))  # Expected output: 0
print(perceptron.predict(np.array([1, 0])))  # Expected output: 0
```
- After training, the perceptron is tested with inputs to verify that it has learned the AND gate logic:
  - `[0, 0]` should return `0`.
  - `[1, 1]` should return `1`.
  - `[0, 1]` should return `0`.
  - `[1, 0]` should return `0`.

---

### Key Concepts:
- **Perceptron**: A binary classifier that can learn simple decision boundaries.
- **Weights and Bias**: Parameters that determine how input values are transformed into an output.
- **Learning**: Adjusting weights through training to minimize errors.
- **Activation Function**: Decides the output based on the weighted sum of inputs.
- **Epochs**: Number of times the model is trained over the dataset.





#### 5. **Programming from Scratch**

We’ll start by coding a simple perceptron in Python:

In [None]:
import numpy as np


# Activation function (Step function for simplicity)
def activation_function(x):
    return 1 if x >= 0 else 0


# Perceptron class
class Perceptron:
    def __init__(self, input_size, learning_rate=0.1):
        self.weights = np.zeros(input_size + 1)  # +1 for the bias term
        self.learning_rate = learning_rate
        # self.epochs = 5

    def predict(self, inputs):
        summation = np.dot(inputs, self.weights[1:]) + self.weights[0]
        return activation_function(summation)

    def train(self, training_inputs, labels, epochs=10):
        for _ in range(epochs):
            for inputs, label in zip(training_inputs, labels):
                prediction = self.predict(inputs)
                self.weights[1:] += self.learning_rate * (label - prediction) * inputs
                self.weights[0] += self.learning_rate * (label - prediction)


# Sample training data
training_inputs = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
labels = np.array([0, 0, 0, 1])  # AND gate

# Create perceptron and train
perceptron = Perceptron(input_size=2)
perceptron.train(training_inputs, labels, 5)

# Test the perceptron
# print(perceptron.predict(np.array([0, 0])))  # Expected output: 0
# print(perceptron.predict(np.array([0, 1])))  # Expected output: 1


In [None]:
print(perceptron.predict(np.array([0, 0])))  # Expected output: 0
print(perceptron.predict(np.array([1, 1])))  # Expected output: 1


0
1


In [None]:
import numpy as np

# Activation function (Step function for simplicity)


def activation_function(x):
    return 1 if x >= 0 else 0


# Perceptron class


class Perceptron:
    def __init__(self, input_size, learning_rate=0.1):
        self.weights = np.zeros(input_size + 1)  # +1 for the bias term
        self.learning_rate = learning_rate

    def predict(self, inputs):
        summation = np.dot(inputs, self.weights[1:]) + self.weights[0]
        return activation_function(summation)

    def train(self, training_inputs, labels, epochs=10):
        for _ in range(epochs):  # Loop over the dataset multiple times
            for inputs, label in zip(training_inputs, labels):
                prediction = self.predict(inputs)
                self.weights[1:] += self.learning_rate * (label - prediction) * inputs
                self.weights[0] += self.learning_rate * (label - prediction)


# Sample training data
training_inputs = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
labels = np.array([0, 0, 0, 1])  # AND gate

# Create perceptron and train
perceptron = Perceptron(input_size=2)
perceptron.train(training_inputs, labels, epochs=10)

# Test the perceptron
print(perceptron.predict(np.array([0, 0])))  # Expected output: 0
print(perceptron.predict(np.array([1, 1])))  # Expected output: 1
print(perceptron.predict(np.array([0, 1])))  # Expected output: 0
print(perceptron.predict(np.array([1, 0])))  # Expected output: 0


0
1
0
0
