# Lecture 1: What is a Neural Network? Basics of Deep Learning

## Introduction

Neural networks are the building blocks of modern AI and power applications like self-driving cars, chatbots, and image recognition. Unlike traditional programs, which follow predefined rules, neural networks learn from data by recognizing patterns.

### In this lecture, we will cover:
1. What neural networks are and how they work
2. Different types of neural networks
3. How neural networks learn through training

By the end, you’ll have a clear understanding of how neural networks function and why they are essential in AI.

---

## 1. Understanding Neural Networks

### A. What is a Neural Network?

A neural network is a system of artificial neurons that work together to process information. It is inspired by how the human brain works.

Imagine a self-driving car trying to recognize a stop sign 🚦:

- The car’s camera captures an image.
- The neural network analyzes the image and detects patterns.
- It determines whether the image contains a stop sign.

### B. Structure of a Neural Network

A neural network is made up of three layers:

| Layer         | Function                                      |
|--------------|----------------------------------------------|
| Input Layer  | Receives raw data (e.g., images, text, numbers). |
| Hidden Layers | Process data and detect patterns.           |
| Output Layer | Makes the final prediction (e.g., stop sign or no stop sign). |

🔹 **Example: Recognizing handwritten digits (0-9) using a neural network.**

- **Input**: The pixels of the image (grayscale values).
- **Hidden Layers**: Extract important features (like curves or lines).
- **Output**: The number the image represents.

📌 **Think of a neural network like a chef preparing a meal:**

- **Ingredients (input layer)** → Raw data.
- **Cooking process (hidden layers)** → Transforming data into something useful.
- **Final dish (output layer)** → The result (e.g., prediction).

---

## 2. Types of Neural Networks

Neural networks come in different types, depending on what they are used for.

### A. Multi-Layer Perceptron (MLP) – The Basic Neural Network

- The simplest type of neural network.
- Used for tasks like spam detection (email is spam or not).

### B. Convolutional Neural Networks (CNNs) – For Images

- Specially designed for image recognition.
- Used in self-driving cars, facial recognition, and medical imaging.

### C. Recurrent Neural Networks (RNNs) – For Sequences

- Processes time-based data like speech, music, and stock prices.
- Used in chatbots and real-time translation (Google Translate).

📌 **Think of different neural networks like different types of athletes:**

- **MLP** → A runner (simple tasks).
- **CNN** → A gymnast (identifies shapes and patterns).
- **RNN** → A musician (remembers past data and sequences).

---

## 3. How Neural Networks Learn (Training Process)

### A. The Learning Process

Neural networks learn by adjusting their internal connections (weights and biases).

🔹 **Example: Teaching a child to recognize apples 🍏**

1. Show the child pictures of apples and non-apples.
2. If they make a mistake, correct them.
3. Repeat the process until they can recognize apples correctly.

Neural networks learn in a similar way by:

1. Receiving input data (e.g., an image of an apple).
2. Making a prediction (e.g., "this is an apple").
3. Checking the error (Was the prediction correct?).
4. Adjusting itself to make better future predictions.

### B. The Role of Activation Functions

Activation functions decide whether a neuron should "fire" (activate) based on its input.

| Activation Function | Use Case                                  |
|--------------------|-----------------------------------------|
| Sigmoid           | Used in binary classification (yes/no problems). |
| ReLU              | Most common, used in deep networks. |
| Softmax           | Used for multi-class classification. |

📌 **Think of activation functions like a decision filter:**

- **Sigmoid**: Decides if a light switch should turn on/off.
- **ReLU**: Ignores weak signals but reacts to strong ones.

### C. How Training Works: The Role of Errors

- After each prediction, the network checks how wrong or right it was.
- It adjusts itself using an optimization process called **backpropagation**.
- Over time, the network improves, just like a student who learns from mistakes.

📌 **Think of backpropagation like a coach helping an athlete:**

- The coach points out mistakes (**error feedback**).
- The athlete adjusts their technique (**learning**).
- Over time, performance improves.

---

## 4. Case Study: AI-Powered Handwriting Recognition

🔹 **Problem**: Traditional OCR (Optical Character Recognition) struggles with handwriting variations.
🔹 **Solution**: A Convolutional Neural Network (CNN) is trained to recognize handwritten text.

### Workflow:
✅ **Data Collection**: Handwritten samples are collected.
✅ **Model Training**: The CNN is trained on thousands of handwritten examples.
✅ **Deployment**: Used in mobile banking apps for check deposits.

💡 **Result**: AI-powered OCR improves accuracy in reading handwritten text, reducing manual data entry.


### Simple Artificial Neuron Simulation

In [1]:
import numpy as np

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

# Define input, weights, and bias
inputs = np.array([1, 0])  # Binary input
weights = np.array([0.5, -0.5])  # Weight coefficients
bias = 0.1  # Bias term

# Compute output
output = sigmoid(np.dot(inputs, weights) + bias)
print("Output:", output)


Output: 0.6456563062257954


This code simulates a simple artificial neuron:

- **Imports** NumPy for numerical operations.
- **Defines** the sigmoid function, which squashes values between 0 and 1.
- **Initializes** inputs, weights, and bias:
  - **Inputs:** `[1, 0]` (binary values).
  - **Weights:** `[0.5, -0.5]` (determines feature importance).
  - **Bias:** `0.1` (shifts activation threshold).
- **Computes** the weighted sum using the dot product of inputs and weights, then adds the bias.
- **Applies** the sigmoid function to transform the result into a probability-like value.
- **Prints** the final output, which determines neuron activation.

This is a basic building block of neural networks. 🚀


### Simple Neural Network Using PyTorch

In [2]:
import torch
import torch.nn as nn
import torch.optim as optim

# Define an MLP model
class MLP(nn.Module):
    def __init__(self):
        super(MLP, self).__init__()
        self.fc1 = nn.Linear(2, 4)  # Input layer to hidden layer
        self.relu = nn.ReLU()  # Activation function
        self.fc2 = nn.Linear(4, 1)  # Hidden layer to output layer
        self.sigmoid = nn.Sigmoid()  # Output activation

    def forward(self, x):
        x = self.relu(self.fc1(x))
        x = self.sigmoid(self.fc2(x))
        return x

# Instantiate model
model = MLP()
print(model)


MLP(
  (fc1): Linear(in_features=2, out_features=4, bias=True)
  (relu): ReLU()
  (fc2): Linear(in_features=4, out_features=1, bias=True)
  (sigmoid): Sigmoid()
)


##### Importing Necessary Libraries:
- **`torch`**: The main library for working with deep learning in Python.  
- **`torch.nn`**: Contains tools to build neural networks.  
- **`torch.optim`**: Used for training the model (**not used in this code**).  

---

##### Defining the Neural Network (MLP Model):
- The `MLP` class is created to define the neural network.  
- It **inherits** from `nn.Module`, which is required for PyTorch models.  
- The `__init__` method sets up **two layers**:  
  - **`fc1`**: Connects **2 input numbers** to **4 neurons** in a hidden layer.  
  - **`fc2`**: Connects **4 hidden neurons** to **1 output neuron**.  
- The model uses:  
  - **ReLU activation function** to allow the model to learn better.  
  - **Sigmoid activation function** to make sure the output is between `0` and `1` (**useful for yes/no decisions**).  

---

#### How the Model Processes Data (`forward` Method):
1. The input is passed through **`fc1`**, then the **ReLU activation** is applied.  
2. It is then passed through **`fc2`**, followed by **sigmoid activation**.  
3. This defines **how the model transforms the input into an output**.  

---

#### Creating and Printing the Model:
```python
model = MLP()  # Creates the neural network
print(model)  # Displays the structure of the model


### XOR Neural Network

In [3]:
# Define loss function and optimizer
criterion = nn.BCELoss()  # Binary Cross-Entropy Loss
optimizer = optim.Adam(model.parameters(), lr=0.01)

# Sample training data
X_train = torch.tensor([[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]])
y_train = torch.tensor([[0.0], [1.0], [1.0], [0.0]])  # XOR dataset labels

# Training loop
epochs = 1000
for epoch in range(epochs):
    optimizer.zero_grad()
    outputs = model(X_train)  
    loss = criterion(outputs, y_train)
    loss.backward()
    optimizer.step()

    if epoch % 200 == 0:
        print(f"Epoch {epoch}, Loss: {loss.item()}")

# Final output after training
print("Final model output:", model(X_train))


Epoch 0, Loss: 0.6961403489112854
Epoch 200, Loss: 0.22329354286193848
Epoch 400, Loss: 0.04864644259214401
Epoch 600, Loss: 0.019203947857022285
Epoch 800, Loss: 0.010208996012806892
Final model output: tensor([[0.0055],
        [0.9953],
        [0.9897],
        [0.0047]], grad_fn=<SigmoidBackward0>)



---
## **1. What the Code Does**
This code **trains a neural network** to learn the **XOR function**, a basic logic operation where:

- `0 XOR 0 = 0`
- `0 XOR 1 = 1`
- `1 XOR 0 = 1`
- `1 XOR 1 = 0`

### **Step-by-Step Breakdown**  

### **1.1 Define Loss Function & Optimizer**  
- The **Binary Cross-Entropy Loss (BCELoss)** measures how well the model's predictions match actual outputs.  
- **Adam optimizer** updates the model’s weights to minimize the loss and improve predictions.

### **1.2 Define the Training Data (XOR Dataset)**  
- The input `X_train` consists of 4 possible **binary combinations** (0s and 1s).  
- The expected output `y_train` follows the **XOR truth table**.

### **1.3 Training the Model (1000 Epochs)**  
- The model **predicts outputs**, calculates **loss**, and **adjusts weights** using **gradient descent**.  
- Loss is printed every **200 epochs** to track improvement.  

### **1.4 Model Evaluation**  
- After training, the model makes predictions for `X_train`, and the results are displayed.

---

## **2. Understanding the Output**  
The model prints loss values and final predictions. Let's analyze them:

### **2.1 Loss at Different Epochs**

| Epoch | Loss Value  | Meaning |
|--------|-----------|---------|
| **0** | **0.6961** | High loss → Model is guessing randomly. |
| **200** | **0.2233** | Loss decreasing → Model is learning the XOR pattern. |
| **400** | **0.0486** | Loss is low → Predictions are getting close to actual values. |
| **600** | **0.0192** | Model has almost mastered XOR. |
| **800** | **0.0102** | Very low loss → Model is making nearly perfect predictions. |

📌 **Key Takeaway:** The model starts with a high loss but gradually improves as it learns from the data.

---

### **2.2 Final Model Predictions**
```python
 tensor([[0.0055],  
        [0.9953],  
        [0.9897],  
        [0.0047]])
```
These values represent the model’s predicted **probabilities** (between 0 and 1) after applying the **Sigmoid activation function**.

| Input (X_train) | Expected Output (y_train) | Model Prediction |
|---------------|----------------|----------------|
| `[0, 0]`     | `0`            | **0.0055** (Very close to 0 ✅) |
| `[0, 1]`     | `1`            | **0.9953** (Very close to 1 ✅) |
| `[1, 0]`     | `1`            | **0.9897** (Very close to 1 ✅) |
| `[1, 1]`     | `0`            | **0.0047** (Very close to 0 ✅) |

✅ **The model has successfully learned the XOR function!**  
- Predictions are **very close** to the expected values.  
- It **correctly classifies XOR outputs** after training.  
