##                                          Deep Learning

---

### 1. Theoretical Intuition
Deep Learning is a subset of Machine Learning inspired by the structure and function of the human brain (Artificial Neural Networks).

- **Machine Learning vs. Deep Learning**:  
  - ML: Uses algorithms to parse data, learn from it, and make decisions.  
  - DL: Uses multiple layers of neural networks to automatically learn representations and features from raw data.  

- **Key Idea**:  
  Instead of hand-crafting features, deep learning models **learn the features** from data (e.g., images, text, audio).  

- **Example**:  
  In image recognition:  
  * Shallow ML: Manually extract edges, textures → feed to classifier  
  * Deep Learning: Raw pixels → network learns edges → shapes → objects  

---

### 2. Use Cases of Deep Learning
- Computer Vision (face recognition, self-driving cars)  
- Natural Language Processing (chatbots, translation, sentiment analysis)  
- Speech Recognition (voice assistants)  
- Healthcare (disease detection from scans, drug discovery)  
- Finance (fraud detection, algorithmic trading)  
- Recommendation Systems (Netflix, YouTube, Amazon)  

---

### 3. Mathematical Intuition (High-Level)
Deep Learning ≈ Function Approximation  

- Given input **X**, output **Y**  
- We want to learn a function **f** such that:  

\[
f(X; \theta) \approx Y
\]

where **θ** are the parameters (weights & biases).  

A neural network is basically:  

\[
\text{Input} \; \rightarrow \; (W \cdot X + b) \; \rightarrow \; \text{Activation} \; \rightarrow \; \text{Output}
\]

Stack multiple layers → "Deep" network.  

---

### 4. Interview Q&A

| Level        | Question | Answer |
|--------------|----------|--------|
| Beginner     | What is Deep Learning? | A subset of ML that uses multi-layered neural networks to learn data representations. |
| Beginner     | How is Deep Learning different from Machine Learning? | ML often needs manual feature engineering; DL automatically learns features with neural nets. |
| Beginner     | What is a Neural Network? | A network of interconnected nodes (neurons) that transform inputs to outputs using weights, biases, and activations. |
| Intermediate | Why do we need activation functions? | They introduce non-linearity, allowing the network to learn complex mappings. |
| Intermediate | What are some challenges of Deep Learning? | High computation cost, need for large labeled datasets, interpretability, risk of overfitting. |
| Intermediate | Why does deep learning need large datasets? | Because millions of parameters must be tuned; large datasets help avoid overfitting. |
| Intermediate | Explain overfitting in deep learning. | When the model memorizes training data but fails to generalize to new data. |
| Advanced     | How is backpropagation used in training? | It computes gradients of loss w.r.t weights using chain rule and updates them via optimization (like SGD). |
| Advanced     | What is the vanishing gradient problem? | When gradients shrink as they are backpropagated, making deep networks hard to train. |
| Advanced     | Difference between shallow and deep networks? | Shallow: 1–2 layers; Deep: many layers that learn hierarchical features. |

---

### 5. Quick Demo: Simple Neural Network on Dummy Data


In [1]:
import torch
import torch.nn as nn
import torch.optim as optim

# Dummy dataset (y = 2x + 1)
X = torch.linspace(-1, 1, 100).reshape(-1, 1)
y = 2*X + 1

# Define a simple neural network
class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        self.fc1 = nn.Linear(1, 10)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(10, 1)
        
    def forward(self, x):
        out = self.fc1(x)
        out = self.relu(out)
        out = self.fc2(out)
        return out

# Initialize
model = SimpleNN()
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.1)

# Train
for epoch in range(200):
    optimizer.zero_grad()
    outputs = model(X)
    loss = criterion(outputs, y)
    loss.backward()
    optimizer.step()
    
print("Final Loss:", loss.item())


Final Loss: 0.0009748904267325997


## Perceptron

---

### 1. Theoretical Intuition
- A **Perceptron** is the **simplest type of artificial neural network** (ANN).  
- It is a **binary classifier** that maps input features to an output using a **linear combination of weights** and an **activation function**.  
- Introduced by **Frank Rosenblatt (1958)**.  
- Think of it as a **single neuron in the brain**.

---

### 2. Key Pointers
- It works for **linearly separable data**.  
- Inputs (**x1, x2, ... xn**) are multiplied by weights (**w1, w2, ... wn**) and summed up with a **bias** term.  
- Output is passed through a **step activation function** (0 or 1).  
- **Learning**: weights are updated based on **prediction error** using the **Perceptron learning rule**.  
- Limitation: Cannot solve **non-linear problems** like XOR.  

---

### 3. Use Cases
- Early **binary classification tasks**  
- Simple **pattern recognition**  
- Foundation for **more complex neural networks**  

---

### 4. Mathematical Intuition
- Weighted sum:  

\[
z = w_1 x_1 + w_2 x_2 + ... + w_n x_n + b
\]

- Activation function (step):  

\[
y =
\begin{cases} 
1 & \text{if } z \ge 0 \\
0 & \text{if } z < 0
\end{cases}
\]

- Weight update (Perceptron learning rule):  

\[
w_i = w_i + \Delta w_i
\]  

\[
\Delta w_i = \eta (y_{\text{true}} - y_{\text{pred}}) x_i
\]  

where **η** = learning rate, **y_true** = actual label, **y_pred** = predicted label.

---

### 5. Interview Q&A

| Question | Answer |
|----------|--------|
| What is a Perceptron? | The simplest type of neural network; a binary classifier that uses weighted sum and activation. |
| Who introduced the Perceptron? | Frank Rosenblatt in 1958. |
| How does a Perceptron make predictions? | Computes weighted sum of inputs + bias, passes through step function. |
| What is the main limitation of a Perceptron? | Cannot solve non-linear problems like XOR. |
| What is the Perceptron learning rule? | Updates weights based on prediction error: w_i = w_i + η*(y_true - y_pred)*x_i |
| What type of data can a Perceptron classify? | Linearly separable data. |
| Why do we need bias in a Perceptron? | Allows shifting the decision boundary away from the origin. |
| Can a single Perceptron be used for multi-class classification? | No, only binary; multi-class requires multiple Perceptrons or other architectures. |

---

### 6. Code Demo: Simple Perceptron in Python

```python
import numpy as np

# Step activation function
def step(x):
    return 1 if x >= 0 else 0

# Perceptron class
class Perceptron:
    def __init__(self, input_size, lr=0.1, epochs=10):
        self.weights = np.zeros(input_size)
        self.bias = 0
        self.lr = lr
        self.epochs = epochs
        
    def predict(self, x):
        z = np.dot(x, self.weights) + self.bias
        return step(z)
    
    def fit(self, X, y):
        for _ in range(self.epochs):
            for xi, yi in zip(X, y):
                y_pred = self.predict(xi)
                update = self.lr * (yi - y_pred)
                self.weights += update * xi
                self.bias += update

# Training data: AND gate
X = np.array([[0,0],[0,1],[1,0],[1,1]])
y = np.array([0,0,0,1])

# Train Perceptron
p = Perceptron(input_size=2)
p.fit(X, y)

# Predictions
for xi in X:
    print(f"Input: {xi}, Predicted: {p.predict(xi)}")
