# 1.What is a Tensor?
- A tensor is just a multi-dimensional array — like numbers in a grid or table.

| Tensor Type | Shape/Dimensions   | Example                                               |
| ----------- | ------------------ | ----------------------------------------------------- |
| Scalar      | 0D (just a number) | `5`                                                   |
| Vector      | 1D                 | `[1, 2, 3]`                                           |
| Matrix      | 2D                 | `[[1, 2], [3, 4]]`                                    |
| 3D Tensor   | 3D                 | A stack of matrices (e.g., image with color channels) |


In [10]:
import numpy as np

In [20]:
scaler = np.array(5)              # 0D
vector = np.array([1,2,3])        # 1D
matrix = np.array([[1,2],[3,4]])  # 2D
tensor3D = np.random.rand(3,3,3)  # 3D Tensor 

print("Scaler: ", scaler)

print("=" * 60)
print("Vector: ", vector)

print("=" * 60)
print("Matrix: ")
print(matrix)

print("=" * 60)
print("Tensor 3D:")
print(tensor3D)

Scaler:  5
Vector:  [1 2 3]
Matrix: 
[[1 2]
 [3 4]]
Tensor 3D:
[[[0.59801424 0.09509024 0.47388334]
  [0.8261656  0.94235033 0.196919  ]
  [0.63734199 0.84406376 0.85908542]]

 [[0.23154865 0.55464748 0.84728301]
  [0.58369939 0.47190367 0.47148114]
  [0.88731673 0.16842578 0.58542442]]

 [[0.5912087  0.70970737 0.00921565]
  [0.99819911 0.47587071 0.37214581]
  [0.68575665 0.80227696 0.83835198]]]


Exactly! ✅ You're **100% right** — and you're understanding this beautifully. Let me re-confirm and explain it again **step-by-step**, so it becomes crystal clear.

---

## ✅ 1. `np.random.rand(3)` → 1D Tensor (Vector)

* This gives a **1D vector** with 3 random values.
* Shape: `(3,)`

📌 Example:

```python
array([0.17, 0.83, 0.29])
```

🧠 Think of it as:

> A single row with 3 numbers.

---

## ✅ 2. `np.random.rand(3, 3)` → 2D Tensor (Matrix)

* This gives a **2D matrix** with 3 rows and 3 columns.
* Shape: `(3, 3)`

📌 Example:

```python
array([[0.25, 0.67, 0.88],
       [0.19, 0.91, 0.43],
       [0.64, 0.33, 0.73]])
```

🧠 Think of it as:

> A **square grid** — like an image in grayscale.

---

## ✅ 3. `np.random.rand(3, 3, 3)` → 3D Tensor

* This gives a **3D tensor**:

  * 3 matrices
  * Each matrix is 3×3

* Shape: `(3, 3, 3)`

📌 Example:

```python
array([
 [[0.1, 0.2, 0.3],
  [0.4, 0.5, 0.6],
  [0.7, 0.8, 0.9]],

 [[0.9, 0.8, 0.7],
  [0.6, 0.5, 0.4],
  [0.3, 0.2, 0.1]],

 [[0.11, 0.22, 0.33],
  [0.44, 0.55, 0.66],
  [0.77, 0.88, 0.99]]
])
```

🧠 Think of it as:

> A **stack of 3 matrices** — like a colored image (RGB has 3 color channels).

---

## 🧠 Bonus Analogy:

| Shape       | Real-World Analogy                       |
| ----------- | ---------------------------------------- |
| `(3,)`      | A list of 3 test scores                  |
| `(3, 3)`    | A chess board (3×3 version)              |
| `(3, 3, 3)` | 3 pages of 3×3 grids (like a cube stack) |

---

Would you like to go deeper into:

* 4D tensors (used in batches of images)?
* How these shapes map into neural networks?
* Visualizing these tensors with matplotlib?

Just say the word — you're learning fast and on the right path! 🚀


# 2. Tensor Operations
Tensor operations are math actions on tensors: addition, multiplication, dot product, reshaping, etc.

In [12]:
a = np.array([1,2])
b = np.array([3,4])

In [13]:
# Element-wise addition
print(a + b) 

[4 6]


In [14]:
# Dot product (1×3 + 2×4 = 11)
print(a * b)

[3 8]


In [15]:
# Reshape (change structure)
matrix = np.array(([[1,2],[3,4]]))
print(matrix.reshape((4,)))

[1 2 3 4]


Excellent question! 🙌  
Understanding `reshape()` is **super important** in deep learning and NumPy because it helps you manage and prepare your **data shape**, which is crucial for models.

---

## ✅ What is `reshape()`?

The `reshape()` function in NumPy **changes the shape of your array** (tensor) **without changing its data**.

📌 **Important:** The **number of elements must stay the same** — only the shape changes.

---

### 📘 Your Example:

```python
import numpy as np

matrix = np.array([[1, 2], [3, 4]])  # Shape: (2, 2)
print(matrix.reshape((4,)))          # Shape: (4,)
```

### 💡 What's happening?

You're taking a 2×2 matrix:

```
[[1, 2],
 [3, 4]]
```

...and turning it into a **1D vector** with 4 elements:

```
[1, 2, 3, 4]
```

- Before: shape = `(2, 2)`
- After: shape = `(4,)`

---

## 🧠 Why is `reshape()` useful in Deep Learning?

1. ### ✅ Preprocessing data
   - Neural networks often expect inputs in **specific shapes**.
   - Example: Flatten a 2D image (28×28) into a 1D vector (784,) before feeding into a dense layer.

2. ### ✅ Batching
   - Reshape a single image into a batch: `(28, 28, 1)` → `(1, 28, 28, 1)`

3. ### ✅ Feeding data into layers
   - Some layers (like LSTM) expect inputs in 3D: `(batch_size, timesteps, features)`

---

## 📦 Examples:

### 1. Flatten a matrix:
```python
arr = np.array([[1, 2], [3, 4]])
flat = arr.reshape(4,)
print(flat)  # [1 2 3 4]
```

### 2. Reshape a flat array into 2D:
```python
flat = np.array([1, 2, 3, 4])
matrix = flat.reshape((2, 2))
print(matrix)
# [[1 2]
#  [3 4]]
```

### 3. Reshape a 3D tensor (for CNN input):
```python
image = np.random.rand(28, 28)
image_reshaped = image.reshape((1, 28, 28, 1))  # Batch of 1 grayscale image
```

---

## ✅ Summary:

| Function          | What it does                           |
|-------------------|----------------------------------------|
| `reshape()`       | Changes the shape of a tensor/array    |
| ✅ Keeps data     | No data lost or added — only reshaped  |
| ❌ Must match size| Old size = New size (same number of elements) |
| 🚀 Used for       | Preparing input/output shapes in DL     |

---

Would you like a visual diagram or use-case where reshaping is **required** for building a neural network?

You're doing great — reshaping is a big part of working with tensors!

In [21]:
# 1. Flatten a matrix:
arr = np.array([[1,2],[3,4]])

flat = arr.reshape(4,)
print(flat)


[1 2 3 4]


In [22]:
# 2. Reshape a flat array into 2D:

flat = np.array([1,2,3,4])
matrix = flat.reshape((2,2))

print(matrix)

[[1 2]
 [3 4]]


In [25]:
# 3. Reshape a 3D tensor (for CNN input):

image = np.random.rand(28,28)

image_reshaped = image.reshape((1,28,28,1))

# print(image_reshaped)

| Dimension | Meaning                           | Example Value   |
| --------- | --------------------------------- | --------------- |
| `1`       | **Batch size** (number of images) | `1` = one image |
| `28`      | **Image height**                  | 28 pixels       |
| `28`      | **Image width**                   | 28 pixels       |
| `1`       | **Channels** (color depth)        | 1 = grayscale   |


Awesome question! 🔥
You're looking at one of the **most important reshape patterns** in deep learning — especially for image data.

Let’s break down:

```python
image.reshape((1, 28, 28, 1))
```

This shape is **standard input format for convolutional neural networks (CNNs)** in libraries like **Keras** and **TensorFlow**.

---

## 🧠 Shape Meaning: `(1, 28, 28, 1)`

| Dimension | Meaning                           | Example Value   |
| --------- | --------------------------------- | --------------- |
| `1`       | **Batch size** (number of images) | `1` = one image |
| `28`      | **Image height**                  | 28 pixels       |
| `28`      | **Image width**                   | 28 pixels       |
| `1`       | **Channels** (color depth)        | 1 = grayscale   |

---

### 🔍 What it represents:

> A **single grayscale image** of **28×28 pixels**, reshaped so that it can be passed into a CNN.

🧠 This format is required because **TensorFlow models expect 4D input** for image data:

```
(batch_size, height, width, channels)
```

---

### 🖼️ Real-life example: MNIST digit images

* Each digit (0–9) image in MNIST dataset:

  * Height = 28
  * Width = 28
  * Channels = 1 (grayscale)

But the model expects a **batch**, even if it's just 1 image. That’s why we reshape:

```python
image = image.reshape((1, 28, 28, 1))
```

---

## 🧠 What if it was a color image?

Color images have 3 channels (Red, Green, Blue):

```python
image.reshape((1, 28, 28, 3))
```

* Same height and width
* `3` = RGB channels

---

## 💡 Why is this important?

Neural networks like CNNs **require specific input shapes**:

* If you give wrong shape, you’ll get shape mismatch errors
* `reshape()` is used to convert raw image or array into the right shape for the model

---

## ✅ Summary:

```python
image.reshape((1, 28, 28, 1))
```

| Value | What it means          |
| ----- | ---------------------- |
| `1`   | One image (batch size) |
| `28`  | Height of image        |
| `28`  | Width of image         |
| `1`   | Grayscale (1 channel)  |

---

Would you like to:

* Visualize how image tensors look in memory?
* Or go into how CNNs use these tensors to extract features?

Let me know how deep you want to go!


# 3. Differentiation

Differentiation tells us how much change in output happens when we change the input — this is called the derivative.

**Imagine you're driving a car.**

- Your speed is the derivative of distance over time.

- If your speed is 50 km/h, it means your position changes 50 km every 1 hour.

**In deep learning:**

We use derivatives to know how much to change weights in a neural network during training.

In [16]:
# Example with autograd (PyTorch)

import torch

x = torch.tensor(2.0, requires_grad=True)
y = x**2

y.backward()
print(x.grad)

tensor(4.)


Great question! 🔥
Understanding this line is **fundamental** to using **PyTorch for deep learning**, especially when working with **automatic differentiation**.

Let’s break this down:

---

## 🧠 Code:

```python
x = torch.tensor(2.0, requires_grad=True)
```

This line creates a **PyTorch tensor** named `x` with the value `2.0`, and tells PyTorch to **track operations on it** for automatic **gradient computation**.

---

## 🔍 Breaking it down:

### ✅ `2.0` — the value of the tensor

* You are creating a tensor that contains the number **2.0**
* The `.0` makes it a **float**, which is necessary for gradient calculations
* In deep learning, we typically use `float32` or `float64` data types

📌 Equivalent in NumPy:

```python
import numpy as np
x = np.array(2.0)  # a simple scalar value
```

---

### ✅ `requires_grad=True` — enables gradient tracking

This tells PyTorch:

> "I want to **track all operations** done on this tensor, so I can compute **gradients** later."

### 📘 Why is this important?

* In **training neural networks**, we use **gradient descent** to update model parameters (weights).
* To do that, we need to calculate the **derivative** of the loss w\.r.t. each parameter.
* `requires_grad=True` lets PyTorch build a **computational graph** and do **automatic differentiation**.

---

### ✅ Example:

```python
import torch

x = torch.tensor(2.0, requires_grad=True)  # we want to track this value
y = x ** 2                                 # y = x² = 4.0
y.backward()                               # calculate dy/dx

print(x.grad)                              # Output: tensor(4.0)
```

### 🧠 Why is the gradient 4.0?

Because:

* `y = x²`
* `dy/dx = 2x`
* When `x = 2`, `dy/dx = 2 × 2 = 4`

So, `x.grad` becomes `4.0`

---

## ✅ Summary:

| Part                 | Meaning                                                        |
| -------------------- | -------------------------------------------------------------- |
| `2.0`                | The actual scalar value stored in the tensor                   |
| `requires_grad=True` | Tells PyTorch to **track operations** for gradient computation |

---

## 🎯 In Real Neural Networks

When you create model weights:

```python
weight = torch.randn(1, requires_grad=True)
```

You **always** set `requires_grad=True`, so you can **update them during training** using gradient descent.

---

Would you like to see:

* A full example where a simple linear model is trained?
* Or how `.backward()` and `.grad` are used in a real loss function?

Let me know — you’re learning fast and asking the right questions!


## ⛰️ 4. **Gradient**

The **gradient** is a collection of all the **partial derivatives**.

### 🧠 Analogy:

Imagine you're hiking down a mountain, and you want to find the **fastest way down**.

* Gradient tells you:
  “If you step a little this way, the ground goes downhill faster.”
* In deep learning:
  Gradient tells us **how to adjust the weights to reduce the error**.

📌 Gradient is **a vector pointing in the direction of fastest increase**.
But we go in the **opposite direction** to **reduce loss** (that’s called gradient descent).

## ⬇️ 5. **Gradient Descent**

### 🔍 What is it?

A method to **find the best model** by gradually updating weights to reduce the loss (error).

### 🧠 Simple Idea:

1. Start with random weights
2. Calculate how bad the prediction is (loss)
3. Use gradients to adjust the weights to make the loss smaller
4. Repeat


In [33]:
# Simple idea (not full code)

weight = 5.0
learning_rate = 0.1

for step in range(100):
  loss = (weight - 3) ** 2
  # print("Loss: ", loss)
  grad = 2 * (weight - 3)
  # print("Grad: ", grad)
  weight -= learning_rate * grad

print(weight)
print(loss)

3.0000000004074074
2.593450360578931e-19


Excellent! You're looking at the **core idea behind how deep learning learns** — this is a simple, clean example of **Gradient Descent**. 💡
Let me explain it step-by-step in the **most beginner-friendly** way with real-life analogy.

---

## ✅ Code Recap:

```python
weight = 5.0
learning_rate = 0.001

for step in range(100):
    loss = (weight - 3) ** 2
    grad = 2 * (weight - 3)
    weight -= learning_rate * grad

print(weight)
print(learning_rate)
```

---

## 🎯 What is this code trying to do?

It’s trying to find the value of `weight` that makes the **loss** smallest.

The loss function is:

```python
loss = (weight - 3)²
```

So the **goal is to make weight = 3**, because that’s when the loss becomes zero.

---

## 🧠 Real-Life Analogy:

Imagine you’re standing on a hill and trying to **reach the bottom**.
At every step:

* You check how steep the slope is (the gradient).
* You take a small step **downhill** in the direction that reduces height fastest.
* Eventually, you reach the **lowest point** — that's when the gradient is zero.

In our case:

* The **height of the hill = loss**
* The **position = weight**
* The **slope = gradient**
* The **step size = learning\_rate**

---

## 🔍 Line-by-Line Explanation

### 🔸 `weight = 5.0`

We **start** at position 5.0 — not the correct value.

### 🔸 `learning_rate = 0.001`

How big each step should be. Small steps = slow but safe learning.

### 🔄 Loop 100 times:

We slowly walk toward the right value (`3.0`), step by step.

#### Inside the loop:

1. **Loss function**:

   ```python
   loss = (weight - 3) ** 2
   ```

   This measures how far `weight` is from `3`.
   The closer `weight` is to 3, the smaller the loss.

2. **Gradient (derivative)**:

   ```python
   grad = 2 * (weight - 3)
   ```

   This tells how fast the loss changes with respect to weight.

3. **Update rule (gradient descent)**:

   ```python
   weight -= learning_rate * grad
   ```

   Move `weight` in the **opposite direction** of the gradient (downhill).

---

## 🔚 After the Loop:

You print the final value of `weight`. It should be **very close to 3.0** (depending on learning rate and steps).

---

## 🔢 Output Example:

```python
2.819..., 0.001
```

* The weight gets **closer to 3** every step.
* With more steps or a larger learning rate, it would get even closer.

---

## ✅ Summary Table

| Concept            | Meaning                                    |
| ------------------ | ------------------------------------------ |
| `weight`           | Parameter we're trying to learn            |
| `loss`             | Shows how wrong our guess is               |
| `gradient`         | Direction and steepness of error change    |
| `learning_rate`    | How big a step we take during each update  |
| `gradient descent` | Algorithm to update weights to reduce loss |

---

## 🔥 In Deep Learning:

This is **exactly how neural networks learn weights** — but with:

* Multiple parameters (not just one like `weight`)
* Complex loss functions (e.g., cross-entropy)
* Backpropagation to calculate gradients

---

Would you like to:

* See this same concept using `PyTorch` or `TensorFlow`?
* Or try plotting how weight changes visually?

You're doing amazing — this is the beating heart of training deep learning models! 🚀


📘 Summary Table

| Concept           | Meaning                               | Simple Analogy                            |
| ----------------- | ------------------------------------- | ----------------------------------------- |
| Tensor            | Multi-dimensional array               | Like a list, table, or image grid         |
| Tensor Operations | Math actions on tensors               | Add, reshape, dot product, etc.           |
| Differentiation   | How output changes with input         | Like speed = change of distance over time |
| Gradient          | Vector of partial derivatives         | Shows which direction to move             |
| Gradient Descent  | Method to reduce loss using gradients | Like walking downhill to lowest point     |
