# 📘 Step 1: Two-Dimensional Tensors

In this section, we’ll explore **2D tensors** — the building block of most machine learning data.

---

## 🔹 1️⃣ What Are Two-Dimensional Tensors?

A **two-dimensional tensor** can be viewed as a **matrix**,  
which holds numerical values of the same type.

- Each **row** represents a *sample* or *observation*  
- Each **column** represents a *feature* or *attribute*

---

### 🏠 Example: Housing Data

Consider a dataset storing information about houses:

| Rooms | Age | Price |
|:------:|:----:|:------:|
| 3 | 10 | 250000 |
| 4 | 15 | 300000 |
| 2 | 20 | 180000 |

This can be represented as a **2D tensor**:

$$
X = 
\begin{bmatrix}
3 & 10 & 250000 \\
4 & 15 & 300000 \\
2 & 20 & 180000
\end{bmatrix}
$$

Each **row** = one house 🏠  
Each **column** = one feature 🔹

---

## 🔹 2️⃣ 2D Tensors in Images

- A **grayscale image** can be represented as a 2D tensor  
  Each element represents **pixel intensity** between **0 (black)** and **255 (white)**.  

$$
\text{Pixel intensity range: } 0 \leq I(x,y) \leq 255
$$

- A **color image (RGB)** is a 3D tensor — one 2D tensor for each channel:  
  - Red 🔴  
  - Green 🟢  
  - Blue 🔵  

Thus,  
$$
\text{Color Image Tensor Shape} = (3, H, W)
$$

---

## 🔹 3️⃣ Creating a 2D Tensor

We can create a 2D tensor using a list of lists.

Example:
- Each nested list → a row
- Each element → a column value

---

## 🔹 4️⃣ Tensor Attributes

| Property | Method | Description |
|-----------|----------|-------------|
| `t.ndimension()` | → Rank | Number of dimensions |
| `t.shape` or `t.size()` | → Shape | Number of rows & columns |
| `t.numel()` | → Count | Total number of elements |

If a tensor has shape $(3,3)$,  
then total elements = $3 \times 3 = 9$.

---

## 🔹 5️⃣ Indexing and Slicing in 2D

### 📌 Indexing Convention

| Syntax | Meaning |
|---------|----------|
| `t[row, column]` | Access a single element |
| `t[row]` | Access an entire row |
| `t[:, column]` | Access an entire column |
| `t[start:end, :]` | Slice rows |
| `t[:, start:end]` | Slice columns |

**Example Visualization:**

For a tensor:

$$
A =
\begin{bmatrix}
11 & 12 & 13 \\
21 & 22 & 23 \\
31 & 32 & 33
\end{bmatrix}
$$

- `A[1, 2]` → element at **2nd row**, **3rd column** = 23  
- `A[0, 1]` → 1st row, 2nd column = 12  
- `A[0, :2]` → 1st row, first 2 columns = `[11, 12]`  
- `A[1:, 2]` → last 2 rows, last column = `[23, 33]`

---

## 🔹 6️⃣ Basic Tensor Operations

### ➕ Addition (Matrix Addition)

If $X$ and $Y$ are two tensors of the same shape:

$$
Z = X + Y \quad \Rightarrow \quad z_{ij} = x_{ij} + y_{ij}
$$

### ✖️ Scalar Multiplication

$$
Z = \alpha Y \quad \Rightarrow \quad z_{ij} = \alpha \times y_{ij}
$$

### ⨀ Element-wise Multiplication (Hadamard Product)

$$
Z = X \odot Y \quad \Rightarrow \quad z_{ij} = x_{ij} \times y_{ij}
$$

### 🔹 Matrix Multiplication

Matrix product rule:  
If $A$ is $(m \times n)$ and $B$ is $(n \times p)$,  
then the result $C = A \times B$ has shape $(m \times p)$.

Each element:
$$
c_{ij} = \sum_{k=1}^{n} a_{ik} b_{kj}
$$

---

## ✅ Summary

| Operation | PyTorch Syntax | Description |
|------------|----------------|--------------|
| Addition | `X + Y` | Element-wise sum |
| Scalar multiplication | `2 * X` | Scales each element |
| Element-wise product | `X * Y` | Hadamard product |
| Matrix multiplication | `torch.mm(A, B)` or `A @ B` | Linear algebra product |

---

📌 **Key Takeaway:**  
2D tensors are everywhere — from tabular data to images.  
Understanding how to index, slice, and operate on them forms the foundation for working with real datasets and neural networks.


In [3]:
# ------------------------------------------------------------------------
# Complete Step 6 — Two-Dimensional Tensors (complete examples + comments)
# ------------------------------------------------------------------------
import torch

# --------------------------
# 1) Create a 2D tensor (list of lists) - example: housing / tabular data
# --------------------------
a = [[11, 12, 13],
     [21, 22, 23],
     [31, 32, 33]]

X = torch.tensor(a)                 # dtype inferred (int64)
print("Tensor X:\n", X, "\n")       # print full matrix

# --------------------------
# 2) Tensor attributes (rank, shape, size, number of elements)
# --------------------------
print("🔹 Tensor attributes")
print("ndimension():", X.ndimension())   # number of dimensions (rank)
print("shape:", X.shape)                 # (rows, columns)
print("size():", X.size())               # same as shape
print("numel():", X.numel())             # total elements = rows * cols
print()

# --------------------------
# 3) Indexing & slicing in 2D
# --------------------------
print("🔹 Indexing and slicing")
print("X[1, 2] (2nd row, 3rd col):", X[1, 2].item())   # single element -> python scalar
print("X[0, 1] (1st row, 2nd col):", X[0, 1].item())
print("X[0, :2] (first row, first two cols):", X[0, :2])   # slice returns tensor
print("X[1:, 2] (last two rows, last column):", X[1:, 2])
print("X[:, 1] (entire 2nd column):", X[:, 1])
print("X[2] (entire 3rd row):", X[2])
print()

# --------------------------
# 4) Matrix (element-wise) addition
# --------------------------
print("🔹 Matrix addition (element-wise)")
Y = torch.tensor([[1, 1, 1],
                  [2, 2, 2],
                  [3, 3, 3]])
Z_add = X + Y
print("Y:\n", Y)
print("X + Y =\n", Z_add)
print()

# --------------------------
# 5) Scalar multiplication (tensor <-> scalar)
# --------------------------
print("🔹 Scalar multiplication")
Z_scalar = 2 * Y
print("2 * Y =\n", Z_scalar)
print()

# --------------------------
# 6) Element-wise (Hadamard) multiplication
# --------------------------
print("🔹 Element-wise (Hadamard) multiplication")
Z_hadamard = X * Y    # elementwise product
print("X * Y =\n", Z_hadamard)
print()

# --------------------------
# 7) Matrix multiplication (linear algebra)
#    A shape (m x n)  B shape (n x p)  => C shape (m x p)
# --------------------------
print("🔹 Matrix multiplication (A @ B)")
A = torch.tensor([[0, 1, 2],
                  [3, 4, 5]])   # shape (2,3)

B = torch.tensor([[0, 1],
                  [2, 3],
                  [4, 5]])       # shape (3,2)

print("A (2x3):\n", A)
print("B (3x2):\n", B)

C = torch.mm(A, B)   # or A @ B
print("C = A @ B =\n", C)
# Manual verification for element (0,0): dot(A[0,:], B[:,0]) = 0*0 + 1*2 + 2*4 = 10
print("Verify C[0,0] manually:", (A[0].float() * B[:,0].float()).sum().item(), "== C[0,0] ->", C[0,0].item())
print()

# --------------------------
# 8) Transpose, determinant, inverse (when square & invertible)
# --------------------------
print("🔹 Transpose, determinant, inverse")
S = torch.tensor([[1., 2.],
                  [3., 4.]])   # square 2x2 (float for det/inv)
print("S:\n", S)
print("S.T (transpose):\n", S.T)
detS = torch.det(S)
print("det(S):", detS.item())
invS = torch.inverse(S)
print("S inverse:\n", invS)
print("S @ S^{-1} =\n", torch.mm(S, invS))  # should approximate identity
print()

# --------------------------
# 9) Flatten and reshape 2D -> 1D and back
# --------------------------
print("🔹 Flatten and reshape")
X_flat = X.flatten()         # (3,3) -> (9,)
print("X.flatten() ->", X_flat, "shape:", X_flat.shape)

X_view = X_flat.view(3, 3)   # reshape back (requires contiguity here)
print("X_flat.view(3,3) ->\n", X_view)
print()

# --------------------------
# 10) Broadcasting examples (tensor <-> tensor with compatible shapes)
# --------------------------
print("🔹 Broadcasting")
row = torch.tensor([1, 2, 3])        # shape (3,)
col = torch.tensor([[10], [20], [30]])  # shape (3,1)

print("row shape:", row.shape, "col shape:", col.shape)
# row will be broadcast to (3,3) when added to col
broadcast_sum = col + row
print("col + row -> shape:", broadcast_sum.shape)
print(broadcast_sum)
print()

# --------------------------
# 11) Comparisons, boolean masks & masked assignment
# --------------------------
print("🔹 Comparisons and masking")
mask = X > 20
print("mask (X > 20):\n", mask)
print("X[mask] ->", X[mask])   # flatten of selected items

# masked assignment: set values >20 to -1 (operate on a clone to show original preserved)
X_clone = X.clone()
X_clone[X_clone > 20] = -1
print("X_clone after masked assignment (values >20 -> -1):\n", X_clone)
print()

# --------------------------
# 12) Small grayscale image example (2D image tensor)
# --------------------------
print("🔹 Grayscale image example (2D tensor)")
# create a toy 5x5 'image' with intensities 0..255 scaled down
img = torch.tensor([
    [0,  30, 60, 90, 120],
    [15, 45, 75, 105,135],
    [30, 60, 90, 120,150],
    [45, 75, 105,135,165],
    [60, 90, 120,150,180]
], dtype=torch.uint8)   # intensities as unsigned 8-bit values
print("img (dtype uint8):\n", img)
print("img shape:", img.shape)
# convert to float normalized [0,1] for processing
img_float = img.float() / 255.0
print("img normalized (float):\n", img_float)
print()

# --------------------------
# 13) Extra: converting dtype and device (common utilities)
# --------------------------
print("🔹 Dtype conversion and device")
X_float = X.float()             # int -> float
print("X_float dtype:", X_float.dtype)
if torch.cuda.is_available():
    X_gpu = X_float.to('cuda')
    print("Moved X to GPU:", X_gpu.device)
    # move back to CPU when needed
    X_back = X_gpu.to('cpu')
    print("Back on CPU:", X_back.device)
else:
    print("No GPU available on this system (device stays on cpu).")
print()

# --------------------------
# End summary
# --------------------------
print("✅ Step 6 code demo complete: creation, indexing, arithmetic, matmul, reshape, broadcasting, masking, and image example.")


Tensor X:
 tensor([[11, 12, 13],
        [21, 22, 23],
        [31, 32, 33]]) 

🔹 Tensor attributes
ndimension(): 2
shape: torch.Size([3, 3])
size(): torch.Size([3, 3])
numel(): 9

🔹 Indexing and slicing
X[1, 2] (2nd row, 3rd col): 23
X[0, 1] (1st row, 2nd col): 12
X[0, :2] (first row, first two cols): tensor([11, 12])
X[1:, 2] (last two rows, last column): tensor([23, 33])
X[:, 1] (entire 2nd column): tensor([12, 22, 32])
X[2] (entire 3rd row): tensor([31, 32, 33])

🔹 Matrix addition (element-wise)
Y:
 tensor([[1, 1, 1],
        [2, 2, 2],
        [3, 3, 3]])
X + Y =
 tensor([[12, 13, 14],
        [23, 24, 25],
        [34, 35, 36]])

🔹 Scalar multiplication
2 * Y =
 tensor([[2, 2, 2],
        [4, 4, 4],
        [6, 6, 6]])

🔹 Element-wise (Hadamard) multiplication
X * Y =
 tensor([[11, 12, 13],
        [42, 44, 46],
        [93, 96, 99]])

🔹 Matrix multiplication (A @ B)
A (2x3):
 tensor([[0, 1, 2],
        [3, 4, 5]])
B (3x2):
 tensor([[0, 1],
        [2, 3],
        [4, 5]])
C = A 

---

# 📘 Step 2: Advanced Concepts — 2D Tensor Operations in Depth

We now extend our understanding of two-dimensional tensors with additional
concepts used constantly in deep-learning workflows.

---

## 🔹 7️⃣ Broadcasting (Automatic Shape Alignment)

When performing element-wise operations, PyTorch automatically **broadcasts**
smaller tensors to match larger shapes.  
Broadcasting follows NumPy rules — comparing shapes **from right to left**:

1. Dimensions are equal, or  
2. One of the dimensions is 1 (stretched virtually), or  
3. Otherwise the operation fails.

**Example**

If  
$$
A \in \mathbb{R}^{3 \times 1}, \quad B \in \mathbb{R}^{1 \times 4}
$$  
then  
$$
A + B \Rightarrow \text{shape } (3,4)
$$


Broadcasting is **logical** (no extra memory).  
Use `expand()` for views and `repeat()` when actual copies are needed.

---

## 🔹 8️⃣ Transpose and Contiguity

Transposing swaps axes (rows ↔ columns):

$$A^T_{ij} = A_{ji}.$$

In PyTorch, `A.T` or `A.transpose(0,1)` creates a **non-contiguous** view.  
Some operations such as `.view()` require contiguous memory.

To fix this, we use:

$$
A = A.T.contiguous()
$$

Now you can safely reshape or flatten the tensor.

---

## 🔹 9️⃣ Reshape vs View vs Flatten

| Function | Copy? | Description |
|-----------|-------|-------------|
| `reshape()` | May copy | Safest reshape |
| `view()` | No copy (contiguous only) | Fast view on same storage |
| `flatten(start_dim=0)` | — | Collapse dims into 1D |

Use `-1` to let PyTorch **infer** a dimension:

$$
x.reshape(2,-1) \Rightarrow (2, \text{auto})
$$

Example: if tensor `x` has 24 elements,  
`x.reshape(2, -1)` → `(2, 12)` and `x.reshape(-1, 6)` → `(4, 6)`.

---

## 🔹 🔟 Unsqueeze and Squeeze

Used to **add** or **remove** singleton (size = 1) dimensions.

| Function | Example | Result |
|-----------|----------|--------|
| `unsqueeze(dim)` | `(3,) → (1,3)` | Adds axis |
| `squeeze(dim)` | `(1,3,1) → (3,)` | Removes axis |

**Common use-cases:**
- Add batch dim → `x.unsqueeze(0)`  
- Remove redundant dims → `x.squeeze()`

Mathematically, if  
$$
x = [1, 2, 3]
$$  
then  
$$
x.unsqueeze(0) \Rightarrow [[1, 2, 3]]
$$

---

## 🔹 11️⃣ Expand vs Repeat

Both replicate data but behave differently:

| Function | Copies Data? | Description |
|-----------|--------------|-------------|
| `expand()` | ❌ | Creates a **view**, no memory cost |
| `repeat()` | ✅ | Physically duplicates elements |

🧠 **Rule of Thumb:**  
Use `expand()` for lightweight broadcasting views;  
use `repeat()` when you truly need multiple copies.

**Example:**

If  
$$
t = [1, 2, 3]
$$  
then  

- `t.expand(3,3)` → broadcasted view  
- `t.repeat(2,1)` → actual duplicated data

---

## 🔹 12️⃣ Masking and Conditional Assignment

Boolean masks let you **filter** or **modify** tensor elements based on conditions.

**Example:**

$$
X =
\begin{bmatrix}
5 & 15 & 25 \\
10 & 20 & 30 \\
0 & 8 & 40
\end{bmatrix}
$$

Then  
`mask = X > 15` gives a boolean tensor:

$$
\begin{bmatrix}
False & False & True \\
False & True & True \\
False & False & True
\end{bmatrix}
$$

- `X[mask]` → returns elements greater than 15  
- `X[mask] = -1` → replaces those elements in place  

Masks are powerful for thresholding, clipping, and logical filtering.

---

## 🔹 13️⃣ Matmul Family and Dot Product

- **Dot product (1D vectors):**

$$
\text{torch.dot}(a,b) = \sum_i a_i b_i
$$

- **Matrix multiplication (2D tensors):**

$$
C = A \times B \Rightarrow c_{ij} = \sum_k a_{ik} b_{kj}
$$

Use `torch.mm(A, B)` or `A @ B`.

- **Generalized matmul (N-D tensors):**

`torch.matmul()` handles batched or broadcasted matrix multiplies.  
For example, `(B, m, n) @ (B, n, p)` → `(B, m, p)`.

---

## 🔹 14️⃣ Dtype and Device Utilities

Convert tensors between data types and devices:

| Purpose | Example |
|----------|----------|
| Change dtype | `X.float()` / `X.long()` |
| Move to GPU | `X.to("cuda")` |
| Move back to CPU | `X.to("cpu")` |

Always convert to CPU before calling `.numpy()` —  
GPU tensors cannot be directly converted to NumPy arrays.

---

## ✅ Summary — Advanced 2D Tensor Tools

| Category | Function / Example | Purpose |
|-----------|-------------------|----------|
| Broadcasting | `A + B` | Auto-align shapes |
| Transpose | `A.T` | Swap rows ↔ columns |
| Contiguity | `A.contiguous()` | Safe reshape |
| Reshape / View | `A.view(2, -1)` | Change layout |
| Flatten | `torch.flatten(A)` | Collapse dims |
| Unsqueeze / Squeeze | `x.unsqueeze(0)` / `x.squeeze()` | Add / remove dims |
| Expand / Repeat | `A.expand(3,3)` / `A.repeat(2,1)` | View vs copy |
| Masking | `A[A>0]` | Conditional selection |
| Matmul | `torch.mm(A,B)` | Matrix product |
| Device / Dtype | `A.to("cuda")` | Move or cast tensor |

---

📌 **Key Takeaway:**  
These advanced tensor operations make your workflow flexible and efficient.  
They’re essential for handling batches, channels, and shapes correctly when building deep learning pipelines.


In [4]:
# Advanced 2D tensor toolbox — Sections 8..14
# Includes: broadcasting, transpose & contiguity, reshape/view/flatten,
# unsqueeze/squeeze, expand/repeat, masking, matmul/dot, dtype/device utilities.

# ---------------------------------------------------------------------
# 8) Broadcasting example (col + row -> (3,3))
# ---------------------------------------------------------------------
print("=== Broadcasting ===")
col = torch.tensor([[10], [20], [30]])   # shape (3,1)
row = torch.tensor([1, 2, 3])            # shape (3,)
print("col shape:", tuple(col.shape), "row shape:", tuple(row.shape))

# col + row broadcasts row to shape (3,3)
B = col + row
print("col + row -> shape:", tuple(B.shape))
print(B)   # each row = original col value + row vector
print()

# ---------------------------------------------------------------------
# 8.1) Demonstrate broadcasting rules failure (incompatible shapes)
# ---------------------------------------------------------------------
try:
    bad = torch.tensor([1,2]).reshape(2,1) + torch.tensor([1,2,3])  # (2,1) + (3,) -> should raise
    print("unexpected success:", bad)
except Exception as e:
    print("Broadcasting failed as expected for incompatible shapes:", type(e).__name__, e)
print()

# ---------------------------------------------------------------------
# 9) Transpose & Contiguity — why view() can fail, how to fix with contiguous()
# ---------------------------------------------------------------------
print("=== Transpose & Contiguity ===")
M = torch.arange(12).reshape(3, 4)   # shape (3,4)
print("M shape:", tuple(M.shape))
Mt = M.T                             # (4,3) - often non-contiguous view
print("Mt shape:", tuple(Mt.shape))
print("Mt.is_contiguous():", Mt.is_contiguous())

# Trying to do Mt.view(-1) may raise if non-contiguous. Show safe pattern:
try:
    v_bad = Mt.view(-1)   # attempt view on possibly non-contiguous tensor
    print("Mt.view(-1) succeeded (unexpected):", v_bad.shape)
except Exception as e:
    print("Mt.view(-1) failed (expected on some systems):", type(e).__name__, "-", e)
    # Fix by making contiguous first
    v_fix = Mt.contiguous().view(-1)
    print("Fixed: Mt.contiguous().view(-1) -> shape:", tuple(v_fix.shape))
print()

# ---------------------------------------------------------------------
# 9.1) reshape() vs view() vs flatten()
# ---------------------------------------------------------------------
print("=== reshape vs view vs flatten ===")
x = torch.arange(24)
print("x shape:", tuple(x.shape))

r1 = x.reshape(2, -1)   # reshape: safe (may copy if necessary)
r2 = x.view(2, -1)      # view: fast, requires contiguous storage
f = x.flatten()         # flatten to 1D
print("x.reshape(2, -1) ->", tuple(r1.shape))
print("x.view(2, -1) ->", tuple(r2.shape))
print("x.flatten() ->", tuple(f.shape))
print()

# ---------------------------------------------------------------------
# 10) Unsqueeze and Squeeze (add/remove singleton dims)
# ---------------------------------------------------------------------
print("=== unsqueeze / squeeze ===")
v = torch.tensor([7, 8, 9])   # (3,)
print("v shape:", tuple(v.shape), "v:", v)

v_u0 = v.unsqueeze(0)         # (1,3) add batch dim at front
v_u1 = v.unsqueeze(1)         # (3,1) add dim in middle
print("v.unsqueeze(0) shape:", tuple(v_u0.shape))
print(v_u0)
print("v.unsqueeze(1) shape:", tuple(v_u1.shape))
print(v_u1)

# create a tensor with singleton dims and remove them
s = v_u1.unsqueeze(0)         # (1,3,1)
print("s shape before squeeze:", tuple(s.shape))
s_squeezed_all = s.squeeze()  # removes all size-1 dims -> (3,)
s_squeezed_dim0 = s.squeeze(0)  # remove only first dim -> (3,1)
print("s.squeeze() ->", tuple(s_squeezed_all.shape))
print("s.squeeze(0) ->", tuple(s_squeezed_dim0.shape))
print()

# ---------------------------------------------------------------------
# 11) Expand vs Repeat (view vs copy)
# ---------------------------------------------------------------------
print("=== expand vs repeat ===")
t = torch.tensor([1.0, 2.0, 3.0])   # shape (3,)
print("t shape:", tuple(t.shape), "t:", t)

# To expand to (3,3) we first unsqueeze to (3,1) then expand
t_expanded = t.unsqueeze(1).expand(3, 3)   # view: no new memory for element values
t_repeated = t.repeat(2, 1)                # copy: new memory allocated (2,3)

print("t.unsqueeze(1).expand(3,3) shape:", tuple(t_expanded.shape))
print(t_expanded)
print("t.repeat(2,1) shape:", tuple(t_repeated.shape))
print(t_repeated)

# Demonstrate that repeat creates new storage pointer (expand may share)
print("storage pointer (t_expanded.storage().data_ptr()):", t_expanded.storage().data_ptr())
print("storage pointer (t_repeated.storage().data_ptr()):", t_repeated.storage().data_ptr())
print("If pointers differ, repeat allocated new storage; expand reuses original where possible.")
print()

# ---------------------------------------------------------------------
# 12) Masking and conditional assignment
# ---------------------------------------------------------------------
print("=== masking & conditional assignment ===")
X = torch.tensor([[5, 15, 25],
                  [10,20,30],
                  [0, 8, 40]])
print("X:\n", X)

mask = X > 15
print("mask (X > 15):\n", mask)
print("Selected elements (X[mask]):", X[mask])

# In-place masked assignment (do on a clone to preserve original)
Xc = X.clone()
Xc[Xc > 15] = -1
print("X clone after Xc[Xc > 15] = -1:\n", Xc)
print()

# ---------------------------------------------------------------------
# 13) dot, mm, matmul (including batched matmul)
# ---------------------------------------------------------------------
print("=== dot, mm, matmul ===")
a = torch.tensor([1., 2., 3.])
b = torch.tensor([4., 5., 6.])
print("dot(a,b):", torch.dot(a, b))   # scalar: 1*4 + 2*5 + 3*6 = 32

# 2D matrix multiplication
A = torch.tensor([[1., 2.],
                  [3., 4.]])   # (2,2)
B = torch.tensor([[5., 6.],
                  [7., 8.]])   # (2,2)
print("A @ B =\n", A @ B)            # or torch.mm(A,B)

# Batched matmul example: shape (batch, m, n) @ (batch, n, p) -> (batch, m, p)
batchA = torch.stack([A, A + 1.0])   # shape (2,2,2)
batchB = torch.stack([B, B + 1.0])   # shape (2,2,2)
print("batchA shape:", tuple(batchA.shape), "batchB shape:", tuple(batchB.shape))
batchC = torch.matmul(batchA, batchB)   # batched matmul
print("torch.matmul(batchA, batchB) -> shape:", tuple(batchC.shape))
print(batchC)
print()

# ---------------------------------------------------------------------
# 14) Dtype and Device utilities (casting & moving tensors)
# ---------------------------------------------------------------------
print("=== dtype & device utilities ===")
X_int = torch.tensor([[1, 2], [3, 4]])
print("X_int dtype:", X_int.dtype)
X_float = X_int.float()    # cast to float
print("X_int.float() dtype:", X_float.dtype)

# Move to GPU if available, otherwise show CPU path
if torch.cuda.is_available():
    dev = torch.device("cuda")
    X_gpu = X_float.to(dev)
    print("Moved X to GPU:", X_gpu.device)
    # move back and convert to numpy (safe only on CPU)
    X_back = X_gpu.to("cpu")
    print("Back on CPU:", X_back.device, "-> as numpy:", X_back.numpy())
else:
    print("No CUDA device found — staying on CPU.")
    # Convert to numpy directly (CPU tensor)
    print("X_float.numpy() ->", X_float.numpy())

print()
print("✅ Advanced 2D tensor demo complete.")


=== Broadcasting ===
col shape: (3, 1) row shape: (3,)
col + row -> shape: (3, 3)
tensor([[11, 12, 13],
        [21, 22, 23],
        [31, 32, 33]])

unexpected success: tensor([[2, 3, 4],
        [3, 4, 5]])

=== Transpose & Contiguity ===
M shape: (3, 4)
Mt shape: (4, 3)
Mt.is_contiguous(): False
Mt.view(-1) failed (expected on some systems): RuntimeError - view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
Fixed: Mt.contiguous().view(-1) -> shape: (12,)

=== reshape vs view vs flatten ===
x shape: (24,)
x.reshape(2, -1) -> (2, 12)
x.view(2, -1) -> (2, 12)
x.flatten() -> (24,)

=== unsqueeze / squeeze ===
v shape: (3,) v: tensor([7, 8, 9])
v.unsqueeze(0) shape: (1, 3)
tensor([[7, 8, 9]])
v.unsqueeze(1) shape: (3, 1)
tensor([[7],
        [8],
        [9]])
s shape before squeeze: (1, 3, 1)
s.squeeze() -> (3,)
s.squeeze(0) -> (3, 1)

=== expand vs repeat ===
t shape: (3,) t: tensor([

## 🔹 2️⃣ Matrix Multiplication (Linear Algebra Product)

For the **matrix product**,  
each element $(i, j)$ of the result is the **dot product** of the *i-th row* of $A$ and the *j-th column* of $B$.

---

### 🔸 Formula

$$
C = A \times B, \quad
c_{ij} = \sum_{k=1}^{n} a_{ik} b_{kj}
$$

where $n$ = number of columns in $A$ (or rows in $B$).

---

### 🔸 Step-by-Step with Positions

Let:

$$
A =
\begin{bmatrix}
2 & 3 \\
4 & 5
\end{bmatrix},
\quad
B =
\begin{bmatrix}
6 & 7 \\
8 & 9
\end{bmatrix}
$$

---

### ➤ For position **(0, 0)**

We take the **1st row** of $A$ and the **1st column** of $B$:

$$
A =
\begin{bmatrix}
\color{red}{2} & \color{red}{3} \\
4 & 5
\end{bmatrix},
\quad
B =
\begin{bmatrix}
\color{red}{6} & 7 \\
\color{red}{8} & 9
\end{bmatrix}
$$

Compute:
$$
(0,0): \; (2 \times 6) + (3 \times 8) = 46
$$

---

### ➤ For position **(0, 1)**

Take the **1st row** of $A$ and the **2nd column** of $B$:

$$
A =
\begin{bmatrix}
\color{red}{2} & \color{red}{3} \\
4 & 5
\end{bmatrix},
\quad
B =
\begin{bmatrix}
6 & \color{red}{7} \\
8 & \color{red}{9}
\end{bmatrix}
$$

Compute:
$$
(0,1): \; (2 \times 7) + (3 \times 9) = 51
$$

---

### ➤ For position **(1, 0)**

Take the **2nd row** of $A$ and the **1st column** of $B$:

$$
A =
\begin{bmatrix}
2 & 3 \\
\color{red}{4} & \color{red}{5}
\end{bmatrix},
\quad
B =
\begin{bmatrix}
\color{red}{6} & 7 \\
\color{red}{8} & 9
\end{bmatrix}
$$

Compute:
$$
(1,0): \; (4 \times 6) + (5 \times 8) = 68
$$

---

### ➤ For position **(1, 1)**

Take the **2nd row** of $A$ and the **2nd column** of $B$:

$$
A =
\begin{bmatrix}
2 & 3 \\
\color{red}{4} & \color{red}{5}
\end{bmatrix},
\quad
B =
\begin{bmatrix}
6 & \color{red}{7} \\
8 & \color{red}{9}
\end{bmatrix}
$$

Compute:
$$
(1,1): \; (4 \times 7) + (5 \times 9) = 75
$$

---

### ✅ Final Matrix Multiplication Result

$$
A \times B =
\begin{bmatrix}
46 & 51 \\
68 & 75
\end{bmatrix}
$$

---

📌 **Summary:**  
Each cell $(i, j)$ of the resulting matrix is obtained by **multiplying the i-th row of A**  
with the **j-th column of B** and summing the products.


## 🔹 Matrix Multiplication Example —  (2×3) × (3×2)

We have two tensors:

$$
A =
\begin{bmatrix}
0 & 1 & 1 \\
1 & 0 & 1
\end{bmatrix},
\quad
B =
\begin{bmatrix}
1 & 1 \\
1 & 1 \\
-1 & 1
\end{bmatrix}
$$

Shape of $A$: **(2 × 3)**  
Shape of $B$: **(3 × 2)**  
✅ Result shape = **(2 × 2)**  

---

### 🔸 Formula

$$
C = A \times B, \quad
c_{ij} = \sum_{k=1}^{3} a_{ik} b_{kj}
$$

---

### ➤ For position **(0, 0)**

Take the **1st row** of $A$ and the **1st column** of $B$:

$$
A =
\begin{bmatrix}
\color{red}{0} & \color{red}{1} & \color{red}{1} \\
1 & 0 & 1
\end{bmatrix},
\quad
B =
\begin{bmatrix}
\color{red}{1} & 1 \\
\color{red}{1} & 1 \\
\color{red}{-1} & 1
\end{bmatrix}
$$

Compute:
$$
(0,0): \; (0\times1) + (1\times1) + (1\times(-1)) = 0
$$

---

### ➤ For position **(0, 1)**

Take the **1st row** of $A$ and the **2nd column** of $B$:

$$
A =
\begin{bmatrix}
\color{red}{0} & \color{red}{1} & \color{red}{1} \\
1 & 0 & 1
\end{bmatrix},
\quad
B =
\begin{bmatrix}
1 & \color{red}{1} \\
1 & \color{red}{1} \\
-1 & \color{red}{1}
\end{bmatrix}
$$

Compute:
$$
(0,1): \; (0\times1) + (1\times1) + (1\times1) = 2
$$

---

### ➤ For position **(1, 0)**

Take the **2nd row** of $A$ and the **1st column** of $B$:

$$
A =
\begin{bmatrix}
0 & 1 & 1 \\
\color{red}{1} & \color{red}{0} & \color{red}{1}
\end{bmatrix},
\quad
B =
\begin{bmatrix}
\color{red}{1} & 1 \\
\color{red}{1} & 1 \\
\color{red}{-1} & 1
\end{bmatrix}
$$

Compute:
$$
(1,0): \; (1\times1) + (0\times1) + (1\times(-1)) = 0
$$

---

### ➤ For position **(1, 1)**

Take the **2nd row** of $A$ and the **2nd column** of $B$:

$$
A =
\begin{bmatrix}
0 & 1 & 1 \\
\color{red}{1} & \color{red}{0} & \color{red}{1}
\end{bmatrix},
\quad
B =
\begin{bmatrix}
1 & \color{red}{1} \\
1 & \color{red}{1} \\
-1 & \color{red}{1}
\end{bmatrix}
$$

Compute:
$$
(1,1): \; (1\times1) + (0\times1) + (1\times1) = 2
$$

---

### ✅ Final Matrix Multiplication Result

$$
A \times B =
\begin{bmatrix}
0 & 2 \\
0 & 2
\end{bmatrix}
$$

---

📌 **Summary:**  
Each element $c_{ij}$ is computed by multiplying the **i-th row of A**  
with the **j-th column of B** and summing the products.
