# 🧠 PyTorch Tensor — Creation & Basic Properties

---

## 💬 Question
**Q:** How do you create tensors in PyTorch, and what are the key tensor attributes?

**问：** PyTorch 中如何创建张量？有哪些重要属性？

---

## 🧩 Explanation
Tensors are the basic data structure in PyTorch — similar to NumPy arrays,  
but they can also run on GPU and support automatic differentiation.

Key attributes:
- `.shape` or `.size()` → tensor dimensions  
- `.dtype` → data type (e.g., float32, int64)  
- `.device` → CPU or GPU  
- `.requires_grad` → whether it tracks gradients

Common creation functions:
- `torch.tensor()` — from Python lists  
- `torch.zeros()`, `torch.ones()` — initialize with 0 or 1  
- `torch.randn()` — random normal distribution  
- `torch.arange()`, `torch.linspace()` — numeric sequences  
- `.to(device)` — move to GPU

---


In [1]:
# ✅ PyTorch Tensor Creation Examples

import torch

# From list
a = torch.tensor([[1, 2, 3], [4, 5, 6]], dtype=torch.float32)
print("a:\n", a)

# Zero / Ones tensors
zeros = torch.zeros((2, 3))
ones = torch.ones((2, 3))
print("zeros:\n", zeros)
print("ones:\n", ones)

# Random tensor (Normal distribution)
randn = torch.randn((2, 3))
print("randn:\n", randn)

# Range tensors
arange = torch.arange(0, 10, 2)   # 0, 2, 4, 6, 8
linspace = torch.linspace(0, 1, 5) # 5 points between 0 and 1
print("arange:", arange)
print("linspace:", linspace)

# Check attributes
print("Shape:", a.shape)
print("Dtype:", a.dtype)
print("Device:", a.device)
print("Requires_grad:", a.requires_grad)


a:
 tensor([[1., 2., 3.],
        [4., 5., 6.]])
zeros:
 tensor([[0., 0., 0.],
        [0., 0., 0.]])
ones:
 tensor([[1., 1., 1.],
        [1., 1., 1.]])
randn:
 tensor([[ 1.1669,  1.3793, -0.0726],
        [ 0.6423,  1.0847, -0.3984]])
arange: tensor([0, 2, 4, 6, 8])
linspace: tensor([0.0000, 0.2500, 0.5000, 0.7500, 1.0000])
Shape: torch.Size([2, 3])
Dtype: torch.float32
Device: cpu
Requires_grad: False


# 🧠 PyTorch Tensor Shape Operations — Reshape, View, Squeeze, Unsqueeze, Flatten

---

## 💬 Question  
**Q:** How do you change the shape of a tensor in PyTorch?  
Explain the difference between `reshape`, `view`, `squeeze`, `unsqueeze`, and `flatten`.

**问：** PyTorch 中如何改变张量形状？`reshape`、`view`、`squeeze`、`unsqueeze`、`flatten` 有何区别？

---

## 🧩 Explanation  

Changing tensor dimensions is common in neural network pipelines — for example, when:
- feeding images into fully connected layers
- processing batched data
- removing unnecessary singleton dimensions

---

### 🔹 `reshape()`  
Returns a new tensor with the desired shape.  
It **creates a new view** if possible; otherwise copies data.  
Flexible and safe in general.

$$A.reshape(\text{new\_shape})$$

---

### 🔹 `view()`  
Similar to `reshape`, but **requires the tensor to be contiguous** in memory.  
Faster, but may fail if tensor is non-contiguous (e.g., after transpose).  
Often used in training loops because of performance.

$$A.view(\text{new\_shape})$$

---

### 🔹 `squeeze()`  
Removes dimensions of size 1.  
Example: shape `[1, 3, 1, 4] → [3, 4]`.

$$\text{squeeze}(A): \text{remove size-1 axes}$$

---

### 🔹 `unsqueeze()`  
Adds a new dimension of size 1 at a given position.  
Used to make tensors broadcastable or add batch/channel axes.

$$\text{unsqueeze}(A, \text{dim})$$

---

### 🔹 `flatten()`  
Flattens all dimensions except optionally one (like batch).  
Example: `[batch, channel, height, width] → [batch, -1]`.

$$\text{flatten}(A, \text{start\_dim}=1)$$

---

## 🧮 Formula  
If a tensor has total elements $N = d_1 \times d_2 \times ... \times d_n$,  
then any reshaping must satisfy:

$$N_{\text{before}} = N_{\text{after}}$$

---

## 💡 Interview tip  
In practice:
- `view()` is common inside `forward()`  
- `squeeze()` and `unsqueeze()` often appear when aligning dimensions  
- `flatten()` is used before linear layers (e.g., CNN → FC)

---


In [3]:
# ✅ Tensor Shape Operations Examples

import torch

x = torch.arange(12)
print("Original x:", x)
print("Shape:", x.shape)

# 🔹 reshape
x_reshape = x.reshape(3, 4)
print("\nReshape (3,4):\n", x_reshape)

# 🔹 view
x_view = x.view(2, 6)
print("\nView (2,6):\n", x_view)

# 🔹 squeeze / unsqueeze
x_unsq = x_reshape.unsqueeze(0)      # Add dim at front -> shape (1,3,4)
x_sq = x_unsq.squeeze(0)             # Remove dim size=1 -> back to (3,4)
print("\nUnsqueeze -> shape:", x_unsq.shape)
print("Squeeze -> shape:", x_sq.shape)

# 🔹 flatten
x_flat = x_reshape.flatten()         # (12,)
x_flat_batch = x_reshape.flatten(start_dim=1)  # keep first dim (3,4)->(3,4)
print("\nFlatten (all):", x_flat.shape)
print("Flatten from dim=1:", x_flat_batch.shape)


Original x: tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
Shape: torch.Size([12])

Reshape (3,4):
 tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])

View (2,6):
 tensor([[ 0,  1,  2,  3,  4,  5],
        [ 6,  7,  8,  9, 10, 11]])

Unsqueeze -> shape: torch.Size([1, 3, 4])
Squeeze -> shape: torch.Size([3, 4])

Flatten (all): torch.Size([12])
Flatten from dim=1: torch.Size([3, 4])


# 🧠 PyTorch Tensor Dimension Swapping — Transpose / Permute / T

---

## 💬 Question  
**Q:** How do you swap or reorder tensor dimensions in PyTorch?  
Explain the difference between `transpose()`, `permute()`, and `.T`.

**问：**  
如何在 PyTorch 中交换或重排张量的维度？`transpose()`、`permute()` 和 `.T` 有什么区别？

---

## 🧩 Explanation  

When working with multi-dimensional tensors (e.g., images, sequences, batches),  
you often need to **reorder or swap dimensions**.  
Typical use cases include converting image data formats, such as  
from `[batch, height, width, channel]` → `[batch, channel, height, width]`.

---

### 🔹 `transpose(dim0, dim1)`

Swaps **two** dimensions only.

$$x' = \text{transpose}(x, i, j)$$

Example: shape `[2, 3, 4] → [3, 2, 4]`

$$x'_{a,\dots,i,\dots,j,\dots,b} = x_{a,\dots,j,\dots,i,\dots,b}$$

---

### 🔹 `.T`

For **2D tensors**, `.T` is equivalent to `.transpose(0, 1)`:

$$x^\top_{i,j} = x_{j,i}$$

For higher-dimensional tensors (≥3D), `.T` **reverses all dimensions**:  
Example: `[2, 3, 4] → [4, 3, 2]`.

---

### 🔹 `permute(dims)`

Reorders **multiple** dimensions arbitrarily.  
It is the most general and powerful method.

$$x' = \text{permute}(x, (\text{new\_order}))$$

Example:  
$$\text{permute}(x, (0, 3, 1, 2)): [N, H, W, C] \to [N, C, H, W]$$

---

## 💡 Key Concept

All these methods **return a view**, not a data copy — they change the way data is interpreted in memory.  
That means the new tensor may become **non-contiguous**,  
so you might need to call `.contiguous()` before `.view()`.

$$x_{\text{contiguous}} = x.\text{permute(...)}.\text{contiguous()}$$

---



In [6]:
import torch

# 3D tensor
x = torch.arange(24).reshape(2, 3, 4)
print("Original shape:", x.shape)  # (2, 3, 4)

# 🔹 transpose: swap two dims
x_t = x.transpose(0, 1)
print("\nAfter transpose(0,1):", x_t.shape)  # (3, 2, 4)

# 🔹 permute: reorder multiple dims
x_p = x.permute(1, 0, 2)
print("After permute(1,0,2):", x_p.shape)    # (3, 2, 4)

# 🔹 .T (for 2D)
m = torch.tensor([[1, 2, 3],
                  [4, 5, 6]])
print("\nMatrix:\n", m)
print("m.T:\n", m.T)  # same as m.transpose(0,1)

# 🔹 non-contiguous example
x_perm = x.permute(2, 0, 1)
print(x_perm)
print("\nIs contiguous?", x_perm.is_contiguous())  # usually False
x_contig = x_perm.contiguous()
print(x_contig)
print("After .contiguous():", x_contig.is_contiguous())  # True


Original shape: torch.Size([2, 3, 4])

After transpose(0,1): torch.Size([3, 2, 4])
After permute(1,0,2): torch.Size([3, 2, 4])

Matrix:
 tensor([[1, 2, 3],
        [4, 5, 6]])
m.T:
 tensor([[1, 4],
        [2, 5],
        [3, 6]])
tensor([[[ 0,  4,  8],
         [12, 16, 20]],

        [[ 1,  5,  9],
         [13, 17, 21]],

        [[ 2,  6, 10],
         [14, 18, 22]],

        [[ 3,  7, 11],
         [15, 19, 23]]])

Is contiguous? False
tensor([[[ 0,  4,  8],
         [12, 16, 20]],

        [[ 1,  5,  9],
         [13, 17, 21]],

        [[ 2,  6, 10],
         [14, 18, 22]],

        [[ 3,  7, 11],
         [15, 19, 23]]])
After .contiguous(): True


# 🧠 Understanding `contiguous()` in PyTorch

---

## 💬 Question

**Q:** What does `contiguous()` mean in PyTorch, and why do we need it?

**问：**  
PyTorch 中的 `contiguous()` 是什么？为什么需要它？

---

## 🧩 Explanation

In PyTorch, tensors are stored as **blocks of memory**.  
Each tensor has two key properties:

1️⃣ **data** – the actual values stored in memory  
2️⃣ **stride** – how many memory steps to move to get to the next element in each dimension

A tensor is said to be **contiguous** when its elements are stored **sequentially in memory**.

---

### 🔹 Example: Contiguous Tensor

A normal tensor created by `torch.arange()` or `torch.randn()` is contiguous by default.

$$\text{Memory layout: } [0, 1, 2, 3, 4, 5]$$

This layout matches its shape order, so:

$$x.\text{is\_contiguous()} = \text{True}$$

---

### 🔹 Example: Non-Contiguous Tensor

When you use operations like `transpose()` or `permute()`,  
PyTorch **does not copy data** — it only changes how indices map to memory positions using strides.

$$\text{Transpose changes the stride order, not the memory itself.}$$

Hence, after transpose:

$$x.\text{is\_contiguous()} = \text{False}$$

---

### 🔹 Why It Matters

The `.view()` function in PyTorch **requires** the tensor to be contiguous,  
because it directly reinterprets the existing memory layout.

If a tensor is non-contiguous, `.view()` will raise:



In [7]:
import torch

# 1️⃣ Create a contiguous tensor
x = torch.arange(6).reshape(2, 3)
print("x:\n", x)
print("is_contiguous:", x.is_contiguous())  # ✅ True

# 2️⃣ Transpose makes it non-contiguous
y = x.t()
print("\ny (transposed):\n", y)
print("is_contiguous:", y.is_contiguous())  # ❌ False

# 3️⃣ Trying to use .view() now will fail
try:
    z = y.view(-1)
except RuntimeError as e:
    print("\nError when using view on non-contiguous tensor:")
    print(e)

# 4️⃣ Fix by making it contiguous
y_contig = y.contiguous()
print("\nAfter .contiguous(): is_contiguous =", y_contig.is_contiguous())

# Now .view() works fine
z = y_contig.view(-1)
print("Reshaped z:", z)


x:
 tensor([[0, 1, 2],
        [3, 4, 5]])
is_contiguous: True

y (transposed):
 tensor([[0, 3],
        [1, 4],
        [2, 5]])
is_contiguous: False

Error when using view on non-contiguous tensor:
view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

After .contiguous(): is_contiguous = True
Reshaped z: tensor([0, 3, 1, 4, 2, 5])


# 🧠 PyTorch Tensor Concatenation & Splitting — `cat`, `stack`, `split`, `chunk`

---

## 💬 Question

**Q:** How do you combine and separate tensors in PyTorch?  
Explain the difference between `torch.cat`, `torch.stack`, `torch.split`, and `torch.chunk`.

**问：**  
在 PyTorch 中如何拼接和分割张量？`torch.cat`、`torch.stack`、`torch.split`、`torch.chunk` 有什么区别？

---

## 🧩 Explanation

### 🔹 1. `torch.cat(tensors, dim)`

Concatenates a **sequence of tensors** along an existing dimension.

\[
y = \text{cat}([x_1, x_2, ...], \text{dim}=d)
\]

All tensors must have the **same shape** except for the concatenation dimension.

**Example:**  
Concatenating along dim=0 → adds more rows  
Concatenating along dim=1 → adds more columns

---

### 🔹 2. `torch.stack(tensors, dim)`

Stacks tensors along a **new dimension**.  
Unlike `cat`, it **creates a new axis**.

\[
y = \text{stack}([x_1, x_2, ...], \text{dim}=d)
\]

All tensors must have **exactly the same shape**.

Example: stacking 3 vectors of shape `(2,)` → result `(3, 2)`.

---

### 🔹 3. `torch.split(tensor, split_size_or_sections, dim)`

Splits a tensor into **equal parts or specified sizes** along a given dimension.

\[
[x_1, x_2, ...] = \text{split}(x, n, \text{dim})
\]

If `split_size_or_sections` is an integer, it divides into equal chunks.  
If it’s a list, it splits with specified lengths.

---

### 🔹 4. `torch.chunk(tensor, chunks, dim)`

Splits a tensor into a **given number of chunks** along the specified dimension.  
The last chunk may be smaller if not evenly divisible.

\[
[x_1, x_2, ...] = \text{chunk}(x, k, \text{dim})
\]

---

## 🧮 Visual Concept

If we have

\[
A = \begin{bmatrix}
1 & 2 \\
3 & 4
\end{bmatrix}, \quad
B = \begin{bmatrix}
5 & 6 \\
7 & 8
\end{bmatrix}
\]

Then:

\[
\text{cat}([A,B], 0) =
\begin{bmatrix}
1 & 2 \\
3 & 4 \\
5 & 6 \\
7 & 8
\end{bmatrix}
,\quad
\text{cat}([A,B], 1) =
\begin{bmatrix}
1 & 2 & 5 & 6 \\
3 & 4 & 7 & 8
\end{bmatrix}
\]

\[
\text{stack}([A,B], 0) \Rightarrow \text{shape }(2,2,2)
\]

---

## 🧠 Summary

| Function | Purpose | Creates New Dim? | Input Shape Requirement |
|-----------|----------|------------------|--------------------------|
| `torch.cat` | Join along existing dim | ❌ No | Same except concat dim |
| `torch.stack` | Join along new dim | ✅ Yes | Exactly same shape |
| `torch.split` | Split by size/sections | ❌ No | Any |
| `torch.chunk` | Split by number of chunks | ❌ No | Any |

---

## 💬 Interview Tip

> Use `cat` when combining tensors along an existing axis,  
> use `stack` when you need to add a new dimension (e.g. stacking images into batch),  
> use `split` or `chunk` to divide large tensors into smaller batches for processing.

---


In [8]:
import torch

# 1️⃣ Prepare tensors
a = torch.tensor([[1, 2],
                  [3, 4]])
b = torch.tensor([[5, 6],
                  [7, 8]])

# 🔹 cat: concatenate along existing dim
cat_dim0 = torch.cat((a, b), dim=0)
cat_dim1 = torch.cat((a, b), dim=1)
print("cat dim=0:\n", cat_dim0)
print("cat dim=1:\n", cat_dim1)

# 🔹 stack: adds a new dimension
stack_0 = torch.stack((a, b), dim=0)
stack_1 = torch.stack((a, b), dim=1)
print("\nstack dim=0:", stack_0.shape)  # (2,2,2)
print("stack dim=1:", stack_1.shape)    # (2,2,2)

# 🔹 split: by size
x = torch.arange(10)
splits = torch.split(x, 3)  # 3,3,3,1
print("\nsplit by 3:", [s.tolist() for s in splits])

# 🔹 split by custom sizes
splits_custom = torch.split(x, [2, 4, 4])
print("split custom [2,4,4]:", [s.tolist() for s in splits_custom])

# 🔹 chunk: by number of parts
chunks = torch.chunk(x, 4)  # split into 4 chunks
print("\nchunk(4):", [c.tolist() for c in chunks])


cat dim=0:
 tensor([[1, 2],
        [3, 4],
        [5, 6],
        [7, 8]])
cat dim=1:
 tensor([[1, 2, 5, 6],
        [3, 4, 7, 8]])

stack dim=0: torch.Size([2, 2, 2])
stack dim=1: torch.Size([2, 2, 2])

split by 3: [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9]]
split custom [2,4,4]: [[0, 1], [2, 3, 4, 5], [6, 7, 8, 9]]

chunk(4): [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9]]


# 🧠 PyTorch Tensor Broadcasting & Expanding — `expand`, `repeat`

---

## 💬 Question

**Q:** What is broadcasting in PyTorch, and how do `expand()` and `repeat()` work?  
Explain their differences.

**问：**  
PyTorch 中的广播机制是什么？`expand()` 和 `repeat()` 有什么区别？

---

## 🧩 Explanation

Broadcasting allows tensors with **different shapes** to be combined in arithmetic operations  
(addition, multiplication, etc.) **without explicit copying**.

---

### 🔹 Broadcasting Rule

When performing operations like `a + b`, PyTorch automatically compares tensor shapes **from right to left**:

1️⃣ If dimensions are equal → ✅ compatible  
2️⃣ If one of them is 1 → ✅ expand it to match  
3️⃣ Otherwise → ❌ incompatible (raises error)

\[
\text{Example: } (3, 1) + (1, 4) \Rightarrow (3, 4)
\]

---

### 🔹 `expand(*sizes)`

Creates a **new view** of the tensor by “virtually” expanding dimensions.  
**No new memory is allocated** — it only changes the tensor’s stride interpretation.

\[
y = x.\text{expand}(3, 4)
\]

⚠️ You can only expand a dimension if its size is **1** or already matches the target.

**Efficient but read-only** — modifying the expanded tensor may raise errors.

---

### 🔹 `repeat(*sizes)`

**Physically copies data** along specified dimensions to make a larger tensor.

\[
y = x.\text{repeat}(3, 4)
\]

This actually **allocates new memory**, so it’s slower but safe to modify.

---

## 💡 Key Difference

| Function | Creates Copy? | Modifiable? | Use Case |
|-----------|----------------|--------------|-----------|
| `expand()` | ❌ No | ⚠️ Read-only | Memory-efficient broadcasting |
| `repeat()` | ✅ Yes | ✅ Writable | When actual replication is needed |

---

### 🔹 Example: shapes

\[
x = [1, 2, 3] \text{ (shape } (3,)) \\
x.expand(2,3) \Rightarrow
\begin{bmatrix}
1 & 2 & 3 \\
1 & 2 & 3
\end{bmatrix} \text{ (no copy)}
\]

\[
x.repeat(2,1) \Rightarrow
\begin{bmatrix}
1 & 2 & 3 \\
1 & 2 & 3
\end{bmatrix} \text{ (copied data)}
\]

---

## 🧠 Broadcasting in Practice

\[
(3,1) + (1,4) \Rightarrow (3,4)
\]

This allows efficient elementwise operations across tensors with different but compatible shapes.

**Example:**
- `x` shape `(3,1)`
- `y` shape `(1,4)`
- `x + y` automatically becomes `(3,4)` via broadcasting

---

## 💬 Interview Tip

> Broadcasting enables implicit expansion of dimensions for elementwise ops.  
> `expand()` provides a lightweight “view” of that expansion,  
> while `repeat()` makes an actual data copy.

---


# 🧠 PyTorch Tensor Broadcasting & Expanding — `expand`, `repeat` (Fixed)

---

## 💬 Question

**Q:** What is broadcasting in PyTorch, and how do `expand()` and `repeat()` work?  
Explain their differences.

**问：**  
PyTorch 中的广播机制是什么？`expand()` 和 `repeat()` 有什么区别？

---

## 🧩 Explanation

Broadcasting allows tensors with **different shapes** to be combined in arithmetic operations  
(addition, multiplication, etc.) **without explicit copying**.

---

### 🔹 Broadcasting Rule

When performing operations like `a + b`, PyTorch automatically compares tensor shapes **from right to left**:

1️⃣ If dimensions are equal → ✅ compatible  
2️⃣ If one of them is 1 → ✅ expand it to match  
3️⃣ Otherwise → ❌ incompatible (raises error)

$$\text{Example: } (3, 1) + (1, 4) \Rightarrow (3, 4)$$

---

### 🔹 `expand(*sizes)`

Creates a **new view** of the tensor by "virtually" expanding dimensions.  
**No new memory is allocated** — it only changes the tensor's stride interpretation.

$$y = x.\text{expand}(3, 4)$$

⚠️ You can only expand a dimension if its size is **1** or already matches the target.

**Efficient but read-only** — modifying the expanded tensor may raise errors.

---

### 🔹 `repeat(*sizes)`

**Physically copies data** along specified dimensions to make a larger tensor.

$$y = x.\text{repeat}(3, 4)$$

This actually **allocates new memory**, so it's slower but safe to modify.

---

## 💡 Key Difference

| Function | Creates Copy? | Modifiable? | Use Case |
|-----------|----------------|--------------|-----------|
| `expand()` | ❌ No | ⚠️ Read-only | Memory-efficient broadcasting |
| `repeat()` | ✅ Yes | ✅ Writable | When actual replication is needed |

---

### 🔹 Example: shapes

$$x = [1, 2, 3] \text{ (shape } (3,)) \\
x.\text{expand}(2,3) \Rightarrow
\begin{bmatrix}
1 & 2 & 3 \\
1 & 2 & 3
\end{bmatrix} \text{ (no copy)}$$

$$x.\text{repeat}(2,1) \Rightarrow
\begin{bmatrix}
1 & 2 & 3 \\
1 & 2 & 3
\end{bmatrix} \text{ (copied data)}$$

---

## 🧠 Broadcasting in Practice

$$(3,1) + (1,4) \Rightarrow (3,4)$$

This allows efficient elementwise operations across tensors with different but compatible shapes.

**Example:**
- `x` shape `(3,1)`
- `y` shape `(1,4)`
- `x + y` automatically becomes `(3,4)` via broadcasting

---

## 💬 Interview Tip

> Broadcasting enables implicit expansion of dimensions for elementwise ops.  
> `expand()` provides a lightweight "view" of that expansion,  
> while `repeat()` makes an actual data copy.

---


# 🧠 PyTorch Tensor Concatenation & Splitting — `cat`, `stack`, `split`, `chunk` (Fixed)

---

## 💬 Question

**Q:** How do you combine and separate tensors in PyTorch?  
Explain the difference between `torch.cat`, `torch.stack`, `torch.split`, and `torch.chunk`.

**问：**  
在 PyTorch 中如何拼接和分割张量？`torch.cat`、`torch.stack`、`torch.split`、`torch.chunk` 有什么区别？

---

## 🧩 Explanation

### 🔹 1. `torch.cat(tensors, dim)`

Concatenates a **sequence of tensors** along an existing dimension.

$$y = \text{cat}([x_1, x_2, ...], \text{dim}=d)$$

All tensors must have the **same shape** except for the concatenation dimension.

**Example:**  
Concatenating along dim=0 → adds more rows  
Concatenating along dim=1 → adds more columns

---

### 🔹 2. `torch.stack(tensors, dim)`

Stacks tensors along a **new dimension**.  
Unlike `cat`, it **creates a new axis**.

$$y = \text{stack}([x_1, x_2, ...], \text{dim}=d)$$

All tensors must have **exactly the same shape**.

Example: stacking 3 vectors of shape `(2,)` → result `(3, 2)`.

---

### 🔹 3. `torch.split(tensor, split_size_or_sections, dim)`

Splits a tensor into **equal parts or specified sizes** along a given dimension.

$$[x_1, x_2, ...] = \text{split}(x, n, \text{dim})$$

If `split_size_or_sections` is an integer, it divides into equal chunks.  
If it's a list, it splits with specified lengths.

---

### 🔹 4. `torch.chunk(tensor, chunks, dim)`

Splits a tensor into a **given number of chunks** along the specified dimension.  
The last chunk may be smaller if not evenly divisible.

$$[x_1, x_2, ...] = \text{chunk}(x, k, \text{dim})$$

---

## 🧮 Visual Concept

If we have

$$A = \begin{bmatrix}
1 & 2 \\
3 & 4
\end{bmatrix}, \quad
B = \begin{bmatrix}
5 & 6 \\
7 & 8
\end{bmatrix}$$

Then:

$$\text{cat}([A,B], 0) =
\begin{bmatrix}
1 & 2 \\
3 & 4 \\
5 & 6 \\
7 & 8
\end{bmatrix}
,\quad
\text{cat}([A,B], 1) =
\begin{bmatrix}
1 & 2 & 5 & 6 \\
3 & 4 & 7 & 8
\end{bmatrix}$$

$$\text{stack}([A,B], 0) \Rightarrow \text{shape }(2,2,2)$$

---

## 🧠 Summary

| Function | Purpose | Creates New Dim? | Input Shape Requirement |
|-----------|----------|------------------|--------------------------|
| `torch.cat` | Join along existing dim | ❌ No | Same except concat dim |
| `torch.stack` | Join along new dim | ✅ Yes | Exactly same shape |
| `torch.split` | Split by size/sections | ❌ No | Any |
| `torch.chunk` | Split by number of chunks | ❌ No | Any |

---

## 💬 Interview Tip

> Use `cat` when combining tensors along an existing axis,  
> use `stack` when you need to add a new dimension (e.g. stacking images into batch),  
> use `split` or `chunk` to divide large tensors into smaller batches for processing.

---


# 🧠 PyTorch Tensor Concatenation & Splitting — `cat`, `stack`, `split`, `chunk` (Fixed)

---

## 💬 Question

**Q:** How do you combine and separate tensors in PyTorch?  
Explain the difference between `torch.cat`, `torch.stack`, `torch.split`, and `torch.chunk`.

**问：**  
在 PyTorch 中如何拼接和分割张量？`torch.cat`、`torch.stack`、`torch.split`、`torch.chunk` 有什么区别？

---

## 🧩 Explanation

### 🔹 1. `torch.cat(tensors, dim)`

Concatenates a **sequence of tensors** along an existing dimension.

$$y = \text{cat}([x_1, x_2, ...], \text{dim}=d)$$

All tensors must have the **same shape** except for the concatenation dimension.

**Example:**  
Concatenating along dim=0 → adds more rows  
Concatenating along dim=1 → adds more columns

---

### 🔹 2. `torch.stack(tensors, dim)`

Stacks tensors along a **new dimension**.  
Unlike `cat`, it **creates a new axis**.

$$y = \text{stack}([x_1, x_2, ...], \text{dim}=d)$$

All tensors must have **exactly the same shape**.

Example: stacking 3 vectors of shape `(2,)` → result `(3, 2)`.

---

### 🔹 3. `torch.split(tensor, split_size_or_sections, dim)`

Splits a tensor into **equal parts or specified sizes** along a given dimension.

$$[x_1, x_2, ...] = \text{split}(x, n, \text{dim})$$

If `split_size_or_sections` is an integer, it divides into equal chunks.  
If it's a list, it splits with specified lengths.

---

### 🔹 4. `torch.chunk(tensor, chunks, dim)`

Splits a tensor into a **given number of chunks** along the specified dimension.  
The last chunk may be smaller if not evenly divisible.

$$[x_1, x_2, ...] = \text{chunk}(x, k, \text{dim})$$

---

## 🧮 Visual Concept

If we have

$$A = \begin{bmatrix}
1 & 2 \\
3 & 4
\end{bmatrix}, \quad
B = \begin{bmatrix}
5 & 6 \\
7 & 8
\end{bmatrix}$$

Then:

$$\text{cat}([A,B], 0) =
\begin{bmatrix}
1 & 2 \\
3 & 4 \\
5 & 6 \\
7 & 8
\end{bmatrix}
,\quad
\text{cat}([A,B], 1) =
\begin{bmatrix}
1 & 2 & 5 & 6 \\
3 & 4 & 7 & 8
\end{bmatrix}$$

$$\text{stack}([A,B], 0) \Rightarrow \text{shape }(2,2,2)$$

---

## 🧠 Summary

| Function | Purpose | Creates New Dim? | Input Shape Requirement |
|-----------|----------|------------------|--------------------------|
| `torch.cat` | Join along existing dim | ❌ No | Same except concat dim |
| `torch.stack` | Join along new dim | ✅ Yes | Exactly same shape |
| `torch.split` | Split by size/sections | ❌ No | Any |
| `torch.chunk` | Split by number of chunks | ❌ No | Any |

---

## 💬 Interview Tip

> Use `cat` when combining tensors along an existing axis,  
> use `stack` when you need to add a new dimension (e.g. stacking images into batch),  
> use `split` or `chunk` to divide large tensors into smaller batches for processing.

---


In [9]:
import torch

# 1️⃣ Broadcasting automatically
a = torch.arange(3).reshape(3, 1)      # shape (3,1)
b = torch.arange(4).reshape(1, 4)      # shape (1,4)
print("a shape:", a.shape)
print("b shape:", b.shape)
print("a + b shape:", (a + b).shape)   # (3,4)

# 2️⃣ expand(): view, no copy
x = torch.tensor([1, 2, 3])
x_exp = x.expand(2, 3)  # virtual repeat
print("\nexpand result:\n", x_exp)
print("Is same memory:", x_exp.data_ptr() == x.data_ptr())  # ✅ same memory
try:
    x_exp[0, 0] = 999   # will affect original or error (depends on stride)
except RuntimeError as e:
    print("Modify expand -> error:", e)

# 3️⃣ repeat(): real copy
x_rep = x.repeat(2, 1)
print("\nrepeat result:\n", x_rep)
print("Is same memory:", x_rep.data_ptr() == x.data_ptr())  # ❌ different memory
x_rep[0, 0] = 999
print("After modify repeat:\n", x_rep)
print("Original x unchanged:\n", x)

# 4️⃣ Broadcasting example
c = torch.arange(3).reshape(3, 1)
d = torch.arange(4).reshape(1, 4)
res = c + d
print("\nBroadcasting result:\n", res)


a shape: torch.Size([3, 1])
b shape: torch.Size([1, 4])
a + b shape: torch.Size([3, 4])

expand result:
 tensor([[1, 2, 3],
        [1, 2, 3]])
Is same memory: True

repeat result:
 tensor([[999,   2,   3],
        [999,   2,   3]])
Is same memory: False
After modify repeat:
 tensor([[999,   2,   3],
        [999,   2,   3]])
Original x unchanged:
 tensor([999,   2,   3])

Broadcasting result:
 tensor([[0, 1, 2, 3],
        [1, 2, 3, 4],
        [2, 3, 4, 5]])
