# CME 193 – Lecture 3: Array & Tensor Operations in Python

> **TIME: 3:35 PM - Start of class**

In this notebook we will learn **how to perform different operations on arrays/tensors with native Python, NumPy and PyTorch.**

# Installing PyTorch
Use the following command to install PyTorch to the current virtual environment used by the notebook. You only need to run this once, you can comment it out after the installation is complete.

The `torch` package requires the `numpy` package as a "dependency", installing `torch` will also install `numpy` automatically.

In [6]:
# comment out the following line if you have already installed PyTorch.
# ! pip install torch torchvision

### Verify Installation

> **TIME: 3:35–3:38 PM**

Run this in a Python terminal or Jupyter cell:

In [None]:
# Import NumPy for numerical array operations
import numpy as np
# Import PyTorch for tensor operations and deep learning
import torch
# Import pprint for pretty-printing nested data structures
from pprint import pprint

# Display the installed version of NumPy
print(f"NumPy version: {np.__version__}")
# Display the installed version of PyTorch
print(f"PyTorch version: {torch.__version__}")
# Check if CUDA (GPU support) is available on this system
print(f"CUDA available: {torch.cuda.is_available()}")
# Check if MPS (Metal Performance Shaders for Apple Silicon M1/M2/M3/M4 Macs) is available
print(f"MPS available: {torch.backends.mps.is_available()}")

# automatically chose the best device for your system
device = torch.device("cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu")
print(f"Using device: {device}")

NumPy version: 2.3.3
PyTorch version: 2.8.0
CUDA available: False
MPS available: True
Using device: mps


### 📚 Tensor Terminology

> **TIME: 3:38–3:43 PM**

**Important naming conventions:**

<!-- ![tensor-terminology](https://github.com/tianyudu/CME193-Autumn-2025/blob/main/notebooks/images/tensors.png?raw=true) -->
![tensor-terminology](./images/tensors.png?raw=true)

- **Vector**: 1-dimensional tensor/array (just like a list in Python)

- **Matrix**: 2-dimensional tensor/array (just like a *nested* list in Python)

- **Tensor**: General term, but often refers to 3+ dimensions, so vectors and matrices are special cases of tensors

- In NumPy, everything is called an "array" or "ndarray"

- In PyTorch, everything is called a "tensor" because arrays, matrices, and higher-dimensional tensors are all tensors

- Note: the Tensor data type in Pytorch is similar to the "tensor" in Physics, but not exactly the same.

This terminology is important when reading documentation and research papers!

## Example of Creating Vectors and Matrices with Native Python

In [None]:
vector_example = [1, 2, 3]
matrix_example = [[1, 2, 3], [4, 5, 6]]
tensor_example = [[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]]

pprint(vector_example)
pprint(matrix_example)
pprint(tensor_example)

print(vector_example[0])
print(matrix_example[0][1])
print(tensor_example[0][1][1])

[1, 2, 3]
[[1, 2, 3], [4, 5, 6]]
[[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]]
1
2
5


In [None]:
print(matrix_example[0, 1])  # <-- this won't work with native Python nested lists!

TypeError: list indices must be integers or slices, not tuple

### 🐌 Why Not Just Use Python Lists?

> **TIME: 3:43–3:48 PM**


Before we dive into NumPy and PyTorch, let's see why we need them at all!

Let's try to square 1 million numbers in different ways and see how fast they are.

- **Vectorization/Parallelism**: Utilize multiple cores of the CPU.
- **GPU acceleration**: PyTorch can offload operations to thousands of parallel GPU cores (thousands to tens of thousands cores)

In [None]:
# Simple task: element-wise square of 1 million numbers
print("=" * 70)
print("PERFORMANCE TEST: Squaring 1,000,000 numbers")
print("=" * 70)

n = 1_000_000

# Python lists (manual loop)
print("\n1️⃣  Python list comprehension:")
%timeit [x**2 for x in range(n)]

# NumPy (vectorized)
print("\n2️⃣  NumPy vectorized:")
%timeit np.arange(n)**2

# NumPy (manual loop)
# print("\n2️⃣b NumPy with manual loop:")
# def numpy_loop_square(n):
#     arr = np.arange(n)
#     result = np.empty(n)
#     for i in range(n):
#         result[i] = arr[i]**2
#     return result
# %timeit numpy_loop_square(n)

# PyTorch (vectorized)
print("\n3️⃣  PyTorch vectorized:")
%timeit torch.arange(n).to(device)**2

# PyTorch (manual loop)
# print("\n3️⃣b PyTorch with manual loop:")
# def pytorch_loop_square(n):
#     arr = torch.arange(n)
#     result = torch.empty(n)
#     for i in range(n):
#         result[i] = arr[i]**2
#     return result
# %timeit pytorch_loop_square(n)

print("\n" + "=" * 70)
print("🚀 NumPy and PyTorch are typically 10-50x faster!")
print("=" * 70)


PERFORMANCE TEST: Squaring 1,000,000 numbers

1️⃣  Python list comprehension:
44.6 ms ± 4.16 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

2️⃣  NumPy vectorized:
1.12 ms ± 57.6 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

3️⃣  PyTorch vectorized:
The slowest run took 64.56 times longer than the fastest. This could mean that an intermediate result is being cached.
11.4 ms ± 22.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

🚀 NumPy and PyTorch are typically 20-100x faster!


# Why Two Packages? They have different areas of specialization, here is a comparison:

> **TIME: 3:48–3:50 PM**



| NumPy | PyTorch |
|---|---|
| **What is it?**<br>NumPy (Numerical Python) is a foundational library for scientific computing in Python. It provides efficient operations on large, multi-dimensional arrays and matrices, plus a broad suite of math functions. | **What is it?**<br>PyTorch is an open-source machine learning library widely used for deep learning. It emphasizes flexibility and speed when building and training neural networks. |
| **Key features**<br>• Efficient n-dimensional array (ndarray) operations<br>• Rich math: linear algebra, FFT, random sampling, etc.<br>• Integrates smoothly with SciPy, Pandas, Matplotlib<br>• High performance from C under the hood | **Key features**<br>• Dynamic computation graphs for flexible modeling and debugging<br>• Tensors (NumPy-like) with GPU acceleration<br>• Autograd for automatic differentiation<br>• Rich ecosystem for CV, NLP, and more |
| **Core purpose**<br>General numerical & scientific computing: array manipulation, math functions, and data analysis foundations. | **Core purpose**<br>Deep learning focus: building and training neural networks (while still supporting general tensor ops). |
| **Data structures**<br>Primary type: **ndarray** (N-dimensional array) supporting many dtypes and vectorized ops. | **Data structures**<br>Primary type: **Tensor**, similar to ndarray but designed to work with GPUs and autograd. |
| **Computation**<br>Primarily CPU-based; fast but no built-in GPU acceleration. | **Computation**<br>Built for GPU acceleration; tensors can be moved to GPUs to speed up large ML workloads. |
| **Automatic differentiation**<br>No built-in autodiff; gradients must be computed manually or via external tools. | **Automatic differentiation**<br>**autograd** automatically computes gradients to simplify backprop and optimization. |
| **Typical use cases**<br>• General scientific computing & data analysis<br>• Array/matrix math and linear algebra<br>• Data preprocessing before ML/DL | **Typical use cases**<br>• Building & training deep learning models<br>• Tasks benefiting from dynamic graphs (e.g., RNNs)<br>• Flexible research & rapid prototyping in AI |
| **Ecosystem & integration**<br>Works hand-in-glove with SciPy/Pandas/Matplotlib to form a full data science stack. | **Ecosystem & integration**<br>Part of a larger DL ecosystem (e.g., Torchvision; integrates with libraries like Transformers). Plays well with NumPy, with easy array↔tensor conversion. |

**Reference**: [https://www.linkedin.com/pulse/numpy-vs-pytorch-nadir-riyani-z6cdf/](https://www.linkedin.com/pulse/numpy-vs-pytorch-nadir-riyani-z6cdf/)

## 1. Creating arrays & tensors

> **TIME: 3:50–4:00 PM**

### 1.1 Creating Numpy Arrays and Torch Tensors from Python lists with `np.array(<PYTHON LIST>)` and `torch.tensor(<PYTHON LIST>)`

In [20]:
# From Python lists
print("=== Creating from Python lists ===")
python_list = [1, 2, 3]
print("Python list:", python_list)
np_arr = np.array(python_list)
print("NumPy array:", np_arr)
pt_arr = torch.tensor(python_list)
print("PyTorch tensor:", pt_arr)

=== Creating from Python lists ===
Python list: [1, 2, 3]
NumPy array: [1 2 3]
PyTorch tensor: tensor([1, 2, 3])


## Before we start, how do we know what we've created? Inspecting Array/Tensor Shapes/Sizes

### Shape and Number of Elements

In [None]:
# Get the shape (dimensions)
np_arr = np.arange(10)
pt_arr = torch.arange(10)
print("NumPy array shape:", np_arr.shape)      # Returns a tuple, e.g., (10,)
print("PyTorch tensor shape:", pt_arr.shape)   # Returns torch.Size([...])

# Get the total number of elements
print("NumPy total elements:", np_arr.size)      # For NumPy: .size
print("PyTorch total elements:", pt_arr.numel())  # For PyTorch: .numel()

NumPy array shape: (10,)
PyTorch tensor shape: torch.Size([10])
NumPy total elements: 10
PyTorch total elements: 10



| Feature | NumPy | PyTorch |
|---------|-------|---------|
| **Get shape** | `.shape` returns tuple, e.g., `(3,)` | `.shape` returns `torch.Size([3])` |
| **Total elements** | `.size` (attribute) | `.numel()` (method) |
| **`len()` behavior** | Returns size of first dimension (rows for 2D+) | Returns size of first dimension (rows for 2D+) |


### What does `len()` do?

⚠️ **Important distinction:**
- **For 1D arrays/tensors**: `len()` returns the number of elements
- **For multi-dimensional arrays/tensors**: `len()` returns the size of the **first dimension** (number of rows), NOT the total!

In [65]:
vector = np.array([1, 2, 3])
matrix = np.array([[1, 2], [3, 4], [5, 6]])

print(len(vector))   # Output: 3 (number of elements)
print(len(matrix))   # Output: 3 (rows, NOT 6!)
print(matrix.size)   # Output: 6 (total elements)

3
3
6


**Best practice:** Use `.size` (NumPy) or `.numel()` (PyTorch) for total element count, do not use `len()`!

### 1.2 Creating Zero Matrices: `np.zeros(shape)` and `torch.zeros(shape)`

What's **shape**? By shape we mean a tuple or list of integers that describe the size of the array/tensor. For example, `(2, 3)` means the shape of the array/tensor is 2x3 matrix, and `(2, 3, 4)` means the shape of the array/tensor is 2x3x4 tensor, and this tensor has 2x3x4=24 elements.

**Syntax:**
- NumPy: `np.zeros((rows, cols))`
- PyTorch: `torch.zeros((rows, cols))`

**Mathematical notation:**

$$
\mathbf{0}_{2 \times 3} = \begin{bmatrix}
0 & 0 & 0 \\
0 & 0 & 0 \\
\end{bmatrix}
$$


In [69]:
# Zeros
print("\n=== Zeros (2x3) ===")
# Python list
py_zeros = [[0.0 for i in range(3)] for j in range(2)]  # 2x3 matrix
print("Python list:")
for row in py_zeros:
    print(" ", row)
# NumPy & PyTorch
np_zeros = np.zeros((2,3))
pt_zeros = torch.zeros((2,3))
print("NumPy:")
print(np_zeros)
print("PyTorch:")
print(pt_zeros)


=== Zeros (2x3) ===
Python list:
  [0.0, 0.0, 0.0]
  [0.0, 0.0, 0.0]
NumPy:
[[0. 0. 0.]
 [0. 0. 0.]]
PyTorch:
tensor([[0., 0., 0.],
        [0., 0., 0.]])


### Creating Ones Matrices: `np.ones(shape)` and `torch.ones(shape)`

**Syntax:**
- NumPy: `np.ones((rows, cols))`
- PyTorch: `torch.ones((rows, cols))`

**Mathematical notation:**

$$
\mathbf{1}_{3 \times 2} = \begin{bmatrix}
1 & 1 \\
1 & 1 \\
1 & 1 \\
\end{bmatrix}
$$


In [None]:
# Ones
print("\n=== Ones (3x2) ===")
py_ones = [[1.0 for _ in range(2)] for _ in range(3)]  # 3x2 matrix
print("Python list:", py_ones)
np_ones = np.ones((3,2))
pt_ones = torch.ones((3,2))
print("NumPy:")
print(np_ones)
print("PyTorch:")
print(pt_ones)

### Creating Sequences: `np.arange(n)` and `torch.arange(n)`

**Syntax:**
- Python: `list(range(n))`
- NumPy: `np.arange(n)`
- PyTorch: `torch.arange(n)`

**Mathematical notation:**

$$
\text{range}(10) = \begin{bmatrix}
0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \\
\end{bmatrix}
$$


In [None]:
# Range/arange
print("\n=== Range 0..9 ===")
py_range = list(range(10))
np_range = np.arange(10)
pt_range = torch.arange(10)
print("Python range:", py_range)
print("NumPy arange:", np_range)
print("PyTorch arange:", pt_range)

### Creating Sequences with Step: `np.arange(start, stop, step)`

**Syntax:**
- Python: `list(range(start, stop, step))`
- NumPy: `np.arange(start, stop, step)`
- PyTorch: `torch.arange(start, stop, step)`

**Mathematical notation:**

$$
\text{range}(2, 10, \text{step}=2) = \begin{bmatrix}
2 & 4 & 6 & 8 \\
\end{bmatrix}
$$


In [None]:
# Range with step
print("\n=== Range 2..10 step 2 ===")
py_step = list(range(2, 10, 2))
np_step = np.arange(2, 10, 2)  # start, stop, step
pt_step = torch.arange(2, 10, 2)
print("Python range:", py_step)
print("NumPy arange:", np_step)
print("PyTorch arange:", pt_step)

### Creating Evenly Spaced Values: `np.linspace(start, stop, num)` and `torch.linspace(start, stop, num)`

**Syntax:**
- NumPy: `np.linspace(start, stop, num)` - Creates `num` evenly spaced values from `start` to `stop` (inclusive)
- PyTorch: `torch.linspace(start, stop, num)`

**Mathematical notation:**

$$
\text{linspace}(0, 1, 5) = \begin{bmatrix}
0.0 & 0.25 & 0.5 & 0.75 & 1.0 \\
\end{bmatrix}
$$

**Difference from `range`:** `linspace` includes the endpoint and spaces values evenly, while `arange` uses step sizes.


In [None]:
# Linspace (PyTorch has this too!)
print("\n=== Linspace 0 to 1 (5 points) ===")
# Python (manual calculation)
py_lin = [0.0 + i * (1.0 - 0.0) / (5 - 1) for i in range(5)]
np_lin = np.linspace(0, 1, 5)  # 5 evenly spaced values from 0 to 1
pt_lin = torch.linspace(0, 1, 5)
print("Python list:", [f"{x:.4f}" for x in py_lin])
print("NumPy linspace:", np_lin)
print("PyTorch linspace:", pt_lin)


### Creating Uninitialized Arrays/Tensors: `np.empty(shape)` and `torch.empty(shape)` 🗑️

**Syntax:**
- NumPy: `np.empty((rows, cols))`
- PyTorch: `torch.empty((rows, cols))`

**Mathematical notation:**

$$
\mathbf{?}_{2 \times 3} = \begin{bmatrix}
? & ? & ? \\
? & ? & ? \\
\end{bmatrix}
\quad \text{(random garbage from memory!)}
$$

⚠️ **Warning**: These create arrays/tensors without initializing values - you'll see whatever was in memory! It could be zeros, but you should not expect it to be zeros.

**Why do we need uninitialized arrays/tensors?**
- **Performance**: It's faster to create an uninitialized array as a placeholder and fill it later
- **Use case**: When you know you'll immediately overwrite all values


In [61]:
# Creating uninitialized arrays (fastest but unsafe!)
# NumPy: empty()
np_empty = np.empty((2, 3))  # Whatever was in memory
print("NumPy empty (uninitialized):")
print(np_empty)
print("Type:", np_empty.dtype)

# PyTorch: empty()
pt_empty = torch.empty((2, 3))
print("\nPyTorch empty (uninitialized):")
print(pt_empty)
print("Type:", pt_empty.dtype)

# Why use empty()? Performance when you'll overwrite all values immediately

# Timing comparison
size = (1000, 1000)

# NumPy zeros
print("\nNumPy zeros timing:")
%timeit np.zeros(size)

# NumPy empty
print("\nNumPy empty timing:")
%timeit np.empty(size)

# PyTorch zeros
print("\nPyTorch zeros timing:")
%timeit torch.zeros(size)

# PyTorch empty
print("\nPyTorch empty timing:")
%timeit torch.empty(size)

NumPy empty (uninitialized):
[[4.94e-323 1.04e-322 1.58e-322]
 [6.42e-323 1.19e-322 1.73e-322]]
Type: float64

PyTorch empty (uninitialized):
tensor([[0., 0., 0.],
        [0., 0., 0.]])
Type: torch.float32

NumPy zeros timing:
489 μs ± 59.2 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

NumPy empty timing:
22.3 μs ± 414 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

PyTorch zeros timing:
222 μs ± 37.3 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

PyTorch empty timing:
13.4 μs ± 321 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


## Data Types (dtypes) (Please read after the lecture)

Both libraries support multiple data types. **PyTorch defaults to float32, NumPy to float64.**

### Why Care About Data Types?

- **Memory efficiency**: `float16` uses half the memory of `float32`, which uses half of `float64`
- **Performance**: Smaller data types can be faster (less data movement)
- **GPU constraints**: GPUs often perform better with `float32` or `float16`
- **Precision tradeoffs**: Higher precision (float64) for scientific computing, lower (float16/32) for deep learning
- **Compatibility**: Ensure data types match between operations to avoid automatic conversions

### Memory Requirements Formula

$$
\text{Memory (bytes)} = \text{num\_elements} \times \text{bytes\_per\_element}
$$

$$
\text{For a matrix } M_{m \times n} \text{ with dtype float32: } \text{Memory} = m \times n \times 4 \text{ bytes}
$$

**Example:** A 1000×1000 `float32` matrix uses: $1000 \times 1000 \times 4 = 4,000,000$ bytes = 3.81 MB

### Data Type Summary Table

| Data Type | NumPy | PyTorch | Bytes | Memory (1M elements) | Range/Notes |
|-----------|-------|---------|-------|---------------------|-------------|
| **Integers** | | | | | |
| 8-bit int | `np.int8` | `torch.int8` | 1 | 1 MB | -128 to 127 |
| 32-bit int | `np.int32` | `torch.int32` | 4 | 4 MB | ±2.1 billion |
| 64-bit int | `np.int64` | `torch.int64` or `.long` | 8 | 8 MB | ±9.2 quintillion |
| **Floats** | | | | | |
| Half precision | `np.float16` | `torch.float16` or `.half` | 2 | 2 MB | ~3 decimal digits |
| Single precision | `np.float32` | `torch.float32` or `.float` | 4 | 4 MB | ~7 decimal digits |
| Double precision | `np.float64` | `torch.float64` or `.double` | 8 | 8 MB | ~15 decimal digits |
| **Boolean** | `np.bool_` | `torch.bool` | 1 | 1 MB | True/False |

### General Syntax

**NumPy:**
```python
arr = np.array([1, 2, 3], dtype=np.float32)
arr = np.zeros((3, 4), dtype=np.int64)
```

**PyTorch:**
```python
tensor = torch.tensor([1, 2, 3], dtype=torch.float32)
tensor = torch.zeros((3, 4), dtype=torch.int64)
```


### Examples of creating arrays/tensors with different data types

In [28]:
# Default data types
np_default = np.array([1.0, 2.0, 3.0])
pt_default = torch.tensor([1.0, 2.0, 3.0])
print("Default dtypes:")
print(f"NumPy: {np_default.dtype}")  # float64 by default
print(f"PyTorch: {pt_default.dtype}")  # float32 by default

Default dtypes:
NumPy: float64
PyTorch: torch.float32


In [29]:
# Common data types
print("\n=== Integer Types ===")
# NumPy
np_int8 = np.array([1, 2, 3], dtype=np.int8)
np_int32 = np.array([1, 2, 3], dtype=np.int32)
np_int64 = np.array([1, 2, 3], dtype=np.int64)
print(f"NumPy int8: {np_int8.dtype}, bytes per element: {np_int8.itemsize}")
print(f"NumPy int32: {np_int32.dtype}, bytes per element: {np_int32.itemsize}")
print(f"NumPy int64: {np_int64.dtype}, bytes per element: {np_int64.itemsize}")


=== Integer Types ===
NumPy int8: int8, bytes per element: 1
NumPy int32: int32, bytes per element: 4
NumPy int64: int64, bytes per element: 8


In [30]:
# PyTorch
pt_int8 = torch.tensor([1, 2, 3], dtype=torch.int8)
pt_int32 = torch.tensor([1, 2, 3], dtype=torch.int32)
pt_int64 = torch.tensor([1, 2, 3], dtype=torch.int64)
print(f"\nPyTorch int8: {pt_int8.dtype}")
print(f"PyTorch int32: {pt_int32.dtype}")
print(f"PyTorch int64: {pt_int64.dtype}")


PyTorch int8: torch.int8
PyTorch int32: torch.int32
PyTorch int64: torch.int64


In [31]:
print("\n=== Float Types ===")
# NumPy
np_float16 = np.array([1.0, 2.0, 3.0], dtype=np.float16)  # half precision
np_float32 = np.array([1.0, 2.0, 3.0], dtype=np.float32)  # single precision
np_float64 = np.array([1.0, 2.0, 3.0], dtype=np.float64)  # double precision
print(f"NumPy float16: {np_float16.dtype}, bytes: {np_float16.itemsize}")
print(f"NumPy float32: {np_float32.dtype}, bytes: {np_float32.itemsize}")
print(f"NumPy float64: {np_float64.dtype}, bytes: {np_float64.itemsize}")


=== Float Types ===
NumPy float16: float16, bytes: 2
NumPy float32: float32, bytes: 4
NumPy float64: float64, bytes: 8


In [None]:
# PyTorch
pt_float16 = torch.tensor([1.0, 2.0, 3.0], dtype=torch.float16)  # or torch.half
pt_float32 = torch.tensor([1.0, 2.0, 3.0], dtype=torch.float32)  # or torch.float
pt_float64 = torch.tensor([1.0, 2.0, 3.0], dtype=torch.float64)  # or torch.double
print(f"\nPyTorch float16: {pt_float16.dtype}")
print(f"PyTorch float32: {pt_float32.dtype}")
print(f"PyTorch float64: {pt_float64.dtype}")

In [32]:
print("\n=== Boolean Types ===")
np_bool = np.array([True, False, True])
pt_bool = torch.tensor([True, False, True])
print(f"NumPy bool: {np_bool.dtype}")
print(f"PyTorch bool: {pt_bool.dtype}")


=== Boolean Types ===
NumPy bool: bool
PyTorch bool: torch.bool


## Type Conversion

### General Syntax

**NumPy:**
```python
arr_float = arr_int.astype(np.float32)  # General method
```

**PyTorch (multiple ways):**
```python
tensor_float = tensor_int.float()              # Convenience method -> float32
tensor_double = tensor_int.double()            # Convenience method -> float64
tensor_half = tensor_int.half()                # Convenience method -> float16
tensor_int32 = tensor_float.int()              # Convenience method -> int32
tensor_custom = tensor_int.to(torch.float16)   # General method
```

### Why Care About Type Conversion?

- **GPU requirements**: Many GPU operations require `float32` or `float16`
- **Memory optimization**: Convert to smaller types when high precision isn't needed
- **Compatibility**: Some operations require matching data types
- **Loss prevention**: Integer division truncates; convert to float first if you need decimals
- **Performance**: Operations on mixed types trigger automatic conversions (slower)

### Examples of Type Conversion

In [70]:
# Type conversion
print("\n=== Type Conversion ===")
np_arr = np.array([1, 2, 3], dtype=np.int32)
np_float = np_arr.astype(np.float32)
print(f"NumPy: {np_arr.dtype} -> {np_float.dtype}")



=== Type Conversion ===
NumPy: int32 -> float32


In [77]:
pt_arr = torch.tensor([1, 2, 3], dtype=torch.int32)
pt_float = pt_arr.float()  # or .to(torch.float32)
pt_double = pt_arr.double()  # or .to(torch.float64)
pt_half = pt_arr.half()  # or .to(torch.float16)
print(f"PyTorch: {pt_arr.dtype} -> {pt_float.dtype}, {pt_double.dtype}, {pt_half.dtype}")

PyTorch: torch.int32 -> torch.float32, torch.float64, torch.float16


In [78]:
# PyTorch also has convenient conversion methods
pt_arr = torch.tensor([1.5, 2.7, 3.9])
print(f"\nOriginal: {pt_arr}")
print(f"To int: {pt_arr.int()}")  # truncates
print(f"To long: {pt_arr.long()}")  # int64


Original: tensor([1.5000, 2.7000, 3.9000])
To int: tensor([1, 2, 3], dtype=torch.int32)
To long: tensor([1, 2, 3])


### Converting between NumPy ⟷ PyTorch using `torch.from_numpy()` and `tensor.numpy()` (Please read after the lecture)

PyTorch shares memory with NumPy when possible (zero-copy conversion)!

### What Does "Share Memory" Mean?

**Shared memory** means both the NumPy array and PyTorch tensor point to the **same underlying data** in RAM:

- ✅ **Advantage**: No data copying → instant conversion, saves memory
- ⚠️ **Caution**: Modifying one affects the other!

```python
np_arr = np.array([1, 2, 3])
pt_tensor = torch.from_numpy(np_arr)  # Shares memory!

np_arr[0] = 999
print(pt_tensor)  # Output: tensor([999, 2, 3]) - changed too!
```

**When memory is shared:**
- `torch.from_numpy(np_array)` → Shares memory
- `tensor.numpy()` → Shares memory (if tensor is on CPU and contiguous)

In [85]:
np_arr = np.arange(6).reshape(2,3)
pt_arr = torch.from_numpy(np_arr)
print("NumPy array:")
print(np_arr)
print("\nPyTorch tensor (shares memory):")
print(pt_arr)

NumPy array:
[[0 1 2]
 [3 4 5]]

PyTorch tensor (shares memory):
tensor([[0, 1, 2],
        [3, 4, 5]])


In [86]:
# Changing one changes the other
np_arr[0,0] = 99
print("\nAfter modifying np_arr[0,0] = 99:")
print(pt_arr)

# Back to NumPy (shares memory when tensor is on CPU!)
back_to_np = pt_arr.numpy()
print("\nConverted back to NumPy:")
print(back_to_np)


After modifying np_arr[0,0] = 99:
tensor([[99,  1,  2],
        [ 3,  4,  5]])

Converted back to NumPy:
[[99  1  2]
 [ 3  4  5]]


In [87]:
print(back_to_np)
pt_arr += 1
print(back_to_np)

[[99  1  2]
 [ 3  4  5]]
[[100   2   3]
 [  4   5   6]]


**When memory is NOT shared:**
- `torch.tensor(python_list)` → Creates new copy
- `tensor.cpu().numpy()` when tensor was on GPU → Must copy from GPU to CPU
- `np_array.copy()` or `tensor.clone()` → Explicit copies

In [89]:
# Example: Creating a copy (memory NOT shared)
np_original = np.array([1, 2, 3])
pt_copy = torch.tensor(np_original)  # Creates a new copy.

np_original[0] = 999
print("Original NumPy array:", np_original)
print("PyTorch tensor (independent copy):", pt_copy)  # Still [1, 2, 3]

Original NumPy array: [999   2   3]
PyTorch tensor (independent copy): tensor([1, 2, 3])


In [None]:
# Explicit copies
np_explicit_copy = np_arr.copy()
pt_explicit_copy = pt_arr.clone()
print("\nExplicit copies are independent:")
print("NumPy copy:", np_explicit_copy)
print("PyTorch clone:", pt_explicit_copy)
np_arr += 100  # make modification to the original array won't affect the copy.
print("NumPy copy:", np_explicit_copy)
print("PyTorch clone:", pt_explicit_copy)


Explicit copies are independent:
NumPy copy: [[100   2   3]
 [  4   5   6]]
PyTorch clone: tensor([[100,   2,   3],
        [  4,   5,   6]])
NumPy copy: [[100   2   3]
 [  4   5   6]]
PyTorch clone: tensor([[100,   2,   3],
        [  4,   5,   6]])


## 2. Reshaping Arrays and Tensors

> **TIME: 4:00–4:05 PM**



In [40]:
np_arr = np.arange(12)
pt_arr = torch.arange(12)
print("Original shapes:", np_arr.shape, pt_arr.shape)

# NumPy: reshape
np_reshaped = np_arr.reshape(3,4)
print("\nNumPy reshape(3,4):")
print(np_reshaped)

# PyTorch also has reshape!
pt_reshaped = pt_arr.reshape(3,4)
print("\nPyTorch reshape(3,4):")
print(pt_reshaped)

Original shapes: (12,) torch.Size([12])

NumPy reshape(3,4):
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

PyTorch reshape(3,4):
tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])


Infer the missing dimension with `-1`.

Suppose we have an imaging dataset with shape `(num_images, height, width)`, and we want to convert it to a 2D array with shape `(num_images, height * width)` for modeling (e.g., using a linear regression model).

One way to do this is:

In [41]:
def flatten_image_data(input_data):
    num_images = input_data.shape[0]  # first get the number of images.
    height = input_data.shape[1]  # then get the height.
    width = input_data.shape[2]  # then get the width.
    return input_data.reshape(num_images, height * width)

In [45]:
example_image_data = np.random.randn(100, 256, 256)
pprint(flatten_image_data(example_image_data))
print(flatten_image_data(example_image_data).shape)

array([[-0.83175747,  1.70909045, -0.95976222, ...,  0.64370245,
         1.25609875,  0.0857383 ],
       [ 0.83158161, -1.2127492 ,  1.02904441, ...,  1.75632332,
         1.09064765,  0.63171535],
       [ 0.42167475,  1.84065853,  0.35019243, ...,  0.97769641,
         2.19496482,  0.88373801],
       ...,
       [-0.92189106, -0.01332148,  0.02257782, ..., -0.94432448,
        -0.76903861,  0.9000175 ],
       [ 0.44457042,  0.38450181, -0.51066133, ...,  0.34844134,
         0.39742803, -1.49165511],
       [-0.37761877, -0.87661186,  1.51565247, ..., -1.09130349,
        -0.8484453 ,  0.31653122]], shape=(100, 65536))
(100, 65536)


In [47]:
# Using -1 to infer dimension, if we know the first dimension (num_images), and we are going to reduce the 3-D array to 2-D array, the second dimension of the resulting 2-D array is simply total number of entries / first dimension. PyTorch and NumPy will figure this out for you!
print("\nInferred dimension with -1:")
print(example_image_data.reshape(len(example_image_data), -1))  # -1 means "figure this out"
print(example_image_data.reshape(len(example_image_data), -1).shape)


Inferred dimension with -1:
[[-0.83175747  1.70909045 -0.95976222 ...  0.64370245  1.25609875
   0.0857383 ]
 [ 0.83158161 -1.2127492   1.02904441 ...  1.75632332  1.09064765
   0.63171535]
 [ 0.42167475  1.84065853  0.35019243 ...  0.97769641  2.19496482
   0.88373801]
 ...
 [-0.92189106 -0.01332148  0.02257782 ... -0.94432448 -0.76903861
   0.9000175 ]
 [ 0.44457042  0.38450181 -0.51066133 ...  0.34844134  0.39742803
  -1.49165511]
 [-0.37761877 -0.87661186  1.51565247 ... -1.09130349 -0.8484453
   0.31653122]]
(100, 65536)


## 3. Indexing & Slicing

> **TIME: 4:05–4:12 PM**


Works the same in both libraries!


### General Syntax

| Operation | NumPy | PyTorch | Documentation |
|-----------|-------|---------|---------------|
| **Basic Indexing** | `arr[i]` | `tensor[i]` | [NumPy](https://numpy.org/doc/stable/user/basics.indexing.html) \| [PyTorch](https://pytorch.org/docs/stable/tensor_view.html) |
| **Slicing** | `arr[start:stop:step]` | `tensor[start:stop:step]` | Same as above |
| **2D Indexing** | `arr[row, col]` | `tensor[row, col]` | Same as above |
| **2D Slicing** | `arr[r1:r2, c1:c2]` | `tensor[r1:r2, c1:c2]` | Same as above |
| **All rows, specific column** | `arr[:, col]` | `tensor[:, col]` | Same as above |
| **Negative indexing** | `arr[-1]` (last element) | `tensor[-1]` (last element) | Same as above |
| **Reverse array** | `arr[::-1]` | `torch.flip(tensor, [dim])` | [flip](https://pytorch.org/docs/stable/generated/torch.flip.html) |
| **Boolean/Fancy indexing** | `arr[mask]` or `arr[indices]` | `tensor[mask]` or `tensor[indices]` | [NumPy advanced](https://numpy.org/doc/stable/user/basics.indexing.html#advanced-indexing) |



In [None]:
np_arr = np.arange(10)
pt_arr = torch.arange(10)

print("arr[5]:", np_arr[5], pt_arr[5])
print("arr[3:7]:", np_arr[3:7], pt_arr[3:7])
print("arr[::2]:", np_arr[::2], pt_arr[::2])  # every other element
print("arr[::-1]:", np_arr[::-1], torch.flip(pt_arr, [0]))  # reverse (PyTorch uses flip)

# 2D indexing
np_2d = 10 * np.arange(6).reshape(-1, 1) + np.arange(6)
pt_2d = 10 * torch.arange(6).reshape(-1, 1) + torch.arange(6)
print("\n2D array:")
print(np_2d)
print("\nSecond row:", np_2d[1])
print("Element [2,3]:", np_2d[2, 3])
print("\nSlice [2:5, 1:4]:")
print(np_2d[2:5, 1:4])


arr[5]: 5 tensor(5)
arr[3:7]: [3 4 5 6] tensor([3, 4, 5, 6])
arr[::2]: [0 2 4 6 8] tensor([0, 2, 4, 6, 8])
arr[::-1]: [9 8 7 6 5 4 3 2 1 0] tensor([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])

2D array:
[[ 0  1  2  3  4  5]
 [10 11 12 13 14 15]
 [20 21 22 23 24 25]
 [30 31 32 33 34 35]
 [40 41 42 43 44 45]
 [50 51 52 53 54 55]]

Second row: [10 11 12 13 14 15]
Element [2,3]: 23

Slice [2:5, 1:4]:
[[21 22 23]
 [31 32 33]
 [41 42 43]]


### Fancy Indexing

Use arrays/tensors or boolean masks as indices

In [None]:
# Integer array indexing
np_arr = np.arange(10)
pt_arr = torch.arange(10)

idx = [2, 7, 9]  # Can use Python list as the index.
print("Select indices [2,7,9]:")
print(np_arr[idx])
print(pt_arr[idx])

In [51]:
# Boolean indexing
np_mask = np_arr > 5
pt_mask = pt_arr > 5
print("\nElements > 5:")
print(np_arr[np_mask])
print(pt_arr[pt_mask])


Elements > 5:
[6 7 8 9]
tensor([6, 7, 8, 9])


In [52]:
# Modify via boolean indexing
np_arr[np_arr < 3] = 100
pt_arr[pt_arr < 3] = 100
print("\nAfter setting elements < 3 to 100:")
print(np_arr)
print(pt_arr)


After setting elements < 3 to 100:
[100 100 100   3   4   5   6   7   8   9]
tensor([100, 100, 100,   3,   4,   5,   6,   7,   8,   9])


### Concatenation and Stacking (Combining Arrays/Tensors)

Use these to merge along an existing axis or create a new axis.

| Operation | NumPy | PyTorch | Result |
|-----------|-------|---------|--------|
| **Concatenate (existing axis)** | `np.concatenate([a, b], axis=dim)` | `torch.cat([a, b], dim=dim)` | Joins along axis `dim` |
| **Vertical stack (rows)** | `np.vstack([a, b])` | `torch.vstack([a, b])` | Stacks along first axis |
| **Horizontal stack (cols)** | `np.hstack([a, b])` | `torch.hstack([a, b])` | Stacks along second axis |
| **Stack (new axis)** | `np.stack([a, b], axis=dim)` | `torch.stack([a, b], dim=dim)` | Adds a new axis |

Notes:
- Shapes must match on all non-concatenated axes.
- Use `stack` when you want to add a new dimension; use `concatenate/cat` when joining along an existing dimension.


In [None]:
a = np.arange(6).reshape(2, 3)
b = np.arange(100, 106).reshape(2, 3)
A = torch.arange(6).reshape(2, 3)
B = torch.arange(100, 106).reshape(2, 3)

print("a:\n", a)
print("b:\n", b)
print("\nConcatenate rows (axis=0 / dim=0):")
print(np.concatenate([a, b], axis=0))
print(torch.cat([A, B], dim=0))

print("\nConcatenate cols (axis=1 / dim=1):")
print(np.concatenate([a, b], axis=1))
print(torch.cat([A, B], dim=1))

print("\nStack along new axis (axis=0 / dim=0):")
print(np.stack([a, b], axis=0).shape)  # (2, 2, 3)
print(torch.stack([A, B], dim=0).shape)  # (2, 2, 3)

# Convenience helpers (vstack/hstack)
print("\nnp.vstack / torch.vstack:")
print(np.vstack([a, b]).shape)
print(torch.vstack([A, B]).shape)
print("\nnp.hstack / torch.hstack:")
print(np.hstack([a, b]).shape)
print(torch.hstack([A, B]).shape)


## 4. Special Matrices

> **TIME: 4:12–4:14 PM**


### General Syntax


### 📐 Mathematical Definitions

#### **Identity Matrix**

The $n \times n$ identity matrix $\mathbf{I}_n$ is defined as:

$$
(\mathbf{I}_n)_{ij} = \begin{cases}
1 & \text{if } i = j \\\
0 & \text{if } i \neq j
\end{cases}
= \delta_{ij}
$$

where $\delta_{ij}$ is the Kronecker delta.

**Example:** $\mathbf{I}_3 = \begin{bmatrix} 1 & 0 & 0 \\\\ 0 & 1 & 0 \\\\ 0 & 0 & 1 \end{bmatrix}$

#### **Diagonal Matrix**

A diagonal matrix $\mathbf{D}$ with diagonal entries $d_1, d_2, \ldots, d_n$:

$$
\mathbf{D} = \text{diag}(d_1, d_2, \ldots, d_n) = \begin{bmatrix}
d_1 & 0 & \cdots & 0 \\\
0 & d_2 & \cdots & 0 \\\
\vdots & \vdots & \ddots & \vdots \\\
0 & 0 & \cdots & d_n
\end{bmatrix}
$$

#### **Random Matrices**

- **Uniform distribution:** Each entry $\mathbf{A}_{ij} \sim \mathcal{U}(0, 1)$ (uniformly distributed between 0 and 1)

- **Normal (Gaussian) distribution:** Each entry $\mathbf{A}_{ij} \sim \mathcal{N}(0, 1)$ (mean 0, standard deviation 1)

$$
\mathcal{N}(\mu, \sigma^2): \quad f(x) = \frac{1}{\sqrt{2\pi\sigma^2}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}
$$

---

| Matrix Type | NumPy | PyTorch | Documentation |
|-------------|-------|---------|---------------|
| **Identity** | `np.eye(n)` | `torch.eye(n)` | [NumPy](https://numpy.org/doc/stable/reference/generated/numpy.eye.html) \| [PyTorch](https://pytorch.org/docs/stable/generated/torch.eye.html) |
| **Diagonal (create)** | `np.diag(vector)` | `torch.diag(vector)` | [NumPy](https://numpy.org/doc/stable/reference/generated/numpy.diag.html) \| [PyTorch](https://pytorch.org/docs/stable/generated/torch.diag.html) |
| **Diagonal (extract)** | `np.diag(matrix)` | `torch.diag(matrix)` | Same as above |
| **Random uniform** | `np.random.rand(m, n)` | `torch.rand(m, n)` | [NumPy](https://numpy.org/doc/stable/reference/random/generated/numpy.random.rand.html) \| [PyTorch](https://pytorch.org/docs/stable/generated/torch.rand.html) |
| **Random normal** | `np.random.randn(m, n)` | `torch.randn(m, n)` | [NumPy](https://numpy.org/doc/stable/reference/random/generated/numpy.random.randn.html) \| [PyTorch](https://pytorch.org/docs/stable/generated/torch.randn.html) |
| **Random integers** | `np.random.randint(low, high, size)` | `torch.randint(low, high, size)` | [NumPy](https://numpy.org/doc/stable/reference/random/generated/numpy.random.randint.html) \| [PyTorch](https://pytorch.org/docs/stable/generated/torch.randint.html) |
| **Set seed** | `np.random.seed(n)` | `torch.manual_seed(n)` | [NumPy](https://numpy.org/doc/stable/reference/random/generated/numpy.random.seed.html) \| [PyTorch](https://pytorch.org/docs/stable/generated/torch.manual_seed.html) |


### 💡 Self-Study Note: Random Seeds
**Reproducibility tip**: Always set seeds when debugging or sharing code! NumPy and PyTorch have independent random states, so you must set both:
```python
np.random.seed(42)
torch.manual_seed(42)
```
**Note**: In production, avoid fixed seeds to ensure proper randomness.


In [None]:
# Identity matrix
np_eye = np.eye(5)
pt_eye = torch.eye(5)
print("Identity matrix (5x5):")
print(np_eye)
print(pt_eye)

# Diagonal matrix
np_diag = np.diag(np.arange(5))
pt_diag = torch.diag(torch.arange(5))
print("\nDiagonal matrix:")
print(np_diag)
print(pt_diag)

# Extract diagonal from matrix
np_M = np.random.randn(4, 4)
pt_M = torch.randn(4, 4)
print("\nExtracted diagonal:")
print(np.diag(np_M))
print(torch.diag(pt_M))

# Set seeds for reproducibility
np.random.seed(0)
torch.manual_seed(0)

# Random uniform [0, 1)
print("Uniform random [0,1):")
print(np.random.rand(2, 2))
print(torch.rand(2, 2))

# Random normal (Gaussian, mean=0, std=1)
print("\nNormal distribution:")
print(np.random.randn(2, 2))
print(torch.randn(2, 2))

# Random integers
print("\nRandom integers [0, 10):")
print(np.random.randint(0, 10, size=(2, 2)))
print(torch.randint(0, 10, (2, 2)))

# Reset seed to get same sequence
np.random.seed(0)
torch.manual_seed(0)
print("\nAfter resetting seed (should match first output):")
print(np.random.rand(2, 2))
print(torch.rand(2, 2))


Identity matrix (5x5):
[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]
tensor([[1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0.],
        [0., 0., 1., 0., 0.],
        [0., 0., 0., 1., 0.],
        [0., 0., 0., 0., 1.]])

Diagonal matrix:
[[0 0 0 0 0]
 [0 1 0 0 0]
 [0 0 2 0 0]
 [0 0 0 3 0]
 [0 0 0 0 4]]
tensor([[0, 0, 0, 0, 0],
        [0, 1, 0, 0, 0],
        [0, 0, 2, 0, 0],
        [0, 0, 0, 3, 0],
        [0, 0, 0, 0, 4]])

Extracted diagonal:
[ 0.45519952 -0.49114465 -0.10803572 -0.71871658]
tensor([-0.0223, -0.0888, -1.3211,  1.1827])


## 5. Broadcasting (Please read after the lecture)

Automatically expand dimensions to match shapes

Please read after the lecture:
- Broadcasting Rules with NumPy: [https://numpy.org/doc/stable/user/basics.broadcasting.html](https://numpy.org/doc/stable/user/basics.broadcasting.html)

- Broadcasting Rules with PyTorch: [https://docs.pytorch.org/docs/stable/notes/broadcasting.html](https://docs.pytorch.org/docs/stable/notes/broadcasting.html)

### 💡 Self-Study Note: Broadcasting Rules
**The magic of broadcasting**: Arrays with different shapes can be used together if:
1. They have the same number of dimensions, OR one can be prepended with 1s
2. For each dimension, sizes must match OR one must be 1

**Example**: (4,1) + (1,3) → (4,3) because dimension sizes are compatible

**Mathematical Example:**

Adding a column vector to a row vector:

$$
\begin{bmatrix} 0 \\\\ 10 \\\\ 20 \end{bmatrix}_{3 \times 1}
+
\begin{bmatrix} 0 & 1 & 2 \end{bmatrix}_{1 \times 3}
=
\begin{bmatrix}
0+0 & 0+1 & 0+2 \\\
10+0 & 10+1 & 10+2 \\\
20+0 & 20+1 & 20+2
\end{bmatrix}_{3 \times 3}
=
\begin{bmatrix}
0 & 1 & 2 \\\
10 & 11 & 12 \\\
20 & 21 & 22
\end{bmatrix}
$$


### General Syntax

Both NumPy and PyTorch follow identical broadcasting rules!

| Operation | NumPy Example | PyTorch Example | Documentation |
|-----------|---------------|-----------------|---------------|
| **Scalar + Array** | `arr + 5` | `tensor + 5` | [NumPy](https://numpy.org/doc/stable/user/basics.broadcasting.html) \| [PyTorch](https://pytorch.org/docs/stable/notes/broadcasting.html) |
| **Array + Vector** | `(m,n) + (n,)` → `(m,n)` | `(m,n) + (n,)` → `(m,n)` | Same as above |
| **Column + Row** | `(m,1) + (1,n)` → `(m,n)` | `(m,1) + (1,n)` → `(m,n)` | Same as above |
| **Automatic expansion** | Dimensions of size 1 are stretched | Dimensions of size 1 are stretched | Same as above |

**Broadcasting Rules:**
1. If arrays have different ranks, prepend 1s to the shape of the smaller rank array
2. Arrays are compatible when for each dimension, sizes match OR one size is 1
3. After broadcasting, each array behaves as if it had the larger shape


In [53]:
# Broadcasting example: (4,1) + (1,3) → (4,3)
np_X = np.arange(4).reshape(-1, 1) * 10  # shape (4,1)
np_Y = np.arange(3).reshape(1, -1)        # shape (1,3)

pt_X = torch.arange(4).reshape(-1, 1) * 10
pt_Y = torch.arange(3).reshape(1, -1)

print(f"X shape: {np_X.shape}")
print(np_X)
print(f"\nY shape: {np_Y.shape}")
print(np_Y)

print("\nX + Y (broadcasting):")
print(np_X + np_Y)
print(pt_X + pt_Y)

# Creating matrix A[i,j] = i*j using broadcasting
np_i = np.arange(5).reshape(-1, 1)  # column vector
np_j = np.arange(10).reshape(1, -1)  # row vector
np_A = np_i * np_j

pt_i = torch.arange(5).reshape(-1, 1)
pt_j = torch.arange(10).reshape(1, -1)
pt_A = pt_i * pt_j

print("\nMatrix A[i,j] = i*j:")
print(np_A)
print(pt_A)


X shape: (4, 1)
[[ 0]
 [10]
 [20]
 [30]]

Y shape: (1, 3)
[[0 1 2]]

X + Y (broadcasting):
[[ 0  1  2]
 [10 11 12]
 [20 21 22]
 [30 31 32]]
tensor([[ 0,  1,  2],
        [10, 11, 12],
        [20, 21, 22],
        [30, 31, 32]])

Matrix A[i,j] = i*j:
[[ 0  0  0  0  0  0  0  0  0  0]
 [ 0  1  2  3  4  5  6  7  8  9]
 [ 0  2  4  6  8 10 12 14 16 18]
 [ 0  3  6  9 12 15 18 21 24 27]
 [ 0  4  8 12 16 20 24 28 32 36]]
tensor([[ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
        [ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18],
        [ 0,  3,  6,  9, 12, 15, 18, 21, 24, 27],
        [ 0,  4,  8, 12, 16, 20, 24, 28, 32, 36]])


## 6. Matrix Operations

> **TIME: 4:14–4:15 PM (Summary)**


### General Syntax


### 📐 Mathematical Definitions

#### **Transpose**

For $\mathbf{A} \in \mathbb{R}^{m \times n}$, the transpose $\mathbf{A}^T \in \mathbb{R}^{n \times m}$:

$$
(\mathbf{A}^T)_{ij} = \mathbf{A}_{ji}
$$

#### **Matrix Multiplication** (@ operator)

For $\mathbf{A} \in \mathbb{R}^{m \times n}$ and $\mathbf{B} \in \mathbb{R}^{n \times p}$:

$$
(\mathbf{A} \mathbf{B})_{ij} = \sum_{k=1}^{n} \mathbf{A}_{ik} \mathbf{B}_{kj}
$$

#### **Element-wise Multiplication** (* operator)

For $\mathbf{A}, \mathbf{B} \in \mathbb{R}^{m \times n}$ (same dimensions):

$$
(\mathbf{A} \odot \mathbf{B})_{ij} = \mathbf{A}_{ij} \cdot \mathbf{B}_{ij}
$$

Also called **Hadamard product** or **element-wise product**.

#### **Dot Product** (for vectors)

For vectors $\mathbf{a}, \mathbf{b} \in \mathbb{R}^{n}$:

$$
\mathbf{a} \cdot \mathbf{b} = \sum_{i=1}^{n} a_i b_i = \mathbf{a}^T \mathbf{b}
$$

#### **Matrix Inverse**

For a square matrix $\mathbf{A} \in \mathbb{R}^{n \times n}$, if it exists, $\mathbf{A}^{-1}$ satisfies:

$$
\mathbf{A} \mathbf{A}^{-1} = \mathbf{A}^{-1} \mathbf{A} = \mathbf{I}_n
$$

where $\mathbf{I}_n$ is the $n \times n$ identity matrix.

#### **Determinant**

For a square matrix $\mathbf{A} \in \mathbb{R}^{n \times n}$:

$$
\det(\mathbf{A}) = \sum_{\sigma \in S_n} \text{sgn}(\sigma) \prod_{i=1}^{n} \mathbf{A}_{i,\sigma(i)}
$$

For $2 \times 2$ matrices: $\det\begin{bmatrix} a & b \\\\ c & d \end{bmatrix} = ad - bc$

#### **Eigenvalues and Eigenvectors**

For a square matrix $\mathbf{A} \in \mathbb{R}^{n \times n}$, $\lambda$ is an eigenvalue and $\mathbf{v}$ is an eigenvector if:

$$
\mathbf{A} \mathbf{v} = \lambda \mathbf{v}
$$

where $\mathbf{v} \neq \mathbf{0}$.

---

| Operation | NumPy | PyTorch | Documentation |
|-----------|-------|---------|---------------|
| **Transpose** | `arr.T` or `np.transpose(arr)` | `tensor.T` or `torch.t(tensor)` | [NumPy](https://numpy.org/doc/stable/reference/generated/numpy.transpose.html) \| [PyTorch](https://pytorch.org/docs/stable/generated/torch.t.html) |
| **Matrix multiply** | `A @ B` or `np.matmul(A, B)` | `A @ B` or `torch.matmul(A, B)` | [NumPy](https://numpy.org/doc/stable/reference/generated/numpy.matmul.html) \| [PyTorch](https://pytorch.org/docs/stable/generated/torch.matmul.html) |
| **Element-wise multiply** | `A * B` or `np.multiply(A, B)` | `A * B` or `torch.mul(A, B)` | [NumPy](https://numpy.org/doc/stable/reference/generated/numpy.multiply.html) \| [PyTorch](https://pytorch.org/docs/stable/generated/torch.mul.html) |
| **Dot product** | `np.dot(a, b)` | `torch.dot(a, b)` | [NumPy](https://numpy.org/doc/stable/reference/generated/numpy.dot.html) \| [PyTorch](https://pytorch.org/docs/stable/generated/torch.dot.html) |
| **Matrix inverse** | `np.linalg.inv(A)` | `torch.linalg.inv(A)` | [NumPy](https://numpy.org/doc/stable/reference/generated/numpy.linalg.inv.html) \| [PyTorch](https://pytorch.org/docs/stable/generated/torch.linalg.inv.html) |
| **Determinant** | `np.linalg.det(A)` | `torch.linalg.det(A)` | [NumPy](https://numpy.org/doc/stable/reference/generated/numpy.linalg.det.html) \| [PyTorch](https://pytorch.org/docs/stable/generated/torch.linalg.det.html) |
| **Eigenvalues** | `np.linalg.eig(A)` | `torch.linalg.eig(A)` | [NumPy](https://numpy.org/doc/stable/reference/generated/numpy.linalg.eig.html) \| [PyTorch](https://pytorch.org/docs/stable/generated/torch.linalg.eig.html) |



In [93]:
# Transpose
np_A = np.arange(15).reshape(5, 3)
pt_A = torch.arange(15).reshape(5, 3)
print("Original (5x3):")
print(np_A)
print("\nTranspose (3x5):")
print(np_A.T)
print(pt_A.T)

Original (5x3):
[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]
 [12 13 14]]

Transpose (3x5):
[[ 0  3  6  9 12]
 [ 1  4  7 10 13]
 [ 2  5  8 11 14]]
tensor([[ 0,  3,  6,  9, 12],
        [ 1,  4,  7, 10, 13],
        [ 2,  5,  8, 11, 14]])


In [None]:
# Matrix multiplication: @ operator
np_Mat = np.random.randn(10, 10)
pt_Mat = torch.randn(10, 10)
np_vec = np.arange(10).reshape(10, 1)
pt_vec = torch.arange(10).reshape(10, 1)

print("\nMatrix multiplication (10,10) @ (10,1):")
np_result = np_Mat @ np_vec
pt_result = pt_Mat @ pt_vec
print(f"Result shapes: {np_result.shape}, {pt_result.shape}")


## 7. Reduction Operations

> **TIME: 4:14–4:15 PM (Summary)**



### 📐 Mathematical Definitions

#### **Sum**

For a matrix $\mathbf{A} \in \mathbb{R}^{m \times n}$:

- **Total sum:** $\sum_{i=1}^{m} \sum_{j=1}^{n} \mathbf{A}_{ij}$

- **Sum along axis 0 (collapse rows):** $\sum_{i=1}^{m} \mathbf{A}_{ij}$ for each $j$ → result shape: $(n,)$

- **Sum along axis 1 (collapse columns):** $\sum_{j=1}^{n} \mathbf{A}_{ij}$ for each $i$ → result shape: $(m,)$

#### **Mean (Average)**

$$
\text{mean}(\mathbf{A}) = \frac{1}{mn} \sum_{i=1}^{m} \sum_{j=1}^{n} \mathbf{A}_{ij}
$$

#### **Minimum and Maximum**

$$
\min(\mathbf{A}) = \min_{i,j} \mathbf{A}_{ij}, \quad \max(\mathbf{A}) = \max_{i,j} \mathbf{A}_{ij}
$$

#### **Cumulative Sum**

For a vector $\mathbf{a} = [a_1, a_2, \ldots, a_n]$:

$$
\text{cumsum}(\mathbf{a}) = \left[a_1, \, a_1 + a_2, \, a_1 + a_2 + a_3, \, \ldots, \, \sum_{i=1}^{n} a_i\right]
$$

---

Aggregate over axes


In [None]:
np_X = np.arange(50).reshape(10, 5)
pt_X = torch.arange(50).reshape(10, 5)

print("Sum of all elements:")
print(np.sum(np_X), np_X.sum())
print(torch.sum(pt_X), pt_X.sum())

print("\nSum along axis 0 (collapse rows):")
print(np.sum(np_X, axis=0))
print(torch.sum(pt_X, dim=0))

print("\nSum along axis 1 (collapse columns):")
print(np.sum(np_X, axis=1))
print(torch.sum(pt_X, dim=1))

print("\nMin, Max, Mean:")
print(f"NumPy: min={np_X.min()}, max={np_X.max()}, mean={np_X.mean():.2f}")
print(f"PyTorch: min={pt_X.min()}, max={pt_X.max()}, mean={pt_X.float().mean():.2f}")

# Cumulative sum
np_Y = np.arange(10)
pt_Y = torch.arange(10)
print("\nCumulative sum:")
print(np.cumsum(np_Y))
print(torch.cumsum(pt_Y, dim=0))


## 8. More Element-wise Operations

### General Syntax


### 📐 Mathematical Definitions

All operations below are **element-wise** (applied to each element independently).

For matrices $\mathbf{A}, \mathbf{B} \in \mathbb{R}^{m \times n}$ and scalar $c$:

#### **Arithmetic Operations**

$$
\begin{align*}
\text{Addition:} \quad & (\mathbf{A} + c)_{ij} = \mathbf{A}_{ij} + c \\\
\text{Subtraction:} \quad & (\mathbf{A} - c)_{ij} = \mathbf{A}_{ij} - c \\\
\text{Multiplication:} \quad & (\mathbf{A} \cdot c)_{ij} = c \cdot \mathbf{A}_{ij} \\\
\text{Division:} \quad & (\mathbf{A} / c)_{ij} = \mathbf{A}_{ij} / c \\\
\text{Power:} \quad & (\mathbf{A}^c)_{ij} = (\mathbf{A}_{ij})^c
\end{align*}
$$

#### **Mathematical Functions**

$$
\begin{align*}
\text{Square root:} \quad & \sqrt{\mathbf{A}}_{ij} = \sqrt{\mathbf{A}_{ij}} \\\
\text{Exponential:} \quad & (e^{\mathbf{A}})_{ij} = e^{\mathbf{A}_{ij}} \\\
\text{Natural log:} \quad & (\ln \mathbf{A})_{ij} = \ln(\mathbf{A}_{ij}) \\\
\text{Absolute value:} \quad & |\mathbf{A}|_{ij} = |\mathbf{A}_{ij}| \\\
\text{Trigonometric:} \quad & (\sin \mathbf{A})_{ij} = \sin(\mathbf{A}_{ij})
\end{align*}
$$

---

| Operation | NumPy | PyTorch | Documentation |
|-----------|-------|---------|---------------|
| **Addition** | `arr + 5` or `np.add(arr, 5)` | `tensor + 5` or `torch.add(tensor, 5)` | [NumPy](https://numpy.org/doc/stable/reference/generated/numpy.add.html) \| [PyTorch](https://pytorch.org/docs/stable/generated/torch.add.html) |
| **Subtraction** | `arr - 5` or `np.subtract(arr, 5)` | `tensor - 5` or `torch.sub(tensor, 5)` | [NumPy](https://numpy.org/doc/stable/reference/generated/numpy.subtract.html) \| [PyTorch](https://pytorch.org/docs/stable/generated/torch.sub.html) |
| **Multiplication** | `arr * 2` or `np.multiply(arr, 2)` | `tensor * 2` or `torch.mul(tensor, 2)` | [NumPy](https://numpy.org/doc/stable/reference/generated/numpy.multiply.html) \| [PyTorch](https://pytorch.org/docs/stable/generated/torch.mul.html) |
| **Division** | `arr / 2` or `np.divide(arr, 2)` | `tensor / 2` or `torch.div(tensor, 2)` | [NumPy](https://numpy.org/doc/stable/reference/generated/numpy.divide.html) \| [PyTorch](https://pytorch.org/docs/stable/generated/torch.div.html) |
| **Power** | `arr ** 2` or `np.power(arr, 2)` | `tensor ** 2` or `torch.pow(tensor, 2)` | [NumPy](https://numpy.org/doc/stable/reference/generated/numpy.power.html) \| [PyTorch](https://pytorch.org/docs/stable/generated/torch.pow.html) |
| **Square root** | `np.sqrt(arr)` | `torch.sqrt(tensor)` | [NumPy](https://numpy.org/doc/stable/reference/generated/numpy.sqrt.html) \| [PyTorch](https://pytorch.org/docs/stable/generated/torch.sqrt.html) |
| **Exponential** | `np.exp(arr)` | `torch.exp(tensor)` | [NumPy](https://numpy.org/doc/stable/reference/generated/numpy.exp.html) \| [PyTorch](https://pytorch.org/docs/stable/generated/torch.exp.html) |
| **Logarithm** | `np.log(arr)` | `torch.log(tensor)` | [NumPy](https://numpy.org/doc/stable/reference/generated/numpy.log.html) \| [PyTorch](https://pytorch.org/docs/stable/generated/torch.log.html) |
| **Absolute** | `np.abs(arr)` | `torch.abs(tensor)` | [NumPy](https://numpy.org/doc/stable/reference/generated/numpy.abs.html) \| [PyTorch](https://pytorch.org/docs/stable/generated/torch.abs.html) |
| **Trigonometric** | `np.sin(arr)`, `np.cos(arr)`, etc. | `torch.sin(tensor)`, `torch.cos(tensor)`, etc. | [NumPy trig](https://numpy.org/doc/stable/reference/routines.math.html) \| [PyTorch trig](https://pytorch.org/docs/stable/torch.html#pointwise-ops) |



In [None]:
np_arr = np.arange(10)
pt_arr = torch.arange(10)

# Vectorized operations (no loops needed!)
print("Original:", np_arr)
print("Squared (NumPy):", np_arr**2)
print("Squared (PyTorch):", pt_arr**2)

# Element-wise functions
print("\nSquare root:")
print(np.sqrt(np_arr.astype(float)))
print(torch.sqrt(pt_arr.float()))

print("\nExponential:")
print(np.exp(np_arr[:5]))
print(torch.exp(pt_arr[:5].float()))


## 9. GPU Acceleration (PyTorch)

PyTorch makes it easy to move tensors to GPU

### General Syntax

**NumPy**: CPU-only (no built-in GPU support)

**PyTorch**: Full GPU support!

| Operation | PyTorch | Documentation |
|-----------|---------|---------------|
| **Check GPU availability** | `torch.cuda.is_available()` | [CUDA semantics](https://pytorch.org/docs/stable/notes/cuda.html) |
| **Create tensor on GPU** | `torch.tensor(data, device='cuda')` | [Tensor creation](https://pytorch.org/docs/stable/torch.html#creation-ops) |
| **Move tensor to GPU** | `tensor.to('cuda')` or `tensor.cuda()` | [to method](https://pytorch.org/docs/stable/generated/torch.Tensor.to.html) |
| **Move tensor to CPU** | `tensor.to('cpu')` or `tensor.cpu()` | Same as above |
| **Get current device** | `tensor.device` | [device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.torch.device) |
| **Specify device** | `device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')` | [torch.device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.torch.device) |
| **Apple Silicon (MPS)** | `torch.backends.mps.is_available()`, `tensor.to('mps')` | [MPS backend](https://pytorch.org/docs/stable/notes/mps.html) |

**Note**: All operations between tensors must be on the same device!



## Additional Math Operations

Both libraries provide extensive mathematical functions beyond basic arithmetic.


### 📊 Why NumPy and PyTorch Are Essential

| Operation | Python Lists | NumPy/PyTorch | Speedup |
|-----------|-------------|---------------|---------|
| Element-wise ops | `[x**2 for x in data]` | `data**2` | **10-50x** |
| Reductions | `sum(data)` | `data.sum()` | **50-100x** |
| Matrix ops | Nested loops | `A @ B` | **100-1000x** |
| GPU compute | ❌ Not possible | ✅ `.to("cuda")` | **1000x+** |

**Key Advantages:**
1. **Vectorization**: Operations on entire arrays at once (no Python loops!)
2. **Optimized C/C++ code**: NumPy/PyTorch are written in compiled languages
3. **Memory efficiency**: Contiguous memory, less overhead than Python lists
4. **Hardware acceleration**: Can leverage GPUs, SIMD instructions, etc.

**When to use Python lists:**
- Small data (< 100 elements)
- Mixed data types (e.g., `[1, "hello", 3.14]`)
- Simple scripting where performance doesn't matter

**When to use NumPy/PyTorch:**
- Numerical computing (always!)
- Large datasets
- Machine learning / scientific computing
- Any performance-critical code


### ⏱️ Performance Comparison: Python vs NumPy vs PyTorch

Let's measure the actual performance difference!


In [94]:
# Test 1: Element-wise operations
print("=" * 70)
print("TEST 1: Element-wise Squaring (100,000 elements)")
print("=" * 70)
n = 100_000

print("\n1️⃣  Python list comprehension:")
data_py = list(range(n))
%timeit [x**2 for x in data_py]

print("\n2️⃣  NumPy vectorized:")
data_np = np.arange(n)
%timeit data_np**2

print("\n3️⃣  PyTorch vectorized:")
data_pt = torch.arange(n)
%timeit data_pt**2

# Test 2: Sum reduction
print("\n" + "=" * 70)
print("TEST 2: Sum of All Elements (100,000 elements)")
print("=" * 70)

print("\n1️⃣  Python sum():")
%timeit sum(data_py)

print("\n2️⃣  NumPy sum():")
%timeit data_np.sum()

print("\n3️⃣  PyTorch sum():")
%timeit data_pt.sum()

# Test 3: Matrix multiplication
print("\n" + "=" * 70)
print("TEST 3: Matrix Multiplication")
print("=" * 70)

# Create smaller matrices for Python (to avoid waiting too long)
print("\n⚠️  Note: Python nested loops are VERY slow, using 100×100")
size_small = 100
a_py = [[float(i*size_small + j) for j in range(size_small)] for i in range(size_small)]
b_py = [[float(i*size_small + j) for j in range(size_small)] for i in range(size_small)]

print(f"\n1️⃣  Python nested loops ({size_small}×{size_small}):")
%timeit result = [[sum(a_py[i][k] * b_py[k][j] for k in range(size_small)) for j in range(size_small)] for i in range(size_small)]

# NumPy and PyTorch can handle larger matrices
size = 500
a_np = np.arange(size*size, dtype=float).reshape(size, size)
b_np = np.arange(size*size, dtype=float).reshape(size, size)

print(f"\n2️⃣  NumPy @ operator ({size}×{size}):")
%timeit a_np @ b_np

a_pt = torch.arange(size*size, dtype=torch.float32).reshape(size, size)
b_pt = torch.arange(size*size, dtype=torch.float32).reshape(size, size)

print(f"\n3️⃣  PyTorch @ operator ({size}×{size}):")
%timeit a_pt @ b_pt

print("\n" + "=" * 70)
print("🚀 Key Takeaway: NumPy/PyTorch are 10-1000x faster!")
print("=" * 70)


TEST 1: Element-wise Squaring (100,000 elements)

1️⃣  Python list comprehension:
3.35 ms ± 48.7 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

2️⃣  NumPy vectorized:
40.6 μs ± 5.17 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

3️⃣  PyTorch vectorized:
87.3 μs ± 5.11 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

TEST 2: Sum of All Elements (100,000 elements)

1️⃣  Python sum():
278 μs ± 864 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

2️⃣  NumPy sum():
10.8 μs ± 446 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

3️⃣  PyTorch sum():
46.8 μs ± 323 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

TEST 3: Matrix Multiplication

⚠️  Note: Python nested loops are VERY slow, using 100×100

1️⃣  Python nested loops (100×100):
48.2 ms ± 4.01 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

2️⃣  NumPy @ operator (500×500):
915 μs ± 56.2 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

3️⃣  PyTor

## Summary

**NumPy** is the foundation for numerical computing in Python:
- Fast array operations
- Rich ecosystem of scientific libraries
- CPU-only

**PyTorch** extends the NumPy paradigm with:
- GPU acceleration for massive speedups
- Automatic differentiation for machine learning
- Nearly identical API for basic operations

**Best practice:** Use NumPy for general scientific computing and data analysis. Use PyTorch when you need gradients or GPU acceleration (especially for deep learning).

Both libraries are essential tools in modern scientific Python!
