# PyTorch Fundamentals

### How to import Pytorch:

In [311]:
import torch
torch.__version__

'2.7.1'

## Tensors

A **tensor** is a generalization of scalars, vectors, and matrices:

| Tensor Type | Description              | Example Shape |
|-------------|--------------------------|---------------|
| Scalar      | Single number            | `()`          |
| Vector      | 1D array of numbers      | `(n,)`        |
| Matrix      | 2D array of numbers      | `(m, n)`      |
| Tensor      | N-dimensional array      | `(d1, d2, ..., dn)` |

Tensors can be **1D, 2D, 3D, or higher-dimensional** arrays used to represent data in machine learning and deep learning.

<img src="imgs/scalar-vector-matrix-tensor.png" alt="Tensor visualization" width="600"/>

<img src="imgs/scalar-vector-matrix-tensor2.png" alt="Tensor visualization" width="600"/>

### Creating Tensors

In [312]:
# Scalar
scalar = torch.tensor(7)
scalar

tensor(7)

In [313]:
# Vector
vector = torch.tensor([7, 7])
vector

tensor([7, 7])

In [314]:
# Matrix
MATRIX = torch.tensor([[7, 8], 
                       [9, 10]])
MATRIX

tensor([[ 7,  8],
        [ 9, 10]])

In [315]:
# Tensor
TENSOR = torch.tensor([[[1, 2, 3],
                        [3, 6, 9],
                        [2, 4, 5]]])
TENSOR

tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 4, 5]]])

### Random tensors

In PyTorch, you usually don’t manually create tensors.
Instead, models begin with random tensors that get updated as they learn from data.

> Start with random numbers → Look at data → Update numbers → Repeat

As a data scientist, you control:
- Initialization (how the model starts),
- Representation (how it processes data),
- Optimization (how it updates).

To create a random tensor in PyTorch, use:

In [316]:
# Create a random tensor of size (3, 4)
random_tensor = torch.rand(size=(3, 4))
random_tensor, random_tensor.dtype

(tensor([[0.1057, 0.1151, 0.8125, 0.5191],
         [0.5090, 0.4442, 0.2110, 0.9835],
         [0.2455, 0.6827, 0.3296, 0.1682]]),
 torch.float32)

In [317]:
# Create a random tensor of size (224, 224, 3)
random_image_size_tensor = torch.rand(size=(224, 224, 3))
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([224, 224, 3]), 3)

### Zeros and ones tensors

In [318]:
# Create a tensor of all zeros
zeros = torch.zeros(size=(3, 4))
zeros, zeros.dtype

(tensor([[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]),
 torch.float32)

In [319]:
# Create a tensor of all ones
ones = torch.ones(size=(3, 4))
ones, ones.dtype

(tensor([[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]]),
 torch.float32)

### Creating a range and tensors like

In [320]:
# Use torch.arange(), torch.range() is deprecated 
# Don't --> zero_to_ten_deprecated = torch.range(0, 10) # Note: this may return an error in the future

# Create a range of values 0 to 10
zero_to_ten = torch.arange(start=0, end=10, step=1)
zero_to_ten

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [321]:
# Can also create a tensor of zeros similar to another tensor
ten_zeros = torch.zeros_like(input=zero_to_ten) # will have same shape
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

### Tensor datatypes and device types

PyTorch supports many tensor datatypes, useful for precision and performance control.
- Common float types:
    - torch.float32 (default): 32-bit float
    - torch.float16 or torch.half: 16-bit (faster, less precise)
    - torch.float64 or torch.double: 64-bit (more precise, slower)
    - Integers: 8, 16, 32, or 64-bit (torch.int8, torch.int16, etc.)

💡 Higher precision = more detail, slower

💡 Lower precision = less detail, faster

- Device types:

    | Device        | Description                                  |
    |---------------|----------------------------------------------|
    | `'cpu'`       | Runs on the system’s processor (default)     |
    | `'cuda'`      | Runs on an NVIDIA GPU using CUDA             |
    | `'cuda:X'`    | Specific GPU (e.g. `'cuda:0'`, `'cuda:1'`)   |
    | `'mps'`       | Apple Silicon (M1/M2) GPU support via Metal  |
    | `'xla'`       | For TPUs (used in Google Cloud, via PyTorch/XLA) |




### Getting Info from Tensors

After creating a tensor, you’ll often need to check:

- **Shape** – What is the tensor’s size?
- **Dtype** – What kind of data does it hold?
- **Device** – Where is it stored? (CPU or GPU)

In [322]:
some_tensor = torch.rand(3, 4)

print(some_tensor)
print(f"Shape: {some_tensor.shape}")
print(f"Dtype: {some_tensor.dtype}")
print(f"Device: {some_tensor.device}")

tensor([[0.1063, 0.8494, 0.1562, 0.8949],
        [0.7252, 0.2058, 0.2487, 0.7838],
        [0.2709, 0.9896, 0.4664, 0.8911]])
Shape: torch.Size([3, 4])
Dtype: torch.float32
Device: cpu


### Tensor Operations in PyTorch

In deep learning, data is represented as **tensors**. Models learn by applying many operations on them — like:

- Addition (`+`)
- Subtraction (`-`)
- Multiplication (`*`)
- Division (`/`)
- Matrix multiplication (`@` or `torch.matmul()`)

These basic operations are the building blocks of neural networks.

In [323]:
tensor = torch.tensor([1, 2, 3])

print(tensor + 10)       # tensor([11, 12, 13])
print(tensor * 10)       # tensor([10, 20, 30])

print(tensor)            # original tensor unchanged
tensor = tensor - 10
print(tensor)            # tensor([-9, -8, -7])

# Using built-in functions
torch.multiply(tensor, 10)

tensor([11, 12, 13])
tensor([10, 20, 30])
tensor([1, 2, 3])
tensor([-9, -8, -7])


tensor([-90, -80, -70])

### Element-wise vs Matrix Multiplication



In [324]:
tensor = torch.tensor([1, 2, 3])

# Element-wise
print(tensor * tensor)                  # tensor([1, 4, 9])

# Matrix multiplication (dot product)
print(torch.matmul(tensor, tensor))     # tensor(14)
tensor @ tensor                         # tensor(14)

tensor([1, 4, 9])
tensor(14)


tensor(14)

💡 Key rule: For matrix multiplication, inner dimensions must match.

Example:
- (2, 3) @ (3, 2) → ✅ results in (2, 2)
- (3, 2) @ (3, 2) → ❌ invalid

💡 Avoid manual loops for math operations — instead use `torch.matmul()` for speed and clarity.

### Tensor Aggregation (Min, Max, Mean, Sum)

Aggregation reduces a tensor to fewer values.

In [325]:
x = torch.arange(0, 100, 10)
# x = tensor([0, 10, 20, ..., 90])

print(x.min())                          # tensor(0)
print(x.max())                          # tensor(90)
print(x.type(torch.float32).mean())     # tensor(45.)
x.sum()                                 # tensor(450)

tensor(0)
tensor(90)
tensor(45.)


tensor(450)

You can also use torch. methods:

In [326]:
torch.min(x), torch.max(x), torch.mean(x.float()), torch.sum(x)

(tensor(0), tensor(90), tensor(45.), tensor(450))

### Positional Min/Max

To find the **index** of the highest or lowest value in a tensor, use:

- `torch.argmax()` → index of max value  
- `torch.argmin()` → index of min value

💡 Useful when you care about position, not just the value (e.g., classification tasks).

In [327]:
tensor = torch.arange(10, 100, 10)
# tensor: [10, 20, ..., 90]

print(tensor.argmax())      # 8
tensor.argmin()             # 0

tensor(8)


tensor(0)

### Changing Tensor Datatype

Tensors must have matching **dtypes** (e.g., `float32`, `float16`) for many operations.

Use `.type(dtype)` to convert:

In [328]:
tensor = torch.arange(10., 100., 10.)
print(tensor.dtype)  # torch.float32 (default)

# Convert to float16
tensor_float16 = tensor.type(torch.float16)
print(tensor_float16.dtype)

# Convert to int8
tensor_int8 = tensor.type(torch.int8)
print(tensor_int8.dtype)

torch.float32
torch.float16
torch.int8


### Reshaping & Dimension Operations

Tensors often need reshaping for model compatibility. Common operations:

| Method                  | Purpose                                      |
|-------------------------|----------------------------------------------|
| `reshape()`             | Change shape (creates copy)                  |
| `view()`                | Change shape (shares data, view only)        |
| `stack([t1, t2], dim)`  | Stack tensors along new dimension            |
| `squeeze()`             | Remove dims with size 1                      |
| `unsqueeze(dim)`        | Add a dim of size 1 at given position        |
| `permute(dims)`         | Rearrange dimensions (changes view)

In [329]:
x = torch.arange(1., 8.)                    # Shape: [7]
print(x.shape)
x_reshaped = x.reshape(1, 7)                # Shape: [1, 7]
print(x_reshaped.shape)
x_view = x.view(1, 7)                       # Shares data with x
print(x_view.shape)

x_stack = torch.stack([x]*4)                # Shape: [4, 7]
print(x_stack.shape)

x_squeezed = x_reshaped.squeeze()           # Shape: [7]
print(x_squeezed.shape)
x_unsqueezed = x_squeezed.unsqueeze(0)      # Shape: [1, 7]
print(x_unsqueezed.shape)

x_img = torch.rand(224, 224, 3)
print(x_img.shape)
x_permuted = x_img.permute(2, 0, 1)         # Shape: [3, 224, 224]
print(x_permuted.shape)

torch.Size([7])
torch.Size([1, 7])
torch.Size([1, 7])
torch.Size([4, 7])
torch.Size([7])
torch.Size([1, 7])
torch.Size([224, 224, 3])
torch.Size([3, 224, 224])


### Indexing Tensors

You can select specific parts of a tensor using indexing — similar to Python lists or NumPy arrays.


In [330]:
x = torch.arange(1, 10).reshape(1, 3, 3)
print(x)
print('----')

# Shape: [1, 3, 3]
# [[[1, 2, 3],
#   [4, 5, 6],
#   [7, 8, 9]]]

print(x[0])         # -> 2D slice
print(x[0][0])      # -> 1D row
print(x[0][0][0])   # -> scalar (1)
print('----')

# Using colons (:) for slicing:
print(x[:, 0])      # -> first row of each batch
print(x[:, :, 1])   # -> second column
print(x[:, 1, 1])   # -> center element
print(x[0, 0, :])   # -> first row

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])
----
tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])
tensor([1, 2, 3])
tensor(1)
----
tensor([[1, 2, 3]])
tensor([[2, 5, 8]])
tensor([5])
tensor([1, 2, 3])


### PyTorch Tensors & NumPy

PyTorch and NumPy interoperate easily:

| Conversion                   | Description                     |
|------------------------------|---------------------------------|
| `torch.from_numpy(ndarray)` | NumPy → PyTorch tensor          |
| `tensor.numpy()`             | PyTorch tensor → NumPy array    |

In [331]:
import numpy as np
import torch

# NumPy → PyTorch
array = np.arange(1.0, 8.0)
print(array)
tensor = torch.from_numpy(array)
print(tensor)
print('----')

# Convert dtype to float32 (optional)
tensor = tensor.type(torch.float32)
print(tensor, tensor.dtype)
print('----')

# Changing array doesn't affect tensor
array += 1
print(array)
print('----')

# PyTorch → NumPy
tensor = torch.ones(7)                  # float32 by default
print(tensor, tensor.dtype)
numpy_tensor = tensor.numpy()
print(numpy_tensor, numpy_tensor.dtype)
print('----')

# Changing tensor doesn’t affect numpy_tensor
tensor += 1
print(tensor, tensor.dtype)

[1. 2. 3. 4. 5. 6. 7.]
tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64)
----
tensor([1., 2., 3., 4., 5., 6., 7.]) torch.float32
----
[2. 3. 4. 5. 6. 7. 8.]
----
tensor([1., 1., 1., 1., 1., 1., 1.]) torch.float32
[1. 1. 1. 1. 1. 1. 1.] float32
----
tensor([2., 2., 2., 2., 2., 2., 2.]) torch.float32


### Reproducibility in PyTorch

Neural networks use **random numbers** (e.g., for weight initialization).  
To make results **repeatable**, we use **random seeds**.

#### Why it matters:
- Ensures consistent results across runs
- Helps others reproduce your experiments

💡 Always reset the seed before each new random call if you want identical results.

#### Without Seed:

In [332]:
a = torch.rand(3, 4)
b = torch.rand(3, 4)
a == b  # False — different values

tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])

#### With Seed:

In [333]:
torch.manual_seed(42)
c = torch.rand(3, 4)

torch.manual_seed(42)
d = torch.rand(3, 4)

c == d  # True — same values

tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])

### PyTorch Devices: CPU, CUDA, MPS

PyTorch can run on different hardware:

| Device | Description                              |
|--------|------------------------------------------|
| `cpu`  | Default processor (always available)     |
| `cuda` | NVIDIA GPU (uses CUDA toolkit)           |
| `mps`  | Apple Silicon GPU (M1/M2, via Metal)     |

Using a GPU can **greatly speed up** training and computation.

#### Auto-Select the Best Available Device:

In [335]:
if torch.cuda.is_available():
    device = "cuda"  # NVIDIA GPU
elif torch.backends.mps.is_available():
    device = "mps"   # Apple GPU
else:
    device = "cpu"   # Fallback

print(f"Using device: {device}")

# Move tensors/models to the selected device
tensor = torch.tensor([1.0, 2.0]).to(device)

Using device: mps
