# Pytorch Fundamentals

## Introduction to Tensors

In [1]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)

2.3.1


### Creating Tensors

A **tensor** is a fundamental data structure in PyTorch, generalizing scalars, vectors, and matrices to potentially higher dimensions. 

#### **1. Intuitive Understanding**

- **Scalar**: A single number (0D tensor), e.g., temperature = 37°C.
- **Vector**: An array of numbers (1D tensor), e.g., [height, weight, age].
- **Matrix**: A 2D array (2D tensor), e.g., grayscale image pixels.
- **Tensor**: An n-dimensional array (nD tensor), e.g., color images (3D), video (4D), or batches of data.

**Analogy:**  
Think of a tensor as a general container for data, like a spreadsheet (matrix), but extended to more dimensions—imagine a stack of spreadsheets (3D), or a sequence of such stacks (4D).

#### **2. Mathematical Foundation**

A tensor is a multi-dimensional array of numerical values. Formally, an n-th order tensor is an element of the tensor product of n vector spaces.

- **Order (Rank):** Number of dimensions (axes).
- **Shape:** Size along each dimension.

**Example:**  
A tensor of shape (3, 4, 5) is a 3D tensor with 3 matrices, each of size 4x5.


**In PyTorch:**  
Tensors are implemented as contiguous blocks of memory, supporting efficient computation and broadcasting.


#### **4. Key Operations**

- **Reshaping:** `tensor.view()`, `tensor.reshape()`
- **Indexing/Slicing:** `tensor[0]`, `tensor[:, 1]`
- **Mathematical Ops:** `torch.add()`, `torch.matmul()`, etc.
- **Broadcasting:** Automatic expansion of dimensions for operations.

#### **5. Real-World Analogies**

- **Images:** 2D (grayscale) or 3D (color) tensors.
- **Videos:** 4D tensors (batch, channel, height, width).
- **Text:** 2D or 3D tensors (batch, sequence length, embedding size).

#### **6. Applications in AI/ML**

- **Deep Learning:** All neural network computations (inputs, weights, activations) are tensor operations.
- **Computer Vision:** Images and videos as tensors.
- **Natural Language Processing:** Sentences as sequences of word embeddings (tensors).
- **Reinforcement Learning:** States, actions, and rewards as tensors.

#### **7. Why Tensors?**

- **Efficiency:** Optimized for GPU/TPU computation.
- **Flexibility:** Can represent any data structure needed for ML.
- **Expressiveness:** Enable complex operations (convolutions, matrix multiplications) essential for modern AI.

In [2]:
# scalar
# creating pytorch tensor using torch.Tensor()
scalar = torch.tensor(7)
scalar

tensor(7)

In [3]:
scalar.ndim  # number of dimensions

0

In [4]:
scalar.item()  # get the value of the tensor

7

### Creating Vectors


#### Intuitive Understanding

A **vector** is a one-dimensional array of numbers. In PyTorch, a vector is represented as a 1D tensor. Vectors are fundamental in mathematics, physics, and machine learning, serving as the building blocks for more complex structures.

- **Example:**  
    `vector = torch.tensor([7, 7])`  
    This is a 1D tensor (vector) with two elements.

**Intuition:**  
Think of a vector as an arrow in space, defined by its direction and magnitude. In data science, vectors often represent features of a data point (e.g., height and weight of a person).

####  Why Vectors Matter in PyTorch

- **Efficiency:** PyTorch operations are vectorized for speed, leveraging GPUs.
- **Expressiveness:** Vectors enable concise representation of data and parameters.
- **Foundation:** All higher-dimensional tensors (matrices, etc.) are built from vectors.



In [5]:
# vector
vector = torch.tensor([7, 7])
vector

tensor([7, 7])

In [6]:
vector.ndim  # number of dimensions

1

In [7]:
vector.shape  # shape of the tensor

torch.Size([2])

In [8]:
# MATRIX
matrix = torch.tensor([[7, 8], [9, 10]])
matrix

tensor([[ 7,  8],
        [ 9, 10]])

In [9]:
matrix.ndim  # number of dimensions

2

In [10]:
matrix[1]

tensor([ 9, 10])

In [11]:
matrix.shape  # shape of the tensor

torch.Size([2, 2])

In [12]:
# TENSOR    
TENSOR = torch.tensor([[[1, 2, 3], 
                        [4,5,6,],
                        [7,8,9]]])
TENSOR

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

In [13]:
TENSOR.ndim  # number of dimensions

3

In [14]:
TENSOR.shape  # shape of the tensor

torch.Size([1, 3, 3])

In [15]:
TENSOR[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

### Random Tensors

Random tensors are tensors whose values are sampled from a probability distribution, such as uniform or normal (Gaussian) distributions. In PyTorch, we can create random tensors using functions like `torch.rand()`, `torch.randn()`, and `torch.randint()`.

**Why use random tensors?**

- **Weight Initialization:** Neural networks require their weights to be initialized randomly to break symmetry and help the model learn effectively.
- **Simulating Data:** Random tensors are useful for testing models and functions when real data is unavailable.
- **Stochastic Processes:** Many machine learning algorithms rely on randomness, such as dropout or data augmentation.

Random tensors are essential for reproducibility and experimentation in deep learning workflows.

In [16]:
random_tensor = torch.rand(3, 4)
random_tensor

tensor([[0.2413, 0.8337, 0.0088, 0.2416],
        [0.7993, 0.3380, 0.2261, 0.6610],
        [0.3512, 0.3304, 0.1887, 0.3715]])

In [17]:
random_tensor.ndim  # number of dimensions

2

In [18]:
# Creating a random tensor with specific shape of an image tensor
random_image_size_tensor = torch.rand(size = (224,224, 3)) # 3 color channels, 224 height, 224 width and rgb color channels
random_image_size_tensor.shape, random_image_size_tensor.ndim # shape and dimension of the tensor

(torch.Size([224, 224, 3]), 3)

### Tensors of Zeros and Ones

In [19]:
# Creating a tensor of all zeros
zeros = torch.zeros(size=(3, 4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [20]:
# creating a tensor of all ones
ones = torch.ones(size=(3, 4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

### Creating a Range of Tensors and Tensors-Like

In [21]:
# Using torch.arange() to create a tensor of a range of numbers
torch.arange(0, 10)

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [22]:
#  one to thousand with step of 50
one_to_thousand = torch.arange(1, 1001, 50)
one_to_thousand

tensor([  1,  51, 101, 151, 201, 251, 301, 351, 401, 451, 501, 551, 601, 651,
        701, 751, 801, 851, 901, 951])

In [23]:
# Creating tensors like
thousand_zeros = torch.zeros_like(input=one_to_thousand)
thousand_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

### Tensor Datatypes

In [24]:
# float 32 tensor
float_32_tensor = torch.tensor([3.0, 6.0, 9.0], dtype=torch.float32)
float_32_tensor

tensor([3., 6., 9.])

A **tensor datatype** (or `dtype`) specifies the kind of elements contained in a tensor, such as integers, floating-point numbers, or booleans. The datatype determines how much memory each element uses and what operations are supported.

#### Common PyTorch Tensor Datatypes

| PyTorch dtype         | Description                | Example usage                |
|---------------------- |---------------------------|------------------------------|
| `torch.float32`       | 32-bit floating point      | Neural network weights       |
| `torch.float64`       | 64-bit floating point      | High-precision calculations  |
| `torch.int32`         | 32-bit integer             | Indexing, counting           |
| `torch.int64`         | 64-bit integer (long)      | Large indices, counters      |
| `torch.bool`          | Boolean (True/False)       | Masks, conditions            |

We can specify the dtype when creating a tensor:
```python
float_tensor = torch.tensor([1.0, 2.0, 3.0], dtype=torch.float32)
int_tensor = torch.tensor([1, 2, 3], dtype=torch.int64)
```

#### Why Tensor Datatypes Matter

- **Memory Usage:** Lower precision (e.g., `float16`) uses less memory, allowing larger models or batches.
- **Computation Speed:** Some hardware (like GPUs/TPUs) is optimized for specific datatypes.
- **Numerical Precision:** Higher precision (e.g., `float64`) reduces rounding errors but is slower and uses more memory.

#### Problems That Arise with Tensor Datatypes

1. **Type Mismatch Errors:**  
    Operations between tensors of different dtypes can cause errors or unexpected results.
    ```python
    a = torch.tensor([1.0, 2.0], dtype=torch.float32)
    b = torch.tensor([1, 2], dtype=torch.int32)
    # a + b will raise a RuntimeError unless dtypes are matched
    ```

2. **Loss of Precision:**  
    Converting from higher to lower precision (e.g., `float64` to `float32`) can lose information.
    ```python
    high_precision = torch.tensor([1.123456789], dtype=torch.float64)
    low_precision = high_precision.to(torch.float32)
    # low_precision may not store all decimal places
    ```

3. **Increased Memory Usage:**  
    Using unnecessarily high precision (e.g., `float64` for images) wastes memory and slows down computation.

4. **Incompatible Operations:**  
    Some operations require specific dtypes (e.g., indices must be `int64` for advanced indexing).

#### Best Practices

- Use `float32` for most deep learning tasks (default in PyTorch).
- Use integer types for labels, indices, or counting.
- Be explicit about dtypes when precision or compatibility matters.
- Convert dtypes using `.to()`, `.float()`, `.long()`, etc.

**Example:**
```python
tensor = torch.arange(10)           # Default dtype: int64
tensor = tensor.float()             # Convert to float32
```

Understanding and managing tensor datatypes is crucial for efficient, correct, and reproducible deep learning workflows.

In [25]:
float_32_tensor.dtype  # data type of the tensor

torch.float32

In [26]:
int_32_tensor = float_32_tensor.type(torch.int32)  # changing the data type of the tensor
int_32_tensor

tensor([3, 6, 9], dtype=torch.int32)

In [27]:
float_32_tensor*int_32_tensor  # element-wise multiplication

tensor([ 9., 36., 81.])

### Getting Information from Tensor

After creating tensors we might want to get some information from them.<br>

Most common attributes you'll want to find out about tensors are:
- `shape` - what shape is the tensor? (some operations require specific shape rules)
- `dtype` - what datatype are the elements within the tensor stored in?
- `device` - what device is the tensor stored on? (usually GPU or CPU)

In [28]:
# creating a tensor
some_tensor = torch.rand(3,4)
some_tensor

tensor([[0.5020, 0.0454, 0.6678, 0.1819],
        [0.9683, 0.2076, 0.8100, 0.3943],
        [0.8694, 0.9924, 0.3539, 0.2197]])

In [29]:
# Finding out details about the tensor
print(some_tensor)

tensor([[0.5020, 0.0454, 0.6678, 0.1819],
        [0.9683, 0.2076, 0.8100, 0.3943],
        [0.8694, 0.9924, 0.3539, 0.2197]])


In [30]:
print(some_tensor)
print(f"Shape of tensor: {some_tensor.shape}")
print(f"Datatype of tensor: {some_tensor.dtype}")
print(f"Device tensor is stored on: {some_tensor.device}")

tensor([[0.5020, 0.0454, 0.6678, 0.1819],
        [0.9683, 0.2076, 0.8100, 0.3943],
        [0.8694, 0.9924, 0.3539, 0.2197]])
Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


### Manipulating Tensors( Tensor Operations)

Tensor operations are fundamental to working with data in PyTorch. They allow us to manipulate, transform, and compute with tensors efficiently. Here are some of the most common tensor operations:

#### 1. **Basic Arithmetic Operations**
- **Addition/Subtraction:** `+`, `-`, or `torch.add()`, `torch.sub()`
- **Multiplication/Division:** `*`, `/`, or `torch.mul()`, `torch.div()`
- **Element-wise:** Operations are performed element-by-element.

#### 2. **Matrix Operations**
- **Matrix Multiplication:** `torch.matmul(a, b)` or `a @ b`
- **Dot Product:** `torch.dot(a, b)` (for 1D tensors)
- **Transpose:** `tensor.T` or `tensor.transpose(dim0, dim1)`

#### 3. **Aggregation**
- **Sum:** `tensor.sum()`
- **Mean:** `tensor.mean()`
- **Max/Min:** `tensor.max()`, `tensor.min()`
- **Argmax/Argmin:** `tensor.argmax()`, `tensor.argmin()`

#### 4. **Reshaping and Manipulation**
- **Reshape:** `tensor.reshape(new_shape)` or `tensor.view(new_shape)`
- **Squeeze/Unsqueeze:** Remove or add dimensions of size 1.
    - `tensor.squeeze()` (removes)
    - `tensor.unsqueeze(dim)` (adds)
- **Stacking:** Combine tensors along a new dimension.
    - `torch.stack([a, b], dim=0)`
    - `torch.cat([a, b], dim=1)`

#### 5. **Indexing and Slicing**
- Access elements or sub-tensors using Python-style indexing: `tensor[0]`, `tensor[:, 1]`, etc.

#### 6. **Broadcasting**
- PyTorch automatically expands tensors of different shapes for compatible operations.

#### 7. **In-place Operations**
- Operations ending with `_` modify the tensor in place, e.g., `tensor.add_(1)`

**Example:**
```python
a = torch.tensor([[1, 2], [3, 4]])
b = torch.tensor([[5, 6], [7, 8]])
c = a + b           # Element-wise addition
d = a @ b           # Matrix multiplication
e = a.reshape(4)    # Reshape to 1D
```


In [31]:
# Creating a tensor of values and adding a number to it
tensor = torch.tensor([1, 2, 3])
tensor + 10

tensor([11, 12, 13])

In [32]:
# Multiply it by 10
tensor * 10

tensor([10, 20, 30])

In [33]:
# Tensors don't change unless reassigned
tensor

tensor([1, 2, 3])

In [34]:
# Subtract and reassign
tensor = tensor - 10
tensor

tensor([-9, -8, -7])

In [35]:
# Add and reassign
tensor = tensor + 10
tensor

tensor([1, 2, 3])

In [36]:
# Can also use torch functions
torch.multiply(tensor, 10)

tensor([10, 20, 30])

In [37]:
# Element-wise multiplication (each element multiplies its equivalent, index 0->0, 1->1, 2->2)
print(tensor, "*", tensor)
print("Equals:", tensor * tensor)

tensor([1, 2, 3]) * tensor([1, 2, 3])
Equals: tensor([1, 4, 9])


In [38]:
# matrix multiplication
torch.matmul(tensor, tensor)

tensor(14)

In [39]:
tensor

tensor([1, 2, 3])

In [40]:
#using @ for matrix multiplication
tensor @ tensor

tensor(14)

#### Using Buit-In Pytorch Tensor

In [41]:
%%time 
value = 0
for i in range(len(tensor)):
    value += tensor[i] * tensor[i]
print(value)

tensor(14)
CPU times: user 784 µs, sys: 786 µs, total: 1.57 ms
Wall time: 1.2 ms


In [44]:
%%time
torch.matmul(tensor, tensor)

CPU times: user 126 µs, sys: 34 µs, total: 160 µs
Wall time: 147 µs


tensor(14)

### Rules of Matrix Multiplications

- The inner dimensions must match.
- The resulting matrix has the shape of the outer dimensions.

In [46]:
tensor = torch.tensor([1,2,3])#
tensor.shape

torch.Size([3])

In [47]:
# Element-wise matrix multiplication
tensor * tensor

tensor([1, 4, 9])

In [48]:
# Matrix multiplication
torch.matmul(tensor, tensor)

tensor(14)

In [50]:
# Can also use the "@" symbol for matrix multiplication
tensor @ tensor

tensor(14)

In [None]:
# Shapes need to be in the right way
tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]], dtype=torch.float32)

tensor_B = torch.tensor([[7, 10],
                         [8, 11],
                         [9, 12]], dtype=torch.float32)

#torch.matmul(tensor_A, tensor_B) (this will error)

Now we are using transpose to avoid the error.

We can attempt transpose in following way:
- `torch.transpose(input, dim0, dim1)`  where input is the desired tensor to transpose and dim0 and dim1 are the dimensions to be swapped.
- `tensor.T` where tensor is the desired tensor to transpose.

In [53]:
print(tensor_A)
print(tensor_B)

tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])
tensor([[ 7., 10.],
        [ 8., 11.],
        [ 9., 12.]])


In [54]:
print(tensor_A)
print(tensor_B.T)

tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])
tensor([[ 7.,  8.,  9.],
        [10., 11., 12.]])


In [55]:
# The operation works when tensor_B is transposed
print(f"Original shapes: tensor_A = {tensor_A.shape}, tensor_B = {tensor_B.shape}\n")
print(f"New shapes: tensor_A = {tensor_A.shape} (same as above), tensor_B.T = {tensor_B.T.shape}\n")
print(f"Multiplying: {tensor_A.shape} * {tensor_B.T.shape} <- inner dimensions match\n")
print("Output:\n")
output = torch.matmul(tensor_A, tensor_B.T)
print(output)
print(f"\nOutput shape: {output.shape}")

Original shapes: tensor_A = torch.Size([3, 2]), tensor_B = torch.Size([3, 2])

New shapes: tensor_A = torch.Size([3, 2]) (same as above), tensor_B.T = torch.Size([2, 3])

Multiplying: torch.Size([3, 2]) * torch.Size([2, 3]) <- inner dimensions match

Output:

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])

Output shape: torch.Size([3, 3])


In [56]:
# using torch.mm which is a shortcut for matmul
torch.mm(tensor_A, tensor_B.T)

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])

### Finding the min, max, mean, sum etc (aggregation)

In [57]:
#creating a tensor
x = torch.arange(0, 100, 10)
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [58]:
print(f"Minimum: {x.min()}")
print(f"Maximum: {x.max()}")
# won't work without float datatype
print(f"Mean: {x.type(torch.float32).mean()}")
print(f"Sum: {x.sum()}")

Minimum: 0
Maximum: 90
Mean: 45.0
Sum: 450


In [60]:
# USING TORCH METHODS 
torch.max(x), torch.min(x), torch.mean(x.type(torch.float32)), torch.sum(x)

(tensor(90), tensor(0), tensor(45.), tensor(450))

### Positional Min/Max

We can also find the index of a tensor where the max or minimum occurs with `torch.argmax()` and `torch.argmin()` respectively.

In [61]:
#creating a tensor
tensor = torch.arange(10, 100, 10)
print(f"Tensor: {tensor}")

#returns index of max and min values
print(f"Index where max value occurs: {tensor.argmax()}")
print(f"Index where min value occurs: {tensor.argmin()}")

Tensor: tensor([10, 20, 30, 40, 50, 60, 70, 80, 90])
Index where max value occurs: 8
Index where min value occurs: 0


### Changing Tensor Datatype

If one tensor is in `torch.float64` and another is in `torch.float32` we might run into some errors.<br>
We can change the datatypes of tensors using `torch.Tensor.type(dtype=None)` where the `dtype` parameter is the datatype we'd like to use.

In [62]:
#creating a tensor and checking its datatype
tensor = torch.arange(10., 100., 10.)
tensor.dtype

torch.float32

In [63]:
# creating a float16 tensor
tensor_float16 = tensor.type(torch.float16)
tensor_float16

tensor([10., 20., 30., 40., 50., 60., 70., 80., 90.], dtype=torch.float16)

In [64]:
#creating an int8 tensor
tensor_int8 = tensor.type(torch.int8)
tensor_int8

tensor([10, 20, 30, 40, 50, 60, 70, 80, 90], dtype=torch.int8)