<a href="https://colab.research.google.com/github/franciskyalo/Pytorch-Learning/blob/main/Pytorch_An_introduction.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>



# An Introduction to Pytorch


PyTorch is an open-source machine learning framework primarily used for deep learning applications. It has gained popularity for its flexibility, dynamic computation graph, and user-friendly interface. Here's a brief history of PyTorch:

### Origins
- **2016**: PyTorch was developed by the **Artificial Intelligence Research Group (FAIR)** at Facebook. It was built on the foundation of **Torch**, a machine learning library written in Lua, which had been widely used in academic research but lacked adoption in the broader industry.

- **Key Contributors**:
  - **Adam Paszke** and other researchers from FAIR played significant roles in its development.
  - It incorporated many modern design principles to make it more accessible and flexible compared to Torch.

### Early Features
- PyTorch introduced a **dynamic computation graph**, allowing developers to modify the model architecture during runtime. This was a significant departure from static graph frameworks like TensorFlow 1.x, making it especially appealing to researchers.

- PyTorch was designed with a focus on Python, making it intuitive and easy to integrate with Python’s rich ecosystem of libraries.

### Adoption and Growth
- By **2017**, PyTorch began gaining traction among researchers, thanks to its ease of use, dynamic nature, and robust community support.

- It became a favorite in academic research, where experimentation and model iteration are frequent.

### Key Milestones
1. **2018**:
   - **PyTorch 1.0**: Facebook merged PyTorch with another of its frameworks, **Caffe2**, to combine research flexibility with production capabilities.
   - This version included improved support for deploying models in production environments.

2. **2019**:
   - PyTorch introduced **TorchScript**, enabling the export of models to a statically defined computational graph for optimized deployment.
   - It became the framework of choice for many prestigious conferences and competitions in the deep learning community.

3. **2020**:
   - PyTorch released support for **distributed training** and **hardware accelerators** like TPUs.
   - It was increasingly adopted by industry leaders like Tesla, Microsoft, and Uber for real-world applications.

4. **2021-2023**:
   - PyTorch grew its ecosystem with libraries like **TorchVision** (for computer vision), **TorchText** (for natural language processing), and **TorchAudio** (for audio processing).
   - It introduced advanced features like **torch.compile** (dynamic compilation for performance optimization).

5. **2022**:
   - Facebook rebranded as Meta, reaffirming its commitment to PyTorch by transferring the framework to the **PyTorch Foundation**, part of the Linux Foundation. This ensured that PyTorch would remain community-driven and open-source.

6. **2023 and Beyond**:
   - PyTorch continues to innovate with improvements in scalability, support for large models (e.g., Transformers), and integration with tools for reinforcement learning, generative AI, and more.

### Impact
- PyTorch has played a crucial role in democratizing deep learning by providing an accessible yet powerful platform for both researchers and practitioners.
- It remains one of the most widely used frameworks in academia and industry, competing closely with TensorFlow.

PyTorch's emphasis on flexibility, coupled with its growing ecosystem and robust community support, has solidified its position as a cornerstone of modern AI and deep learning.

#Introduction to Tensors

A **tensor** is a multi-dimensional array or matrix used to represent data in deep learning and other numerical computations. Tensors generalize vectors and matrices to higher dimensions, making them highly versatile for handling complex datasets and computations in neural networks.

### Key Characteristics of Tensors
1. **Dimensionality**:
   - A **scalar** is a tensor with zero dimensions (e.g., \(5\)).
   - A **vector** is a 1-dimensional tensor (e.g., \([1, 2, 3]\)).
   - A **matrix** is a 2-dimensional tensor (e.g., \(\begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}\)).
   - Higher-dimensional tensors are often referred to as n-dimensional tensors.

2. **Shape**:
   The shape of a tensor indicates the number of elements along each dimension. For example:
   - A 1D tensor with 3 elements: `(3,)`.
   - A 2D tensor with 2 rows and 3 columns: `(2, 3)`.
   - A 3D tensor, e.g., representing a batch of images: `(batch_size, height, width)`.

3. **Data Types**:
   Tensors can hold various data types like integers, floating-point numbers, or even Boolean values. The data type must be specified when defining the tensor.

4. **Operations**:
   Tensors are the building blocks of deep learning models. They support various operations such as addition, multiplication, reshaping, slicing, and more. These operations are highly optimized for GPUs and other accelerators.

### Importance in Deep Learning
- **Data Representation**:
  Tensors are used to represent all forms of input data (e.g., images, text, audio).
  - Images: Stored as 3D tensors (`[height, width, channels]`).
  - Text: Represented as sequences or embeddings in tensors.
  - Video: Stored as 4D tensors (`[frames, height, width, channels]`).

- **Model Parameters**:
  The weights and biases in neural networks are stored as tensors.

- **Parallel Computation**:
  Frameworks like PyTorch and TensorFlow use tensors to perform computations efficiently on GPUs and TPUs.

### Example in PyTorch
```python
import torch

# Scalar
scalar = torch.tensor(5)
print(scalar.shape)  # Output: torch.Size([])

# Vector
vector = torch.tensor([1, 2, 3])
print(vector.shape)  # Output: torch.Size([3])

# Matrix
matrix = torch.tensor([[1, 2], [3, 4]])
print(matrix.shape)  # Output: torch.Size([2, 2])

# 3D Tensor
tensor_3d = torch.randn(3, 3, 3)  # Random 3x3x3 tensor
print(tensor_3d.shape)  # Output: torch.Size([3, 3, 3])
```

Tensors are foundational in deep learning, enabling the storage and manipulation of data and parameters in neural networks.

# Writing Pytorch Code

### Introduction to Pytorch Tensors

In PyTorch, tensors are versatile data structures that support various data types and functionalities. While all PyTorch tensors are fundamentally the same type, their behavior can differ based on their **data type**, **device**, **dimensionality**, or **special purpose**. Here are the main classifications:

---

### **1. Based on Dimensionality**
- **Scalar Tensor**: A zero-dimensional tensor, representing a single number.
  ```python
  scalar = torch.tensor(5)
  print(scalar.dim())  # Output: 0
  ```

- **Vector Tensor**: A one-dimensional tensor, representing a list of numbers.
  ```python
  vector = torch.tensor([1, 2, 3])
  print(vector.dim())  # Output: 1
  ```

- **Matrix Tensor**: A two-dimensional tensor, representing a grid of numbers.
  ```python
  matrix = torch.tensor([[1, 2], [3, 4]])
  print(matrix.dim())  # Output: 2
  ```

- **Higher-Dimensional Tensor**: Tensors with three or more dimensions, used for more complex data.
  ```python
  tensor_3d = torch.rand(3, 4, 5)  # 3D tensor
  tensor_4d = torch.rand(2, 3, 4, 5)  # 4D tensor
  ```

---

### **2. Based on Data Type**
PyTorch tensors can hold different types of data. Common types include:
- **Float Tensors** (default): `torch.float32`
  ```python
  tensor = torch.tensor([1.0, 2.0], dtype=torch.float32)
  ```

- **Double Tensors**: `torch.float64`
  ```python
  tensor = torch.tensor([1.0, 2.0], dtype=torch.float64)
  ```

- **Integer Tensors**:
  - `torch.int32` or `torch.int64`
  ```python
  tensor = torch.tensor([1, 2], dtype=torch.int32)
  ```

- **Boolean Tensors**: `torch.bool`
  ```python
  tensor = torch.tensor([True, False], dtype=torch.bool)
  ```

- **Complex Tensors**: `torch.complex64` or `torch.complex128`
  ```python
  tensor = torch.tensor([1+2j, 3+4j], dtype=torch.complex64)
  ```

---

### **3. Based on Device**
Tensors can reside on different devices:
- **CPU Tensors**: The default device.
  ```python
  tensor = torch.tensor([1, 2, 3])
  ```

- **GPU Tensors**: Leveraging CUDA for faster computation.
  ```python
  tensor = torch.tensor([1, 2, 3], device='cuda')
  ```

- **TPU Tensors**: For advanced hardware accelerators (requires libraries like PyTorch XLA).

---

### **4. Based on Gradient Computation**
- **Regular Tensor**: No gradient tracking.
  ```python
  tensor = torch.tensor([1.0, 2.0])
  ```

- **Tensor with Gradients**: Used for automatic differentiation in models.
  ```python
  tensor = torch.tensor([1.0, 2.0], requires_grad=True)
  ```

---

### **5. Special Tensors**
- **Sparse Tensors**: Efficient storage for tensors with mostly zero values.
  ```python
  indices = torch.tensor([[0, 1], [1, 2]])
  values = torch.tensor([3, 4])
  sparse_tensor = torch.sparse_coo_tensor(indices, values, [2, 3])
  ```

- **Quantized Tensors**: Reduced precision for memory-efficient inference.
  ```python
  tensor = torch.quantize_per_tensor(torch.tensor([1.0, 2.0]), scale=0.1, zero_point=0, dtype=torch.qint8)
  ```

- **Complex Tensors**: For working with complex numbers.
  ```python
  tensor = torch.tensor([1+2j, 3+4j])
  ```

- **Custom Tensors**: You can define custom tensors by extending the PyTorch library.

---

### Summary Table

| **Classification** | **Example**                                                                 |
|---------------------|-----------------------------------------------------------------------------|
| Dimensionality      | Scalar, Vector, Matrix, Higher-dimensional tensors                         |
| Data Type           | Float, Integer, Boolean, Complex                                           |
| Device              | CPU, GPU, TPU                                                             |
| Gradients           | Regular tensors, Tensors with `requires_grad=True`                        |
| Special Tensors     | Sparse tensors, Quantized tensors, Complex tensors                        |




In [1]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)

2.5.1+cu121


In [2]:
scaler = torch.tensor(8)
scaler

tensor(8)

In [3]:
scaler.ndim

0

In [4]:
scaler.item()

8

In [5]:
vector=torch.tensor([1,2,7,8])

vector

tensor([1, 2, 7, 8])

In [6]:
vector.dim()

1

In [7]:
vector.shape

torch.Size([4])

In [8]:
matrix_tensor = torch.tensor([[1,2,3],[4,5,6]])
matrix_tensor

tensor([[1, 2, 3],
        [4, 5, 6]])

In [9]:
matrix_tensor.shape

torch.Size([2, 3])

In [10]:
matrix_tensor[0]

tensor([1, 2, 3])

In [11]:
# random tensors

random_tensor = torch.rand(3,4)
random_tensor

tensor([[0.0286, 0.8445, 0.9133, 0.9177],
        [0.2888, 0.9284, 0.2399, 0.9023],
        [0.1752, 0.9511, 0.4754, 0.6895]])

In [12]:
normal_random_tensor = torch.randn(3,4)
normal_random_tensor

tensor([[-0.0835,  1.8718,  0.3465, -0.5932],
        [-0.6418, -1.8164, -0.6834, -0.3300],
        [-0.4358, -1.2571,  1.8330,  2.0710]])

In [13]:
random_image_size_tensor = torch.rand(size=(224,224,3))
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([224, 224, 3]), 3)

In [14]:
#zero tensors
zero_tensor=torch.zeros(size=(3,4))
zero_tensor

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [15]:
# ones tensors

ones_tensors=torch.ones(size=(3,4))
ones_tensors

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [16]:
# checking the data type

ones_tensors.dtype

torch.float32

In [17]:
# Creating a range of tensors

range_torch = torch.arange(0,10)
range_torch

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [18]:
# including step

step_range_torch = torch.arange(start=0,end=1000,step=77)
step_range_torch

tensor([  0,  77, 154, 231, 308, 385, 462, 539, 616, 693, 770, 847, 924])

In [19]:
# creating tensors-like

tensors_like= torch.zeros_like(input=step_range_torch)
tensors_like

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

### Tensor data types

In PyTorch, tensors can have various data types that determine the kind of data they store. These data types are essential for ensuring compatibility and optimizing performance in computations. Here's an overview of the main tensor data types in PyTorch:

---

### **Numeric Data Types**

#### **Floating Point**
- **`torch.float32` (`torch.float`)**:
  - 32-bit floating-point.
  - Commonly used for most neural network computations.
- **`torch.float64` (`torch.double`)**:
  - 64-bit floating-point.
  - Higher precision than `float32` but slower; rarely used in deep learning.
- **`torch.float16` (`torch.half`)**:
  - 16-bit floating-point.
  - Used for mixed precision training to reduce memory usage and accelerate computations on GPUs.

#### **Integer**
- **`torch.int8`**:
  - 8-bit signed integer.
  - Often used for quantized models to reduce memory usage.
- **`torch.uint8`**:
  - 8-bit unsigned integer.
  - Suitable for image data (e.g., pixel intensities between 0-255).
- **`torch.int16` (`torch.short`)**:
  - 16-bit signed integer.
- **`torch.int32` (`torch.int`)**:
  - 32-bit signed integer.
- **`torch.int64` (`torch.long`)**:
  - 64-bit signed integer.
  - Often used for indexing operations in PyTorch (e.g., in `torch.gather` or `torch.index_select`).

#### **Boolean**
- **`torch.bool`**:
  - Boolean type.
  - Used for masks and logical operations.

---

### **Complex Data Types**
- **`torch.complex64`**:
  - 64-bit complex number (32 bits for real and imaginary parts).
- **`torch.complex128`**:
  - 128-bit complex number (64 bits for real and imaginary parts).
  - Used for computations involving complex numbers.

---

### **Quantized Data Types**
- **`torch.qint8`**:
  - Quantized 8-bit signed integer.
- **`torch.quint8`**:
  - Quantized 8-bit unsigned integer.
- **`torch.qint32`**:
  - Quantized 32-bit signed integer.
  - Used in quantized models for efficient inference.

---

### **Specialized Data Types**
- **`torch.bfloat16`**:
  - Brain floating-point format (16-bit).
  - Optimized for faster training while maintaining higher precision compared to `float16`.
  - Supported on certain hardware (e.g., TPUs, some GPUs).

---

### Changing Data Types
You can specify or change a tensor's data type using the `dtype` argument or methods like `.to()` or `.type()`:
```python
import torch

# Specifying data type at creation
tensor = torch.tensor([1.0, 2.0], dtype=torch.float64)

# Changing data type
tensor = tensor.to(torch.float32)
tensor = tensor.type(torch.int64)
```

---

### Choosing the Right Data Type
- **Training Models:** Use `torch.float32` or `torch.float16` for efficient computations.
- **Inference/Storage:** Use quantized types (e.g., `torch.qint8`) to save memory and improve speed.
- **Indexing:** Use `torch.int64` (`torch.long`).
- **Logical Operations:** Use `torch.bool`.


In [20]:
#float_32 tensor- this is the default tensor data type
#default device = cpu
#requires_grad=False

float_32_tensor = torch.tensor([3.0,6.0,9.0],
                               dtype=None,
                               device=None,
                               requires_grad=False)
float_32_tensor

tensor([3., 6., 9.])

In [21]:
# converting a tensor to a different data type

float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([3., 6., 9.], dtype=torch.float16)

In [22]:
# multiplying different data types tensors

float_16_tensor * float_32_tensor

tensor([ 9., 36., 81.])

In [24]:
# getting information from tensors
float_32_tensor.shape, float_32_tensor.dtype, float_32_tensor.device


(torch.Size([3]), torch.float32, device(type='cpu'))