<div>
<img src="https://education-team-2020.s3.eu-west-1.amazonaws.com/ai-eng/pytorch.png" alt="pytorch log" width="1000"/>
</div>

# Introduction to PyTorch

### What is PyTorch?
PyTorch is an open-source machine learning library developed by Facebook's AI Research lab (FAIR). It provides a flexible framework for building and training neural networks, and it is widely used for research and production applications in areas such as computer vision, natural language processing, and reinforcement learning.

### Key Features and Advantages:
- Dynamic computation graph: PyTorch uses a dynamic computational graph approach, allowing for more flexible and intuitive model development compared to static graph frameworks.
- Easy debugging: PyTorch's dynamic nature enables easier debugging and error handling during model development.
- Pythonic interface: PyTorch provides a Pythonic interface that makes it easy to use and integrate with other Python libraries and frameworks.
- GPU acceleration: PyTorch seamlessly supports GPU acceleration, allowing for efficient training and inference on NVIDIA GPUs.
- Rich ecosystem: PyTorch has a rich ecosystem of libraries and tools for tasks such as data loading, model deployment, and visualization.

### Installation and Setup:
To install PyTorch, you can use pip, conda, or build from source. Here's how to install PyTorch using pip:

In [None]:
!pip install torch torchvision

This command installs both PyTorch and torchvision, which is a PyTorch package containing popular datasets, model architectures, and common image transformations.

For more detailed installation instructions and platform-specific considerations, refer to the official PyTorch documentation: **PyTorch Installation Guide**

Once PyTorch is installed, you can start using it in your Python environment by importing the torch module:

In [None]:
import torch

Now you're ready to explore the powerful capabilities of PyTorch for building and training neural networks!

## 1. Tensors and Operations with PyTorch

### Basics and Properties:
- **Tensors**: Tensors are the fundamental data structure in PyTorch, similar to arrays in NumPy. They can be scalars, vectors, matrices, or higher-dimensional arrays.
- **Properties**: 
  - Tensors have a data type (e.g., float32, int64) and a shape (e.g., [3, 4] for a 3x4 matrix).
  - Tensors can be stored on different devices, such as CPU or GPU.
  - PyTorch tensors support automatic differentiation using the autograd module.

### Examples of Creating and Manipulating Tensors:

In [None]:
# 1. Create a tensor from a list
tensor1 = torch.tensor([1, 2, 3])

# 2. Create a tensor of zeros
tensor2 = torch.zeros(2, 3)

# 3. Create a tensor of random values
tensor3 = torch.randn(3, 3)

# 4. Accessing and modifying tensor elements
print(tensor3[0, 0])  # Access element at row 0, column 0
tensor3[1, 1] = 0      # Modify element at row 1, column 1

# 5. Reshaping tensors
tensor4 = torch.arange(9).reshape(3, 3)

In [None]:
print(tensor1)
print(tensor2)
print(tensor3)
print(tensor4)

### Tensor Operations:
   - Arithmetic Operations: PyTorch tensors support arithmetic operations such as addition, subtraction, multiplication, and division, both element-wise and with scalar values.
   - Indexing and Slicing: Tensors can be indexed and sliced to access specific elements or sub-tensors.
   - Broadcasting: PyTorch automatically broadcasts tensors of different shapes during arithmetic operations, allowing for efficient element-wise operations.
  

 
#### Arithmetic operations

In [None]:
result = tensor2 + tensor3
result = tensor1 * 2

In [None]:
### Indexing and slicing
subset = tensor4[:, :2]

In [None]:
### Broadcasting
tensor5 = torch.tensor([1, 2, 3])
result = tensor4 + tensor5.reshape(3, 1)  # Broadcasting tensor5 to match the shape of tensor4

## 2. Autograd: Automatic Differentiation

### Introduction to Automatic Differentiation:
Automatic differentiation is a technique used in machine learning frameworks to automatically compute gradients of functions with respect to their input variables. It plays a crucial role in training neural networks by enabling efficient optimization algorithms such as gradient descent.

### How Autograd Works in PyTorch:
PyTorch's autograd package provides automatic differentiation capabilities, allowing users to compute gradients of tensors with respect to some scalar loss function. It works by dynamically building a computational graph during the forward pass of the network and then efficiently computing gradients using the chain rule during the backward pass.

### Computing Gradients with Autograd:
To compute gradients in PyTorch, you first need to set the `requires_grad` attribute of tensors to `True` for which you want to compute gradients. Then, you perform forward pass operations, compute a scalar loss function, and call the `backward()` method on the loss tensor. This triggers the computation of gradients for all tensors with `requires_grad=True` using the chain rule.

In [None]:
# Define tensors with requires_grad=True
x = torch.tensor(3.0, requires_grad=True)
y = torch.tensor(4.0, requires_grad=True)

# Perform forward pass operations
z = x * y

# Compute scalar loss function
loss = z**2

# Compute gradients
loss.backward()

# Access gradients
print(x.grad)  # Gradient of loss w.r.t. x
print(y.grad)  # Gradient of loss w.r.t. y

## 3. Building Neural Networks

### Components of Neural Networks in PyTorch:
Neural networks in PyTorch are composed of various components, including layers, activation functions, loss functions, and optimizers. These components work together to define the architecture, train the model, and make predictions.

### Defining Network Architecture using nn.Module:
In PyTorch, neural network architectures are defined using the `nn.Module` class. This class serves as a base class for all neural network modules and provides methods for defining and organizing the components of the network.

In [None]:
```python
import torch.nn as nn

class NeuralNetwork(nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        # Define layers and other components
        self.fc1 = nn.Linear(in_features=784, out_features=128)
        self.fc2 = nn.Linear(in_features=128, out_features=10)
        self.relu = nn.ReLU()
        self.softmax = nn.Softmax(dim=1)
        
    def forward(self, x):
        # Define forward pass operations
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        x = self.softmax(x)
        return x
```

### Implement Forward Pass and Backward Pass:
The forward pass of a neural network is implemented in the ```forward()``` method of the ```nn.Module``` subclass. This method defines the sequence of operations that transform input data into output predictions. During training, the backward pass is automatically computed using automatic differentiation (autograd) to compute gradients of the loss function with respect to the network parameters.

##### Instantiate the neural network

In [None]:
model = NeuralNetwork()

# Forward pass
input_data = torch.randn(64, 784)
output_predictions = model(input_data)

# Backward pass (gradient computation)
loss_function = nn.CrossEntropyLoss()
loss = loss_function(output_predictions, target_labels)
loss.backward()

### Model Initialization and Parameter Access:
Neural network parameters are initialized during model instantiation and can be accessed using the parameters() method. This method returns an iterator over all model parameters, which can then be used for tasks such as parameter initialization, optimization, or inspection.

In [None]:
# Initialize model parameters
def initialize_weights(m):
    if isinstance(m, nn.Linear):
        nn.init.xavier_uniform_(m.weight)
        nn.init.zeros_(m.bias)

model.apply(initialize_weights)

# Access model parameters
for name, param in model.named_parameters():
    print(name, param.size())

By leveraging the nn.Module class and its methods, you can easily define, train, and access neural network architectures in PyTorch.

## 4. Training Models in PyTorch

### Overview of the Training Process:
Training a model in PyTorch involves iteratively feeding input data through the network, computing the loss, and updating the network parameters to minimize the loss. This process typically consists of the following steps:

### Dataset and DataLoader:
- **Dataset**: Prepare your dataset by creating a custom dataset class that inherits from `torch.utils.data.Dataset`. This class should implement methods to load and preprocess the data.
- **DataLoader**: Use the `torch.utils.data.DataLoader` class to create batches of data from your dataset. This class provides utilities for efficient data loading and batching, including shuffling and parallelizing data loading.

### Defining Loss Functions and Optimizers:
- **Loss Functions**: Choose an appropriate loss function based on your task (e.g., classification, regression). PyTorch provides a wide range of loss functions in the `torch.nn` module, such as `nn.CrossEntropyLoss` for classification tasks and `nn.MSELoss` for regression tasks.
- **Optimizers**: Select an optimization algorithm to update the network parameters. PyTorch provides various optimizers in the `torch.optim` module, including SGD, Adam, and RMSprop. Initialize the optimizer by passing the network parameters and specifying the learning rate.

### Training Loop:
- **Forward Pass**: Iterate over the batches of data and pass them through the network to compute the output predictions.
- **Compute Loss**: Calculate the loss between the predicted output and the ground truth labels using the chosen loss function.
- **Backward Pass**: Use automatic differentiation (autograd) to compute the gradients of the loss function with respect to the network parameters.
- **Parameter Updates**: Update the network parameters using the gradients and the selected optimization algorithm.
- **Repeat**: Iterate over the dataset multiple times (epochs) until convergence or a specified number of epochs.

### Monitoring Training Progress and Evaluating Model Performance:
- **Logging**: Track and log relevant metrics during training, such as loss and accuracy, to monitor the training progress.
- **Validation Set**: Split your dataset into training and validation sets to evaluate the model's performance on unseen data. Use the validation set to tune hyperparameters and prevent overfitting.
- **Evaluation Metrics**: Choose appropriate evaluation metrics based on your task (e.g., accuracy, precision, recall) to assess the model's performance on the validation set.

By following these steps, you can effectively train models in PyTorch for a variety of machine learning tasks.

### Example: Training Loop in PyTorch

In [None]:
import torch.optim as optim

# Define a simple neural network
class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        self.fc = nn.Linear(10, 1)

    def forward(self, x):
        return self.fc(x)

# Instantiate the model
model = SimpleNN()

# Define dummy input and target data
inputs = torch.randn(32, 10)
targets = torch.randn(32, 1)

# Define loss function and optimizer
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)

# Training loop
num_epochs = 10
for epoch in range(num_epochs):
    # Forward pass
    outputs = model(inputs)
    
    # Compute loss
    loss = criterion(outputs, targets)
    
    # Backward pass
    optimizer.zero_grad()  # Zero gradients
    loss.backward()        # Compute gradients
    
    # Parameter updates
    optimizer.step()       # Update parameters
    
    # Print training progress
    print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')

This code snippet above demonstrates a simple training loop in PyTorch. It includes the forward pass, backward pass (gradient computation), and parameter updates using the Stochastic Gradient Descent (SGD) optimizer. The model used is a simple neural network with one linear layer. During each epoch, the model processes the input data, computes the loss, backpropagates the gradients, and updates the model parameters.

## 5. Advanced Concepts

### Custom Layers and Loss Functions

#### Custom Layers and Activation Functions:
Creating custom layers and activation functions in PyTorch allows you to extend the functionality of the library and tailor neural network architectures to your specific needs. Here's how you can create custom layers and activation functions:

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F

# Custom layer example
class CustomLayer(nn.Module):
    def __init__(self, input_size, output_size):
        super(CustomLayer, self).__init__()
        self.linear = nn.Linear(input_size, output_size)
        self.activation = nn.ReLU()

    def forward(self, x):
        return self.activation(self.linear(x))

# Custom activation function example
class CustomActivation(nn.Module):
    def __init__(self):
        super(CustomActivation, self).__init__()

    def forward(self, x):
        return torch.sin(x)  # Custom activation function (sine)

#### Custom Loss Functions:
In addition to built-in loss functions provided by PyTorch, you can create custom loss functions tailored to specific tasks or objectives. Here's an example of how you can implement a custom loss function:

In [None]:
# Custom loss function example
def custom_loss(output, target):
    return torch.mean((output - target) ** 2)  # Mean squared error loss

# Usage:
criterion = custom_loss(output, target)

By creating custom layers, activation functions, and loss functions, you can enhance the flexibility and expressiveness of your neural network models in PyTorch, enabling you to tackle a wide range of machine learning tasks with precision and efficiency.

#### Custom Optimizers

Implementing custom optimization algorithms beyond built-in optimizers in PyTorch allows for fine-tuning optimization strategies to specific use cases or research objectives. Here's how you can create a custom optimizer:

In [None]:
import torch.optim as optim

# Custom optimizer example
class CustomOptimizer(optim.Optimizer):
    def __init__(self, params, lr=0.01):
        defaults = dict(lr=lr)
        super(CustomOptimizer, self).__init__(params, defaults)

    def step(self, closure=None):
        loss = None
        if closure is not None:
            loss = closure()

        for group in self.param_groups:
            for param in group['params']:
                if param.grad is None:
                    continue
                grad = param.grad.data
                param.data.add_(-group['lr'], grad)  # Custom optimization step
        return loss

# Usage:
optimizer = CustomOptimizer(model.parameters(), lr=0.01)

In the example above, CustomOptimizer is a subclass of torch.optim.Optimizer that implements a custom optimization step using a specific update rule. You can define your custom update rule based on the gradients of the parameters and adjust the parameter values accordingly.

By creating custom optimizers, you can experiment with novel optimization algorithms or fine-tune existing ones to improve the training process and achieve better performance for your machine learning models in PyTorch.

#### GPU Acceleration

Utilizing GPUs for accelerated computation in PyTorch can significantly speed up training and inference processes, especially for deep learning models with large datasets. Here's how you can move tensors and models to the GPU and back:

In [None]:
import torch

# Check if GPU is available
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Example: Moving tensors to GPU
tensor_cpu = torch.randn(32, 64)  # Create a tensor on CPU
tensor_gpu = tensor_cpu.to(device)  # Move tensor to GPU

In the example above, ```torch.cuda.is_available()``` checks if a GPU is available, and ```tensor.to(device)``` moves the tensor to the specified device (GPU or CPU). Similarly, you can move entire models to the GPU:

In [None]:
import torch
import torch.nn as nn

# Define a simple neural network
class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        self.fc = nn.Linear(10, 1)

    def forward(self, x):
        return self.fc(x)

# Instantiate the model
model = SimpleNN()
model.to(device)  # Move model to GPU

By moving both tensors and models to the GPU, you can leverage the parallel processing power of GPUs to accelerate computations and improve the efficiency of your machine learning workflows in PyTorch.

#### Distributed Training

Distributed training with PyTorch enables efficient utilization of multiple GPUs or even multiple machines to accelerate model training and handle larger datasets. PyTorch provides two main approaches for distributed training: `DataParallel` and `DistributedDataParallel`.

##### Overview of Distributed Training:
Distributed training involves splitting the training workload across multiple devices (GPUs or machines) and coordinating the communication between these devices to update model parameters efficiently. PyTorch supports distributed training via its `torch.distributed` package, which provides utilities for parallelism and communication.

##### Multi-GPU Training with DataParallel:
`DataParallel` is a simple and easy-to-use approach for multi-GPU training in PyTorch. It replicates the model across multiple GPUs and divides the input data into smaller batches, allowing each GPU to process a portion of the data independently. Here's how you can use `DataParallel`:

In [None]:
# Define a simple neural network
class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        self.fc = nn.Linear(10, 1)

    def forward(self, x):
        return self.fc(x)

# Instantiate the model
model = SimpleNN()

# Wrap the model with DataParallel
model = nn.DataParallel(model)

Multi-GPU Training with `DistributedDataParallel`:
DistributedDataParallel (DDP) is a more advanced approach that provides better scalability and flexibility for distributed training. It leverages the torch.distributed package to synchronize gradients and model parameters across multiple processes. Here's how you can use DistributedDataParallel:

In [None]:
import torch
import torch.nn as nn
import torch.distributed as dist
from torch.nn.parallel import DistributedDataParallel as DDP

# Initialize distributed training
dist.init_process_group(backend='nccl', init_method='...')
rank = dist.get_rank()
world_size = dist.get_world_size()

# Define a simple neural network
class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        self.fc = nn.Linear(10, 1)

    def forward(self, x):
        return self.fc(x)

# Instantiate the model
model = SimpleNN()

# Wrap the model with DistributedDataParallel
model = DDP(model)

By utilizing `DataParallel` or `DistributedDataParalle`l, you can easily scale your PyTorch training across multiple GPUs or machines, improving training speed and efficiency.

#### Model Deployment

Deploying trained models for inference in production systems requires exporting the model and integrating it with deployment platforms. PyTorch provides several methods for exporting trained models and offers integration options for various deployment scenarios.

##### Exporting Trained Models for Inference:
PyTorch provides facilities for exporting trained models to formats suitable for inference in production environments, such as ONNX (Open Neural Network Exchange) and TorchScript. Here's how you can export a trained model to ONNX format:

In [None]:
import torchvision.models as models

# Load a pre-trained model
model = models.resnet18(pretrained=True)

# Export the model to ONNX format
dummy_input = torch.randn(1, 3, 224, 224)
torch.onnx.export(model, dummy_input, "resnet18.onnx", verbose=True)

##### Integration with Production Systems and Deployment Platforms:
Once the model is exported, it can be integrated into production systems using various deployment platforms and frameworks, such as TensorFlow Serving, Amazon SageMaker, or Microsoft Azure ML. Here's an example of how you can deploy a PyTorch model using TensorFlow Serving:

In [None]:
# Install TensorFlow Serving
!pip install tensorflow-serving-api

# Start TensorFlow Serving
!tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=resnet18 --model_base_path=/path/to/model/directory

After starting TensorFlow Serving, you can send inference requests to the server using REST API calls or gRPC requests.

Alternatively, you can deploy PyTorch models using cloud-based deployment platforms, such as Amazon SageMaker or Microsoft Azure ML, which provide managed services for deploying and serving machine learning models in production environments.

By exporting trained models and integrating them with production systems and deployment platforms, you can deploy PyTorch models for inference in real-world applications and scale your machine learning solutions effectively.