# PyTorch Basics Notebook

## Introduction
Welcome to the PyTorch Basics Notebook! In this notebook, we will explore the fundamental concepts and functionalities of PyTorch, a popular deep learning framework. PyTorch provides a flexible and intuitive way to build and train neural networks.

## Table of Contents
1. Tensor Basics
2. Tensor Operations
3. CUDA Tensors
4. Autograd and Gradients
5. Dataset and DataLoader
6. Neural Networks
7. Loss Functions and Optimizers
8. Training Loop
9. Model Evaluation
10. Saving and Loading Models
11. Transfer Learning
12. Data Augmentation
13. Distributed Training

## 1. Tensor Basics
PyTorch is built around tensors, which are similar to NumPy arrays but can also be used on GPUs for accelerated computing. Let's start by creating a tensor and exploring its properties.

### ELI5: Tensors
Imagine tensors as containers that can hold numbers. Just like you can have boxes of different sizes and shapes to store your toys, tensors can have different dimensions and sizes to store numbers. Tensors are the building blocks of PyTorch, and we use them to represent and manipulate data in our programs.


In [None]:
import torch

tensor = torch.tensor([[1, 2, 3], [4, 5, 6]], dtype=torch.float32)
print(tensor)
print(tensor.shape)
print(tensor.dtype)
print(tensor.device)

## 2. Tensor Operations
PyTorch provides a wide range of tensor operations that allow us to perform mathematical computations on tensors. Let's explore some common tensor operations.
## ELI5: Tensor Operations
Just like you can add, subtract, or multiply numbers, you can do similar operations on tensors. Tensor operations help us manipulate and transform the data stored in tensors. We can perform element-wise operations, matrix multiplication, and more.

In [None]:
tensor = torch.ones(3, 4)
print(tensor + 5)
print(tensor * 2)
print(torch.matmul(tensor, tensor.T))

## 3. CUDA Tensors
PyTorch allows us to leverage the power of NVIDIA GPUs to accelerate computations. By moving tensors to CUDA-enabled devices, we can perform operations much faster.
## ELI5: CUDA Tensors
Imagine you have a superhero friend named CUDA who can help you do your tasks faster. When you give your toys (tensors) to CUDA, it can play with them and finish tasks quickly. In PyTorch, we can move our tensors to CUDA devices (like GPUs) to make our programs run faster.

In [None]:
if torch.cuda.is_available():
    device = torch.device("cuda")
    tensor = tensor.to(device)
    print(tensor)
else:
    print("CUDA not available")

## 4. Autograd and Gradients
PyTorch's autograd package provides automatic differentiation capabilities, which are essential for training neural networks. Let's see how we can use autograd to compute gradients.

### ELI5: Autograd and Gradients
Imagine you are hiking on a mountain, and you want to find the steepest path to the top. Gradients are like arrows that point in the direction of the steepest climb. In PyTorch, autograd helps us find these arrows (gradients) automatically, making it easier to train our models and reach the top of the mountain (optimal solution).


In [None]:
x = torch.tensor([[1., 2., 3.], [4., 5., 6.]], requires_grad=True)
y = x ** 2
z = y.sum()
z.backward()
print(x.grad)

## 5. Dataset and DataLoader
PyTorch provides the Dataset and DataLoader classes to efficiently load and preprocess data for training models. Let's create a custom dataset and use a DataLoader to iterate over it.
## ELI5: Dataset and DataLoader
Imagine you have a toy collection, and you want to play with a few toys at a time. A Dataset is like your entire toy collection, and a DataLoader is like a helper who brings you a few toys to play with in each turn. This way, you can enjoy your toys in smaller groups rather than playing with all of them at once.

In [None]:
from torch.utils.data import DataLoader, Dataset

class CustomDataset(Dataset):
    def __init__(self, data, labels):
        self.data = data
        self.labels = labels
    
    def __len__(self):
        return len(self.data)
    
    def __getitem__(self, idx):
        return self.data[idx], self.labels[idx]

data = torch.randn(100, 10)
labels = torch.randint(0, 5, (100,))
dataset = CustomDataset(data, labels)
dataloader = DataLoader(dataset, batch_size=32, shuffle=True)

## 6. Neural Networks
PyTorch provides a module called nn to define and build neural networks. Let's create a simple neural network using the nn.Module class.
## ELI5: Neural Networks
Imagine a neural network as a machine that can learn to recognize patterns. It has different parts (layers) that work together to understand the input and make predictions. We can teach the machine by showing it many examples and adjusting its parts until it becomes really good at recognizing patterns.

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(10, 20)
        self.fc2 = nn.Linear(20, 5)
    
    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x

net = Net()
print(net)

## 7. Loss Functions and Optimizers
Loss functions measure how well our model is performing, and optimizers help adjust the model's parameters to minimize the loss. PyTorch provides various loss functions and optimization algorithms.
## ELI5: Loss Functions and Optimizers
Imagine you are playing a game where you need to guess a number. The loss function is like a score that tells you how far your guess is from the correct number. The optimizer is like a guide who helps you adjust your guesses based on the score, so you can get closer to the correct number with each attempt.

In [None]:
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.01)

## 8. Training Loop
The training loop is where we iterate over the dataset, perform forward and backward passes, and update the model's parameters. Let's implement a basic training loop.
## ELI5: Training Loop
Imagine you are teaching a robot to perform a task. The training loop is like repeatedly showing the robot how to do the task and letting it practice. Each time the robot practices, it learns from its mistakes and gets better at the task. We keep repeating this process until the robot becomes really good at performing the task.

In [None]:
for epoch in range(5):
    running_loss = 0.0
    for inputs, labels in dataloader:
        optimizer.zero_grad()
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
    print(f"Epoch {epoch+1} loss: {running_loss / len(dataloader):.3f}")

## 9. Model Evaluation
After training our model, we need to evaluate its performance on unseen data. Let's evaluate our model on the test dataset.
## ELI5: Model Evaluation
Imagine you have taught a friend how to solve puzzles. To see how well your friend has learned, you give them new puzzles they haven't seen before and check how many they can solve correctly. This is like evaluating a trained model on a test dataset to see how well it performs on new, unseen data.

In [None]:
correct = 0
total = 0
with torch.no_grad():
    for inputs, labels in dataloader:
        outputs = net(inputs)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
print(f"Accuracy: {100 * correct / total:.2f}%")

## 10. Saving and Loading Models
We often need to save trained models for later use or share them with others. PyTorch allows us to save and load model state dictionaries.
## ELI5: Saving and Loading Models
Imagine you have built a really cool LEGO model, and you want to show it to your friends later. You can take apart the model and put the pieces in a special box (save the model) and then rebuild it later (load the model) to show your friends. This way, you don't have to build the model from scratch every time you want to show it.

In [None]:
torch.save(net.state_dict(), "model.pth")
loaded_net = Net()
loaded_net.load_state_dict(torch.load("model.pth"))

## 11. Transfer Learning
Transfer learning allows us to use pre-trained models and fine-tune them for our specific tasks. PyTorch provides many pre-trained models that we can use as a starting point.
## ELI5: Transfer Learning
Imagine you want to learn to play a new musical instrument, but you already know how to play the piano. Instead of starting from scratch, you can use your existing knowledge of music and apply it to learn the new instrument faster. This is like transfer learning, where we use a pre-trained model that has already learned general features and adapt it to our specific task, making the learning process faster and easier.

In [None]:
import torchvision

model = torchvision.models.resnet18(pretrained=True)
num_ftrs = model.fc.in_features
model.fc = nn.Linear(num_ftrs, 10)
print(model)

## 12. Data Augmentation
Data augmentation is a technique used to artificially increase the size and diversity of the training dataset by applying random transformations to the input data. PyTorch provides various transforms in the torchvision package.
## ELI5: Data Augmentation
Imagine you have a few toy cars, but you want to have more variety in your collection. You can use your imagination to create new cars by painting them different colors, adding stickers, or changing their wheels. This is like data augmentation, where we take existing data and apply different transformations to create new, varied examples for our model to learn from.

In [None]:
import torchvision.transforms as transforms

transform = transforms.Compose([
    transforms.RandomResizedCrop(224),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])
dataset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)

## 13. Distributed Training
Distributed training allows us to train models on multiple GPUs or machines to accelerate the training process. PyTorch provides built-in support for distributed training.
## ELI5: Distributed Training
Imagine you and your friends want to build a big LEGO model together. Instead of building it alone, you can divide the work among your friends, and each person can work on a different part of the model simultaneously. This way, you can build the model much faster than if you were working alone. Distributed training is similar, where we split the training process across multiple GPUs or machines to make it faster and more efficient.

In [None]:
if torch.cuda.device_count() > 1:
    print(f"Using {torch.cuda.device_count()} GPUs for training")
    net = nn.DataParallel(net)

## Conclusion
Congratulations! You(I) have completed the PyTorch Basics Notebook. You have learned about tensors, tensor operations, CUDA tensors, autograd, datasets, neural networks, training, evaluation, saving and loading models, transfer learning, data augmentation, and distributed training. PyTorch provides a powerful and flexible framework for building and training deep learning models. Keep exploring and building amazing projects with PyTorch!