# Introduction to PyTorch

This notebook was adapted from [Stanford's CS224N Pytorch](https://github.com/SunnyHaze/Stanford-CS224N-NLP/blob/main/CS224N%20PyTorch%20Tutorial.ipynb) Tutorial by Dilara Soylu as well as the official [PyTorch 60 Minute Blitz Tutorial](https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html) demo for PyTorch.

We will have a basic introduction to `PyTorch` and Tensors and how to use them to create, train and evaluate Neural Networks. In the end, we will build, train, and evaluate our first classifier by classifying two moons!

## Introduction
[PyTorch](https://pytorch.org/) is a machine learning framework that is used in both academia and industry for various applications. PyTorch started of as a more flexible alternative to [TensorFlow](https://www.tensorflow.org/), which is another popular machine learning framework. At the time of its release, `PyTorch` appealed to the users due to its user friendly nature: as opposed to defining static graphs before performing an operation as in `TensorFlow`, `PyTorch` allowed users to define their operations as they go, which is also the approached integrated by `TensorFlow` in its following releases. Although `TensorFlow` is more widely preferred in the industry, `PyTorch` is often times the preferred machine learning framework for researchers.

Now that we have learned enough about the background of `PyTorch`, let's start by importing it into our notebook.

In [150]:
import torch
import torch.nn as nn # contains functionality for building neural networks
import numpy as np

Like in the last notebook, we can use `__version__` to check the `PyTorch` version that Colab is running on.

In [151]:
torch.__version__

'2.6.0+cu124'

PyTorch is open source and the documentation can he accessed [here](https://pytorch.org/docs/stable/index.html). With it imported, we can get started!

## Tensors

Tensors are the most basic building blocks in `PyTorch`. Tensors are similar to matrices, but the have extra properties and they can represent higher dimensions. For example, an square RGB image with 256 pixels in both sides can be represented by a 3x256x256 tensor, where the first 3 dimensions represent the color channels RGB. In `PyTorch`, we often use tensors to encode the inputs and outputs of a neural network model, as well as the model's parameters, to a numeric format which can be understood by the architecture. Tensors can run on GPU's to accelerate e.g. network training.

### Tensor Initialization

There are several ways to instantiate tensors:

**Directly from data**

Tensors can be created directly from data. The data type is
automatically inferred.

In [152]:
data = [[1, 2], [3, 4]]
x_data = torch.tensor(data)
x_data

tensor([[1, 2],
        [3, 4]])

In [153]:
print(type(x_data)) # prints the type of the data structure, i.e. tensor
print(x_data.dtype) # print type of elements in the tensor

<class 'torch.Tensor'>
torch.int64


We can also speficy the data type (`dtype`) directly:

In [154]:
# We are using the dtype to create a float tensor
x_float = torch.tensor(data, dtype=torch.float)
x_float.dtype

torch.float32

**From a Python List**

We can initalize a tensor from a Python list, which could include sublists. The dimensions and the data types will be automatically inferred by PyTorch when we use torch.tensor().

In [155]:
# Initialize a tensor from a Python List
data = [
        [0, 1],
        [2, 3],
        [4, 5]
       ]
x_python = torch.tensor(data)

# Print the tensor
x_python

tensor([[0, 1],
        [2, 3],
        [4, 5]])

**From a NumPy array**

Tensors can be created from NumPy arrays (and vice versa).

In [156]:
np_array = np.array(data)
x_np = torch.from_numpy(np_array)
x_np

tensor([[0, 1],
        [2, 3],
        [4, 5]])

**From another tensor:**

The new tensor retains the properties (shape, datatype) of the argument
tensor, unless explicitly overridden.

In [157]:
x_ones = torch.ones_like(x_data) # retains the properties of x_data
print(f"Ones Tensor: \n {x_ones} \n")

x_rand = torch.rand_like(x_data, dtype=torch.float) # overrides the datatype of x_data
print(f"Random Tensor: \n {x_rand} \n")

Ones Tensor: 
 tensor([[1, 1],
        [1, 1]]) 

Random Tensor: 
 tensor([[0.6109, 0.5822],
        [0.4923, 0.5586]]) 



**With random or constant values:**

Similar to what we have seen with NumPy, we can pre-fill tensors with static values like 1, or random numbers, just such that we have the shape as a placeholder. For this, we define `shape` as a tuple of tensor dimensions. In the functions below, it
determines the dimensionality of the output tensor.

In [158]:
shape = (2, 3,) # 2x3x1 = 2x3 tensor
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)

print(f"Random Tensor: \n {rand_tensor} \n")
print(f"Ones Tensor: \n {ones_tensor} \n")
print(f"Zeros Tensor: \n {zeros_tensor}")

# to read the dimensions of a tensor, use shape or size()
print(zeros_tensor.shape)
print(zeros_tensor.size())

Random Tensor: 
 tensor([[0.6514, 0.0606, 0.1859],
        [0.8758, 0.7169, 0.2399]]) 

Ones Tensor: 
 tensor([[1., 1., 1.],
        [1., 1., 1.]]) 

Zeros Tensor: 
 tensor([[0., 0., 0.],
        [0., 0., 0.]])
torch.Size([2, 3])
torch.Size([2, 3])


### Tensor Attributes

Tensors have several attributes which are important to know and adjust to your needs. Some of these properties are the aforementioned `shape` (aka dimensions), `dtype` (data type of the elements in the tensor) and the `device` they are stored on. The device could for instance be a CPU or a GPU. During training of neural networks, we might want to push our tensors onto the GPU device for accelerated training. Let's look at these tensor attributes below:

In [159]:
tensor = torch.rand(3, 4)

print(f"Shape of tensor: {tensor.shape}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")

Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


We can also index separate dimensions of our tensor. Here, we have a 2D tensor with 3 rows X 4 columns.

In [160]:
print(tensor.shape[0]) # access the row dimension (is only considered the row dimension in 2D)
print(tensor.size(0)) # another way to access the row dimension

3
3


### Tensor Operations

Over 100 tensor operations, including transposing, indexing, slicing,
mathematical operations, linear algebra, random sampling, and more are
comprehensively described
[here](https://pytorch.org/docs/stable/torch.html), where each of them can be run on the CPU and on the GPU.

**Standard numpy-like indexing and slicing:**

We index the rows of a 2D tensor by writing the row index into the parentheses. For indexing multiple dimensions, we use `:` and separate them by a comma.

In [161]:
tensor = torch.ones(4, 4)
tensor[:,1] = 0 # set second col to 0 (remember 0-indexing in Python)
print(tensor)

tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])


In [162]:
tensor[0] # access 0th element of the tensor, which for a 2D tensor is the first row

tensor([1., 0., 1., 1.])

The indexing operations can become more sophisticated:

In [163]:
# get the top left element (the 0's in our indexing example) of each element (colon : runs through all elements) in our tensor
x = torch.Tensor([
                  [[1, 2], [3, 4]],
                  [[5, 6], [7, 8]],
                  [[9, 10], [11, 12]]
                 ])
x[:, 0, 0]

tensor([1., 5., 9.])

**Count and access tensor elements**

Use `numel()` to count the elements in a tensor.

In [164]:
i = torch.tensor([1, 2])
i.numel()

2

Use `item()` to access a tensor's underlying elements. This works on flattened or single dimensions of a tensor:

In [165]:
i[0].item()

1

**Joining tensors**

You can use `torch.cat` to concatenate a sequence of
tensors along a given dimension. See also
[torch.stack](https://pytorch.org/docs/stable/generated/torch.stack.html),
another tensor joining op that is subtly different from `torch.cat`.

In [166]:
t1 = torch.cat([tensor, tensor, tensor], dim=1)
print(t1)

tensor([[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.]])


**Reshaping tensors**

We can change the shape of a tensor with `view()` by simply specifying our desired shape:

In [167]:
# x_view shares the same memory as x, so changing one changes the other
x = torch.Tensor([[1, 2], [3, 4], [5, 6]])
print(x) # before reshaping, shape = (3,2)
x_view = x.view(2, 3)
print(x_view) # after reshaping, shape = (2,3)

tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])
tensor([[1., 2., 3.],
        [4., 5., 6.]])


We can also just specify some of the dimensions and leave it up to Pytorch to infer the rest of them. Say we know that we want to have 3 rows, and we don't care how the rest of the tensor is structured, then we specify all dimensions we don't care about with `-1`:

In [168]:
x_view = x.view(3, -1)
x_view

tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])

We can remove singular dimensions with the `squeeze()` function.

In [169]:
x = torch.arange(10).reshape(5, 1, 2) # arange creates a list of 0-9 numbers, reshape shapes them into a tensor of dims (5,2)
x

tensor([[[0, 1]],

        [[2, 3]],

        [[4, 5]],

        [[6, 7]],

        [[8, 9]]])

In [170]:
x = x.squeeze() # removes the extra empty dimension, unsqueeze would add it back
x

tensor([[0, 1],
        [2, 3],
        [4, 5],
        [6, 7],
        [8, 9]])

**Multiplying tensors**

Similar to Numpy matrices, we have different ways to multiply Tensors:

In [171]:
# This computes the element-wise product
print(f"tensor.mul(tensor) \n {tensor.mul(tensor)} \n")
# Alternative syntax:
print(f"tensor * tensor \n {tensor * tensor}")
# Alternative syntax:
print(f"tensor @ tensor.T \n {tensor @ tensor.T}")

tensor.mul(tensor) 
 tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]]) 

tensor * tensor 
 tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])
tensor @ tensor.T 
 tensor([[3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.]])


**In-place operations**

In-place operations are operations that modify the datastructure directly, without having to re-assess it.
Operations that have a `_` suffix are in-place.
For example: `x.add_(y)` will directly add `y` to `x` without needing to call `x = x.add(y)`. However, their use is discouraged when computing derivates (important later when traiing models) due to the loss of the history.

In [172]:
print(tensor, "\n")
tensor.add_(5)
print(tensor)

tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]]) 

tensor([[6., 5., 6., 6.],
        [6., 5., 6., 6.],
        [6., 5., 6., 6.],
        [6., 5., 6., 6.]])


**From Tensors to NumPy**

Just as we can go from Numpy arrays to tensors, we can also convert them back.

In [173]:
t = torch.ones(5)
print(f"Tensor: {t}")
n = t.numpy()
print(f"Numpy: {n}")

Tensor: tensor([1., 1., 1., 1., 1.])
Numpy: [1. 1. 1. 1. 1.]


A change in the tensor reflects a change in the NumPy array:

In [174]:
t.add_(1)
print(f"Tensor: {t}")
print(f"Numpy: {n}")

Tensor: tensor([2., 2., 2., 2., 2.])
Numpy: [2. 2. 2. 2. 2.]
