# PyTorch Basics - Getting Started

This notebook covers the fundamental concepts of PyTorch:
- Tensors and operations
- Automatic differentiation
- Basic neural network construction
- Training loops


In [1]:
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
import matplotlib.pyplot as plt

print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA device: {torch.cuda.get_device_name(0)}")


PyTorch version: 2.5.1+cu121
CUDA available: True
CUDA device: NVIDIA GeForce GTX 1650


## 1. Working with Tensors
**What is a Tensor?**

By Google's definition:
- It is a multi-dimensional array of numbers that generalizes scalars, vectors, and matrices to higher dimensions. In machine learning and AI, tensors are the fundamental data structure used to represent inputs, outputs, and model parameters (similar to NumPy's ndarrays, but with the ability to run on hardware accelerators like GPUs). More broadly, a tensor is a mathematical object with a specific transformation property under a change of coordinates, giving it a geometric meaning. 

Types of tensors:
- Rank 0 Tensor -> a Scalar, a single number
- Rank 1 Tensor -> a Vector, one-dimensional array of numbers
- Rank 2 Tensor -> a Matrix, a 2 dimensional grid of numbers
- Rank 3 Tensor -> a Three-dimensional grid of numbers
- Rank 4 and nove -> Multi-dimensional arrays with more than 3 dimensions


<img src="nn_tensor_example.png">




Before we start, below is an example of a PyTorch workflow:
<img src="pytorch_workflow.png">




In [2]:
#Introduction and Creating tensors
#Intro to Scalars
scalar = torch.tensor(7)
scalar

tensor(7)

torch.tensor is the most common variable that is used to create almost all the pytorch tensors.

Official documentation: 

https://docs.pytorch.org/tutorials/beginner/blitz/tensor_tutorial.html

In [4]:
#Tensors can be directly created from data
data = [1,2], [3,4]
#convert the data into tensors
x_data = torch.tensor(data)
x_data

tensor([[1, 2],
        [3, 4]])

In [6]:
#create tensors from a numpy array
np_array = np.array(data)
x_np = torch.from_numpy(np_array)
x_np

tensor([[1, 2],
        [3, 4]])

In [8]:
#Let's create a tensor from a tensor
x_ones = torch.ones_like(x_data) #this retains the properties of x_data
print(f"Ones Tensor: \n {x_ones} \n")
x_rand = torch.rand_like(x_data, dtype=torch.float) #this overrides the datatype of x_data
print(f"Random Tensor: \n {x_rand} \n")


Ones Tensor: 
 tensor([[1, 1],
        [1, 1]]) 

Random Tensor: 
 tensor([[0.3454, 0.3503],
        [0.1050, 0.7709]]) 



In [9]:
#Let's create a tensor with random or constant values
#shape is a tuple of tensor dimensions
shape = (2, 3,)
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)

print(f"Random Tensor: \n {rand_tensor} \n")
print(f"Ones Tensor: \n {ones_tensor} \n")
print(f"Zeros Tensor: \n {zeros_tensor} \n")


Random Tensor: 
 tensor([[0.1981, 0.7919, 0.9392],
        [0.0187, 0.6356, 0.2956]]) 

Ones Tensor: 
 tensor([[1., 1., 1.],
        [1., 1., 1.]]) 

Zeros Tensor: 
 tensor([[0., 0., 0.],
        [0., 0., 0.]]) 



## 2. Tensor Attributes
Tensor attributes describe their shape, datatype, and the device on which they are stored.

In [10]:
tensor = torch.rand(3,4)

print(f"Shape of tensor: {tensor.shape}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")

Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


## 3. Tensor Operations
Over 100 tensor operations, including transposing, indexing, slicing, math operations, linear algebra, random sampling, all of them can be seen here:

https://docs.pytorch.org/docs/stable/torch.html

Let's try out some of the operations from the Numpy API



In [11]:
#standard numpy-like indexing and slicing
tensor = torch.ones(4, 4)
tensor[:,1] = 0
print(tensor)

tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])


In [12]:
#For joining tensors, you can use torch.cat() to concatenate a sequence of tensors along a given dimension
t1 = torch.cat([tensor, tensor, tensor], dim=1)
print(t1)

tensor([[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.]])


In [13]:
#Here is an example of tensor multiplication
#this computes element-wise product
print(f"tensor.mul(tensor): \n {tensor.mul(tensor)} \n")
#alternative syntax:
print(f"tensor * tensor: \n {tensor * tensor} \n")

tensor.mul(tensor): 
 tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]]) 

tensor * tensor: 
 tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]]) 



In [14]:
#Let's compute the matrix multiplication between two tensors
print(f"tensor.matmul(tensor.T): \n {tensor.matmul(tensor.T)} \n")
#alternative syntax but this does the same thing lol
print(f"tensor @ tensor.T \n {tensor @ tensor.T} \n")

tensor.matmul(tensor.T): 
 tensor([[3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.]]) 

tensor @ tensor.T 
 tensor([[3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.]]) 



In [None]:
#In place operations
#In place operations are denoted by a _ at the end of the method
#This is a common convention in PyTorch for in place operations
print(tensor, "\n")
tensor.add_(5)
print(tensor)

tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]]) 

tensor([[6., 5., 6., 6.],
        [6., 5., 6., 6.],
        [6., 5., 6., 6.],
        [6., 5., 6., 6.]])


## 4. Bridge with Numpy
Tensors on CPU and Numpy Arrays can share their memory locations, and changing one will change the other.

In [16]:
t = torch.ones(5)
print(f"t: {t}")
n = t.numpy()
print(f"n: {n}")

t: tensor([1., 1., 1., 1., 1.])
n: [1. 1. 1. 1. 1.]


In [17]:
#a change in the tensor reflects in the numpy array
t.add_(1)
print(f"t: {t}")
print(f"n: {n}")

t: tensor([2., 2., 2., 2., 2.])
n: [2. 2. 2. 2. 2.]


Numpy array to tensor, where changes in the NumPy array reflects in the tensor

In [19]:
#changes in the numpy array reflects in the tensor
n = np.ones(5)
t = torch.from_numpy(n)
np.add(n, 1, out=n)
print(f"t: {t}")
print(f"n: {n}")

t: tensor([2., 2., 2., 2., 2.], dtype=torch.float64)
n: [2. 2. 2. 2. 2.]
