# Tutorial 4 - PyTorch

## Outline

+ Installation & Introduction
+ PyTorch Tensors and auto grads
+ Building up neural network
+ Optimizers


## Installation & Introduction

+ [Official Web site](https://pytorch.org/)

+ [Installation](https://pytorch.org/get-started/locally/)


PyTorch is an open-source machine learning library widely used for deep learning applications. Developed by Facebook's AI Research lab (FAIR), it provides a flexible and intuitive framework for building and training neural networks. PyTorch is known for its ease of use, computational efficiency, and dynamic computational graph, making it a favorite among researchers and developers for both academic and industrial applications.

### Key Features of PyTorch

+ **Dynamic Computational Graph**: PyTorch uses a dynamic computation graph (also known as a define-by-run paradigm), meaning the graph is built on the fly as operations are performed. This makes it more intuitive and flexible, allowing for easy changes and debugging.

+ **Eager Execution**: Operations in PyTorch are executed eagerly, meaning they are computed immediately without waiting for a compiled graph of operations. This allows for more interactive and dynamic development.

+ **Pythonic Nature**: PyTorch is deeply integrated with Python, making it easy to use and familiar to those with Python experience. It leverages Pythonâ€™s features and libraries, allowing for seamless integration with the Python data science stack (e.g., NumPy, SciPy, Pandas).

+ **Extensive Library Support**: PyTorch provides a wide range of libraries and tools for various tasks in deep learning, including computer vision (TorchVision), natural language processing (TorchText), and more. This ecosystem supports a vast array of algorithms, pre-trained models, and datasets to facilitate development and experimentation.

+ **GPU Acceleration**: It supports CUDA, enabling it to leverage Nvidia GPUs for accelerated tensor computations. This makes training deep neural networks significantly faster compared to CPU-based training.

+ **Community and Support**: PyTorch has a large and active community, contributing to a growing ecosystem of tools, libraries, and resources. It also enjoys robust support from major tech companies, ensuring continuous development and improvement.

## Tensors

Tensors are data structure in PyTorch to manipulate data. It is very similar to numpy.ndarray, but with support for automatic differentiation and hardware acceleration (Nvidia GPU, Apple silicon)

In [1]:
import torch

In [2]:
a = torch.tensor([[1, 2], [3, 4]], dtype=torch.float)
print(type(a))
a

<class 'torch.Tensor'>


tensor([[1., 2.],
        [3., 4.]])

Bridge with NumPy

In [64]:
import numpy as np

arr = np.array([[1., 2.], [3., 4.]])
arr_torch = torch.from_numpy(arr)
arr_torch

tensor([[1., 2.],
        [3., 4.]], dtype=torch.float64)

In [4]:
# detach() stops a tensor from tracking history in automatic differentiation
arr_np = arr_torch.detach().numpy() 

Generate random numbers

In [5]:
# normal distribution
torch.randn(4, 4)

tensor([[-0.7879, -0.2330, -0.0982, -1.0348],
        [ 0.0639,  0.8598,  2.7095,  1.2524],
        [-0.7575,  0.9654, -0.3348, -0.7185],
        [-1.2670,  0.4653, -0.5597,  1.0670]])

In [6]:
# uniform distribution
torch.rand(4, 4)

tensor([[0.5604, 0.5394, 0.9632, 0.6035],
        [0.8970, 0.5477, 0.7817, 0.7909],
        [0.0512, 0.1679, 0.5124, 0.5346],
        [0.5963, 0.3559, 0.9216, 0.7628]])

Others

In [7]:
torch.arange(5)

tensor([0, 1, 2, 3, 4])

In [8]:
torch.linspace(-4, 4, 10)

tensor([-4.0000, -3.1111, -2.2222, -1.3333, -0.4444,  0.4444,  1.3333,  2.2222,
         3.1111,  4.0000])

In [9]:
torch.ones(6)

tensor([1., 1., 1., 1., 1., 1.])

In [10]:
torch.zeros(3, 2)

tensor([[0., 0.],
        [0., 0.],
        [0., 0.]])

Attributes of tensors

In [11]:
tensor = torch.rand(3,4)

print(f"Shape of tensor: {tensor.shape}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")

Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


Single-element tensor can use `.item()` method to get a Python float object

In [95]:
a = torch.Tensor([4.])
print(type(a.item()))

<class 'float'>


**PyTorch** can work on different hardwares

In [12]:
device = (
    "cuda"
    if torch.cuda.is_available()
    else "mps"
    if torch.backends.mps.is_available()
    else "cpu"
)

# send the tensor to device
tensor_device = tensor.to(device)

# send the tensor back to cpu
tensor_cpu = tensor.cpu()

### Autograd

In [13]:
x = torch.tensor([[1, 2], [3, 4]], dtype=torch.float, requires_grad=True)
y = torch.sum(x ** 2)
y.backward()
x.grad

tensor([[2., 4.],
        [6., 8.]])

## Build Neural Network with PyTorch

In [14]:
import torch.nn as nn

### Activation Functions

In [15]:
tensor = 5 * (torch.rand(3, 2) * 2 - 1)
print(tensor)

# ReLU
relu = nn.ReLU()
print("ReLU:", relu(tensor))

# Tanh
tanh = nn.Tanh()
print("Tanh:", tanh(tensor))

# Sigmoid
sigmoid = nn.Sigmoid()
print("Sigmoid:", sigmoid(tensor))

# Softmax
softmax = nn.Softmax(dim=1)
print("Softmax:", softmax(tensor))

tensor([[-3.8831, -3.5977],
        [-0.1765, -0.9821],
        [-2.4004, -2.5256]])
ReLU: tensor([[0., 0.],
        [0., 0.],
        [0., 0.]])
Tanh: tensor([[-0.9992, -0.9985],
        [-0.1747, -0.7540],
        [-0.9837, -0.9873]])
Sigmoid: tensor([[0.0202, 0.0267],
        [0.4560, 0.2725],
        [0.0831, 0.0741]])
Softmax: tensor([[0.4291, 0.5709],
        [0.6912, 0.3088],
        [0.5313, 0.4687]])


### Loss functions

In [16]:
# mse
mse = nn.MSELoss()
a, b = torch.rand(5, 2), torch.rand(5, 2)
print(mse(a, b))

# cross-entropy
cross_entropy = nn.CrossEntropyLoss()
a = torch.rand(10, 2)
b = torch.randint(2, (10,))
print(cross_entropy(a, b))

tensor(0.2663)
tensor(0.5983)


### Neural Network

In [81]:
class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.layers = nn.Sequential(
            nn.Linear(13, 3),
            nn.ReLU(),
            nn.Linear(3, 3),
            nn.Softmax(dim=1)
        )
    
    def forward(self, X):
        return self.layers(X)
    

model = Net()
model

Net(
  (layers): Sequential(
    (0): Linear(in_features=13, out_features=3, bias=True)
    (1): ReLU()
    (2): Linear(in_features=3, out_features=3, bias=True)
    (3): Softmax(dim=1)
  )
)

In [82]:
for name, param in model.named_parameters():
    print(f"Layer: {name} | Size: {param.size()} | Values : {param.data} \n")

Layer: layers.0.weight | Size: torch.Size([3, 13]) | Values : tensor([[ 0.0743,  0.2055, -0.0763,  0.0349, -0.1844,  0.0049,  0.1216, -0.1394,
         -0.2260, -0.0892, -0.0419, -0.0718,  0.0782],
        [-0.0420, -0.1034, -0.2759,  0.0035,  0.1277, -0.0102, -0.0055,  0.0580,
         -0.2649,  0.0219, -0.2439, -0.1629,  0.0296],
        [ 0.1125, -0.1069, -0.2596, -0.2020,  0.1576, -0.0568,  0.2264,  0.1395,
          0.1872, -0.2123, -0.0911,  0.1796, -0.2613]]) 

Layer: layers.0.bias | Size: torch.Size([3]) | Values : tensor([ 0.2464, -0.2696, -0.0119]) 

Layer: layers.2.weight | Size: torch.Size([3, 3]) | Values : tensor([[-0.4451,  0.3152,  0.0387],
        [ 0.0320, -0.2368, -0.2466],
        [ 0.3873, -0.2003,  0.4198]]) 

Layer: layers.2.bias | Size: torch.Size([3]) | Values : tensor([ 0.0696,  0.5506, -0.3752]) 



In [83]:
X = torch.rand(3, 13)
y = model(X)
print(y)

tensor([[0.3123, 0.4621, 0.2256],
        [0.2744, 0.5052, 0.2204],
        [0.3069, 0.4964, 0.1967]], grad_fn=<SoftmaxBackward0>)


## Optimization

In [90]:
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

df = pd.read_csv("../../Datasets/wines.csv")
df.head()

Unnamed: 0,Alcohol %,Malic Acid,Ash,Alkalinity,Mg,Phenols,Flavanoids,Phenols.1,Proantho-cyanins,Color intensity,Hue,OD280 315,Proline,Start assignment,ranking
0,14.23,1.71,2.43,15.6,127,2.8,3.06,0.28,2.29,5.64,1.04,3.92,1065,1,1
1,13.24,2.59,2.87,21.0,118,2.8,2.69,0.39,1.82,4.32,1.04,2.93,735,1,1
2,14.83,1.64,2.17,14.0,97,2.8,2.98,0.29,1.98,5.2,1.08,2.85,1045,1,1
3,14.12,1.48,2.32,16.8,95,2.2,2.43,0.26,1.57,5.0,1.17,2.82,1280,1,1
4,13.75,1.73,2.41,16.0,89,2.6,2.76,0.29,1.81,5.6,1.15,2.9,1320,1,1


In [93]:
features = df.drop(['Start assignment', 'ranking'], axis=1).values
X = StandardScaler().fit_transform(features)
X = torch.tensor(X, dtype=torch.float32)
y = torch.tensor(df['ranking'].values - 1)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# define loss
loss_func = nn.CrossEntropyLoss()

# define optimizer
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)

epochs = 10
for _ in range(epochs):
    y_pred = model(X_train)
    loss = loss_func(y_pred, y_train)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    
    with torch.no_grad():
        test_loss = loss_func(model(X_test), y_test)
        print(test_loss.item())

0.6230356693267822
0.6229644417762756
0.6228887438774109
0.6228083968162537
0.6227228045463562
0.6226339936256409
0.6225464940071106
0.6224617958068848
0.622379720211029
0.6222999095916748
