# Tutorial 4 - PyTorch

## Outline

+ Installation & Introduction
+ PyTorch Tensors and auto grads
+ Building up neural network
+ Optimizers


## Installation & Introduction

+ [Official Web site](https://pytorch.org/)

+ [Installation](https://pytorch.org/get-started/locally/)


PyTorch is an open-source machine learning library widely used for deep learning applications. Developed by Facebook's AI Research lab (FAIR), it provides a flexible and intuitive framework for building and training neural networks. PyTorch is known for its ease of use, computational efficiency, and dynamic computational graph, making it a favorite among researchers and developers for both academic and industrial applications.

### Key Features of PyTorch

+ **Dynamic Computational Graph**: PyTorch uses a dynamic computation graph (also known as a define-by-run paradigm), meaning the graph is built on the fly as operations are performed. This makes it more intuitive and flexible, allowing for easy changes and debugging.

+ **Eager Execution**: Operations in PyTorch are executed eagerly, meaning they are computed immediately without waiting for a compiled graph of operations. This allows for more interactive and dynamic development.

+ **Pythonic Nature**: PyTorch is deeply integrated with Python, making it easy to use and familiar to those with Python experience. It leverages Python’s features and libraries, allowing for seamless integration with the Python data science stack (e.g., NumPy, SciPy, Pandas).

+ **Extensive Library Support**: PyTorch provides a wide range of libraries and tools for various tasks in deep learning, including computer vision (TorchVision), natural language processing (TorchText), and more. This ecosystem supports a vast array of algorithms, pre-trained models, and datasets to facilitate development and experimentation.

+ **GPU Acceleration**: It supports CUDA, enabling it to leverage Nvidia GPUs for accelerated tensor computations. This makes training deep neural networks significantly faster compared to CPU-based training.

+ **Community and Support**: PyTorch has a large and active community, contributing to a growing ecosystem of tools, libraries, and resources. It also enjoys robust support from major tech companies, ensuring continuous development and improvement.

## Tensors

Tensors are data structure in PyTorch to manipulate data. It is very similar to numpy.ndarray, but with support for automatic differentiation and hardware acceleration (Nvidia GPU, Apple silicon)

In [1]:
import torch

In [2]:
a = torch.tensor([[1, 2], [3, 4]], dtype=torch.int64)
print(type(a))
a.dtype

<class 'torch.Tensor'>


torch.int64

Bridge with NumPy

In [3]:
import numpy as np

arr = np.array([[1., 2.], [3., 4.]])
arr_torch = torch.from_numpy(arr)
arr_torch

tensor([[1., 2.],
        [3., 4.]], dtype=torch.float64)

In [4]:
arr_torch_2 = torch.tensor(arr)
arr+= 1
arr_torch_2

tensor([[1., 2.],
        [3., 4.]], dtype=torch.float64)

In [5]:
# detach() stops a tensor from tracking history in automatic differentiation
arr_np = arr_torch_2.detach().numpy()
arr_np

array([[1., 2.],
       [3., 4.]])

Generate random numbers

In [6]:
# normal distribution
torch.randn(4,3)

tensor([[ 0.2123, -1.5801,  0.1543],
        [ 0.5049,  1.3548,  0.8801],
        [-0.3158, -0.0684, -0.2944],
        [ 0.2007, -0.5693,  0.8686]])

In [7]:
# uniform distribution
torch.rand(4,3)

tensor([[0.0154, 0.8571, 0.0700],
        [0.0885, 0.3157, 0.6699],
        [0.9285, 0.2943, 0.8468],
        [0.7247, 0.1792, 0.7754]])

Others

In [8]:
# arange
torch.arange(10)

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [9]:
# linspace
torch.linspace(0,10,100)

tensor([ 0.0000,  0.1010,  0.2020,  0.3030,  0.4040,  0.5051,  0.6061,  0.7071,
         0.8081,  0.9091,  1.0101,  1.1111,  1.2121,  1.3131,  1.4141,  1.5152,
         1.6162,  1.7172,  1.8182,  1.9192,  2.0202,  2.1212,  2.2222,  2.3232,
         2.4242,  2.5253,  2.6263,  2.7273,  2.8283,  2.9293,  3.0303,  3.1313,
         3.2323,  3.3333,  3.4343,  3.5354,  3.6364,  3.7374,  3.8384,  3.9394,
         4.0404,  4.1414,  4.2424,  4.3434,  4.4444,  4.5455,  4.6465,  4.7475,
         4.8485,  4.9495,  5.0505,  5.1515,  5.2525,  5.3535,  5.4545,  5.5556,
         5.6566,  5.7576,  5.8586,  5.9596,  6.0606,  6.1616,  6.2626,  6.3636,
         6.4646,  6.5657,  6.6667,  6.7677,  6.8687,  6.9697,  7.0707,  7.1717,
         7.2727,  7.3737,  7.4747,  7.5758,  7.6768,  7.7778,  7.8788,  7.9798,
         8.0808,  8.1818,  8.2828,  8.3838,  8.4848,  8.5859,  8.6869,  8.7879,
         8.8889,  8.9899,  9.0909,  9.1919,  9.2929,  9.3939,  9.4950,  9.5960,
         9.6970,  9.7980,  9.8990, 10.00

In [10]:
# ones & zeros
torch.ones(6)
torch.zeros(6)

tensor([0., 0., 0., 0., 0., 0.])

In [11]:
a = torch.rand(4,3)
torch.ones_like(a)

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])

Attributes of tensors

In [12]:
tensor = torch.rand(3,4)
tensor.shape[0]
tensor.dtype
tensor.device

# shape, dtype, device

device(type='cpu')

Single-element tensor can use `.item()` method to get a Python float object

In [13]:
a = torch.tensor([4.])
a.item()

4.0

**PyTorch** can work on different hardwares

In [14]:
device = (
    "cuda"
    if torch.cuda.is_available()
    else "mps"
    if torch.backends.mps.is_available()
    else "cpu"
)

device = "cpu"

# send the tensor to device
tensor_device = tensor.to(device)

# send the tensor back to cpu
tensor_cpu = tensor_device.cpu()

### Autograd

In [15]:
x = torch.tensor([[1, 2], [3, 4]], dtype=torch.float, requires_grad=True)
y = torch.sum(x ** 2)
# backward
y.backward()
# get grad
x.grad

tensor([[2., 4.],
        [6., 8.]])

## Build Neural Network with PyTorch

In [16]:
import torch.nn as nn

### Activation Functions

In [17]:
tensor = 5 * (torch.rand(3, 2) * 2 - 1)
print(tensor)

# ReLU
relu = nn.ReLU()
print(relu(tensor))

# Tanh
tanh = nn.Tanh()
print(tanh(tensor))

# Sigmoid
sigmoid = nn.Sigmoid()
print(sigmoid(tensor))
# Softmax
Softmax = nn.Softmax()
print(Softmax(tensor))

tensor([[-4.4810, -4.3289],
        [ 3.6732,  4.3279],
        [ 3.6227, -3.4451]])
tensor([[0.0000, 0.0000],
        [3.6732, 4.3279],
        [3.6227, 0.0000]])
tensor([[-0.9997, -0.9997],
        [ 0.9987,  0.9997],
        [ 0.9986, -0.9980]])
tensor([[0.0112, 0.0130],
        [0.9752, 0.9870],
        [0.9740, 0.0309]])
tensor([[4.6205e-01, 5.3795e-01],
        [3.4194e-01, 6.5806e-01],
        [9.9915e-01, 8.5144e-04]])


  print(Softmax(tensor))


### Loss functions

In [18]:
# mse
mse = nn.MSELoss()
a, b = torch.rand(5, 2), torch.rand(5, 2)
print(mse(a, b))

# cross-entropy
cross_entropy = nn.CrossEntropyLoss() # more reasonable loss function 
a = torch.rand(10, 2)
print(a)
b = torch.randint(2, (10,))
print(b)
print(cross_entropy(a, b)) # low number  --> similar distribution

tensor(0.1130)
tensor([[0.6123, 0.2574],
        [0.8333, 0.9689],
        [0.5896, 0.7312],
        [0.9963, 0.2132],
        [0.3175, 0.8414],
        [0.0637, 0.7856],
        [0.0115, 0.6907],
        [0.7475, 0.2844],
        [0.8100, 0.6398],
        [0.3529, 0.0498]])
tensor([0, 1, 0, 0, 0, 1, 1, 0, 1, 1])
tensor(0.6223)


### Neural Network

In [19]:
class Net(nn.Module):
    def __init__(self):
        super().__init__()
        # create a net with one hidden layer
        # input_dim 13, hidden_dim 3, output_dim 3
        # use ReLU and softmax activation func
        self.layers = nn.Sequential(
            nn.Linear(13,3),
            nn.ReLU(),
            nn.Linear(3,3),
            nn.Softmax(dim =1)
        )
    
    def forward(self, X):
        return self.layers(X)
    

model = Net()
model

Net(
  (layers): Sequential(
    (0): Linear(in_features=13, out_features=3, bias=True)
    (1): ReLU()
    (2): Linear(in_features=3, out_features=3, bias=True)
    (3): Softmax(dim=1)
  )
)

In [20]:
for name, param in model.named_parameters():
    print(f"Layer: {name} | Size: {param.size()} | Values : {param.data} \n")

Layer: layers.0.weight | Size: torch.Size([3, 13]) | Values : tensor([[ 0.2420, -0.1840, -0.1041, -0.1980,  0.0645, -0.2736, -0.0659,  0.0006,
          0.2485,  0.1369, -0.1444,  0.0494, -0.1201],
        [-0.2297,  0.0578, -0.0134, -0.0968, -0.1640,  0.2552,  0.1632,  0.1480,
         -0.0308,  0.0905,  0.1821, -0.0033,  0.0651],
        [-0.1257, -0.0071,  0.1277,  0.2254, -0.0067, -0.2448, -0.1086,  0.0676,
         -0.1130, -0.0836,  0.1829,  0.0643, -0.2081]]) 

Layer: layers.0.bias | Size: torch.Size([3]) | Values : tensor([0.2745, 0.2123, 0.1118]) 

Layer: layers.2.weight | Size: torch.Size([3, 3]) | Values : tensor([[-0.0234,  0.1145, -0.0681],
        [ 0.5418,  0.5636, -0.3831],
        [-0.2218,  0.0473,  0.5681]]) 

Layer: layers.2.bias | Size: torch.Size([3]) | Values : tensor([-0.3683,  0.0620,  0.3707]) 



In [21]:
X = torch.rand(3, 13)
y = model(X)
print(y)

tensor([[0.1992, 0.3479, 0.4529],
        [0.2032, 0.3842, 0.4125],
        [0.1930, 0.4417, 0.3653]], grad_fn=<SoftmaxBackward0>)


## Optimization

In [22]:
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

df = pd.read_csv("/Users/mac_1/Desktop/CHEM C142/wines.csv")
df.head()

Unnamed: 0,Alcohol %,Malic Acid,Ash,Alkalinity,Mg,Phenols,Flavanoids,Phenols.1,Proantho-cyanins,Color intensity,Hue,OD280 315,Proline,Start assignment,ranking
0,14.23,1.71,2.43,15.6,127,2.8,3.06,0.28,2.29,5.64,1.04,3.92,1065,1,1
1,13.24,2.59,2.87,21.0,118,2.8,2.69,0.39,1.82,4.32,1.04,2.93,735,1,1
2,14.83,1.64,2.17,14.0,97,2.8,2.98,0.29,1.98,5.2,1.08,2.85,1045,1,1
3,14.12,1.48,2.32,16.8,95,2.2,2.43,0.26,1.57,5.0,1.17,2.82,1280,1,1
4,13.75,1.73,2.41,16.0,89,2.6,2.76,0.29,1.81,5.6,1.15,2.9,1320,1,1


In [23]:
features = df.drop(['Start assignment', 'ranking'], axis=1).values
X = StandardScaler().fit_transform(features)
X = torch.tensor(X, dtype=torch.float32)
y = torch.tensor(df['ranking'].values - 1)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# define loss
loss_func = nn.CrossEntropyLoss()

# define optimizer
optimizer = torch.optim.Adam(model.parameters(), lr = 0.001)

epochs = 100
for _ in range(epochs):
    y_pred = model(X_train)
    loss = loss_func(y_pred, y_train)
    
    optimizer.zero_grad()
    loss.backward() # ask torch to calculate grad of parameter
    optimizer.step() # gradient descent with momentum(algorithm of Adam)
    
    
    with torch.no_grad():
        test_loss = loss_func(model(X_test), y_test)
        print(test_loss.item())

1.1166212558746338
1.114992618560791
1.1133668422698975
1.111743450164795
1.110122561454773
1.1085048913955688
1.1068904399871826
1.1052794456481934
1.1036324501037598
1.1019701957702637
1.1003565788269043
1.0987454652786255
1.0971381664276123
1.095518708229065
1.0938935279846191
1.0922763347625732
1.0906411409378052
1.0890021324157715
1.0873675346374512
1.085737705230713
1.0841106176376343
1.0824875831604004
1.0808680057525635
1.0792546272277832
1.0776454210281372
1.0760397911071777
1.0744372606277466
1.0728365182876587
1.07123863697052
1.069644570350647
1.0680569410324097
1.0664745569229126
1.0648969411849976
1.063323974609375
1.0617568492889404
1.0601928234100342
1.0586313009262085
1.057071566581726
1.0555148124694824
1.0539599657058716
1.0524067878723145
1.0508428812026978
1.0492675304412842
1.047693133354187
1.0461186170578003
1.0445444583892822
1.042970895767212
1.0413872003555298
1.0397734642028809
1.0381560325622559
1.0365365743637085
1.0349156856536865
1.0332932472229004
1.031