<a href="https://colab.research.google.com/github/lwa01/229352-STAT-LEARING-FOR-DATA-SCI-2/blob/main/Lab08_Pytorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Statistical Learning for Data Science 2 (229352)
#### Instructor: Donlapark Ponnoprat

#### [Course website](https://donlapark.pages.dev/229352/)

## Lab #8

There are several deep learning frameworks in Python.

<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/c/c6/PyTorch_logo_black.svg/2560px-PyTorch_logo_black.svg.png" width="100"/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<img src="https://upload.wikimedia.org/wikipedia/commons/2/2d/Tensorflow_logo.svg" width="40"/><img src="https://assets-global.website-files.com/621e749a546b7592125f38ed/62277da165ed192adba475fc_JAX.jpg" width="100"/>

In this Lab, we will use PyTorch

In [1]:
import numpy as np

import torch

# Tensor basics

## Basic tensor creation

### Creating a scalar (1D) tensor

In [2]:
a = torch.tensor(2)
print(a)

b = torch.tensor(3)
print(a + b)

tensor(2)
tensor(5)


### Convert a tensor to scalar

In [3]:
c = b.item()
print(c)

3


### Creating 2D tensor

In [4]:
A = torch.tensor([[1, 2], [5, 6]])
A

tensor([[1, 2],
        [5, 6]])

## Tensor and Numpy

### Convert from tensor to numpy array

In [5]:
B = A.numpy()
B

array([[1, 2],
       [5, 6]])

### Convert from numpy array to tensor

In [6]:
C = torch.from_numpy(B)
C

tensor([[1, 2],
        [5, 6]])

## PyTorch and GPU

check if GPU is available

In [7]:
torch.cuda.is_available()

True

In [8]:
C.device

device(type='cpu')

In [9]:
D = C.cuda()
D

tensor([[1, 2],
        [5, 6]], device='cuda:0')

In [10]:
D.device

device(type='cuda', index=0)

In [11]:
C + D

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

In [12]:
# D.numpy()
E = D.cpu().numpy()
E

array([[1, 2],
       [5, 6]])

## Basic operations

In [13]:
D ** 2

tensor([[ 1,  4],
        [25, 36]], device='cuda:0')

In [14]:
D - 2

tensor([[-1,  0],
        [ 3,  4]], device='cuda:0')

In [15]:
D.cpu() * np.array([1,2])

  D.cpu() * np.array([1,2])


tensor([[ 1,  4],
        [ 5, 12]])

### Matrix multiplication

In [16]:
A = torch.tensor([[4, 5],
                  [8, 9]])
B = torch.tensor([[1, 4],
                  [7, 8]])
torch.matmul(A, B)

tensor([[ 39,  56],
        [ 71, 104]])

In [17]:
torch.mm(A, B)

tensor([[ 39,  56],
        [ 71, 104]])

In [18]:
A @ B

tensor([[ 39,  56],
        [ 71, 104]])

### Matrix transpose

In [19]:
A.t()

tensor([[4, 8],
        [5, 9]])

## Creating a specific type of tensor

In [20]:
# tensor ‡∏ó‡∏µ‡πà‡∏°‡∏µ‡πÅ‡∏ï‡πà‡πÄ‡∏•‡∏Ç 0
a = torch.zeros(1, 2, 3)
a

# tensor ‡∏ó‡∏µ‡πà‡∏°‡∏µ‡πÅ‡∏ï‡πà‡πÄ‡∏•‡∏Ç 1
b = torch.ones(1, 2, 3)
b

tensor([[[1., 1., 1.],
         [1., 1., 1.]]])

In [21]:
# Identity matrix
I = torch.eye(3)
I

tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])

In [22]:
# Random
R1 = torch.rand(2, 3) # Uniform dist.
R1

tensor([[0.3022, 0.6177, 0.5184],
        [0.7355, 0.0435, 0.9656]])

In [23]:
R2 = torch.randn(2, 3) # Normal dist.
R2

tensor([[ 0.4766, -0.9697, -0.6313],
        [ 0.7008,  1.5991,  0.2512]])

In [24]:
# vector ‡πÄ‡∏£‡∏µ‡∏¢‡∏á‡∏•‡∏≥‡∏î‡∏±‡∏ö‡∏Å‡∏±‡∏ô
a = torch.arange(1, 10, 2)
a

tensor([1, 3, 5, 7, 9])

## Tensor's shape

In [25]:
A.shape

torch.Size([2, 2])

In [26]:
A.size()

torch.Size([2, 2])

### Checking the shape of a tensor

In [27]:
u = torch.arange(6)
print(u.shape)
u

torch.Size([6])


tensor([0, 1, 2, 3, 4, 5])

### Changing the shape of a tensor

In [28]:
v = u.reshape(2, 3)
v

tensor([[0, 1, 2],
        [3, 4, 5]])

In [29]:
w = u.view(3, 2)
w

tensor([[0, 1],
        [2, 3],
        [4, 5]])

In general, use `reshape`, but if you are worried about the memory usage, use `view`.

### Stacking and concatenating tensors

In [30]:
a = torch.arange(4)
b = torch.arange(4) + 1
a, b

(tensor([0, 1, 2, 3]), tensor([1, 2, 3, 4]))

In [31]:
c = torch.stack([a, b], axis = 0) # stack ‡∏ï‡∏≤‡∏°‡πÅ‡∏ñ‡∏ß
c

tensor([[0, 1, 2, 3],
        [1, 2, 3, 4]])

In [32]:
d = torch.stack([a, b], axis = 1) # stack ‡∏ï‡∏≤‡∏°‡∏´‡∏•‡∏±‡∏Å
d

tensor([[0, 1],
        [1, 2],
        [2, 3],
        [3, 4]])

In [33]:
# concatenate
e = torch.cat([a, b], axis = 0)
e

tensor([0, 1, 2, 3, 1, 2, 3, 4])

In [34]:
# concatenate (Dimention out of range (expected to be in range of [-1, 0],but got 1))
f = torch.cat([a, b], axis = 1)
f

IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

### Squeezing a tensor (removing an extra dimension)

In [36]:
A = torch.zeros(1, 2, 3)

In [37]:
# Remove 1st axis
B = A.squeeze(0) # defaut ‡∏Ñ‡∏∑‡∏≠‡πÅ‡∏Å‡∏ô‡∏ó‡∏µ‡πà‡∏°‡∏µ 1 ‡πÅ‡∏ñ‡∏ß
B.shape

torch.Size([2, 3])

### Unsqueezing a tensor (adding an extra dimension)

In [38]:
C =  B.unsqueeze(axis = 0)
C.shape

torch.Size([1, 2, 3])

## Indexing

In [39]:
P = torch.arange(12).reshape(3,4)
print(P)
print(P[0])
print(P[:, 0])
print(P[-1])
print(P[:, -1])
print(P[-2:])
print(P[:, -2:])

tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])
tensor([0, 1, 2, 3])
tensor([0, 4, 8])
tensor([ 8,  9, 10, 11])
tensor([ 3,  7, 11])
tensor([[ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])
tensor([[ 2,  3],
        [ 6,  7],
        [10, 11]])


In [41]:
P[0:2, 0:2]
P[[0, 1, 2], [0, 1, 2]]

tensor([ 0,  5, 10])

In [42]:
# Inverse
torch.inverse(torch.tensor([[1.0, 2.0], [1.0, 4.0]]))

tensor([[ 2.0000, -1.0000],
        [-0.5000,  0.5000]])

# Exercise

In this exercise, we will simulate data to perform linear regression with 200 rows and 7 variables.

1. Create three random $N(0,1)$ tensors: `X`, `b` and `e` with `X.shape = (200, 7)`, `b.shape = (8, 1)` and `e.shape = (200, 1)` respectively.
2. Create a tensor that contains only 1's with shape `(200, 1)`.
3. Modify tensor `X` by adding the tensor in 2. as the first column.
4. Compute `y` using the following formula:
$$ y = Xb + e $$.
5. Fit a linear regression to the data `X` and `y` and obtain a tensor of estimated coefficient `b_hat`. The formula for `b_hat` is given by:
$$ \hat{b} = (X^TX)^{-1}X^Ty $$
Note: use `torch.inverse(...)` to calculate the inverse
6. Compute the predictions `y_hat`, given by:
$$ \hat{y} = X\hat{b} $$
7. Convert both `y` and `y_hat` from tensor to Numpy array and calculate MSE:
$$ MSE = \frac{1}{200}\sum_{i=1}^{200} (y_i - \hat{y}_i)^2 $$

In [43]:
X = torch.tensor([[2, 3, 2], [4, 6, 7], [7, 2, 4]])
print(X)

X = torch.tensor([[1, 2, 3, 2], [1, 4, 6, 7], [1, 7, 2, 4]])
print(X)

tensor([[2, 3, 2],
        [4, 6, 7],
        [7, 2, 4]])
tensor([[1, 2, 3, 2],
        [1, 4, 6, 7],
        [1, 7, 2, 4]])


In [47]:
# 1.Create three random  ùëÅ(0,1)
torch.manual_seed(0)
X = torch.randn(200, 7)              # tensors: x
b = torch.randn(8, 1)                # tensors: b
e = torch.randn(200, 1)              # tensors: e

In [48]:
# 2.Create a tensor that contains only 1's
ones = torch.ones(200, 1)

In [50]:
# 3.Modify tensor X by adding the tensor in 2. as the first column.
X = torch.cat((ones, X), axis=1)

In [51]:
# 4.Compute y
y = X @ b + e

In [55]:
# 5.Fit a linear regression to the data X and y and obtain a tensor of estimated coefficient b_hat
XtX = X.T @ X          # compute X^T X
XtX_inv = torch.inverse(XtX)  # inverse of X^T X
Xty = X.T @ y          # compute X^T y
b_hat = XtX_inv @ Xty  # estimated regression coefficients

In [54]:
# 6.ompute the predictions y_hat
y_hat = X @ b_hat

In [56]:
# 7.Convert both y and y_hat from tensor to Numpy array and calculate MSE
y_np = y.numpy()
y_hat_np = y_hat.numpy()
MSE = np.mean((y_np - y_hat_np) ** 2)
print("MSE =", MSE)

MSE = 0.94517547
