# Pytorch
Pytorch is a package for python largely used to compute neural network calculations.
Its main features include Tensors, Gradient Descend and Modules.
Today we will look at what Tensors are and the main functions we can use them for.
We will look at Gradient Descent and Modules in future Lessons

## Tensors
Tensors are a rapresentation of data stored in memory.
You can see them as an evolution of numpy arrays. They can have different data type and dimensions.
The main feature of Tensors, though, is the computation automatic computation of gradients, useful during BackPropagation.


### Initializing a Tensor
There are many ways to initialize a tensor. Most of thetime you will initialize it starting from a random distribution or a set of constants.
It's also possible to initialize them from already existing data, like Python lists and numpy arrays.

In [None]:
import torch
import numpy as np
import warnings
warnings.filterwarnings('ignore')

data = [[1, 2],[3, 4]]
data_np = np.array(data)
x_data = torch.tensor(data)

print(data_np, '\n')
print(x_data)

### Tensor parameters
Tensors have many parameters that define how they operate.
The most important ones are listed below:
- dtype: it's a parameter defining the type of data the tensor contains (integers, float, string...)
- shape: it's the shape of the tensor. Like matrixes, some operations between tensors require them to have compatible dimensions
- device: this is a technical parameter. It specifies whether a tensor is stored on RAM ('cpu') or on the GPU ('cuda'). Operations between tensors are only possible if the tensors are stored on the same device.

In [None]:
print(f'x_data is of type:', x_data.dtype)
print(f'x_data has shape:', x_data.shape)
print(f'x_data has shape:', x_data.shape[0], x_data.shape[1])
print(f'x_data is on:', x_data.device)

### From Numpy to Tensor (and viceversa)
As stated before, it's possible to create a tensor from already existing data, like numpy arrays. This conversion keeps the data type and shape intact.

In [None]:
x_np = torch.from_numpy(data_np)

print(x_np, '\n')
print(x_data)

torch_2_np = x_np.numpy()
print(data_np, '\n')
print(torch_2_np)

### Initializing Tensor from Constants/Random distributions
The final way to create a tensor is using one of the built-in constuctors of torch.
- ones: creates a tensor of ones
- rand: create a tensor filled with numbers taken from a distribution
- zeros: creates a tensor filled with zeros

In [None]:
x_ones = torch.ones_like(x_data) # retains the properties of x_data
print(f"Ones Tensor: \n {x_ones} \n")

x_rand = torch.rand_like(x_data, dtype=torch.float) # overrides the datatype of x_data
print(f"Random Tensor: \n {x_rand} \n")

shape = (2,3,)
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)

print(f"Random Tensor: \n {rand_tensor} \n")
print(f"Ones Tensor: \n {ones_tensor} \n")
print(f"Zeros Tensor: \n {zeros_tensor}")

### Operations on Tensors
Tensor operate much like a python list or numpy array.
- You can select a subset of the tensor
- You can select a single value
- You can concatenate tensors on a given dimension

In [None]:
tensor = torch.ones(4, 3)
print(f"First row: {tensor[0]}")
print(f"First column: {tensor[:, 0]}")
print(f"Last column: {tensor[..., -1]}")
tensor[:,1] = 0 # set second column to 0
print(tensor)

t1 = torch.cat([tensor, tensor, tensor], dim=1) # concatenate on the column axis
print(t1)

### Aritmetic operations with Tensors
As expected, all the algebric operations available on matrixes are available on tensors. These are the most popular ones, but remember to look at the documentation if you are looking for a specific function!

In [None]:
# This computes the matrix multiplication between two tensors. y1, y2, y3 will have the same value
y1 = tensor @ tensor.T
y2 = tensor.matmul(tensor.T)

y3 = torch.rand_like(y1)
torch.matmul(tensor, tensor.T, out=y3)


# This computes the element-wise product. z1, z2, z3 will have the same value
z1 = tensor * tensor
z2 = tensor.mul(tensor)

z3 = torch.rand_like(tensor)
torch.mul(tensor, tensor, out=z3)

z4 = z1 + z3

print('z1\n', z1, '\nz3\n', z3, '\nz4\n', z4)

z5 = z4.to(torch.int32)

print(z5)

## Task - Linear Regression Using Torch
This exercise is meant to push your python knowledge and force you to use functions that will be crucial in the upcoming lessons.
The task in itself is simple: re-implement the exercise of the first lab lesson (Linear regressor with least squares, gradient descent), but this time, using Python classes and torch tensors!

You will be provided with a rough structure of how i expect the code to work. The rest is up to you!

### Task 1 - LeastSquareRegressor Class
Instead of using simple functions for calculating the regression, you have to implement a class LeastSquareRegressor(), following the structure that sklearn uses for its owm regressors.
A regressor class should have some parameters and functions that use torch to do the operations seen in lesson 1.

In [None]:
import torch
import warnings
warnings.filterwarnings('ignore')

class LeastSquareRegressor():
    """
    This class should have W has a parameter. This means that when i instantiate a LeastSquareRegressor object, i should be able to access its current W matrix
    :attribute W (torch.Tensor): the tensor containing the weigths of this regressor
    :attribute b (torch.Tensor): the 1x1 tensor containing the bias
    """
    def __init__(self):
        self.W = None
        self.b = None

    def fit(self, X : torch.Tensor, Y : torch.Tensor) -> None:
        """
        Given X and Y, calculate W using Least Square solutions
        :param X (torch.Tensor): The independent data of shape (N,F)
        :param Y (torch.Tensor): The dependent data (labels) of shape (N,1)
        :return None
        """
        X = torch.cat([X, torch.ones((X.shape[0],1))], axis=1)

        W = torch.linalg.inv(X.transpose(0,1).matmul(X)).matmul(X.transpose(0,1).matmul(Y))

        self.W = W[:-1]
        self.b = W[-1]
    
    def regress(self, X) -> torch.Tensor:
        """
        Given X, the tensor of independent data, calculate Y using the weigths trained
        :param X (torch.Tensor): The tensor of independent data of shape (N, F)
        :return (torch.Tensor): a tensor of shape (N,1) containing the regressed data
        """
        return X.matmul(self.W) + self.b

In [None]:
# Test your progress
LSReg = LeastSquareRegressor()
print(LSReg.W, LSReg.b)

X = torch.rand((5,2))
Y = torch.rand((5,1))
LSReg.fit(X, Y)
print(LSReg.W, LSReg.b)

## Task 2 - Make regression on the boston dataset
We have seen in class how to do it, now it's time to do it on your own using torch!

TIP: you will need to import the dataset like usual and find a way to convert to a torch.Tensor

In [None]:
from sklearn.datasets import load_boston
## Import the dataset and create the X and Y tensors, as well as X_np and Y_np, the numpy arrays containing the same data
# X and Y should be of type torch.float64

def get_data():
    X_np, Y_np = load_boston(return_X_y=True)
    X = torch.from_numpy(X_np)
    Y = torch.from_numpy(Y_np)

    return X, Y, X_np, Y_np

X, Y, X_np, Y_np = get_data()
print(X.shape, Y.shape)
print(X_np.shape, Y_np.shape)


In [None]:
from sklearn.metrics import r2_score
from torchmetrics  import R2Score
## Fit our regressor to the data!
LSReg = LeastSquareRegressor()
LSReg.fit(X,Y)
print(LSReg.W, LSReg.b, '\n')
y_pred_torch = LSReg.regress(X)
R2torch = R2Score()
print(f'R2 score of our regressor is {R2torch(y_pred_torch,Y)}\n\n')

## Compare the results with the sklearn regressor
from sklearn.linear_model import LinearRegression
SKReg = LinearRegression()
SKReg.fit(X_np,Y_np)
print(SKReg.coef_, SKReg.intercept_, '\n')
y_pred_sk = X_np.dot(SKReg.coef_) + SKReg.intercept_
print(f'R2 score of SKlearn regressor is {r2_score(Y_np, y_pred_sk)}')

## Bonus Task - Gradient Descent Regressor with pytorch
Following what you just did with the LeastSquare regressor, try and implement the Gradient Descent one.

In [92]:
class GDRegressor():
    """
    This class should have W has a parameter. This means that when i instantiate a LeastSquareRegressor object, i should be able to access its current W matrix
    :attribute W (torch.Tensor): the tensor containing the weigths of this regressor
    :attribute b (torch.Tensor): the 1x1 tensor containing the bias
    """
    def __init__(self):
        self.W = None
        self.b = None
        
    def fit(self, X : torch.tensor, Y : torch.Tensor, n_iters : int=1000, alpha : float=0.005) -> None:
        """
        Given X and Y, calculate W using GradientDescent, updating W with alpha over n_iters iterations
        :param X (torch.Tensor): The independent data of shape (N,F)
        :param Y (torch.Tensor): The dependent data (labels) of shape (N,1)
        :param n_iters(int): The number of iterations for gradient descent
        :param alpha (float): learning rate of gradient descent
        :return None
        """
        X = torch.cat([X, torch.ones((X.shape[0],1))], axis=1)
        W = torch.zeros((X.shape[1],1))

        for i in range(n_iters):
            #print(self.gradfn(W, X, Y))
            W = W - alpha*self.gradfn(W, X, Y)
        
        self.W = W[0:-1]
        self.b = W[-1]

    def gradfn(self, W, X, Y):
        """
        Function that calculates the gradient 
        :param W (torch.Tensor): tensor of shape (F) containing current guess of weights (and bias)
        :param X (torch.Tensor): tensor of shape (N,F) of input features
        :param Y (torch.Tensor): target y values
        :Return gradient of each weight evaluated at the current value
        """
        return X.transpose(0,1).matmul(X.matmul(W)-Y)/X.shape[0]
    
    def regress(self, X) -> torch.Tensor:
        """
        Given X, the tensor of independent data, calculate Y using the weigths trained
        :param X (torch.Tensor): The tensor of independent data of shape (N, F)
        :return (torch.Tensor): a tensor of shape (N,1) containing the regressed data
        """
        return X.matmul(self.W) + self.b


In [95]:
# Test your progress
GDReg = GDRegressor()
print(GDReg.W, GDReg.b)

X = torch.rand((5,2))
Y = torch.rand((5,1))
GDReg.fit(X, Y, n_iters=10000, alpha=1)
print(GDReg.W, GDReg.b)

# Regression metrics are kind of arbitrary as the data is generated randomly and some 
# generation run will produce data that is better-suited for linear regression
# while others may produce data that cannot be approximated well by a linear model.
Y_pred = GDReg.regress(X)
R2torch = R2Score()
print(f'R2 score of our GD regressor is {R2torch(Y_pred,Y)}\n\n')




None None
tensor([[0.7517],
        [0.7678]]) tensor([-0.3895])
R2 score of our GD regressor is 0.38012903928756714




In [100]:
from sklearn.datasets import make_regression
from sklearn.preprocessing import StandardScaler
import numpy as np

X, Y = make_regression(n_samples=10000, n_features=20)
# Standardizing data
scaler = StandardScaler()
scaler.fit(X)
X = scaler.transform(X).astype(np.float32)

X_np = X.copy().astype(np.float32)
Y_np = Y.copy().astype(np.float32)

X = torch.from_numpy(X.astype(np.float32))
Y = torch.from_numpy(Y.astype(np.float32))[:,None]

GDReg = GDRegressor()
GDReg.fit(X,Y, n_iters=1000, alpha=1)
# print(GDReg.W, GDReg.b, '\n')
y_pred_gd = GDReg.regress(X)
R2torch = R2Score()
print(f'R2 score of our GD regressor is {R2torch(y_pred_gd,Y)}\n\n')

## Least Square
LSReg = LeastSquareRegressor()
LSReg.fit(X,Y)
# print(LSReg.W, LSReg.b, '\n')
y_pred_torch = LSReg.regress(X)
R2torch = R2Score()
print(f'R2 score of our LS regressor is {R2torch(y_pred_torch,Y)}\n\n')

## Compare the results with the sklearn regressor
from sklearn.linear_model import LinearRegression
SKReg = LinearRegression()
SKReg.fit(X_np,Y_np)
# print(SKReg.coef_, SKReg.intercept_, '\n')
y_pred_sk = X_np.dot(SKReg.coef_) + SKReg.intercept_
print(f'R2 score of SKlearn regressor is {r2_score(Y_np, y_pred_sk)}')

R2 score of our GD regressor is 1.0


R2 score of our LS regressor is 1.0


R2 score of SKlearn regressor is 0.99999999999968
