# Intro to Deep Learning

## Tasks

### Task 1

Code a function `function01` that has the following signature:

`def function01(tensor: torch.Tensor, count_over: str) -> torch.Tensor:`

- If `count_over` parameter equals `'columns'`, return the mean of the tensor over columns.
- If it equals `'rows'`, return the mean over rows.

In [1]:
import torch


def function01(tensor: torch.Tensor, count_over: str) -> torch.Tensor:
    if count_over == 'columns':
        return torch.mean(tensor, dim=0)
    elif count_over == 'rows':
        return torch.mean(tensor, dim=1)

In [2]:
tensor = torch.tensor([[1, 2, 3, 4],
                      [5, 6, 7, 8],
                      [9, 10, 11, 12]], dtype=torch.float32)


result_columns = function01(tensor, 'columns')
print(result_columns) 


result_rows = function01(tensor, 'rows')
print(result_rows) 

tensor([5., 6., 7., 8.])
tensor([ 2.5000,  6.5000, 10.5000])


### Task 2

Code a function `function02`.   
The function should take as input a tensor matrix of object features.   
Your function should create a tensor vector with weights (let them be from a uniform distribution on the interval from 0 to 1) and return them for further training of a linear regression without a free coefficient (bias term).   
Make these weights of type `float32`, and they will need to have gradients computed during training (use `requires_grad`).  

In [3]:
import torch


def function02(x):
    n_features = x.shape[1]
    weights = torch.rand(n_features, dtype=torch.float32, requires_grad=True)
    return weights

In [7]:
test_data = torch.tensor([
    [1.0, 2.0, 3.0],
    [4.0, 5.0, 6.0],
    [7.0, 8.0, 9.0],
    [10.0, 11.0, 12.0],
    [13.0, 14.0, 15.0]
], dtype=torch.float32)


weights = function02(test_data)

print(f'Dataset shape: {test_data.shape}')
print(f'Generated weights: {weights}')
print(f'Weights shape: {weights.shape}')
print(f'Weights data type: {weights.dtype}')
print(f'Requires gradient: {weights.requires_grad}')

Dataset shape: torch.Size([5, 3])
Generated weights: tensor([0.2529, 0.2081, 0.9598], requires_grad=True)
Weights shape: torch.Size([3])
Weights data type: torch.float32
Requires gradient: True


In [8]:
y_pred = torch.matmul(test_data, weights)
loss = y_pred.mean()
loss.backward()

print(f'Weights gradient: {weights.grad}')

Weights gradient: tensor([7., 8., 9.])


### Task 3

Code a function `function03`. It should accept a tensor-matrix with objects and a tensor-vector with correct answers, we'll be solving a regression problem: `def function03(x: torch.Tensor, y: torch.Tensor)`:

Inside the function, create weights for linear regression (without a bias term). Using gradient descent, find the optimal weights for the input data (use a step length of about `1e-2`). Return a tensor-vector with the optimal weights from the function. Your trained weights should give an `MSE` on the training set less than one.

In [10]:
import torch


def function03(x: torch.Tensor, y: torch.Tensor):
    num_features = x.shape[1]
    weights = torch.rand(num_features, dtype=torch.float32, requires_grad=True)
    
    learning_rate = 1e-2
    num_iterations = 1000

    for i in range(num_iterations):
        y_pred = x @ weights
        
        # computing MSE loss 
        loss = ((y_pred - y) ** 2).mean()
        if loss.item() < 1.0:
            break
        
        loss.backward()
        with torch.no_grad():
            weights -= learning_rate * weights.grad
            weights.grad.zero_()
    
    return weights.detach().clone()

In [11]:
n_features = 2      
n_objects = 300

w_true = torch.randn(n_features)
X = (torch.rand(n_objects, n_features) - 0.5) * 5
Y = X @ w_true + torch.randn(n_objects) / 2

function03(X, Y)

tensor([0.2558, 0.7918])

### Task 4

Code a function `function04`. It should accept a tensor-matrix with objects and a tensor with correct answers, we'll be solving a regression problem: `def function04(x: torch.Tensor, y: torch.Tensor)`  

Inside the function, create a fully connected layer, train this fully connected layer on the input data using gradient descent (use a step length of about `1e-2`). Return it from the function. Your fully connected layer should give an `MSE` on the training set less than 0.3.

In [13]:
import torch
import torch.nn as nn


def function04(x: torch.Tensor, y: torch.Tensor):
    input_features = x.shape[1]
    linear_layer = nn.Linear(input_features, 1, bias=False)
    
    learning_rate = 1e-2
    num_iterations = 1000
    
    for i in range(num_iterations):
        y_pred = linear_layer(x).squeeze()
        
        # computing MSE loss 
        loss = ((y_pred - y) ** 2).mean()
        if loss.item() < 0.3:
            break
        
        loss.backward()
        with torch.no_grad():
            linear_layer.weight -= learning_rate * linear_layer.weight.grad
            linear_layer.weight.grad.zero_()
    
    return linear_layer

In [14]:
n_features = 2
n_objects = 300

w_true = torch.randn(n_features)
X = (torch.rand(n_objects, n_features) - 0.5) * 5
Y = X @ w_true + torch.randn(n_objects) / 2

function04(X, Y)

Linear(in_features=2, out_features=1, bias=False)