# Neural Networks Optimization

## Tasks

### Task 1

Code a function `create_model` that returns a fully connected neural network with two layers.  
The input should be 100 numbers, the output should be 1, and the middle layer should have 10 neurons.  
Use `ReLU` as the non-linearity. Use `nn.Sequential` and pass the layers as a sequence.

In [1]:
import torch
import torch.nn as nn


def create_model():
    model = nn.Sequential(
        nn.Linear(100, 10),  
        nn.ReLU(),           
        nn.Linear(10, 1)     
    )
    return model

In [2]:
model = create_model()
sample_input = torch.randn(1, 100)
output = model(sample_input)

print(f'Input shape: {sample_input.shape}')
print(f'Output shape: {output.shape}')
print(f'Output value: {output.item():.4f}')

Input shape: torch.Size([1, 100])
Output shape: torch.Size([1, 1])
Output value: 0.0819


### Task 2

Code a function `train`. It should take as input a neural network, a dataloader, an optimizer, and a loss function.  
It should have the following signature: `def train(model: nn.Module, data_loader: DataLoader, optimizer: Optimizer, loss_fn)`:

Inside the function, do the following steps:

1. Set the model to training mode.

2. Iterate through the dataloader.

3. On each iteration:
    - Zero the gradients using the optimizer
    - Make a forward pass
    - Calculate the error
    - Make a backward pass
    - Print the error on the current batch with precision up to 5 decimal places (only the number)
    - Make an optimization step.

The function should return the average error during the pass through the dataloader.

In [3]:
import torch
from torch import nn
from torch.utils.data import DataLoader
from torch.optim import Optimizer


def train(model: nn.Module, data_loader: DataLoader, optimizer: Optimizer, loss_fn):
    model.train()
    
    total_loss = 0.0
    for inputs, targets in data_loader:        
        optimizer.zero_grad()
        outputs = model(inputs)  # forward pass
        loss = loss_fn(outputs, targets)
        loss.backward()
        print(f'{loss.item():.5f}')
        optimizer.step()  # optimization
        total_loss += loss.item()  # adding current loss to total loss
    
    avg_loss = total_loss / len(data_loader)
    
    return avg_loss

In [4]:
from torch.utils.data import DataLoader, TensorDataset
from torch.optim import SGD


num_samples = 500
input_dim = 100

# creating data
X = torch.randn(num_samples, input_dim)
weights = torch.randn(input_dim, 1) * 0.1
y = torch.matmul(X, weights) + torch.randn(num_samples, 1) * 0.1

# creating dataset and dataloader
dataset = TensorDataset(X, y)
data_loader = DataLoader(dataset, batch_size=64, shuffle=True)

# creating model, loss function, and optimizer
model = create_model()
loss_fn = nn.MSELoss()
optimizer = SGD(model.parameters(), lr=0.01)

# train model
print("Training model:")
avg_loss = train(model, data_loader, optimizer, loss_fn)
print(f"\nAverage loss: {avg_loss:.5f}")

Training model:
1.09693
1.35616
1.08373
1.25230
1.34806
1.14654
0.97326
1.54920

Average loss: 1.22577


### Task 3

Code a function `evaluate`. It should take as input a neural network, a dataloader, and a loss function.  
It should have the following signature: `def evaluate(model: nn.Module, data_loader: DataLoader, loss_fn)`:

Inside the function, do the following steps:

1. Set the model to inference mode (evaluation)

2. Iterate through the dataloader

3. On each iteration:
    - Make a forward pass
    - Calculate the error.

The function should return the average error during the pass through the dataloader.

In [5]:
import torch
from torch import nn
from torch.utils.data import DataLoader


def evaluate(model: nn.Module, data_loader: DataLoader, loss_fn):
    model.eval()
    
    total_loss = 0.0
    with torch.no_grad():  # disable gradients calc to make function work faster
        for inputs, targets in data_loader:
            outputs = model(inputs)  # forward pass
            loss = loss_fn(outputs, targets)
            total_loss += loss.item()
    
    avg_loss = total_loss / len(data_loader)
    
    return avg_loss

In [6]:
evaluate(model, data_loader, loss_fn)

1.1299102902412415