# Deep Learning
<table align="left">
  <td>
    <a href="https://colab.research.google.com/github/marcinsawinski/UEP_KIE_DL_CODE2024/blob/main/dl03_tuning.ipynb" target="_parent">
      <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab"/>
    </a>
  </td>
  <td>
    <a target="_blank" href="https://kaggle.com/kernels/welcome?src=https://github.com/marcinsawinski/UEP_KIE_DL_CODE2024/blob/main/dl03_tuning.ipynb">
      <img src="https://kaggle.com/static/images/open-in-kaggle.svg" alt="Open in Kaggle"/>
    </a>
  </td>
  <td>
    <a target="_blank" href="https://studiolab.sagemaker.aws/import/github/marcinsawinski/UEP_KIE_DL_CODE2024/blob/main/dl03_tuning.ipynb">
      <img src="https://studiolab.sagemaker.aws/studiolab.svg" alt="Open in SageMaker Studio Lab"/>
    </a>
  </td>
</table>

# Tasks
1. Install and connect to [WandB](https://wandb.ai) and Run a training simulation
2. Build NN model for classification on Fashion MNIST dataset. Log training to WandB in own project.
3. Create hyperparameter search with WandB sweep.
4. Experiment with Optimizers (SGD, ADAM) and optimizer parameters - number of epochs, learnign rate.
5. Experiment with network depth and number of neurons in layers
6. Experiment with Schedulers
7. Experiment with Dropout layers
8. Experiment with Batch Normalization layers
9. Try early stopping.
10. Find optimal setup. Retrain with the optimal setup and Log training to WandB in project called DL25-FMNIST


## Task1 - Install and connect to [WandB](https://wandb.ai)
1. Run `pip install wandb -qU`in terminal or `!pip install wandb -qU` in jupyter cell
2. Login using `wandb login` (terminal) or !wandb login' (jupyter cell). Alternatively you can run 
```
    import wandb
    wandb.login()
```
3. Update wandb.init code with your own project name e.g. student_surname_firstname. The entity can be "uep-kie-dl25" if you are already assigned.
4. Run simulation and review results on the [WandB](https://wandb.ai) website.


You will find WandB detailed example here:
https://colab.research.google.com/github/wandb/examples/blob/master/colabs/intro/Intro_to_Weights_%26_Biases.ipynb


In [None]:
# your code for installation and login goes here

In [None]:
import random
import time
import wandb
user = "kowalski_jan"
cfg = {
    "learning_rate": 0.02,
    "architecture": "ANN",
    "dataset": "dummy",
    "epochs": 5,
}
name = f"{cfg['architecture']}_{cfg['dataset']}_lr{cfg['learning_rate']}_ep{cfg['epochs']}_{time.strftime('%m%d-%H%M')}"
project = (f"student_{user}_dummy")
# Start a new wandb run to track this script.
run = wandb.init(
    # Set the wandb entity where your project will be logged (generally your team name).
    entity="uep-kie-dl25",
    # Set the wandb project where this run will be logged.
    project=project,
    # Track hyperparameters and run metadata.
    name=name,
    config=cfg,
)


# Simulate training.
epochs = 10
offset = random.random() / 5
for epoch in range(2, epochs):
    acc = 1 - 2**-epoch - random.random() / epoch - offset
    loss = 2**-epoch + random.random() / epoch + offset

    # Log metrics to wandb.
    run.log({"acc": acc, "loss": loss})

# Finish the run and upload any remaining data.
run.finish()

# Task 2 - Build NN model for classification on Fashion MNIST dataset. Log training to WandB in own project.

Create a basic NN model:
- Flatten data with `nn.Flatten()` layer
- 3 linear layers with 784, 128 and 64 inputs. outputs are 128,64 and 10 e.g. `nn.Linear(64, 10)`
- ReLU activation `nn.ReLU()`  (no activation for output layer)
- use cross entropy loss as criterion `nn.CrossEntropyLoss()` 
- use  SGD optimizer `optim.SGD(model.parameters(), lr=config.lr)` 


https://colab.research.google.com/github/wandb/examples/blob/master/colabs/pytorch/Organizing_Hyperparameter_Sweeps_in_PyTorch_with_W%26B.ipynb

In [1]:
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
import math
import time
import wandb

In [2]:
user = "kowalski_jan" # your name here 
cfg = {
    "learning_rate": 0.02,
    "architecture": "ANN",
    "dataset": "FMNIST",
    "epochs": 5,
    "batch_size": 32,
    "lr": 1e-3,
}
name = f"{cfg['architecture']}_{cfg['dataset']}_lr{cfg['learning_rate']}_ep{cfg['epochs']}_{time.strftime('%m%d-%H%M')}"
project = f"student_{user}_FMNIST"

In [None]:
run = wandb.init(
    # Set the wandb entity where your project will be logged (generally your team name).
    entity="uep-kie-dl25",
    # Set the wandb project where this run will be logged.
    project=project,
    # Track hyperparameters and run metadata.
    name=name,
    config=cfg,
)

config = wandb.config

# 1. Device configuration
device = torch.device("mps" if torch.cuda.is_available() else "cpu")

# 2. Transformations
transform = transforms.Compose(
    [transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))]
)

# 3. Load the dataset
train_dataset = datasets.FashionMNIST(
    root="./data", train=True, download=True, transform=transform
)
test_dataset = datasets.FashionMNIST(
    root="./data", train=False, download=True, transform=transform
)

train_loader = DataLoader(train_dataset, batch_size=config.batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=config.batch_size * 4)
n_steps_per_epoch = math.ceil(len(train_loader.dataset) / wandb.config.batch_size)


# 4. Define the model
class FashionClassifier(nn.Module):
    def __init__(self):
        super(FashionClassifier, self).__init__()
        self.network = nn.Sequential(
            # TODO specify your model here ...
        )


    def forward(self, x):
        return self.network(x)


model = FashionClassifier().to(device)

# 5. Loss and optimizer
criterion = # TODO your code here 
optimizer = # TODO your code here 

# 6. Training loop
num_epochs = config.epochs
example_ct = 0
step_ct = 0

for epoch in range(num_epochs):
    model.train()
    for step, (images, labels) in enumerate(train_loader):
        images, labels = images.to(device), labels.to(device)

        # Forward pass
        outputs = model(images)
        loss = criterion(outputs, labels)

        # Backward and optimize
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        example_ct += len(images)
        train_step_ct = (step + 1 + (n_steps_per_epoch * epoch)) / n_steps_per_epoch
        log_step_ct = step + 1 + (n_steps_per_epoch * epoch)
        metrics = {
            "train/train_loss": loss,
            "train/epoch": train_step_ct,
            "train/example_ct": example_ct,
        }
        # print(log_step_ct)
        if step + 1 < n_steps_per_epoch:
            # Log train metrics to wandb
            wandb.log(metrics)

    print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}")

    # 7. Evaluation
    model.eval()
    correct = 0
    total = 0
    with torch.no_grad():
        for images, labels in test_loader:
            images, labels = images.to(device), labels.to(device)
            outputs = model(images)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

    print(f"Test Accuracy: {100 * correct / total:.2f}%")
    metrics = {
        "test/epoch": epoch + 1,
        "test/accuracy": 100 * correct / total,
    }
    wandb.log(metrics)
wandb.finish()

# Task 3 - Create hyperparameter search with WandB sweep.
Follow instructions available from WandB 

https://colab.research.google.com/github/wandb/examples/blob/master/colabs/pytorch/Organizing_Hyperparameter_Sweeps_in_PyTorch_with_W%26B.ipynb

In [15]:
# Experiment with sweeps

# Remaining tasks
4. Experiment with Optimizers (SGD, ADAM) and optimizer parameters - number of epochs, learning rate.
5. Experiment with network depth and number of neurons in layers
6. Experiment with Schedulers
7. Experiment with Dropout layers
8. Experiment with Batch Normalization layers
9. Try early stopping.
10. Find optimal setup. Retrain with the optimal setup and Log training to WandB in project called DL25-FMNIST. use your name as job name 

**Optimizers**

Available [Optimizers](https://pytorch.org/docs/stable/optim.html):
```import torch.optim as optim
optimizer = optim.SGD(model.parameters(), lr=config.lr)
optimizer = optim.Adam(model.parameters(), lr=config.lr)
```
**Dropout**

Dropout layers: `nn.Dropout(0.5)` - add after the ReLU activations.

**Batch Normalization**

Batch Normalization layers 'nn.BatchNorm1d(64)'  - add right after Linear layers and before the activation functions. 


**Scheduler**

```
optimizer = optim.SGD(model.parameters(), lr=0.01)
scheduler = ExponentialLR(optimizer, gamma=0.9)

for epoch in range(20):
    for input, target in dataset:
        optimizer.zero_grad()
        output = model(input)
        loss = loss_fn(output, target)
        loss.backward()
        optimizer.step()
    scheduler.step()
```



In [5]:
# your code here