# Logistic regression in pytoch with mandala
This is a simple example of logistic regression in pytorch, using mandala for
data management. We will use the MNIST dataset to train a logistic regression
model and play with the hyperparameters of the training components. 

## Import libraries

In [1]:
import torch
import torchvision as tv
from torchvision.datasets import MNIST
from torch.utils.data import DataLoader, Subset
import torch.utils as utils
from mandala_lite.all import *

## Define supporting functions

In [2]:
INPUT_SIZE = 28**2
NUM_CLASSES = 10


class LogisticRegression(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = torch.nn.Linear(INPUT_SIZE, NUM_CLASSES)

    def forward(self, feature):
        output = self.linear(feature)
        return output


@op
def get_dataloaders(
    batch_size: int = 100, train_size: int = 1_000
) -> Tuple[DataLoader, DataLoader]:
    train_data = Subset(
        MNIST("data", train=True, download=True, transform=tv.transforms.ToTensor()),
        indices=range(train_size),
    )
    test_data = Subset(
        MNIST("data", train=False, transform=tv.transforms.ToTensor()),
        indices=range(10_000),
    )
    train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True)
    test_loader = DataLoader(test_data, batch_size=batch_size, shuffle=False)
    return train_loader, test_loader


@op
def train_lr(
    train_loader: DataLoader,
    test_loader: DataLoader,
    learning_rate: float = 0.001,
    num_epochs: int = 5,
) -> Tuple[LogisticRegression, float]:
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    model = LogisticRegression().to(device)
    loss = torch.nn.CrossEntropyLoss().to(device)
    optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
    for epoch in range(num_epochs):
        # train
        for batch_index, (images, labels) in enumerate(train_loader):
            images = images.view(-1, INPUT_SIZE).to(device)
            labels = labels.to(device)
            optimizer.zero_grad()
            output = model(images)
            loss_value = loss(output, labels)
            loss_value.backward()
            optimizer.step()
        # test
        accurate, total = 0, 0
        for images, labels in test_loader:
            images = images.view(-1, INPUT_SIZE).to(device)
            labels = labels.to(device)
            output = model(images)
            _, predicted = torch.max(output.data, 1)
            total += labels.size(0)
            accurate += (predicted == labels).sum()
        acc = 100 * accurate / total
        print(
            f"Epoch: {epoch}, Training loss: {round(loss_value.item(), 2)}. Test accuracy: {round(acc.item(), 2)}"
        )
    return model, float(acc.item())

## Create a storage for the results
Some explanation on the relationship between the storage and the `@op`-decorated
functions?

In [3]:
storage = Storage()

## Run the pipeline
This is just a single run of the pipeline with default parameters to see a
simple use case.

In [4]:
with storage.run():
    train_loader, test_loader = get_dataloaders()
    model, acc = train_lr(train_loader, test_loader)
    print(f"Final accuracy: {round(unwrap(acc), 2)}")

Epoch: 0, Training loss: 2.27. Test accuracy: 7.09
Epoch: 1, Training loss: 2.27. Test accuracy: 8.23
Epoch: 2, Training loss: 2.28. Test accuracy: 9.56
Epoch: 3, Training loss: 2.27. Test accuracy: 11.35
Epoch: 4, Training loss: 2.22. Test accuracy: 13.62
Final accuracy: 13.62


It's instructive to run the pipeline again to demonstrate the memoization:

In [5]:
with storage.run():
    train_loader, test_loader = get_dataloaders()
    model, acc = train_lr(train_loader, test_loader)
    print(f"Final accuracy: {round(unwrap(acc), 2)}")

Final accuracy: 13.62


## Explore parameters while reusing past results
- Here, could have a few rounds of exploration, gradually expanding the search
space, to show that it's easy to directly adjust the code and reuse results.
- It would also be nice to later on alternate between this exploration and the
queries, while introducing extra parameters (the batch and training set sizes),
and evolving the run/query blocks alongside each other.

In [6]:
with storage.run():
    train_loader, test_loader = get_dataloaders(batch_size=100, train_size=1_000)
    for learning_rate in [0.001, 0.01, 0.1]:
        for num_epochs in [1, 2, 3]:
            model, acc = train_lr(train_loader, test_loader, learning_rate, num_epochs)
            print(
                f"===End of run=== num_epochs: {num_epochs}, learning_rate: {learning_rate}, acc: {round(unwrap(acc), 2)}"
            )

Epoch: 0, Training loss: 2.29. Test accuracy: 11.06
===End of run=== num_epochs: 1, learning_rate: 0.001, acc: 11.06
Epoch: 0, Training loss: 2.33. Test accuracy: 9.15
Epoch: 1, Training loss: 2.32. Test accuracy: 9.97
===End of run=== num_epochs: 2, learning_rate: 0.001, acc: 9.97
Epoch: 0, Training loss: 2.24. Test accuracy: 21.21
Epoch: 1, Training loss: 2.22. Test accuracy: 23.3
Epoch: 2, Training loss: 2.23. Test accuracy: 25.66
===End of run=== num_epochs: 3, learning_rate: 0.001, acc: 25.66
Epoch: 0, Training loss: 2.27. Test accuracy: 19.31
===End of run=== num_epochs: 1, learning_rate: 0.01, acc: 19.31
Epoch: 0, Training loss: 2.25. Test accuracy: 27.19
Epoch: 1, Training loss: 2.12. Test accuracy: 42.96
===End of run=== num_epochs: 2, learning_rate: 0.01, acc: 42.96
Epoch: 0, Training loss: 2.24. Test accuracy: 18.57
Epoch: 1, Training loss: 2.12. Test accuracy: 33.58
Epoch: 2, Training loss: 2.04. Test accuracy: 52.17
===End of run=== num_epochs: 3, learning_rate: 0.01, acc:

## Query the results

This will take some explaining, but should basically revolve around a "function
calls become (conjunctive) constraints between values" metaphor. The below is a
template for a query that can be extended with more parameters (`batch_size`,
`train_size`) over time:

In [7]:
with storage.query() as q:
    train_loader, test_loader = get_dataloaders(batch_size=100, train_size=1_000)
    learning_rate = Q().named("learning_rate")
    num_epochs = Q().named("num_epochs")
    model, acc = train_lr(train_loader, test_loader, learning_rate, num_epochs)
    df = q.get_table(learning_rate, num_epochs, acc.named("accuracy"))

In [8]:
df.sort_values(by=["accuracy"], ascending=False)

Unnamed: 0,learning_rate,num_epochs,accuracy
2,0.1,3,77.699997
1,0.1,2,75.82
8,0.1,1,65.349998
4,0.01,3,52.169998
6,0.01,2,42.959999
7,0.001,3,25.66
9,0.01,1,19.309999
0,0.001,5,13.62
5,0.001,1,11.059999
3,0.001,2,9.969999
