<a href="https://colab.research.google.com/github/hkbu-kennycheng/comp3065/blob/main/lab3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Lab3: Introduction to Decision Trees and Support Vector Machine (Basic Models)

In [1]:
import torch

# Decision Trees using PyTorch

![](https://url2img-web.herokuapp.com/aHR0cHM6Ly9naXRodWIuY29tL3h1eXh1L1NvZnQtRGVjaXNpb24tVHJlZSNpbnRyb2R1Y3Rpb24=)

[https://github.com/xuyxu/Soft-Decision-Tree](https://github.com/xuyxu/Soft-Decision-Tree)

## Download code from Github

In [2]:
!curl -L https://raw.githubusercontent.com/xuyxu/Soft-Decision-Tree/master/SDT.py > SDT.py

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  5911  100  5911    0     0  29853      0 --:--:-- --:--:-- --:--:-- 29853


## Load the dataset

In [3]:
from torchvision import datasets
from torchvision import transforms

train_set = datasets.FashionMNIST("./data", download=True,
                                  transform=transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))]))
test_set = datasets.FashionMNIST("./data", download=True, train=False,
                                 transform=transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))]))

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to ./data/FashionMNIST/raw/train-images-idx3-ubyte.gz


  0%|          | 0/26421880 [00:00<?, ?it/s]

Extracting ./data/FashionMNIST/raw/train-images-idx3-ubyte.gz to ./data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw/train-labels-idx1-ubyte.gz


  0%|          | 0/29515 [00:00<?, ?it/s]

Extracting ./data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to ./data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz


  0%|          | 0/4422102 [00:00<?, ?it/s]

Extracting ./data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to ./data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz


  0%|          | 0/5148 [00:00<?, ?it/s]

Extracting ./data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw



  return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)


## Wrapping dataset with `DataLoder`

In [4]:
from torch.utils.data import DataLoader

BATCH_SIZE = 64

train_loader = DataLoader(train_set, batch_size=BATCH_SIZE, shuffle=True)
test_loader = DataLoader(test_set, batch_size=BATCH_SIZE, shuffle=True)

len(train_loader)

938

## Model settings

In [5]:
# import SDT from SDT.py
from SDT import SDT

model = SDT(input_dim=28*28, output_dim=10, depth=8)

## Loss function

In [6]:
from torch import nn

loss_function = nn.CrossEntropyLoss()

## Optimazier

In [7]:
from torch import optim

optimizer = optim.Adam(model.parameters(), lr=1e-3)

## Training loop

In [8]:
from tqdm import tqdm # for showing as progress bar

model.train() # put model in training mode

NUM_EPOCHS = 3

for epoch in range(NUM_EPOCHS):

  loop = tqdm(train_loader, position=0, leave=True)

  for (inputs, labels) in loop:
    optimizer.zero_grad() # zero the parameter gradients

    # forward + backward + optimize
    output = model.forward(inputs)
    loss = loss_function(output, labels)
    loss.backward()
    optimizer.step()
    
    loop.set_description(f"Epoch [{epoch}/{NUM_EPOCHS}]")

Epoch [0/3]: 100%|██████████| 938/938 [00:53<00:00, 17.67it/s]
Epoch [1/3]: 100%|██████████| 938/938 [00:52<00:00, 17.75it/s]
Epoch [2/3]: 100%|██████████| 938/938 [00:52<00:00, 17.72it/s]


## Evaluate the model

In [9]:
correct = 0
total = 0

loop = tqdm(test_loader, position=0, leave=True)
model.eval() # put model in evaluation mode
for (input, label) in loop:
  output = model.forward(input)
  _, predicted = torch.max(output.data, 1)
  total += label.size(0)
  correct += (predicted == label).sum().item()
  loop.set_postfix(acc=(100*correct/total))

100%|██████████| 157/157 [00:08<00:00, 18.88it/s, acc=75.8]


# Gradient Boosting

Gradient Boosting is a technique to improve the performance and robustness of machine learning model, typically decision trees.

![](https://url2img-web.herokuapp.com/aHR0cHM6Ly9lbnNlbWJsZS1weXRvcmNoLnJlYWR0aGVkb2NzLmlvL2VuL3N0YWJsZS9pbnRyb2R1Y3Rpb24uaHRtbCNncmFkaWVudC1ib29zdGluZy0x)

After boosting our previous decision tree model, the accuracy would be improved. 

In [10]:
!pip install torchensemble

Collecting torchensemble
  Downloading torchensemble-0.1.6-py3-none-any.whl (39 kB)
Collecting scikit-learn>=0.23.0
  Downloading scikit_learn-0.24.2-cp37-cp37m-manylinux2010_x86_64.whl (22.3 MB)
[K     |████████████████████████████████| 22.3 MB 1.3 MB/s 
Collecting threadpoolctl>=2.0.0
  Downloading threadpoolctl-2.2.0-py3-none-any.whl (12 kB)
Installing collected packages: threadpoolctl, scikit-learn, torchensemble
  Attempting uninstall: scikit-learn
    Found existing installation: scikit-learn 0.22.2.post1
    Uninstalling scikit-learn-0.22.2.post1:
      Successfully uninstalled scikit-learn-0.22.2.post1
Successfully installed scikit-learn-0.24.2 threadpoolctl-2.2.0 torchensemble-0.1.6


## Setup logger



In [11]:
from torchensemble.utils.logging import set_logger

logger = set_logger('classification_mnist_dt')

Log will be saved in '/content/logs'.
Create folder 'logs/'
Start logging into file /content/logs/classification_mnist_dt-2021_09_03_03_58.log...


## Chosing ensemble

In [12]:
from torchensemble import GradientBoostingClassifier

model = GradientBoostingClassifier(
    estimator=model, # previous decision tree model
    n_estimators=5,
    cuda=False,
)

## Optimazier

In [13]:
model.set_optimizer('Adam',             # parameter optimizer
                    lr=1e-3,            # learning rate of the optimizer
                    weight_decay=5e-4)  # weight decay of the optimizer

## Train and evaluate

In [14]:
# Train and Evaluate
model.fit(train_loader=train_loader,  # training data
          epochs=1,                   # the number of training epochs
          test_loader=test_loader)

2021-09-03 04:12:23,424 - INFO: Estimator: 000 | Epoch: 000 | Batch: 000 | RegLoss: 1673.95703
2021-09-03 04:12:28,612 - INFO: Estimator: 000 | Epoch: 000 | Batch: 100 | RegLoss: 68.80402
2021-09-03 04:12:33,788 - INFO: Estimator: 000 | Epoch: 000 | Batch: 200 | RegLoss: 43.16668
2021-09-03 04:12:38,915 - INFO: Estimator: 000 | Epoch: 000 | Batch: 300 | RegLoss: 36.52592
2021-09-03 04:12:44,054 - INFO: Estimator: 000 | Epoch: 000 | Batch: 400 | RegLoss: 28.75037
2021-09-03 04:12:49,114 - INFO: Estimator: 000 | Epoch: 000 | Batch: 500 | RegLoss: 28.23838
2021-09-03 04:12:54,245 - INFO: Estimator: 000 | Epoch: 000 | Batch: 600 | RegLoss: 14.91311
2021-09-03 04:12:59,378 - INFO: Estimator: 000 | Epoch: 000 | Batch: 700 | RegLoss: 21.59414
2021-09-03 04:13:04,529 - INFO: Estimator: 000 | Epoch: 000 | Batch: 800 | RegLoss: 19.98888
2021-09-03 04:13:09,744 - INFO: Estimator: 000 | Epoch: 000 | Batch: 900 | RegLoss: 20.32565
2021-09-03 04:13:16,806 - INFO: Validation Acc: 79.450 % | Historica

# Support Vector Machine (SVM)

## Model defination

![](https://url2img-web.herokuapp.com/aHR0cHM6Ly9weXRvcmNoLm9yZy9kb2NzL3N0YWJsZS9nZW5lcmF0ZWQvdG9yY2gubm4uTGluZWFyLmh0bWwjdG9yY2gubm4uTGluZWFy)

In [None]:
from torch import nn

model = nn.Sequential(
    nn.Flatten(),
    nn.Linear(28 * 28, 10)
)

## Multi-label soft margin SVM

![](https://url2img-web.herokuapp.com/aHR0cHM6Ly9weXRvcmNoLm9yZy9kb2NzL3N0YWJsZS9nZW5lcmF0ZWQvdG9yY2gubm4uTXVsdGlMYWJlbFNvZnRNYXJnaW5Mb3NzLmh0bWwjdG9yY2gubm4uTXVsdGlMYWJlbFNvZnRNYXJnaW5Mb3Nz)

In [None]:
loss_function = nn.MultiLabelSoftMarginLoss()

## Optimazier

In [None]:
from torch import optim

optimizer = optim.SGD(model.parameters(), lr=1e-3, momentum=0.9)

## Training loop

In [None]:
from tqdm import tqdm # for showing as progress bar

model.train() # put model in training mode

NUM_EPOCHS = 10

for epoch in range(NUM_EPOCHS):

  loop = tqdm(train_loader, position=0, leave=True)

  for (inputs, labels) in loop:
    optimizer.zero_grad() # zero the parameter gradients

    # print(label.shape, )
    label = torch.zeros(labels.shape[0], 10)
    for i, l in enumerate(labels):
      label[i][l] = 1

    # forward + backward + optimize
    output = model.forward(inputs)
    loss = loss_function(output, label)
    loss.backward()
    optimizer.step()
    
    loop.set_description(f"Epoch [{epoch}/{NUM_EPOCHS}]")

## Evaluate the model

In [None]:
correct = 0
total = 0

loop = tqdm(test_loader, position=0, leave=True)
model.eval() # put model in evaluation mode
for (input, label) in loop:
  output = model.forward(input)
  _, predicted = torch.max(output.data, 1)
  total += label.size(0)
  correct += (predicted == label).sum().item()
  loop.set_postfix(acc=(100*correct/total))