# Model Soup ensemble example

### Preparation

In this notebook, we use cifar10 to show you the performance of greedy ensemble operator.

In [1]:
import torch
import torchvision

transform = torchvision.transforms.Compose([torchvision.transforms.ToTensor(),
                                            torchvision.transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))])

tr_set = torchvision.datasets.CIFAR10(root = "./datasets/cifar10", train = True, download = True, transform = transform)
te_set = torchvision.datasets.CIFAR10(root = "./datasets/cifar10", train = False, download = True, transform = transform)

batch_size = 128

tr_data = torch.utils.data.DataLoader(tr_set, batch_size = batch_size, shuffle = True)
te_data = torch.utils.data.DataLoader(te_set, batch_size = batch_size, shuffle = False)

Files already downloaded and verified
Files already downloaded and verified


### Training

Here we train 5 resnet50 models with different seed and save them to the corresponding path.
We also evaluate their accuracy for further comparision.

In [2]:
from torch import nn
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

def evaluate(model, data):
    score = 0
    model.eval()
    with torch.no_grad():
        for x, y in data:
            x, y = x.to(device), y.to(device)
            logits = model(x)
            pred = logits.argmax(dim = 1)
            score += (pred == y).sum().to("cpu").item()
    score = score / len(data.dataset)
    return score

result = {}
for index in range(5):
    print("{0} - model learn".format(index + 1))
    
    model = torchvision.models.resnet50(pretrained = True)
    model.to(device)
    model.train()
    criterion = torch.nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(model.parameters(), lr = 1e-4)
    for epoch in range(10):
        for x, y in tr_data:
            x, y = x.to(device), y.to(device)
            optimizer.zero_grad()
            logits = model(x)
            loss = criterion(logits, y)
            loss.backward()
            optimizer.step()

    score = evaluate(model, te_data)
    print("score : {0:.4f}".format(score))
    path = "model{0}.pth".format(index + 1)
    result[path] = score
    torch.save(model.state_dict(), path)
print("finish")

1 - model learn
score : 0.8301
2 - model learn
score : 0.8308
3 - model learn
score : 0.8351
4 - model learn


### Model Soup Ensemble & Reference

In this part, we use model-soup ensemble operator to generate new models.

Both greedy_soup and uniform_soup improve accury of classification task by at least 1%.

In [None]:
import os
import numpy as np
import towhee
from towhee import ops
def metric(y_true, y_pred):
    return ((y_true == y_pred.argmax(axis = -1)).sum() / len(y_true)).to("cpu").item()

print("[Original Performance]")
for k, v in result.items():
    print("[{0}] score:{1:.4f}".format(os.path.basename(k), v))

print("\n[Greedy Soup (uniform weight update) Performance]")
greedy_soup = ops.ensemble.model_soup(souptype = "greedy_soup")
greedy_model = greedy_soup(model, list(result.keys()), te_data, metric = metric, device = device, compare = np.greater_equal)

score = evaluate(greedy_model, te_data)
print("score : {0:.4f}".format(score))

print("\n[Uniform Soup Performance]")
uniform_soup = ops.ensemble.model_soup(souptype = "uniform_soup")
uniform_model = uniform_soup(model, list(result.keys()), device = device)
score = evaluate(uniform_model, te_data)
print("score : {0:.4f}".format(score))