***Challenge 1***

Here the goal is to train on 25 samples. In this preliminary testbed the evaluation will be done on a 2000 sample validation set. Note in the end the final evaluation will be done on the full CIFAR-10 test set as well as potentially a separate dataset. The validation samples here should not be used for training in any way, the final evaluation will provide only random samples of 25 from a datasource that is not the CIFAR-10 training data.

Feel free to modify this testbed to your liking, including the normalization transformations etc. Note however the final evaluation testbed will have a rigid set of components where you will need to place your answer. The only constraint is the data. Refer to the full project instructions for more information.


Setup training functions. Again you are free to fully modify this testbed in your prototyping within the constraints of the data used. You can use tools outside of pytorch for training models if desired as well although the torchvision dataloaders will still be useful for interacting with the cifar-10 dataset.

In [None]:
def train(model, device, train_loader, optimizer, epoch, display=True):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = F.cross_entropy(output, target)
        loss.backward()
        optimizer.step()
    if display:
      print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
          epoch, batch_idx * len(data), len(train_loader.dataset),
          100. * batch_idx / len(train_loader), loss.item()))

def test(model, device, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.cross_entropy(output, target, size_average=False).item() # sum up batch loss
            pred = output.max(1, keepdim=True)[1] # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()

    test_loss /= len(test_loader.dataset)
    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.2f}%)\n'.format(
        test_loss, correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))
    return 100. * correct / len(test_loader.dataset)

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F

class Net(torch.nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.layers = nn.ModuleList()

        self.layers+=[nn.Conv2d(3, 16,  kernel_size=3) ,
                      nn.ReLU(inplace=True)]
        self.layers+=[nn.Conv2d(16, 16,  kernel_size=3, stride=2),
                      nn.ReLU(inplace=True)]
        self.layers+=[nn.Conv2d(16, 32,  kernel_size=3),
                      nn.ReLU(inplace=True)]
        self.layers+=[nn.Conv2d(32, 32,  kernel_size=3, stride=2),
                      nn.ReLU(inplace=True)]
        self.fc = nn.Linear(32*5*5, 10)
    def forward(self, x):
        for i in range(len(self.layers)):
          x = self.layers[i](x)
        x = x.view(-1, 32*5*5)
        x = self.fc(x)
        return x

The below tries  2 random problem instances. In your development you may choose to prototype with 1 problem instances but keep in mind for small sample problems the variance is high so continously evaluating on several subsets will be important.

In [None]:
from numpy.random import RandomState
import numpy as np
import torch.optim as optim
from torch.utils.data import Subset
from torchvision import datasets, transforms


normalize = transforms.Normalize((0.4914, 0.4822, 0.4465), (0.247, 0.243, 0.261))

transform_val = transforms.Compose([transforms.ToTensor(), normalize]) #careful to keep this one same
transform_train = transforms.Compose([transforms.ToTensor(), normalize])

use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")

##### Cifar Data
cifar_data = datasets.CIFAR10(root='.',train=True, transform=transform_train, download=True)

#We need two copies of this due to weird dataset api
cifar_data_val = datasets.CIFAR10(root='.',train=True, transform=transform_val, download=True)


accs = []

for seed in range(1, 5):
  prng = RandomState(seed)
  random_permute = prng.permutation(np.arange(0, 1000))
  classes =  prng.permutation(np.arange(0,10))
  indx_train = np.concatenate([np.where(np.array(cifar_data.targets) == classe)[0][random_permute[0:25]] for classe in classes[0:2]])
  indx_val = np.concatenate([np.where(np.array(cifar_data.targets) == classe)[0][random_permute[25:225]] for classe in classes[0:2]])


  train_data = Subset(cifar_data, indx_train)
  val_data = Subset(cifar_data_val, indx_val)

  print('Num Samples For Training %d Num Samples For Val %d'%(train_data.indices.shape[0],val_data.indices.shape[0]))

  train_loader = torch.utils.data.DataLoader(train_data,
                                             batch_size=128,
                                             shuffle=True)

  val_loader = torch.utils.data.DataLoader(val_data,
                                           batch_size=128,
                                           shuffle=False)


  model = Net()
  model.to(device)
  optimizer = torch.optim.SGD(model.parameters(),lr=0.01, momentum=0.9,
                              weight_decay=0.0005)
  for epoch in range(100):
    train(model, device, train_loader, optimizer, epoch, display=epoch%5==0)

  accs.append(test(model, device, val_loader))

accs = np.array(accs)
print('Acc over 5 instances: %.2f +- %.2f'%(accs.mean(),accs.std()))


Files already downloaded and verified
Files already downloaded and verified
Num Samples For Training 50 Num Samples For Val 400


NameError: ignored

***Challenge 2***

You may use the same testbed but without the constraints on external datasets or models trained on exeternal datasets. You may not however use any of the CIFAR-10 training set.

In [None]:
from torch.utils.data import Dataset

def Offline_aug(x,y):
  y1 = y.squeeze()
  assert x.shape[0]==y1.shape[0]
  tr_x = []
  tr_y = []
  for i in range(x.shape[0]):
    x_ = transforms.ToPILImage()(x[i])
    x1 = transforms.RandomRotation(40)(x_)
    x2 = transforms.RandomHorizontalFlip()(x_)
    x3 = transforms.RandomVerticalFlip()(x_)
    x4 = transforms.ColorJitter()(x_)
    x5 = transforms.AutoAugment()(x_)
    tr_x.append(transforms.ToTensor()(x_))
    tr_x.append(transforms.ToTensor()(x1))
    tr_x.append(transforms.ToTensor()(x2))
    tr_y.append(y1[i])
    tr_y.append(y1[i])
    tr_y.append(y1[i])
  tr_x = torch.stack(tr_x)
  tr_y = torch.stack(tr_y)
  return tr_x,tr_y

class CustomTensorDataset(Dataset):
    """TensorDataset with support of transforms.
    """
    def __init__(self, tensors, transform=None):
        assert all(tensors[0].size(0) == tensor.size(0) for tensor in tensors)
        self.tensors = tensors
        self.transform = transform

    def __getitem__(self, index):
        x = self.tensors[0][index]

        if self.transform:
            x = self.transform(x)

        y = self.tensors[1][index]

        return x, y

    def __len__(self):
        return self.tensors[0].size(0)

In [None]:
!pip install efficientnet-pytorch

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting efficientnet-pytorch
  Downloading efficientnet_pytorch-0.7.1.tar.gz (21 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: efficientnet-pytorch
  Building wheel for efficientnet-pytorch (setup.py) ... [?25l[?25hdone
  Created wheel for efficientnet-pytorch: filename=efficientnet_pytorch-0.7.1-py3-none-any.whl size=16444 sha256=8c9ba3d2c567256b18baa0e41d6545da4c76b5d93da0d8a21586a7d44512b831
  Stored in directory: /root/.cache/pip/wheels/03/3f/e9/911b1bc46869644912bda90a56bcf7b960f20b5187feea3baf
Successfully built efficientnet-pytorch
Installing collected packages: efficientnet-pytorch
Successfully installed efficientnet-pytorch-0.7.1


In [None]:
import torchvision.models as models
import torch.nn as nn
from numpy.random import RandomState
import numpy as np
import torch
import torch.optim as optim
from torch.utils.data import Subset
import torch.nn.functional as F
from torchvision import datasets, transforms
import pickle
from efficientnet_pytorch import EfficientNet

#torch.cuda.memory_summary(device=None, abbreviated=False)

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                  std=[0.229, 0.224, 0.225])
resize = transforms.Resize(224)

# We resize images to allow using imagenet pre-trained models, is there a better way?
resize = transforms.Resize(224)

transform_val = transforms.Compose([resize, transforms.ToTensor(), normalize]) #careful to keep this one same
transform_train = transforms.Compose([transforms.ToPILImage(), resize, transforms.ToTensor(), normalize])

use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")
print(device) # you will really need gpu's for this part

##### Cifar Data
cifar_data = datasets.CIFAR10(root='.',train=True, transform=transforms.ToTensor(), download=True)
cifar_data100 = datasets.CIFAR100(root='.',train=True, transform=transform_val, download=True)

#We need two copies of this due to weird dataset api
cifar_data_val = datasets.CIFAR10(root='.',train=True, transform=transform_val, download=True)

accs = []
l2_lambda = 0.001
step = 1
num_epoch = 4

Valid_accuracies = []
Valid_loss = []
Train_loss = []

for seed in range(1,25):
  print('seed number is %d'%(seed))
  prng = RandomState(seed)
  random_permute = prng.permutation(np.arange(0, 5000))
  classes =  prng.permutation(np.arange(0,10))
  classes1 =  prng.permutation(np.arange(0,100))
  indx_train = np.concatenate([np.where(np.array(cifar_data.targets) == classe)[0][random_permute[0:25]] for classe in classes[0:2]])
  indx_val = np.concatenate([np.where(np.array(cifar_data.targets) == classe)[0][random_permute[25:225]] for classe in classes[0:2]])
  indx_train100 = np.concatenate([np.where(np.array(cifar_data100.targets) == classe)[0] for classe in classes[0:2]])

  train_data = Subset(cifar_data, indx_train)
  val_data = Subset(cifar_data_val, indx_val)
  train_data100 = Subset(cifar_data100,indx_train100)

  print('Num Samples For Training %d Num Samples For Val %d'%(train_data.indices.shape[0],val_data.indices.shape[0]))

  train_loader = torch.utils.data.DataLoader(train_data,
                                             batch_size=50,
                                             shuffle=True)

  train_loader100 = torch.utils.data.DataLoader(train_data100,
                                             batch_size=200,#200
                                             shuffle=True)
  val_loader = torch.utils.data.DataLoader(val_data,
                                           batch_size=32,
                                           shuffle=True)

  trainset = []
  targetset = []
  for data, target in (train_loader):
    trainset.append(data)
    targetset.append(target)
  trainset = torch.stack(trainset).squeeze()
  targetset = torch.stack(targetset)

  X,Y = Offline_aug(trainset,targetset)

  train_dataset_normal = CustomTensorDataset(tensors=(X, Y), transform=transform_train)

  train_loader_new = torch.utils.data.DataLoader(train_dataset_normal,
                                              batch_size=32,
                                              shuffle=True)


  #model = models.alexnet(pretrained=True)
  model = EfficientNet.from_pretrained('efficientnet-b1')
  for param in model.parameters():
    param.requires_grad = False
  #model.classifier = nn.Linear(256 * 6 * 6, 10)
  model ._fc= torch.nn.Linear(in_features=model._fc.in_features, out_features=10, bias=True)
  optimizer = torch.optim.SGD(model._fc.parameters(),
                                lr=0.01, momentum=0.9,
                              weight_decay=0.0005)

 # model.to(device)
  model.train()
  train_loss = 0
  for epoch in range(num_epoch):
    for batch_idx, ((data, target),(data100,target100)) in enumerate(zip(train_loader_new,train_loader100)):

      #data = data.to(device)
      #target = target.to(device)
      #data100 = data100.to(device)
      #target100 = target100.to(device)

      optimizer.zero_grad()

      output = (model(data.float())).squeeze()
      output100 = (model(data100.float())).squeeze()

      loss1 = F.cross_entropy(output, target)
      loss2 = F.cross_entropy(output100, target100)

      l2_norm = sum(p.pow(2.0).sum()
                  for p in model.parameters())

      loss = loss1 + l2_lambda * l2_norm +loss2  #L2 Regularization
      train_loss +=loss
      loss.backward()
      optimizer.step()
    train_loss /= len(train_loader_new.dataset)
    Train_loss.append(train_loss)
    #print(f"Loss at epoch {epoch} = {mean_loss}")
    if epoch%step==0:
      print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
        epoch, batch_idx * len(data), len(train_loader_new.dataset),
      100. * batch_idx / len(train_loader_new), loss.item()))
    model.eval()
    test_loss = 0
    correct = 0

    with torch.no_grad():

        for data, target in val_loader:
           # data, target = data.to(device), target.to(device)
            output = model(data.float())
            test_loss += F.cross_entropy(output, target).item() # sum up batch loss
            pred = output.max(1, keepdim=True)[1] # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()



    #print(f"Loss at epoch {epoch} = {mean_loss}")
    test_loss /= len(val_loader.dataset)
    Valid_loss.append(test_loss)
    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.2f}%)\n'.format(
        test_loss, correct, len(val_loader.dataset),
        100. * correct / len(val_loader.dataset)))
    print(100. * correct / len(val_loader.dataset))
    Valid_accuracies.append(correct / len(val_loader.dataset))
  with open('/content/drive/MyDrive/Results2/Train_loss', 'wb') as fp:
      pickle.dump(Train_loss, fp)
  with open('/content/drive/MyDrive/Results2/Valid_loss', 'wb') as fp:
      pickle.dump(Valid_loss, fp)
  with open('/content/drive/MyDrive/Results2/Valid_accuracy', 'wb') as fp:
      pickle.dump(Valid_accuracies, fp)
print(f'Mean Train Acc over 25 seeds: '\
      f'{np.mean(Valid_accuracies):.2%} '\
      f'+- {np.std(Valid_accuracies):.2}')

print(f'Mean Loss Acc over 25 seeds: '\
      f'{np.mean(Valid_loss):.2%} '\
      f'+- {np.std(Valid_loss):.2}')


cuda
Files already downloaded and verified
Files already downloaded and verified
Files already downloaded and verified
seed number is 5
Num Samples For Training 50 Num Samples For Val 400


Downloading: "https://github.com/lukemelas/EfficientNet-PyTorch/releases/download/1.0/efficientnet-b1-f1951068.pth" to /root/.cache/torch/hub/checkpoints/efficientnet-b1-f1951068.pth
100%|██████████| 30.1M/30.1M [00:00<00:00, 176MB/s]


Loaded pretrained weights for efficientnet-b1

Test set: Average loss: 0.0460, Accuracy: 307/400 (76.75%)

76.75

Test set: Average loss: 0.0106, Accuracy: 373/400 (93.25%)

93.25

Test set: Average loss: 0.0051, Accuracy: 381/400 (95.25%)

95.25

Test set: Average loss: 0.0051, Accuracy: 374/400 (93.50%)

93.5

Test set: Average loss: 0.0044, Accuracy: 380/400 (95.00%)

95.0

Test set: Average loss: 0.0038, Accuracy: 382/400 (95.50%)

95.5

Test set: Average loss: 0.0036, Accuracy: 382/400 (95.50%)

95.5

Test set: Average loss: 0.0039, Accuracy: 382/400 (95.50%)

95.5

Test set: Average loss: 0.0042, Accuracy: 381/400 (95.25%)

95.25

Test set: Average loss: 0.0041, Accuracy: 381/400 (95.25%)

95.25
seed number is 6
Num Samples For Training 50 Num Samples For Val 400
Loaded pretrained weights for efficientnet-b1

Test set: Average loss: 0.0442, Accuracy: 282/400 (70.50%)

70.5

Test set: Average loss: 0.0096, Accuracy: 377/400 (94.25%)

94.25

Test set: Average loss: 0.0046, Accuracy

In [None]:
import pickle
with open('/content/drive/MyDrive/Results2/Valid_loss', 'wb') as fp:
      pickle.dump(Train_loss, fp)


NameError: ignored

In [None]:
import pickle
import matplotlib.pyplot as plt
import torch
import numpy as np

with open('/content/drive/MyDrive/Results2/Valid_accuracy','rb') as f:
    x = pickle.load(f)

with open('/content/drive/MyDrive/Results2/Valid_loss','rb') as f:
    x2 = pickle.load(f)
#x1 = torch.stack(x)
#plt.plot(x1.detach().cpu())
print(f'Mean Train Acc over 25 seeds: '\
      f'{np.mean(x):.2%} '\
      f'+- {np.std(x):.2}')

print(f'Mean Loss Acc over 25 seeds: '\
      f'{np.mean(x2):.2%} '\
      f'+- {np.std(x2):.2}')


Mean Train Acc over 25 seeds: 81.53% +- 0.15
Mean Loss Acc over 25 seeds: 1.83% +- 0.015


In [None]:
print(f'Mean Train Acc over 25 seeds: '\
      f'{np.mean(Valid_accuracies):.2%} '\
      f'+- {np.std(Valid_accuracies):.2}')

print(f'Mean Loss Acc over 25 seeds: '\
      f'{np.mean(Valid_loss):.2%} '\
      f'+- {np.std(Valid_loss):.2}')

NameError: ignored

# New section