# Task Overview

In this task, your goal is to verify the impact of data noise level in neural network training.
You should use MLP architecture trained on MNIST dataset (like in previous lab exercises).


We will experiment with two setups:
1. Pick X. Take X% of training examples and reassign their labels to random ones. Note that we don't change anything in the test set.
2. Pick X. During each training step, for each sample, change values of X% randomly selected pixels to random values. Note that we don't change anything in the test set.

For both setups, check the impact of various levels of noise (various values of X%) on model performance. Show plots comparing crossentropy (log-loss) and accuracy with varying X%, and also comparing two setups with each other.
Prepare short report briefly explaining the results and observed trends. Consider questions like "why accuracy/loss increases/decreases so quickly/slowly", "why Z is higher in setup 1/2" and any potentially surprising things you see on charts.

### Potential questions, clarifications
* Q: Can I still use sigmoid/MSE loss?
  * You should train your network with softmax and crossentropy loss (log-loss), especially since you should report crossentropy loss.
* Q: When I pick X% of pixels/examples, does it have to be exactly X% or can it be X% in expectation?
  * A: It's fine either way.
* Q: When I randomize pixels, should I randomize them again each time a particular example is drawn (each training step/epoch) or only once before training?
  * A: Each training step/epoch.
* Q: When I randomize labels, should I randomize them again each time a particular example is drawn (each training step/epoch) or only once before training?
  * A: Only once before training.
* Q: What is the expected length of report/explanation?
  * A: There is no minimum/maximum, but between 5 (concise) and 20 sentences should be good. Don't forget about plots.
* Q: When I replace labels/pixels with random values, what random distribution should I use?
  * A: A distribution reasonably similar to the data. However, you don't need to match dataset's distribution exactly - approximation will be totally fine, especially if it's faster or easier to get.
* Q: Can I use something different than Colab/Jupyter Notebook? E.g. just Python files.
  * A: Yes, although notebook is encouraged; please include in you solution code and pdf.

# Model definition and training.

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import Dataset, DataLoader
import plotly.express as px
import random
import math
import numpy as np
import pandas as pd

In [None]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        # After flattening an image of size 28x28 we have 784 inputs
        self.fc1 = nn.Linear(784, 128)
        self.fc2 = nn.Linear(128, 128)
        self.fc3 = nn.Linear(128, 10)

    def forward(self, x):
        x = torch.flatten(x, 1)
        x = self.fc1(x)
        x = F.relu(x)
        x = self.fc2(x)
        x = F.relu(x)
        x = self.fc3(x)
        #output = F.softmax(x, dim=1)
        output = x
        return output


def train(model, device, train_loader, optimizer, epoch, log_interval):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = F.cross_entropy(output, target)
        print(loss)
        loss.backward()
        optimizer.step()
        if batch_idx % log_interval == 0:
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * len(data), len(train_loader.dataset),
                100. * batch_idx / len(train_loader), loss.item()))


def test(model, device, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.cross_entropy(output, target, reduction='sum').item()  # sum up batch loss
            pred = F.softmax(output, dim=1).argmax(dim=1, keepdim=True)  # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()

    test_loss /= len(test_loader.dataset)

    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
        test_loss, correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))
    
    return (test_loss, 100. * correct / len(test_loader.dataset))
    
batch_size = 256
test_batch_size = 1000
epochs = 10
lr = 1e-2
use_cuda = False
seed = 1
log_interval = 10

use_cuda = not use_cuda and torch.cuda.is_available()

torch.manual_seed(seed)
device = torch.device("cuda" if use_cuda else "cpu")

train_kwargs = {'batch_size': batch_size}
test_kwargs = {'batch_size': test_batch_size}
if use_cuda:
    cuda_kwargs = {'num_workers': 1,
                    'pin_memory': True,
                    'shuffle': True}
    train_kwargs.update(cuda_kwargs)
    test_kwargs.update(cuda_kwargs)

transform=transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
    ])
dataset1 = datasets.MNIST('../data', train=True, download=True,
                    transform=transform)
dataset2 = datasets.MNIST('../data', train=False,
                    transform=transform)
train_loader = torch.utils.data.DataLoader(dataset1,**train_kwargs)
test_loader = torch.utils.data.DataLoader(dataset2, **test_kwargs)

model = Net().to(device)
optimizer = optim.Adam(model.parameters(), lr=lr)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ../data/MNIST/raw/train-images-idx3-ubyte.gz


  0%|          | 0/9912422 [00:00<?, ?it/s]

Extracting ../data/MNIST/raw/train-images-idx3-ubyte.gz to ../data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ../data/MNIST/raw/train-labels-idx1-ubyte.gz


  0%|          | 0/28881 [00:00<?, ?it/s]

Extracting ../data/MNIST/raw/train-labels-idx1-ubyte.gz to ../data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ../data/MNIST/raw/t10k-images-idx3-ubyte.gz


  0%|          | 0/1648877 [00:00<?, ?it/s]

Extracting ../data/MNIST/raw/t10k-images-idx3-ubyte.gz to ../data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ../data/MNIST/raw/t10k-labels-idx1-ubyte.gz


  0%|          | 0/4542 [00:00<?, ?it/s]

Extracting ../data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ../data/MNIST/raw



In [None]:
dataset1[0][1]

5

In [None]:
batch = 1

for i_batch, sample_batch in enumerate(train_loader):
  if i_batch > 0:
    break
  batch = sample_batch

In [None]:
batch[1]

tensor([3, 5, 7, 0, 6, 6, 6, 9, 3, 6, 1, 4, 5, 7, 7, 8, 0, 6, 7, 2, 3, 9, 6, 9,
        5, 6, 9, 2, 4, 1, 4, 5, 9, 0, 1, 7, 6, 8, 5, 6, 2, 0, 1, 1, 6, 6, 9, 0,
        7, 1, 5, 9, 3, 3, 4, 0, 9, 0, 6, 3, 7, 9, 2, 4, 6, 7, 0, 8, 9, 2, 3, 0,
        7, 0, 5, 7, 4, 2, 5, 3, 9, 3, 5, 2, 7, 9, 9, 8, 2, 6, 2, 6, 4, 7, 0, 4,
        5, 3, 0, 0, 5, 1, 9, 3, 5, 6, 0, 5, 3, 0, 3, 5, 9, 1, 4, 8, 4, 1, 8, 9,
        3, 1, 0, 5, 7, 6, 1, 1, 5, 7, 4, 0, 3, 1, 2, 9, 2, 5, 4, 1, 9, 1, 7, 9,
        3, 6, 6, 3, 2, 9, 1, 8, 4, 1, 3, 2, 9, 2, 4, 0, 6, 7, 0, 5, 3, 9, 8, 1,
        3, 6, 3, 6, 9, 2, 1, 4, 7, 4, 9, 7, 3, 8, 4, 2, 3, 9, 4, 8, 8, 6, 3, 9,
        1, 1, 6, 9, 0, 7, 7, 9, 9, 5, 8, 9, 0, 1, 7, 3, 7, 8, 0, 6, 1, 5, 2, 7,
        7, 0, 5, 1, 7, 3, 5, 1, 6, 7, 8, 2, 7, 0, 0, 4, 7, 2, 2, 8, 1, 1, 7, 6,
        0, 6, 5, 4, 3, 9, 4, 1, 6, 1, 5, 8, 8, 5, 5, 4])

In [None]:
model.parameters()

<generator object Module.parameters at 0x7f3e53f67b50>

In [None]:
dataset1

Dataset MNIST
    Number of datapoints: 60000
    Root location: ../data
    Split: Train
    StandardTransform
Transform: Compose(
               ToTensor()
               Normalize(mean=(0.1307,), std=(0.3081,))
           )

In [None]:
base_losses = []
base_accs = []

for epoch in range(1, epochs + 1):
    train(model, device, train_loader, optimizer, epoch, log_interval)
    loss, acc = test(model, device, test_loader)
    base_losses.append(loss)
    base_accs.append(acc)


Test set: Average loss: 0.1426, Accuracy: 9544/10000 (95%)


Test set: Average loss: 0.1261, Accuracy: 9647/10000 (96%)


Test set: Average loss: 0.1350, Accuracy: 9626/10000 (96%)


Test set: Average loss: 0.1263, Accuracy: 9646/10000 (96%)


Test set: Average loss: 0.1273, Accuracy: 9650/10000 (96%)


Test set: Average loss: 0.1253, Accuracy: 9666/10000 (97%)


Test set: Average loss: 0.1345, Accuracy: 9692/10000 (97%)


Test set: Average loss: 0.1861, Accuracy: 9600/10000 (96%)


Test set: Average loss: 0.1629, Accuracy: 9648/10000 (96%)


Test set: Average loss: 0.1670, Accuracy: 9643/10000 (96%)



# Training models in setup 1: with randomized labels.

Below I perform training for several values of X% changing labels. Info about train loss is printed directly, plots showing test accuracy/loss are placed in the last part of the notebook.

In [None]:
class DatasetChangeLabels(Dataset):
  def __init__(self, x, dataset, transform=None):
    self.x = x
    self.dataset = dataset
    self.transform = transform

    length = len(dataset)
    percent = x
    vals = np.random.randint(0, 9, length)
    rand = np.random.uniform(0, 100, length)
    targets = self.dataset.targets
    targets[rand < percent] = torch.tensor(vals[rand < percent])
    self.dataset.targets = targets

  def __len__(self):
    return len(self.dataset)
  
  def __getitem__(self, idx):
    if torch.is_tensor(idx):
      idx = idx.tolist()

    sample = self.dataset[idx][0]
    target = self.dataset.targets[idx]

    if self.transform:
      sample = self.transform(sample)

    return (sample, target)

In [None]:
# Training for x=10
x = 10 # x is value (in per cent) of labels to change
dataset1 = datasets.MNIST('../data', train=True, download=True,
                    transform=transform)

dataset_labels = DatasetChangeLabels(x, dataset1)

train_loader = torch.utils.data.DataLoader(dataset_labels,**train_kwargs)
test_loader = torch.utils.data.DataLoader(dataset2, **test_kwargs)

model = Net().to(device)
optimizer = optim.Adam(model.parameters(), lr=lr)

In [None]:
labels_losses_10 = []
labels_accs_10 = []

for epoch in range(1, epochs + 1):
    train(model, device, train_loader, optimizer, epoch, log_interval)
    loss, acc = test(model, device, test_loader)
    labels_losses_10.append(loss)
    labels_accs_10.append(acc)


Test set: Average loss: 0.2637, Accuracy: 9491/10000 (95%)


Test set: Average loss: 0.2613, Accuracy: 9577/10000 (96%)


Test set: Average loss: 0.2161, Accuracy: 9580/10000 (96%)


Test set: Average loss: 0.2464, Accuracy: 9576/10000 (96%)


Test set: Average loss: 0.2410, Accuracy: 9587/10000 (96%)


Test set: Average loss: 0.2100, Accuracy: 9579/10000 (96%)


Test set: Average loss: 0.2474, Accuracy: 9579/10000 (96%)


Test set: Average loss: 0.2260, Accuracy: 9568/10000 (96%)


Test set: Average loss: 0.2197, Accuracy: 9582/10000 (96%)


Test set: Average loss: 0.2142, Accuracy: 9592/10000 (96%)



In [None]:
# Training for x=30
x = 30 # x is value (in per cent) of labels to change
dataset1 = datasets.MNIST('../data', train=True, download=True,
                    transform=transform)

dataset_labels = DatasetChangeLabels(x, dataset1)

train_loader = torch.utils.data.DataLoader(dataset_labels,**train_kwargs)
test_loader = torch.utils.data.DataLoader(dataset2, **test_kwargs)

model = Net().to(device)
optimizer = optim.Adam(model.parameters(), lr=lr)

In [None]:
labels_losses_30 = []
labels_accs_30 = []

for epoch in range(1, epochs + 1):
    train(model, device, train_loader, optimizer, epoch, log_interval)
    loss, acc = test(model, device, test_loader)
    labels_losses_30.append(loss)
    labels_accs_30.append(acc)


Test set: Average loss: 0.5573, Accuracy: 9259/10000 (93%)


Test set: Average loss: 0.5699, Accuracy: 9377/10000 (94%)


Test set: Average loss: 0.5144, Accuracy: 9445/10000 (94%)


Test set: Average loss: 0.5438, Accuracy: 9480/10000 (95%)


Test set: Average loss: 0.4067, Accuracy: 9458/10000 (95%)


Test set: Average loss: 0.5224, Accuracy: 9512/10000 (95%)


Test set: Average loss: 0.4896, Accuracy: 9499/10000 (95%)


Test set: Average loss: 0.5023, Accuracy: 9469/10000 (95%)


Test set: Average loss: 0.4949, Accuracy: 9492/10000 (95%)


Test set: Average loss: 0.5329, Accuracy: 9452/10000 (95%)



In [None]:
# Training for x=70
x = 70 # x is value (in per cent) of labels to change
dataset1 = datasets.MNIST('../data', train=True, download=True,
                    transform=transform)

dataset_labels = DatasetChangeLabels(x, dataset1)

train_loader = torch.utils.data.DataLoader(dataset_labels,**train_kwargs)
test_loader = torch.utils.data.DataLoader(dataset2, **test_kwargs)

model = Net().to(device)
optimizer = optim.Adam(model.parameters(), lr=lr)

In [None]:
labels_losses_70 = []
labels_accs_70 = []

for epoch in range(1, epochs + 1):
    train(model, device, train_loader, optimizer, epoch, log_interval)
    loss, acc = test(model, device, test_loader)
    labels_losses_70.append(loss)
    labels_accs_70.append(acc)


Test set: Average loss: 1.3088, Accuracy: 8979/10000 (90%)


Test set: Average loss: 1.3513, Accuracy: 8741/10000 (87%)


Test set: Average loss: 1.2449, Accuracy: 8991/10000 (90%)


Test set: Average loss: 1.2453, Accuracy: 9065/10000 (91%)


Test set: Average loss: 1.2151, Accuracy: 9028/10000 (90%)


Test set: Average loss: 1.1189, Accuracy: 9004/10000 (90%)


Test set: Average loss: 1.2950, Accuracy: 8950/10000 (90%)


Test set: Average loss: 1.2796, Accuracy: 9047/10000 (90%)


Test set: Average loss: 1.2977, Accuracy: 9013/10000 (90%)


Test set: Average loss: 1.2051, Accuracy: 8843/10000 (88%)



In [None]:
# Training for x=90
x = 90 # x is value (in per cent) of labels to change
dataset1 = datasets.MNIST('../data', train=True, download=True,
                    transform=transform)

dataset_labels = DatasetChangeLabels(x, dataset1)

train_loader = torch.utils.data.DataLoader(dataset_labels,**train_kwargs)
test_loader = torch.utils.data.DataLoader(dataset2, **test_kwargs)

model = Net().to(device)
optimizer = optim.Adam(model.parameters(), lr=lr)

In [None]:
labels_losses_90 = []
labels_accs_90 = []

for epoch in range(1, epochs + 1):
    train(model, device, train_loader, optimizer, epoch, log_interval)
    loss, acc = test(model, device, test_loader)
    labels_losses_90.append(loss)
    labels_accs_90.append(acc)


Test set: Average loss: 2.1194, Accuracy: 5139/10000 (51%)


Test set: Average loss: 2.0752, Accuracy: 5155/10000 (52%)


Test set: Average loss: 2.0969, Accuracy: 5323/10000 (53%)


Test set: Average loss: 2.0086, Accuracy: 5744/10000 (57%)


Test set: Average loss: 2.0435, Accuracy: 5471/10000 (55%)


Test set: Average loss: 2.0386, Accuracy: 5314/10000 (53%)


Test set: Average loss: 2.0839, Accuracy: 5530/10000 (55%)


Test set: Average loss: 2.0478, Accuracy: 5270/10000 (53%)


Test set: Average loss: 2.0384, Accuracy: 5177/10000 (52%)


Test set: Average loss: 2.0668, Accuracy: 5035/10000 (50%)



In [None]:
# Save results of losses to csv file
label_losses_10 = pd.DataFrame(labels_losses_10, columns=['Loss'])
label_losses_10['Percent'] = 10
label_losses_30 = pd.DataFrame(labels_losses_30, columns=['Loss'])
label_losses_30['Percent'] = 30
label_losses_70 = pd.DataFrame(labels_losses_70, columns=['Loss'])
label_losses_70['Percent'] = 70
label_losses_90 = pd.DataFrame(labels_losses_90, columns=['Loss'])
label_losses_90['Percent'] = 90
label_losses = pd.concat([label_losses_10, label_losses_30, 
                            label_losses_70,
                            label_losses_90])
label_losses.to_csv('./label_losses.csv')

In [None]:
# Save results of accuracies to csv file
label_accs_10 = pd.DataFrame(labels_accs_10, columns=['Accuracy'])
label_accs_10['Percent'] = 10
label_accs_30 = pd.DataFrame(labels_accs_30, columns=['Accuracy'])
label_accs_30['Percent'] = 30
label_accs_70 = pd.DataFrame(labels_accs_70, columns=['Accuracy'])
label_accs_70['Percent'] = 70
label_accs_90 = pd.DataFrame(labels_accs_90, columns=['Accuracy'])
label_accs_90['Percent'] = 90
label_accs = pd.concat([label_accs_10, label_accs_30, 
                            label_accs_70,
                            label_accs_90])
label_accs.to_csv('./label_accs.csv')

# Training models in setup 2: with randomized pixels.

In [None]:
# Function to modify X% of pexels
# Approximate distribution (based on histogram of dataset):
# After norming we have prob 0.85 of -0.43 and 0.15 of 2.78
def change_pixels(img, x):
  rng = 28*28
  to_change = math.floor(x*rng/100)
  coords_change = random.sample(range(rng), to_change)
  rand = np.random.uniform(0, 1, to_change)
  for i in range(0, to_change):
    number = coords_change[i]
    x_coord = np.int(math.floor(number/28))
    y_coord = np.int(number%28)
    img[0, x_coord, y_coord] = -0.43 if rand[i] < 0.85 else 2.78 #random.uniform(-0.42, 2.8)
  return img

In [None]:
class DatasetChangePexels(Dataset):
  def __init__(self, x, dataset, transform=None):
    self.x = x
    self.dataset = dataset
    self.transform = transform

  def __len__(self):
    return len(self.dataset)
  
  def __getitem__(self, idx):
    if torch.is_tensor(idx):
      idx = idx.tolist()

    sample = change_pixels(self.dataset[idx][0], self.x)
    target = self.dataset.targets[idx]

    if self.transform:
      sample = self.transform(sample)

    return (sample, target)

Below I perform training for several values of X% changing pexels. Info about train loss is printed directly, plots showing test accuracy/loss are placed in the next part of the notebook.

In [None]:
# dataset for changing 5 percent of pexels
dataset1_change_pexels = DatasetChangePexels(5, dataset1)
train_loader = torch.utils.data.DataLoader(dataset1_change_pexels,**train_kwargs)
test_loader = torch.utils.data.DataLoader(dataset2, **test_kwargs)

model = Net().to(device)
optimizer = optim.Adam(model.parameters(), lr=lr)

In [None]:
test_losses_5 = []
test_accuracies_5 = []

for epoch in range(1, epochs + 1):
    train(model, device, train_loader, optimizer, epoch, log_interval)
    loss, acc = test(model, device, test_loader)
    test_losses_5.append(loss)
    test_accuracies_5.append(acc)


Test set: Average loss: 0.1417, Accuracy: 9561/10000 (96%)


Test set: Average loss: 0.1302, Accuracy: 9639/10000 (96%)


Test set: Average loss: 0.1078, Accuracy: 9684/10000 (97%)


Test set: Average loss: 0.1404, Accuracy: 9600/10000 (96%)


Test set: Average loss: 0.1275, Accuracy: 9677/10000 (97%)


Test set: Average loss: 0.1168, Accuracy: 9707/10000 (97%)


Test set: Average loss: 0.1302, Accuracy: 9669/10000 (97%)


Test set: Average loss: 0.1211, Accuracy: 9705/10000 (97%)


Test set: Average loss: 0.1512, Accuracy: 9633/10000 (96%)


Test set: Average loss: 0.1344, Accuracy: 9705/10000 (97%)



In [None]:
# dataset for changing 10 percent of pexels
dataset1_change_pexels = DatasetChangePexels(10, dataset1)
train_loader = torch.utils.data.DataLoader(dataset1_change_pexels,**train_kwargs)
test_loader = torch.utils.data.DataLoader(dataset2, **test_kwargs)

model = Net().to(device)
optimizer = optim.Adam(model.parameters(), lr=lr)

In [None]:
test_losses_10 = []
test_accuracies_10 = []

for epoch in range(1, epochs + 1):
    train(model, device, train_loader, optimizer, epoch, log_interval)
    loss, acc = test(model, device, test_loader)
    test_losses_10.append(loss)
    test_accuracies_10.append(acc)


Test set: Average loss: 0.1536, Accuracy: 9545/10000 (95%)


Test set: Average loss: 0.1328, Accuracy: 9605/10000 (96%)


Test set: Average loss: 0.1182, Accuracy: 9694/10000 (97%)


Test set: Average loss: 0.1316, Accuracy: 9643/10000 (96%)


Test set: Average loss: 0.1190, Accuracy: 9702/10000 (97%)


Test set: Average loss: 0.1037, Accuracy: 9715/10000 (97%)


Test set: Average loss: 0.1095, Accuracy: 9711/10000 (97%)


Test set: Average loss: 0.1429, Accuracy: 9660/10000 (97%)


Test set: Average loss: 0.1265, Accuracy: 9697/10000 (97%)


Test set: Average loss: 0.1385, Accuracy: 9687/10000 (97%)



In [None]:
# dataset for changing 15 percent of pexels
dataset1_change_pexels = DatasetChangePexels(15, dataset1)
train_loader = torch.utils.data.DataLoader(dataset1_change_pexels,**train_kwargs)
test_loader = torch.utils.data.DataLoader(dataset2, **test_kwargs)

model = Net().to(device)
optimizer = optim.Adam(model.parameters(), lr=lr)

In [None]:
test_losses_15 = []
test_accuracies_15 = []

for epoch in range(1, epochs + 1):
    train(model, device, train_loader, optimizer, epoch, log_interval)
    loss, acc = test(model, device, test_loader)
    test_losses_15.append(loss)
    test_accuracies_15.append(acc)


Test set: Average loss: 0.1515, Accuracy: 9560/10000 (96%)


Test set: Average loss: 0.1134, Accuracy: 9660/10000 (97%)


Test set: Average loss: 0.1129, Accuracy: 9671/10000 (97%)


Test set: Average loss: 0.0969, Accuracy: 9732/10000 (97%)


Test set: Average loss: 0.1123, Accuracy: 9693/10000 (97%)


Test set: Average loss: 0.1258, Accuracy: 9703/10000 (97%)


Test set: Average loss: 0.1061, Accuracy: 9726/10000 (97%)


Test set: Average loss: 0.1140, Accuracy: 9745/10000 (97%)


Test set: Average loss: 0.1273, Accuracy: 9721/10000 (97%)


Test set: Average loss: 0.1213, Accuracy: 9724/10000 (97%)



In [None]:
# dataset for changing 20 percent of pexels
dataset1_change_pexels = DatasetChangePexels(20, dataset1)
train_loader = torch.utils.data.DataLoader(dataset1_change_pexels,**train_kwargs)
test_loader = torch.utils.data.DataLoader(dataset2, **test_kwargs)

model = Net().to(device)
optimizer = optim.Adam(model.parameters(), lr=lr)

In [None]:
test_losses_20 = []
test_accuracies_20 = []

for epoch in range(1, epochs + 1):
    train(model, device, train_loader, optimizer, epoch, log_interval)
    loss, acc = test(model, device, test_loader)
    test_losses_20.append(loss)
    test_accuracies_20.append(acc)


Test set: Average loss: 0.1743, Accuracy: 9464/10000 (95%)


Test set: Average loss: 0.1089, Accuracy: 9655/10000 (97%)


Test set: Average loss: 0.1311, Accuracy: 9647/10000 (96%)


Test set: Average loss: 0.1209, Accuracy: 9679/10000 (97%)


Test set: Average loss: 0.1138, Accuracy: 9694/10000 (97%)


Test set: Average loss: 0.1211, Accuracy: 9694/10000 (97%)


Test set: Average loss: 0.1187, Accuracy: 9692/10000 (97%)


Test set: Average loss: 0.1230, Accuracy: 9708/10000 (97%)


Test set: Average loss: 0.1371, Accuracy: 9735/10000 (97%)


Test set: Average loss: 0.1342, Accuracy: 9708/10000 (97%)



In [None]:
# dataset for changing 30 percent of pexels
dataset1_change_pexels = DatasetChangePexels(30, dataset1)
train_loader = torch.utils.data.DataLoader(dataset1_change_pexels,**train_kwargs)
test_loader = torch.utils.data.DataLoader(dataset2, **test_kwargs)

model = Net().to(device)
optimizer = optim.Adam(model.parameters(), lr=lr)

In [None]:
test_losses_30 = []
test_accuracies_30 = []

for epoch in range(1, epochs + 1):
    train(model, device, train_loader, optimizer, epoch, log_interval)
    loss, acc = test(model, device, test_loader)
    test_losses_30.append(loss)
    test_accuracies_30.append(acc)


Test set: Average loss: 0.1441, Accuracy: 9593/10000 (96%)


Test set: Average loss: 0.1314, Accuracy: 9629/10000 (96%)


Test set: Average loss: 0.1218, Accuracy: 9682/10000 (97%)


Test set: Average loss: 0.1155, Accuracy: 9692/10000 (97%)


Test set: Average loss: 0.1195, Accuracy: 9704/10000 (97%)


Test set: Average loss: 0.1202, Accuracy: 9722/10000 (97%)


Test set: Average loss: 0.1105, Accuracy: 9717/10000 (97%)


Test set: Average loss: 0.1339, Accuracy: 9699/10000 (97%)


Test set: Average loss: 0.1215, Accuracy: 9701/10000 (97%)


Test set: Average loss: 0.1253, Accuracy: 9694/10000 (97%)



In [None]:
# Save results of losses to csv file
pexel_losses_5 = pd.DataFrame(test_losses_5, columns=['Loss'])
pexel_losses_5['Percent'] = 5
pexel_losses_10 = pd.DataFrame(test_losses_10, columns=['Loss'])
pexel_losses_10['Percent'] = 10
pexel_losses_15 = pd.DataFrame(test_losses_15, columns=['Loss'])
pexel_losses_15['Percent'] = 15
pexel_losses_20 = pd.DataFrame(test_losses_20, columns=['Loss'])
pexel_losses_20['Percent'] = 20
pexel_losses_30 = pd.DataFrame(test_losses_30, columns=['Loss'])
pexel_losses_30['Percent'] = 30
pexel_losses = pd.concat([pexel_losses_5, pexel_losses_10, 
                            pexel_losses_15, pexel_losses_20,
                            pexel_losses_30])
pexel_losses.to_csv('./pexel_losses.csv')

In [None]:
# Save results of accuracies to csv file
pexel_accs_5 = pd.DataFrame(test_accuracies_5, columns=['Accuracy'])
pexel_accs_5['Percent'] = 5
pexel_accs_10 = pd.DataFrame(test_accuracies_10, columns=['Accuracy'])
pexel_accs_10['Percent'] = 10
pexel_accs_15 = pd.DataFrame(test_accuracies_15, columns=['Accuracy'])
pexel_accs_15['Percent'] = 15
pexel_accs_20 = pd.DataFrame(test_accuracies_20, columns=['Accuracy'])
pexel_accs_20['Percent'] = 20
pexel_accs_30 = pd.DataFrame(test_accuracies_30, columns=['Accuracy'])
pexel_accs_30['Percent'] = 30
pexel_accs = pd.concat([pexel_accs_5, pexel_accs_10, 
                            pexel_accs_15, pexel_accs_20,
                            pexel_accs_30])
pexel_accs.to_csv('./pexel_accs.csv')

# Plots and report.

In [None]:
base_losses_frame = pd.DataFrame(base_losses, columns=['Loss'])
base_losses_frame['Percent'] = 0
base_accs_frame = pd.DataFrame(base_accs, columns=['Accuracy'])
base_accs_frame['Percent'] = 0

In [None]:
# Test losses for changing labels
label_losses = pd.read_csv('./label_losses.csv')
label_losses = pd.concat([label_losses, base_losses_frame])
px.line(label_losses, y='Loss', color='Percent', title="Test losses for changing labels")

In [None]:
# Test accuracies for changing labels
label_accs = pd.read_csv('./label_accs.csv')
label_accs = pd.concat([label_accs, base_accs_frame])
px.line(label_accs, y='Accuracy', color='Percent', title="Test accuracies for changing labels")

In [None]:
# Test accuracies for changing labels (without 90%)
label_accs = pd.read_csv('./label_accs.csv')
label_accs = pd.concat([label_accs, base_accs_frame])
label_accs = pd.DataFrame(label_accs[label_accs['Percent'] != 90])
px.line(label_accs, y='Accuracy', color='Percent', title="Test accuracies for changing labels (without 90%)")

In [None]:
# Test losses for changing pexels
pexel_losses = pd.read_csv('./pexel_losses.csv')
pexel_losses = pd.concat([pexel_losses, base_losses_frame])
px.line(pexel_losses, y='Loss', color='Percent', title="Test losses for changing pexels")

In [None]:
# Test accuracies for changing pexels
pexel_accs = pd.read_csv('./pexel_accs.csv')
pexel_accs = pd.concat([pexel_accs, base_accs_frame])
px.line(pexel_accs, y='Accuracy', color='Percent', title="Test accuracies for changing pexels")

In [None]:
# Comparison of loss between both approaches
label_10 = pd.DataFrame(label_losses[label_losses['Percent'] == 10])
label_10['Experiment'] = '10% (labels)'
label_30 = pd.DataFrame(label_losses[label_losses['Percent'] == 30])
label_30['Experiment'] = '30% (labels)'
pexel_15 = pd.DataFrame(pexel_losses[pexel_losses['Percent'] == 15])
pexel_15['Experiment'] = '15% (pexels)'
pexel_20 = pd.DataFrame(pexel_losses[pexel_losses['Percent'] == 20])
pexel_20['Experiment'] = '20% (pexels)'
base_0 = base_losses_frame
base_0['Experiment'] = 'No change'
plot1_frame = pd.concat([label_10, label_30, pexel_15, pexel_20, base_0])

px.line(plot1_frame, y='Loss', color='Experiment', title='Comparison of loss for some values for two approaches')

In [None]:
# Comparison of accuracy between both approaches
label_10 = pd.DataFrame(label_accs[label_accs['Percent'] == 10])
label_10['Experiment'] = '10% (labels)'
label_30 = pd.DataFrame(label_accs[label_accs['Percent'] == 30])
label_30['Experiment'] = '30% (labels)'
pexel_15 = pd.DataFrame(pexel_accs[pexel_accs['Percent'] == 15])
pexel_15['Experiment'] = '15% (pexels)'
pexel_20 = pd.DataFrame(pexel_accs[pexel_accs['Percent'] == 20])
pexel_20['Experiment'] = '20% (pexels)'
base_0 = base_accs_frame
base_0['Experiment'] = 'No change'
plot1_frame = pd.concat([label_10, label_30, pexel_15, pexel_20, base_accs_frame])

px.line(plot1_frame, y='Accuracy', color='Experiment', title='Comparison of accuracy for some values for two approaches')

In the experiment I examined two approaches to adding noise to data. In the first setting, I analysed results of changing: 10%, 30%, 70% and 90% of labels to random ones. We can tell for sure that changing 90% of labels is not a particularly good idea: it makes the results significantly worse. All of the other values give comparable results. From plots we can see that the best effect gives changing 10% of labels. However, over the first ten training epochs, in this setting I was not able to achieve better results of training than without making any change.

Interestingly, even if we change 70 per cent of labels to random ones, the network is able to achive accuracy of more than 90%. However, changing labels makes the training process slower (we need more epochs to achieve the same accuracy this way). This is because the network is regularly pushed in random directions, which takes time, but does not help to learn.

In the second approach, I analysed changing: 5%, 10%, 15%, 20% and 30% of pexels to random values, according to approximate random distribution of values of pexels in MNIST images. Changing 5% of pexels seems to give the worst results, which are also similar to the situation with no change at all. This is reasonable, as most of the pexels are white and also if we pick a pixel, most likely it will be set as white according to the distribution. In effect, randomly changing 5% of pexels does not usually result in a lot of changes. Values for changing 10%, 15%, 20% and 30% are comparable. In general, adding noise in second approach seems to make the training process faster and more stable. From epoch 6/7 we can see effects of overfit in training without any noise: the loss is getting bigger and the accuracy smaller. If we randomly change pexels, this problem can not be seen on plots.

Generally we can see that the accuracies/losses for the second setting are better than without making any change. The reason is probably that adding some noise makes it less likely that the model will overfit training examples, therefore it generalises better.

On the last two graphs we can compare values from setting 1 and 2. We can see that the first approach makes the situation worse or similar to making no change. However, the second approach helps to achive better results.