https://github.com/L1aoXingyu/pytorch-beginner/blob/master/08-AutoEncoder/Variational_autoencoder.py

REad it once:
    
https://jhui.github.io/2018/02/09/PyTorch-Data-loading-preprocess_torchvision/

In [4]:
import os

import torch
import torchvision
from torch import nn
from torch.autograd import Variable
from torch.utils.data import DataLoader
from torchvision import transforms
from torchvision.datasets import MNIST
from torchvision.utils import save_image

if not os.path.exists('./mlp_img'):
    os.mkdir('./mlp_img')


def to_img(x):
    x = 0.5 * (x + 1)
    x = x.clamp(0, 1)
    x = x.view(x.size(0), 1, 28, 28)
    return x


num_epochs = 100
batch_size = 128
learning_rate = 1e-3

# ransforms are common image transformations. They can be chained together using Compose
# transforms (list of Transform objects) – list of transforms to compose.
# img_transform = transforms.Compose([
#     transforms.ToTensor(),  # convert numpy array to tensor, It converts a PIL Image or numpy.ndarray (H x W x C) in the range [0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0].
#     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)) #normalizes a tensor image with mean and standard deviation.
# ])


# img_transform1 = transforms.Compose([
#     transforms.Resize(784),
#     transforms.ToTensor(),
#     transforms.Lambda(lambda x: x.repeat(3,1,1)),
#     transforms.Normalize(mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5))
# ])

# transforms (list of Transform objects) – list of transforms to compose.
img_transform = transforms.Compose([
    transforms.ToTensor(),  # convert numpy array to tensor, It converts a PIL Image or numpy.ndarray (H x W x C) in the range [0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0].
    transforms.Normalize((0.5,), (0.5,)) #normalizes a tensor image with mean and standard deviation.
])     

# All the datasets have almost similar API. They all have two common arguments:
# transform and target_transform to transform the input and target respectively.
dataset = MNIST('./data', transform=img_transform, download=True)
# Data loader. Combines a dataset and a sampler, and provides single- 
# or multi-process iterators over the dataset.
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True)


In [5]:
dataset.classes

['0 - zero',
 '1 - one',
 '2 - two',
 '3 - three',
 '4 - four',
 '5 - five',
 '6 - six',
 '7 - seven',
 '8 - eight',
 '9 - nine']

In [6]:
dataset.test_labels



tensor([5, 0, 4,  ..., 5, 6, 8])

In [7]:
dataset

Dataset MNIST
    Number of datapoints: 60000
    Split: train
    Root Location: ./data
    Transforms (if any): Compose(
                             ToTensor()
                             Normalize(mean=(0.5,), std=(0.5,))
                         )
    Target Transforms (if any): None

In [8]:
dataset.urls

['http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz',
 'http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz',
 'http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz',
 'http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz']

In [9]:
# nn.Module is Base class for all neural network modules.
# Your models should also subclass this class. thats why creating child

class autoencoder(nn.Module):
    def __init__(self):
        super(autoencoder, self).__init__()
        self.encoder = nn.Sequential( #stack of network in sequence
            nn.Linear(28 * 28, 128),  # 28*28 input image and 128 output size
            nn.ReLU(True),            # Activation function, decide how much to weight each input
            nn.Linear(128, 64),       #128 -> 64 dimension
            nn.ReLU(True), 
            nn.Linear(64, 12),        # 64-> 12 dimension
            nn.ReLU(True), 
            nn.Linear(12, 3))         # finally 12 -> 3 dimension
        self.decoder = nn.Sequential( #performing reverse step to produce the original signal output
            nn.Linear(3, 12),
            nn.ReLU(True),
            nn.Linear(12, 64),
            nn.ReLU(True),
            nn.Linear(64, 128),
            nn.ReLU(True), nn.Linear(128, 28 * 28), nn.Tanh())
        
    #Defines the computation performed at every call.
    def forward(self, x):
        x = self.encoder(x)
        x = self.decoder(x)
        return x

In [10]:
if(torch.cuda.is_available()):
    model = autoencoder().cuda()
else:
    model = autoencoder()
# nn.MSELoss() measures the mean squared error (squared L2 norm) between
# each element in the input xxx and target yyy.
criterion = nn.MSELoss() 
# Implements Adam algorithm.
# It has been proposed in Adam: A Method for Stochastic Optimization.
optimizer = torch.optim.Adam(
model.parameters(), lr=learning_rate, weight_decay=1e-5)

In [11]:
model

autoencoder(
  (encoder): Sequential(
    (0): Linear(in_features=784, out_features=128, bias=True)
    (1): ReLU(inplace)
    (2): Linear(in_features=128, out_features=64, bias=True)
    (3): ReLU(inplace)
    (4): Linear(in_features=64, out_features=12, bias=True)
    (5): ReLU(inplace)
    (6): Linear(in_features=12, out_features=3, bias=True)
  )
  (decoder): Sequential(
    (0): Linear(in_features=3, out_features=12, bias=True)
    (1): ReLU(inplace)
    (2): Linear(in_features=12, out_features=64, bias=True)
    (3): ReLU(inplace)
    (4): Linear(in_features=64, out_features=128, bias=True)
    (5): ReLU(inplace)
    (6): Linear(in_features=128, out_features=784, bias=True)
    (7): Tanh()
  )
)

* We need terminologies like epochs, batch size, iterations only when the data is too big which happens all the time in machine learning and we can’t pass all the data to the computer at once. So, to overcome this problem we need to divide the data into smaller sizes and give it to our computer one by one and update the weights of the neural networks at the end of every step to fit it to the data given.

* Epoch : One Epoch is when an ENTIRE dataset is passed forward and backward through the neural network only ONCE.

* Batch : Since one epoch is too big to feed to the computer at once we divide it in several smaller batches.

* Batch size: Total number of training examples present in a single batch.

* Iteration: Iterations is the number of batches needed to complete one epoch.

Example : We can divide the dataset of 2000 examples into batches of 500 then it will take 4 iterations to complete 1 epoch.

source : https://towardsdatascience.com/epoch-vs-iterations-vs-batch-size-4dfb9c7ce9c9

In [12]:
dataloader.batch_size

128

In [13]:
num_epochs

100

In [14]:
len(dataset)

60000

In [17]:
for epoch in range(num_epochs):
    for data in dataloader:
        img, _ = data
        img = img.view(img.size(0), -1)
        if(torch.cuda.is_available()):
            img = Variable(img).cuda()
        else:
            img = Variable(img)
        # ===================forward=====================
        output = model(img) #input image to the encoder and decoder generates original image
        loss = criterion(output, img) # calculate MSE loss function with actual image
        # ===================backward====================
        optimizer.zero_grad()
        loss.backward()  # Compute gradients
        optimizer.step() #updates the parameters
    # ===================log========================
#     print('epoch [{}/{}], loss:{:.4f}'.format(epoch + 1, num_epochs, loss.data[0]))
#     loss.data[0] is outdated with 0.4 version, now use data.item()
    print('epoch [{}/{}], loss:{:.4f}'.format(epoch + 1, num_epochs, loss.item()))
    if epoch % 10 == 0:
        pic = to_img(output.cpu().data)
        save_image(pic, './mlp_img/image_{}.png'.format(epoch))

epoch [1/100], loss:0.1609
epoch [2/100], loss:0.1679
epoch [3/100], loss:0.1465
epoch [4/100], loss:0.1447
epoch [5/100], loss:0.1572
epoch [6/100], loss:0.1553
epoch [7/100], loss:0.1524
epoch [8/100], loss:0.1389
epoch [9/100], loss:0.1474
epoch [10/100], loss:0.1454
epoch [11/100], loss:0.1328
epoch [12/100], loss:0.1360
epoch [13/100], loss:0.1325
epoch [14/100], loss:0.1482
epoch [15/100], loss:0.1283
epoch [16/100], loss:0.1264
epoch [17/100], loss:0.1354
epoch [18/100], loss:0.1373
epoch [19/100], loss:0.1342
epoch [20/100], loss:0.1349
epoch [21/100], loss:0.1249
epoch [22/100], loss:0.1290
epoch [23/100], loss:0.1299
epoch [24/100], loss:0.1365
epoch [25/100], loss:0.1194
epoch [26/100], loss:0.1240
epoch [27/100], loss:0.1273
epoch [28/100], loss:0.1266
epoch [29/100], loss:0.1163
epoch [30/100], loss:0.1268
epoch [31/100], loss:0.1300
epoch [32/100], loss:0.1341
epoch [33/100], loss:0.1306
epoch [34/100], loss:0.1335
epoch [35/100], loss:0.1258
epoch [36/100], loss:0.1248
e

This error was caused by the shape mismatch between the tensor and self.mean in F.normalize, which tensor was [1,28,28] and self.mean was [0.5, 0.5, 0.5], so the shape of self.mean implied that the tensor should be [3, *, *], instead of [1, *, *]. So I think there is something wrong with this input tensor.

In [18]:
torch.save(model.state_dict(), './sim_autoencoder.pth')