**<font size = 6>**
# This is the `Demo.ipynb` file

I will demonstrate how to use my classes and functions.

In [1]:
import random
import torch
import numpy as np
seed = 1
random.seed(seed)
np.random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)

**<font size = 4>**
# The `UCIDatasets` loader in `/utils/dataframe.py `

`reference: https://gist.github.com/martinferianc/db7615c85d5a3a71242b4916ea6a14a2`

Note that the output `train` or `test` is pytorch.Dataset

In [2]:
from utils.dataframe import UCIDatasets, datalist
print(datalist)
data = UCIDatasets("housing")
train = data.get_split( load="train") #pytorch.dataset
test = data.get_split( load="test")

torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
train_loader = data.get_dataloader(load = "train", batch_size = 16) #pytorch.dataloader

['housing', 'concrete', 'energy', 'power', 'redwine', 'whitewine', 'yacht']


**<font size = 4>**
# The `imputer` in `model/imputer.py`.

I assemble several common methods, some detail can see my code and the documentary of sklearn.impute. 

I have set up some parameters. change the parameters via `par_setting( par_dict )` function.

The input dictionary should be like `__getParvalue__()`

In [3]:
from model.imputer import imputer, method_list
import numpy as np
print(method_list)
X = np.random.rand(100*20).reshape(100,20)
X[51:, 11:] = np.nan

impObject = imputer(X, method = 'mice')
impObject.train()
imp = impObject.imp
imp.transform(X).shape 
print('\n\nSet up parameters via impObject.par_setting( par_dict ) and retrain the model, \
    the par_dict should be like the following structure:\n {} \n'.format(impObject.getParvalue()))
print('for instance: impObject.par_setting( { \'max_iter\': 10 } ) ')
print('\nEach par corresponding to method is as the following:')
impObject.getParlist()
print('where SimpleImputer includes [\'mean\', \'median\', \'most_frequent\']')


['mean', 'median', 'most_frequent', 'mice', 'missForest', 'knn']


Set up parameters via impObject.par_setting( par_dict ) and retrain the model,     the par_dict should be like the following structure:
 {'missing_values': nan, 'max_iter': 10, 'random_state': 0, 'n_estimators': 100, 'n_neighbors': 3, 'metric': 'nan_euclidean'} 

for instance: impObject.par_setting( { 'max_iter': 10 } ) 

Each par corresponding to method is as the following:
where SimpleImputer includes ['mean', 'median', 'most_frequent']


**<font size = 4>**
# This part is for `MIWAE` in `/model/MIWAE.py` and `trainer` in `/utils/trainer.py`.

`http://proceedings.mlr.press/v97/mattei19a/mattei19a.pdf (ICML, 2019)`.

`MIWAE` is a pytorch model

Might use `trainer` to train it.

Loss function are like `loss(self, outdic, indic)` in which outdic and indic are input and output

For `MIWAE`, 

`indic = {'x': x , 'm': m }` where `x` is dataset and `m` is missing indicator.

`output = {'lpxz': lpxz , 'lqzx': lqzx , 'lpz': lpz }`  means `l`og loss for `p`(`x` | `z`), `q`(`z` | `x`), `p`(`z`)

In MIWAE loss `self.MIWAE_ELBO(outdic, indic = None)` the indic is not required.


In [5]:
from model.MIWAE import MIWAE
from utils.trainer import VAEtrainer
from utils.dataframe import UCIDatasets, datalist
data = UCIDatasets("whitewine")

torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
train_loader = data.get_dataloader(load = "train", batch_size = 16) #pytorch.dataloader
test_loader = data.get_dataloader(load = "test", batch_size = 16) #pytorch.dataloader
model = MIWAE(data_dim = 20, n_samples=5, permutation_invariance=True)
trainer = VAEtrainer(model = model, train_loader = train_loader, test_loader = test_loader)
trainer.model_summary()

Es torch.Size([16, 20, 20])
Esx torch.Size([16, 20, 21])
Esxr torch.Size([320, 21])
h torch.Size([320, 20])
hr torch.Size([16, 20, 20])
hz torch.Size([16, 20, 20])
g torch.Size([16, 20])
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Linear-1                   [-1, 20]             440
            Linear-2                  [-1, 100]           2,100
              Tanh-3                  [-1, 100]               0
            Linear-4                  [-1, 100]          10,100
              Tanh-5                  [-1, 100]               0
            Linear-6                   [-1, 50]           5,050
            Linear-7                   [-1, 50]           5,050
            Linear-8               [-1, 5, 100]           5,100
            Linear-9               [-1, 5, 100]          10,100
           Linear-10                [-1, 5, 20]           2,020
           Linear-11                [-1, 5, 

**<font size = 4>**
# A simple example

Here is a simple example using `/utils/VAEtrainer` class from `/utils/trainer.py` to train `VAE` on `MNIST`.

First, we may set some hyperparameters.

Then read the training data and validation data (`train_loader` structure) and put them into trainer.

The trainer will use `VAE_loss` automatically for `VAE` model.

I will use this trainer for all `VAE` like models in this project.

In [7]:
import torch
import torch.utils.data
from torchvision import datasets, transforms
from model.VAE import VAE
from utils.trainer import VAEtrainer

batch_size=128
max_epochs=10
no_cuda = False
seed=1
log_interval=10
cuda = not no_cuda and torch.cuda.is_available()
torch.manual_seed(seed)

device = torch.device("cuda" if cuda else "cpu")

train_loader = torch.utils.data.DataLoader(
    datasets.MNIST('./data', train=True, download=True,
                   transform=transforms.ToTensor()),
    batch_size=batch_size, shuffle=True, num_workers = 1, pin_memory = True )

test_loader = torch.utils.data.DataLoader(
    datasets.MNIST('./data', train=False, transform=transforms.transforms.ToTensor()),
    batch_size=batch_size, shuffle=True, num_workers = 1, pin_memory = True)

model = VAE()
trainer = VAEtrainer(model = model, train_loader = train_loader, test_loader= test_loader, batch_size = batch_size)
trainer.train(max_epochs = max_epochs)

====> Epoch: 1 Average loss: 164.8244
====> Test set loss: 128.5144
====> Epoch: 2 Average loss: 122.3092
====> Test set loss: 116.4054
====> Epoch: 3 Average loss: 114.9204
====> Test set loss: 111.9082
====> Epoch: 4 Average loss: 111.6956
====> Test set loss: 110.5553
====> Epoch: 5 Average loss: 109.8703
====> Test set loss: 108.3903
====> Epoch: 6 Average loss: 108.6580
====> Test set loss: 107.6233
====> Epoch: 7 Average loss: 107.8095
====> Test set loss: 107.2324
====> Epoch: 8 Average loss: 107.1775
====> Test set loss: 106.5009
====> Epoch: 9 Average loss: 106.6282
====> Test set loss: 105.8009
====> Epoch: 10 Average loss: 106.2131
====> Test set loss: 105.5583
