##  pytorch의 구현순서는 다음과 같다.
1. Dataset, DataLoader 정의
2. Model 정의
3. Loss function 정의
4. Back prop 정의
5. Training 
6. Testing

## naive tutorial

### 1. pre-requisite

#### 1.1 install pytorch

for windows users, you can use **[wsl(windows subsystem for Linux)](https://msdn.microsoft.com/en-us/commandline/wsl/install_guide)** <br>

[GPU support is now available too for wsl](https://datascience.stackexchange.com/questions/17776/how-to-install-pytorch-in-windows) <br>

requirements : <br>
* [conda & pytorch](http://pytorch.org/)

#### 1.2 install tensorboard
```bash
pip install tensorboard-pytorch 
pip install tensorflow-tensorboard
```

In [6]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.autograd.variable import Variable
import torchvision.utils as vutils
import numpy as np
import torchvision.models as models
from torchvision import datasets
from tensorboard import SummaryWriter
import torchvision.transforms as transforms

## 2. Dataset, DataLoader 정의

먼저 customized Dataset으로 진행할 때의 방식은 다음과 같다 : <br>
```python

des_dir = "./somewhere"

dataset = dset.ImageFolder(root=des_dir,
                           transform=transforms.Compose([
                               transforms.Scale(imageSize),
                               transforms.ToTensor(),
                               transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
                           ]))

dataloader = torch.utils.data.DataLoader(dataset,
                                         batch_size= batchSize,
                                         shuffle=True)

```

**DATASET**은 파일을 파이썬에 불러오는데 그 역할이 있다. <br>
따라서 
```python
__getitem__
```
을 정의할 필요가 있는데, sample code는 본인의 [deep speech](https://github.com/YBIGTA/Deep_learning/blob/master/RNN/deep_speech/implementation/deep%20speech1%20%EA%B5%AC%ED%98%84.ipynb)를 참고한다. <br>

**DATALOADER**는 dataset을 받아 slicing을 해주는데에 그 의미가 있다. <br>
이는 pytorch의 
```python
self.collate_fn 
``` 
을 구현하여, 각 크기가 다른, 혹은 형태가 다른 아이템을을 어떻게 같이 slicing을 할 것인가를 정의해야 한다.<br>
```python
self.collate_fn 
``` 
에는 slicing된 각 dataset의 데이터가 list형태로 들어가고, 이 output을 정의해야한다. <br>

예를들어 음성은 각 사이즈가 다 다른데, 어떻게 collation을 할 지를 정의해줘야 하는 것이다.<br>


In [8]:
# dataset & dataLoader for Mnists.

train_loader = torch.utils.data.DataLoader(
    datasets.MNIST('../data', train=True, download=True,
                   transform=transforms.ToTensor()),
    batch_size=100, shuffle=True,)
test_loader = torch.utils.data.DataLoader(
    datasets.MNIST('../data', train=False, transform=transforms.ToTensor()),
    batch_size=100, shuffle=True)


## 3. modeling
modeling은 **nn.module**을 상속받아 Class로 바인딩한다. <br>

back propagation은 정의할 필요가 없고,<br>

1. __init__을 통하여 Model에 어떤 레이어들이 있는지 정의한 후,
2. forward()를 통하여 데이터가 각 레이어에 어떻게 들어가게되는지 정의한다.

In [9]:
# modeling
class VAE(nn.Module):
    def __init__(self):
        super(VAE, self).__init__()

        self.fc1 = nn.Linear(784, 400)
        self.fc21 = nn.Linear(400, 20)
        self.fc22 = nn.Linear(400, 20)
        self.fc3 = nn.Linear(20, 400)
        self.fc4 = nn.Linear(400, 784)

        self.relu = nn.ReLU()
        self.sigmoid = nn.Sigmoid()

    def encode(self, x):
        h1 = self.relu(self.fc1(x))
        return self.fc21(h1), self.fc22(h1)

    def reparametrize(self, mu, logvar):
        std = logvar.mul(0.5).exp_()
        if 0:
            eps = torch.cuda.FloatTensor(std.size()).normal_()
        else:
            eps = torch.FloatTensor(std.size()).normal_()
        eps = Variable(eps)
        return eps.mul(std).add_(mu)

    def decode(self, z):
        h3 = self.relu(self.fc3(z))
        return self.sigmoid(self.fc4(h3))

    def forward(self, x):
        mu, logvar = self.encode(x.view(-1, 784))
        z = self.reparametrize(mu, logvar)
        return self.decode(z), mu, logvar

In [47]:
# 모델 초기화
model = VAE()

## 4. Loss function 구현

loss function을 정의하여, model에서 나온 output을 넣어 역전파를 구할 장소를 만든다. <br>

함수로서 정의하면 되고, 일반적인 경우에선 그냥 NLL등을 사용하면 된다.<br>



In [11]:
reconstruction_function = nn.BCELoss()
reconstruction_function.size_average = False


def loss_function(recon_x, x, mu, logvar):
    BCE = reconstruction_function(recon_x, x)

    # see Appendix B from VAE paper:
    # Kingma and Welling. Auto-Encoding Variational Bayes. ICLR, 2014
    # https://arxiv.org/abs/1312.6114
    # 0.5 * sum(1 + log(sigma^2) - mu^2 - sigma^2)
    KLD_element = mu.pow(2).add_(logvar.exp()).mul_(-1).add_(1).add_(logvar)
    KLD = torch.sum(KLD_element).mul_(-0.5)

    return BCE + KLD


## 5. Back prop method 구현.

굳이 따로 때넨 이유는, 코드가 때에 따라 길어질 여지가 있고 하나의 명백한 학문이기 때문이다.

모델의 인자(트레이닝을 할 변수들)를 당연하게도 받는다.

In [48]:
optimizer = optim.Adam(model.parameters(), lr=1e-3)

## 6. Training, implementing tensorboard

In [61]:
# tensorboard에 필요한 것. 
writer = SummaryWriter()
#epoch 수
epoch = 1
# 몇 번째마다 로깅 할 것인가.
embedding_log = 5

#train
for i in range(epoch):
    model.train()
    train_loss = 0
    for batch_idx, (data, label) in enumerate(train_loader):
        n_iter = (i*len(train_loader))+batch_idx
        print(len(label))
        data = Variable(data)
        if 0:
            data = data.cuda()
        optimizer.zero_grad()
        recon_batch, mu, logvar = model(data)
        loss = loss_function(recon_batch, data, mu, logvar)
        loss.backward()
        train_loss += loss.data[0]
        optimizer.step()
        if batch_idx % 100 == 0:
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * len(data), len(train_loader.dataset),
                100. * batch_idx / len(train_loader),
                loss.data[0] / len(data)))
            
        writer.add_scalar('loss',loss.data[0] / len(data),n_iter)
        for tag, value in model.named_parameters():
            writer.add_histogram(tag, value.data.numpy(), global_step = n_iter)
        if batch_idx % embedding_log == 0:
            #we need  dimension for tensor to visualize it
            writer.add_embedding(recon_batch.data,metadata = label, label_img = data.data,global_step = n_iter)


100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100


In [60]:
value.data.numpy()

array([[-0.00205815, -0.00553272,  0.01402932, ...,  0.00380697,
        -0.03047704,  0.00743232],
       [-0.00973061,  0.02102149, -0.00396892, ...,  0.0184958 ,
         0.01318543,  0.00028399],
       [-0.02935295, -0.02123811, -0.01930187, ..., -0.02751336,
        -0.01735669, -0.00705339],
       ..., 
       [-0.02532171,  0.00659803,  0.0349379 , ..., -0.01281032,
         0.01002253,  0.02722737],
       [-0.01069853, -0.01685899,  0.0151909 , ...,  0.01378453,
        -0.02534758,  0.0127874 ],
       [-0.02630001, -0.02807823,  0.02950477, ...,  0.00298915,
         0.02662011, -0.00847801]], dtype=float32)

In [None]:
writer.add_histogram

In [55]:
tag

'fc4.bias'

In [49]:
loss.data[0] / len(data)

206.529765625

In [45]:
loss.data.numpy()[0]

21168.311

In [40]:
type(recon_batch.data)


torch.FloatTensor

In [54]:
for tag, value in model.named_parameters():
    print(tag)
    print(value)

fc1.weight
Parameter containing:
-2.0581e-03 -5.5327e-03  1.4029e-02  ...   3.8070e-03 -3.0477e-02  7.4323e-03
-9.7306e-03  2.1021e-02 -3.9689e-03  ...   1.8496e-02  1.3185e-02  2.8399e-04
-2.9353e-02 -2.1238e-02 -1.9302e-02  ...  -2.7513e-02 -1.7357e-02 -7.0534e-03
                ...                   ⋱                   ...                
-2.5322e-02  6.5980e-03  3.4938e-02  ...  -1.2810e-02  1.0023e-02  2.7227e-02
-1.0699e-02 -1.6859e-02  1.5191e-02  ...   1.3785e-02 -2.5348e-02  1.2787e-02
-2.6300e-02 -2.8078e-02  2.9505e-02  ...   2.9891e-03  2.6620e-02 -8.4780e-03
[torch.FloatTensor of size 400x784]

fc1.bias
Parameter containing:
 0.0064
 0.0191
-0.0027
 0.0070
-0.0627
 0.0787
 0.0148
 0.0199
 0.0165
 0.0662
-0.0344
 0.1589
 0.0203
 0.0701
 0.0223
-0.1591
 0.1117
-0.0385
 0.1145
 0.1398
 0.0073
-0.0171
 0.1323
-0.0441
-0.0026
-0.0834
-0.0199
-0.0142
 0.0840
 0.0750
 0.1920
 0.1390
 0.0182
-0.0372
 0.0213
-0.0453
-0.0227
-0.0008
 0.1093
-0.0221
 0.0077
 0.0153
 0.1274
-0.0076
 

In [None]:
model.n

In [35]:
loss.data[0]

20717.869140625

In [31]:
l

AttributeError: 'NLLLoss' object has no attribute 'data'

In [21]:
torch.cat((recon_batch,Variable(torch.ones(len(recon_batch)))),1)

RuntimeError: dimension 1 out of range of 1D tensor at /py/conda-bld/pytorch_1493681908901/work/torch/lib/TH/generic/THTensor.c:24

In [36]:
tmp = recon_batch.view(100,784,1)


In [38]:
tmp.data

TypeError: 'torch.FloatTensor' object is not callable

In [None]:
torch.cat()

In [29]:
recon_batch.size()[0]

100

loss_value:0.8387617468833923


RuntimeError: expected a Variable argument, but got torch.FloatTensor

100