<a href="https://colab.research.google.com/github/suyeon-9706/MNIST/blob/master/MNIST.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#PyTorch 
- Scientific Computing Package based on Python (Deep Learning Library)
- Dynamic Computing Graph (DCG) Support
- Easy access to python arrangement features
- Excellent compatibility with numpy / scipy.
- Most are implemented in python except C++ code for tensor operation.

# MNIST
- Provided on Yann LeCun's website
- A simple computer vision data set
- It consists of handwritten images.
- 28*28 image, 1 channel gray image, 0~9 digits
- Each data is labeled 'What is the number of that data' with the data.






In [0]:
#code: https://github.com/AvivSham/Pytorch-MNIST-colab/blob/master/Pytorch_MNIST.ipynb
#@title Install pytorch
!pip install torch
!pip install torchvision

In [0]:
import torch
import torch.nn as nn # Packages for creating a Neural Network
# torch.autograd: A package that is central to the Neural network, providing automatic differentiation for all operations of the Tensor

from torch.autograd import Variable # Variable class: core class of autograd package

# torchvision: use for image classification training, easy to vision training
import torchvision.datasets as dsets # Data loader for datasets such as CIFAR10, MNIST, etc.
import torchvision.transforms as transf # A package that transforms images of PIL type into torch tensor type

In [0]:
import matplotlib.pyplot as plt # visualization package(Used to draw a graph)
import numpy as np # For matrix operations
import random #For random operations

In [3]:
#@title Loading MNIST data
mnist_train = dsets.MNIST(root='data/',
                          train=True, # train set
                          transform=transf.ToTensor(), # image to Tensor 
                          download=True) # If MNIST image does not exist in root, download data
mnist_test = dsets.MNIST(root='data/',
                         train=False, # val(test set)
                         transform=transf.ToTensor(), # image to Tensor
                         download=True) # If MNIST image does not exist in root, download data

0it [00:00, ?it/s]

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to data/MNIST/raw/train-images-idx3-ubyte.gz


9920512it [00:01, 8627450.82it/s]                            


Extracting data/MNIST/raw/train-images-idx3-ubyte.gz


  0%|          | 0/28881 [00:00<?, ?it/s]

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to data/MNIST/raw/train-labels-idx1-ubyte.gz


32768it [00:00, 129187.37it/s]           
  0%|          | 0/1648877 [00:00<?, ?it/s]

Extracting data/MNIST/raw/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to data/MNIST/raw/t10k-images-idx3-ubyte.gz


1654784it [00:00, 2117622.68it/s]                            
0it [00:00, ?it/s]

Extracting data/MNIST/raw/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to data/MNIST/raw/t10k-labels-idx1-ubyte.gz


8192it [00:00, 49003.64it/s]            


Extracting data/MNIST/raw/t10k-labels-idx1-ubyte.gz
Processing...
Done!


In [0]:
#@title Create batch operator to enter data in batch units
# To update the parameters of a model: gradient descent(Update parameters after reporting batch size of data)
batch_size = 100
train_data = torch.utils.data.DataLoader(dataset=mnist_train,
                                         batch_size=batch_size, 
                                         shuffle=True) # shuffle data
test_data = torch.utils.data.DataLoader(dataset=mnist_test, 
                                        batch_size=batch_size, 
                                        shuffle=False) # don't shuffle data

*batch_size*: the size of input data took for one iteration

In [5]:
#@title Define model
print("Define model...")
class Net(nn.Module):
  # Initialize all modules here(instantiate)
  def __init__(self, input_size, hidden_size, num_classes):    
    super(Net, self).__init__() # Always 'torch.nn.Module' inheritance, then start
    self.fc1 = nn.Linear(input_size, hidden_size) # used for linear transformation
    self.relu = nn.ReLU()
    self.fc2 = nn.Linear(hidden_size, num_classes) # used for linear transformation
    
  # Define the network structure
  # A function in which the model receives training data and proceeds to 'forward propagation'
  def forward(self, x):
    out = self.fc1(x)
    out = self.relu(out)
    out = self.fc2(out)
    return out

Define model...


*input_size*: img_size(MNIST data image of shape 28*28 = 784)

*hidden_size*: number of nodes at hidden layer

*num_classes*: number of output classes(MNIST label: discrete range [0,9])

*Linear() function*: Creates an 'input_size' image into a one-dimensional vector => Outputs 'num_classes' classes via the 'hidden_size' nodes at hidden layer

*RELU() function*: a function treated as zero only for negative numbers, such as max(0, x)


---

The reason they are grouped together into a module is because of the 'gpu allocation'.

( graph를 선언할 때 gpu option을 주면 
그 안에서 선언한 함수는 모두 gpu에 할당되는 Tensorflow와는 다르게 
PyTorch에서는 직접 .cuda() 를 통해 할당시켜주어야 한다. 
이 번거로움을 최소화시키고자, 모듈로 구성.)

In [0]:
#@title Build the model
input_size = 784 # img_size = (28,28) -> 28*28=784 in total
hidden_size = 500
num_classes = 10 # discrete range [0,9]

net = Net(input_size, hidden_size, num_classes)
if torch.cuda.is_available():
  net.cuda()

*torch.cuda.is_available() function*: Returns a bool indicating if CUDA is currently available. (Verify that GPUs are available in given environment)

*cuda()*: Used to replace the existing Tensor with a data type that allows GPU operation


In [0]:
#@title Define loss & optimizer
# loss(Softmax)
loss_function = nn.CrossEntropyLoss()
# backpropagation method(SGD optimizer)
learning_rate = 1e-3
optimizer = torch.optim.SGD(net.parameters(), lr=learning_rate)

*torch.optim*: a package implementing various optimization algorithms. 

*torch.optim.SGD(params, lr=~)*: SGD optimizer
- *params(iterable)*: iterable of parameters to optimize or dicts defining parameter groups

- *lr(float, otional)*: learning rate(default: le-3) (1e-3 --> 1∗10^−3=0.001)

In [11]:
#@title Training model
num_epochs = 20 
for epoch in range(num_epochs):
  for i ,(images,labels) in enumerate(train_data):
    images = Variable(images.view(-1,28*28)).cuda()
    labels = Variable(labels).cuda()
    
    # grad init
    optimizer.zero_grad()
    # forward propagation
    outputs = net(images)
    # calculate loss
    loss = loss_function(outputs, labels)
    # back propagation
    loss.backward()
    # weight update
    optimizer.step()
    
    if (i+1) % 100 == 0:
      print('Epoch [%d/%d], Step [%d/%d], Loss: %.4f'
                 %(epoch+1, num_epochs, i+1, len(mnist_train)//batch_size, loss.data))

Epoch [1/20], Step [100/600], Loss: 2.0119
Epoch [1/20], Step [200/600], Loss: 1.9970
Epoch [1/20], Step [300/600], Loss: 1.9673
Epoch [1/20], Step [400/600], Loss: 1.9009
Epoch [1/20], Step [500/600], Loss: 1.8735
Epoch [1/20], Step [600/600], Loss: 1.8277
Epoch [2/20], Step [100/600], Loss: 1.8178
Epoch [2/20], Step [200/600], Loss: 1.7330
Epoch [2/20], Step [300/600], Loss: 1.7045
Epoch [2/20], Step [400/600], Loss: 1.6386
Epoch [2/20], Step [500/600], Loss: 1.7111
Epoch [2/20], Step [600/600], Loss: 1.5794
Epoch [3/20], Step [100/600], Loss: 1.5957
Epoch [3/20], Step [200/600], Loss: 1.5472
Epoch [3/20], Step [300/600], Loss: 1.5167
Epoch [3/20], Step [400/600], Loss: 1.4979
Epoch [3/20], Step [500/600], Loss: 1.3821
Epoch [3/20], Step [600/600], Loss: 1.4622
Epoch [4/20], Step [100/600], Loss: 1.4177
Epoch [4/20], Step [200/600], Loss: 1.3602
Epoch [4/20], Step [300/600], Loss: 1.3924
Epoch [4/20], Step [400/600], Loss: 1.3130
Epoch [4/20], Step [500/600], Loss: 1.3198
Epoch [4/20

 *num_epochs*: number of times which the entire dataset is passed throughout the model

*epoch*: one forward pass and one backward pass of all  the training examples

*step*: Mnist에서 6만장을 한 번에 학습에 사용하면 좋겠지만, 메모리와 속도를 고려하여 데이터를 나누어 학습하는데  현재 batch Size가 100이므로 총 600개의 Batch를 얻을 수 있다.

In [12]:
#@title Evaluating accuracy of the model
correct = 0
total = 0
for images,labels in test_data:
  images = Variable(images.view(-1,28*28)).cuda()
  labels = labels.cuda()
  
  output = net(images)
  _, predicted = torch.max(output,1)
  correct += (predicted == labels).sum()
  total += labels.size(0)

print('Accuracy of the model: %.3f %%' %((100*correct)/(total+1)))

Accuracy of the model: 88.000 %
