<a href="https://colab.research.google.com/github/suyeon-9706/MNIST/blob/master/Update_MNIST.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# MNIST
- Provided on Yann LeCun's website
- A simple computer vision data set
- It consists of handwritten images.
- 28 * 28 * 1 image, 1 channel gray image, 0~9 digits
- Each data is labeled 'What is the number of that data' with the data.






In [1]:
#@title Install pytorch
!pip install torch
!pip install torchvision



In [0]:
import torch
import torch.nn as nn # Packages for creating a Neural Network
import torch.nn.functional as F

# torch.autograd: A package that is central to the Neural network, providing automatic differentiation for all operations of the Tensor
from torch.autograd import Variable # Variable class: core class of autograd package

# torchvision: use for image classification training, easy to vision training
import torchvision.datasets as dsets # Data loader for datasets such as CIFAR10, MNIST, etc.
import torchvision.transforms as transf # A package that transforms images of PIL type into torch tensor type

In [0]:
#@title Loading MNIST data
mnist_train = dsets.MNIST(root='data/',
                          train=True, # train set
                          transform=transf.ToTensor(), # image to Tensor 
                          download=True) # If MNIST image does not exist in root, download data
mnist_test = dsets.MNIST(root='data/',
                         train=False, # val(test set)
                         transform=transf.ToTensor(), # image to Tensor
                         download=True) # If MNIST image does not exist in root, download data

In [0]:
#@title Create batch operator to enter data in batch units
# To update the parameters of a model: gradient descent(Update parameters after reporting batch size of data)
batch_size = 100
train_data = torch.utils.data.DataLoader(dataset=mnist_train,
                                         batch_size=batch_size, 
                                         shuffle=True) # shuffle data
test_data = torch.utils.data.DataLoader(dataset=mnist_test, 
                                        batch_size=batch_size, 
                                        shuffle=False) # don't shuffle data

*batch_size*: the size of input data took for one iteration

In [5]:
#@title Define model(★Update★)
print("Define model...")

class Net(nn.Module):
  # Initialize all modules here(instantiate)
  def __init__(self, num_classes):    
    super(Net, self).__init__() # Always 'torch.nn.Module' inheritance, then start
    
    # input=28*28*1
    # padding=2 for same padding
    self.conv1 = nn.Conv2d(1, 32, 5, padding=2) # 1 input image channel, 32 output image channels 
    # feature map size is 14*14 by pooling
    # padding=2 for same padding
    self.conv2 = nn.Conv2d(32, 64, 5, padding=2) # 32 input image channels, 64 output image channels
    # feature map size is 7*7 by pooling
    self.fc1 = nn.Linear(7*7*64, 1024)
    self.fc2 = nn.Linear(1024, 10)
    
  # A function in which the model receives training data and proceeds to 'forward propagation'
  def forward(self, x):
    # Conv -> ReLU -> MaxPool
    out = F.max_pool2d(F.relu(self.conv1(x)), 2)
    out = F.max_pool2d(F.relu(self.conv2(out)), 2)
    
    out = out.view(-1, 7*7*64) # Conv --> F.C
    
    #Fully Connected
    out = self.fc1(out)
    out = F.relu(out)
    out = self.fc2(out)
    return out

Define model...


*num_classes*: number of output classes(MNIST label: discrete range [0,9])

*RELU() function*: a function treated as zero only for negative numbers, such as max(0, x)


---

Convolution Layer 만드는 방법

Conv -> ReLU -> MaxPool 이 3가지를 순서대로 진행!

In [6]:
#@title Build the model
num_classes = 10 # discrete range [0,9]

net = Net(num_classes)
print(net)

if torch.cuda.is_available():
  net.cuda()

Net(
  (conv1): Conv2d(1, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
  (conv2): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
  (fc1): Linear(in_features=3136, out_features=1024, bias=True)
  (fc2): Linear(in_features=1024, out_features=10, bias=True)
)


*torch.cuda.is_available() function*: Returns a bool indicating if CUDA is currently available. (Verify that GPUs are available in given environment)

*cuda()*: Used to replace the existing Tensor with a data type that allows GPU operation


In [0]:
#@title Define loss & optimizer
# loss(Softmax)
loss_function = nn.CrossEntropyLoss()
# backpropagation method(Adam optimizer)
learning_rate = 1e-3
optimizer = torch.optim.Adam(net.parameters(), lr=learning_rate)

*torch.optim*: a package implementing various optimization algorithms. 

*torch.optim.Adam(params, lr=~)*:  An algorithm for first-order "gradient-based optimization" of stochastic objective functions, based on adaptive estimates of lower-order moments.

- *params(iterable)*: iterable of parameters to optimize or dicts defining parameter groups

- *lr(float, otional)*: learning rate(default: le-3) (1e-3 --> 1∗10^−3=0.001)

- straightforward to implement(1), computationally efficient(2), little memory requirements(3), well suited for problems that are large in terms of data and/or parameters(4)

In [8]:
#@title Training model
num_epochs = 15 
for epoch in range(num_epochs):
  for i ,(images,labels) in enumerate(train_data):
    images = Variable(images).cuda()
    labels = Variable(labels).cuda()
    
    # grad init
    optimizer.zero_grad()
    # forward propagation
    outputs = net(images)
    # calculate loss
    loss = loss_function(outputs, labels)
    # backpropagation(calc gradient)
    loss.backward()
    # weight update(update gradient)
    optimizer.step()
    
    if (i+1) % 100 == 0:
      print('Epoch [%d/%d], Step [%d/%d], Loss: %.4f'
                 %(epoch+1, num_epochs, i+1, len(mnist_train)//batch_size, loss.data))

Epoch [1/15], Step [100/600], Loss: 0.1912
Epoch [1/15], Step [200/600], Loss: 0.0417
Epoch [1/15], Step [300/600], Loss: 0.0624
Epoch [1/15], Step [400/600], Loss: 0.0722
Epoch [1/15], Step [500/600], Loss: 0.1322
Epoch [1/15], Step [600/600], Loss: 0.0364
Epoch [2/15], Step [100/600], Loss: 0.0304
Epoch [2/15], Step [200/600], Loss: 0.0303
Epoch [2/15], Step [300/600], Loss: 0.0830
Epoch [2/15], Step [400/600], Loss: 0.0246
Epoch [2/15], Step [500/600], Loss: 0.0504
Epoch [2/15], Step [600/600], Loss: 0.0192
Epoch [3/15], Step [100/600], Loss: 0.0322
Epoch [3/15], Step [200/600], Loss: 0.0182
Epoch [3/15], Step [300/600], Loss: 0.0031
Epoch [3/15], Step [400/600], Loss: 0.0307
Epoch [3/15], Step [500/600], Loss: 0.0012
Epoch [3/15], Step [600/600], Loss: 0.0822
Epoch [4/15], Step [100/600], Loss: 0.0097
Epoch [4/15], Step [200/600], Loss: 0.0070
Epoch [4/15], Step [300/600], Loss: 0.0529
Epoch [4/15], Step [400/600], Loss: 0.0141
Epoch [4/15], Step [500/600], Loss: 0.0025
Epoch [4/15

 *num_epochs*: number of times which the entire dataset is passed throughout the model

*epoch*: one forward pass and one backward pass of all  the training examples

*step*: Mnist에서 6만장을 한 번에 학습에 사용하면 좋겠지만, 메모리와 속도를 고려하여 데이터를 나누어 학습하는데  현재 batch Size가 100이므로 총 600개의 Batch를 얻을 수 있다.

In [10]:
#@title Evaluating accuracy of the model
net.eval() # test 과정이라고 내부적으로 알려줌
correct = 0
total = 0
for images,labels in test_data:
  images = Variable(images).cuda()
  labels = labels.cuda()
  
  output = net(images)
  _, predicted = torch.max(output,1)
  correct += (predicted == labels).sum()
  total += labels.size(0)

print('Accuracy of the model: %.3f %%' %((100*correct)/(total+1)))

Accuracy of the model: 99.000 %
