# 20.07.18

It is beneficial to zero out gradients when building a neural network. This is because by default, **gradients are accumulated in buffers**(not overwritten) whenever **.backward()** is called.

when training your neural network, models are able to increase their accuracy through **gradient decent**. In short, gradient descent is the process of **minimizing our loss(or error)** by tweaking the weights and biases in our model.  
**torch.Tensor** is the central class of PyTorch. when you create a tensor, if you set its attribute **.requires_grad as True**, the package tracks all operations on it. This happens on subsequent backward passes.(후속 역행)  
The gradient for this tensor will be accumulated into .grad attribute. The accumulation(or sum) of all the gradients is calculated when .backward() is called on the loss tensor.  
**There are cases where it may be necessary to zero-out the gradients of a tensor. For example: when you start your training loop, you should zero out the gradients so that you can perform this tracking correctly.**

## Import necessary libraries for loading our data

In [3]:
import torch

import torch.nn as nn
import torch.nn.functional as F

import torch.optim as optim

import torchvision
import torchvision.transforms as transforms

## Load and normalize the dataset

Pytorch features various built-in datasets

In [4]:
transform=transforms.Compose([transforms.ToTensor(),
                             transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))])

trainset=torchvision.datasets.CIFAR10(root='./data',train=True,
                                     download=True,transform=transform)

trainloader=torch.utils.data.DataLoader(trainset,batch_size=4,
                                       shuffle=True,num_workers=2)

testset=torchvision.datasets.CIFAR10(root='/data',train=False,
                                    download=True,transform=transform)

testloader=torch.utils.data.DataLoader(testset,batch_size=4,
                                      shuffle=False, num_workers=2)

classses=('plane','car','bird','cat','deer','dog','frog','horse','ship','truck')

Files already downloaded and verified
Files already downloaded and verified


# 20.07.19

In [5]:
class Net(nn.Module):
    def __init__(self):
        super(Net,self).__init__()
        self.conv1=nn.Conv2d(3,6,5)
        self.pool=nn.MaxPool2d(2,2)
        self.conv2=nn.Conv2d(6,16,5)
        self.fc1=nn.Linear(16*5*5,120)
        self.fc2=nn.Linear(120,84)
        self.fc3=nn.Linear(84,10)
        
    def forward(self,x):
        x=self.pool(F.relu(self.conv1(x)))
        x=self.pool(F.relu(self.conv2(x)))
        x=x.view(-1,16*5*5)
        x=F.relu(self.fc1(x))
        x=F.relu(self.fc2(x))
        x=self.fc3(x)
        return x

## define a loss function and optimizer

In [6]:
net=Net()
criterion=nn.CrossEntropyLoss()
optimizer=optim.SGD(net.parameters(),lr=0.001,momentum=0.9)

# 20.07.20

In [8]:
for epoch in range(2): #loop over the dataset multiple times
    
    running_loss=0.0
    for i ,data in enumerate(trainloader, 0):
        #get the inputs; data is a list of [inputs, labels]
        inputs, labels=data
        
        #zero the parameter gradients
        optimizer.zero_grad()
        
        #forward + backward +optimize
        outputs=net(inputs)
        loss=criterion(outputs,labels)
        loss.backward()
        optimizer.step()
        
        #print statistics
        running_loss +=loss.item()
        if i % 2000==1999: #print every 2000 mini-batch
            print('[%d, %5d] loss: %.3f' %(epoch +1, i+1, running_loss/2000))
            running_losss=0.0
            
print('Finished Training')
        

[1,  2000] loss: 2.191
[1,  4000] loss: 4.010
[1,  6000] loss: 5.647
[1,  8000] loss: 7.205
[1, 10000] loss: 8.699
[1, 12000] loss: 10.171
[2,  2000] loss: 1.374
[2,  4000] loss: 2.700
[2,  6000] loss: 4.004
[2,  8000] loss: 5.270
[2, 10000] loss: 6.521
[2, 12000] loss: 7.752
Finished Training
