<a href="https://colab.research.google.com/github/paulgureghian/Google_Colab_Notebooks/blob/master/Training_a_PyTorch_network.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Created by Paul A. Gureghian on 11/24/18.**

**This notebook has the code to train a PyTorch deep learning network,**

**which makes it a 'smart' network.**

**I will show it real data, then adjust the network parameters.**

In [25]:
### Install PyTorch and Torchvision 
!pip3 install -U -q torchvision 
!pip3 install -U -q http://download.pytorch.org/whl/cu92/torch-0.4.1-cp36-cp36m-linux_x86_64.whl

tcmalloc: large alloc 1073750016 bytes == 0x57f54000 @  0x7fdd1cc082a4 0x591a07 0x5b5d56 0x502e9a 0x506859 0x502209 0x502f3d 0x506859 0x504c28 0x502540 0x502f3d 0x506859 0x504c28 0x502540 0x502f3d 0x506859 0x504c28 0x502540 0x502f3d 0x507641 0x502209 0x502f3d 0x506859 0x504c28 0x502540 0x502f3d 0x507641 0x504c28 0x502540 0x502f3d 0x507641


In [0]:
### Import packages
import torch
from torch import nn
from torch import optim
import torch.nn.functional as F
from torchvision import  datasets, transforms

In [0]:
### Define a transform to normalize the data
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),])

### Download and load the training data
trainset = datasets.MNIST('~/.pytorch/MNIST_data/', download=True, train=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)

In [0]:
### Build a feed-forward network
model = nn.Sequential(nn.Linear(784, 128),
                      nn.ReLU(),
                      nn.Linear(128, 64),
                      nn.ReLU(),
                      nn.Linear(64, 10),
                      nn.LogSoftmax(dim=1))                       

In [29]:
### Print the 'weights' before training

print("Initial weights - ", model[0].weight)

Initial weights -  Parameter containing:
tensor([[ 0.0213,  0.0127,  0.0309,  ...,  0.0017,  0.0326, -0.0152],
        [ 0.0212,  0.0312,  0.0068,  ..., -0.0157, -0.0008, -0.0136],
        [ 0.0085,  0.0251,  0.0048,  ..., -0.0320, -0.0143,  0.0060],
        ...,
        [ 0.0179, -0.0135,  0.0323,  ...,  0.0085, -0.0206, -0.0239],
        [-0.0248,  0.0152, -0.0054,  ...,  0.0306,  0.0276, -0.0026],
        [-0.0215, -0.0163, -0.0134,  ..., -0.0115, -0.0143, -0.0100]],
       requires_grad=True)


In [0]:
### Define an optimizer and a learning rate
optimizer = optim.SGD(model.parameters(), lr=0.003)

### Define the loss
criterion = nn.NLLLoss() 

In [31]:
### Iterate over all the dataset images
epochs = 5
for e in range(epochs):
    running_loss = 0
    for images, labels in trainloader:
      
        ### flatten images into a 784 long vector
        images = images.view(images.shape[0], -1)

        ### clear the gradients
        optimizer.zero_grad()

        ### forward pass through the network
        ### get the network output
        output = model(images)

        ### calculate the loss with the logps and labels
        loss = criterion(output, labels)
 
        ### backward pass through the network 
        loss.backward()
      
        ### update the weights
        optimizer.step()

        running_loss += loss.item()
else:
        print(f"Training loss:  {running_loss/len(trainloader)}") 

Training loss:  0.37537118576482925


**Autograd.** 

In [32]:
### some autograd examples

x = torch.randn(2, 2, requires_grad=True)
print(x)

tensor([[-1.3703, -0.7818],
        [-0.1143,  1.0356]], requires_grad=True)


In [33]:
y = x**2
print(y)  

tensor([[1.8778, 0.6112],
        [0.0131, 1.0724]], grad_fn=<PowBackward0>)


In [34]:
### grad_fn shows the function that generated the 'y' variable

print(y.grad_fn)

<PowBackward0 object at 0x7fb46f54f860>


In [35]:
### scale the 'y' tensor to a scalar

z = y.mean()
print(z)

tensor(0.8936, grad_fn=<MeanBackward1>)


In [36]:
### check the gradient for 'x'

print(x.grad)

None


In [37]:
### calculate the gradient for 'z' with respect to 'x'

z.backward()
print(x.grad)
print(x/2) 

tensor([[-0.6852, -0.3909],
        [-0.0571,  0.5178]])
tensor([[-0.6852, -0.3909],
        [-0.0571,  0.5178]], grad_fn=<DivBackward0>)
