# PyTorch Introduction

* PyTorch tensors are essentially like numpy arrays, but they can run on **GPU**.

### **Let's take a look on how tensors look like in pytorch:**

In [2]:
import torch

T = torch.Tensor([[1,2],[3,4]])
print(T)

tensor([[1., 2.],
        [3., 4.]])


In [3]:
print(T**2)

tensor([[ 1.,  4.],
        [ 9., 16.]])


### You can do inplace operations

In [22]:
T = torch.empty(2, 4)
print(T)

tensor([[-2.0848e-31,  3.0906e-41,  0.0000e+00,  0.0000e+00],
        [        nan,  0.0000e+00,  1.2271e+01,  1.1319e+21]])


In [23]:
T.fill_(0.05)

tensor([[0.0500, 0.0500, 0.0500, 0.0500],
        [0.0500, 0.0500, 0.0500, 0.0500]])

In [24]:
T += torch.randn(2,4)
print(T)

tensor([[-1.2986,  0.0204,  0.9383, -1.2173],
        [-0.3505, -1.4784,  0.3386, -0.0081]])


In [25]:
print(T.mean())
print(T.mean().item())

tensor(-0.3819)
-0.38193219900131226


### You can convert numpy to tensor or vise versa

In [26]:
import numpy as np

v = np.ones(6)
print(v)

T = torch.from_numpy(v)
print(T)

[1. 1. 1. 1. 1. 1.]
tensor([1., 1., 1., 1., 1., 1.], dtype=torch.float64)


### The tensor and numy array will share their underlying memory locations

In [27]:
T.add_(1)
print(T)
print(v)

tensor([2., 2., 2., 2., 2., 2.], dtype=torch.float64)
[2. 2. 2. 2. 2. 2.]


### From tensor to numpy:

In [28]:
T = torch.randn(3,3)
print(T)
v = T.numpy()
print(v)

tensor([[-1.1288, -0.5369, -0.0870],
        [ 0.6636,  1.4599,  1.7416],
        [ 0.3486,  0.5827,  0.6461]])
[[-1.1287965  -0.5369296  -0.08702268]
 [ 0.6635978   1.459885    1.7415698 ]
 [ 0.34862456  0.5826633   0.6461347 ]]


## Autograd: automatic differentiation

* Any tensor operation done by PyTorch can be automatically differentiated by the **autograd** package.
* We only need to write the forward pass, autograd takes care of tracking the computational graph associated, and compute the gradients.
* To have its operations tracked by autograd you just need to set the attribute **requires_grad** as **True**.
* Every tensor also has a field grad, itself a tensor of same size, type used to accumulate gradients.

In [33]:
# A simple example:
x = torch.tensor(1., requires_grad=True)
w = torch.tensor(2., requires_grad=True)
b = torch.tensor(3., requires_grad=True)

# Build a computational graph.
y = w * x + b    # y = 2 * x + 3
print(y.grad_fn)

y.backward()
print(x.grad)
print(w.grad)
print(b.grad)

<ThAddBackward object at 0x7fccb0d7f748>
tensor(2.)
tensor(1.)
tensor(1.)


## PyTorch Modules - Neural Networks

* Neural networks can be constructed using the **torch.nn** package.
* Our idealized modules are constructed as subclasses of **torch.nn.Module**.
* We also use elements of **torch.nn.functional** which are autograd-compliant functions.

In [36]:
import torch.nn as nn
import torch.nn.functional as F

x = torch.randn(2,3)
print(x)
x = F.relu(x)
print(x)

tensor([[ 1.7550, -0.9437, -0.9727],
        [ 0.9440,  0.4132,  0.1134]])
tensor([[1.7550, 0.0000, 0.0000],
        [0.9440, 0.4132, 0.1134]])


In [37]:
f = nn.Linear(in_features = 10, out_features = 4)
for n, p in f.named_parameters(): print(n, p.size())

weight torch.Size([4, 10])
bias torch.Size([4])


In [38]:
x = torch.empty(350, 10).normal_()
y = f(x)
print(y.size())

torch.Size([350, 4])


### Let's define a feedforward neural network as a module

In [39]:
# Fully connected neural network with one hidden layer
class NeuralNet(nn.Module):
    def __init__(self, input_size, hidden_size, num_classes):
        super(NeuralNet, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size) 
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(hidden_size, num_classes)  
    
    def forward(self, x):
        out = self.fc1(x)
        out = self.relu(out)
        out = self.fc2(out)
        return out

In [42]:
import torchvision
import torchvision.transforms as transforms

input_size = 784
hidden_size = 500
num_classes = 10
num_epochs = 5
batch_size = 100
learning_rate = 0.001

# MNIST dataset 
train_dataset = torchvision.datasets.MNIST(root='../../data', 
                                           train=True, 
                                           transform=transforms.ToTensor(),  
                                           download=True)

test_dataset = torchvision.datasets.MNIST(root='../../data', 
                                          train=False, 
                                          transform=transforms.ToTensor())

# Data loader
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 
                                           batch_size=batch_size, 
                                           shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 
                                          batch_size=batch_size, 
                                          shuffle=False)

In [44]:
model = NeuralNet(input_size, hidden_size, num_classes)

# Loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)  

In [47]:
# Train the model
total_step = len(train_loader)
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):  
        # Move tensors to the configured device
        images = images.reshape(-1, 28*28)
        labels = labels
        
        # Forward pass
        outputs = model(images)
        loss = criterion(outputs, labels)
        
        # Backward and optimize
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        if (i+1) % 100 == 0:
            print ('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}' 
                   .format(epoch+1, num_epochs, i+1, total_step, loss.item()))


Epoch [1/5], Step [100/600], Loss: 0.3366
Epoch [1/5], Step [200/600], Loss: 0.2874
Epoch [1/5], Step [300/600], Loss: 0.3106
Epoch [1/5], Step [400/600], Loss: 0.1440
Epoch [1/5], Step [500/600], Loss: 0.1801
Epoch [1/5], Step [600/600], Loss: 0.0872
Epoch [2/5], Step [100/600], Loss: 0.0701
Epoch [2/5], Step [200/600], Loss: 0.1648
Epoch [2/5], Step [300/600], Loss: 0.1814
Epoch [2/5], Step [400/600], Loss: 0.1003
Epoch [2/5], Step [500/600], Loss: 0.1224
Epoch [2/5], Step [600/600], Loss: 0.1301
Epoch [3/5], Step [100/600], Loss: 0.1409
Epoch [3/5], Step [200/600], Loss: 0.0185
Epoch [3/5], Step [300/600], Loss: 0.0263
Epoch [3/5], Step [400/600], Loss: 0.0580
Epoch [3/5], Step [500/600], Loss: 0.1400
Epoch [3/5], Step [600/600], Loss: 0.0614
Epoch [4/5], Step [100/600], Loss: 0.0196
Epoch [4/5], Step [200/600], Loss: 0.0103
Epoch [4/5], Step [300/600], Loss: 0.0380
Epoch [4/5], Step [400/600], Loss: 0.1080
Epoch [4/5], Step [500/600], Loss: 0.0327
Epoch [4/5], Step [600/600], Loss:

In [49]:
with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in test_loader:
        images = images.reshape(-1, 28*28)
        labels = labels
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    print('Accuracy of the network on the 10000 test images: {} %'.format(100 * correct / total))

# Save the model checkpoint
torch.save(model.state_dict(), 'model.ckpt')

Accuracy of the network on the 10000 test images: 97.72 %
