[View in Colaboratory](https://colab.research.google.com/github/Sharwon/fastai-intro-kit/blob/master/pytorch_intro_kit.ipynb)

# Introduction to PyTorch !


PyTorch is a python framework for deep learning tasks. It was tailored to be fast and pythnonic(Yeah!). The biggest                    advantage is its ability to automatically calculate gradients for the specified variables.The autograd package provides automatic differentiation for all operations on variables.This is very important in  case of deep learning, as calculating gradients during back-propogation becomes hassle free.


In [0]:
# # Uncomment if google colab

# # http://pytorch.org/
# # pre installation
# from os import path
# from wheel.pep425tags import get_abbr_impl, get_impl_ver, get_abi_tag
# platform = '{}{}-{}'.format(get_abbr_impl(), get_impl_ver(), get_abi_tag())

# accelerator = 'cu80' if path.exists('/opt/bin/nvidia-smi') else 'cpu'

# !pip install -q http://download.pytorch.org/whl/{accelerator}/torch-0.3.0.post4-{platform}-linux_x86_64.whl torchvision

In [0]:
import torch
import numpy as np

In [0]:
torch.__version__

'0.3.0.post4'

## Tensor 

it's a n-diamensional array which resides on the gpu(mostly of the cases).

Types supported:

In [0]:
torch.*Tensor?

#### Comparing numpy and torch tensors.

In [0]:
anumpy = np.random.randn(3,3)
anumpy

array([[ 0.78822371,  1.37979321, -0.15011747],
       [ 0.65926084,  0.89105191, -0.47031767],
       [-0.75957142, -0.09054076, -0.56651158]])

In [0]:
atorch = torch.randn((3,3))
atorch


-0.2511  0.7048 -0.7531
 0.4350  1.1851 -0.0939
 0.9303  0.3595 -0.2589
[torch.FloatTensor of size 3x3]

In [0]:
#converting torch tensor to numpy
atorch2numpy = atorch.numpy()
type(atorch2numpy)

numpy.ndarray

In [0]:
#converting numpy to torch tensor
anumpy2torch = torch.from_numpy(anumpy)
type(anumpy2torch)

torch.DoubleTensor

#### Torch Tensor operands

In [0]:
#Creating a 2x3 tensor.
x = torch.randn(3, 3)
print(x)

#Creating a 2x3 tensor with values randomly selected from a Uniform Distribution between -1 and 1
y = torch.Tensor(3, 3)
print(y)


 0.3360 -0.3223  2.7800
 0.2366 -1.1127 -1.3100
 1.0351  0.0542  0.1007
[torch.FloatTensor of size 3x3]


1.00000e-25 *
  0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000
 -1.2481  0.0000  0.0000
[torch.FloatTensor of size 3x3]



Important torch operations  
**torch.numel**: returns the total number of elements in a Tensor.  
**torch.eye**: returns a 2D tensor representing an identity matrix.  
**torch.linspace**: returns a 1D tensor of equally spaced steps with start and end of a range.  
**torch.cat**: concatenate tensors.
**torch.chunk**: splits the tensors into chunks.

In [0]:
z = torch.ones(3,3)
print(z)


 1  1  1
 1  1  1
 1  1  1
[torch.FloatTensor of size 3x3]



In [0]:
# some operations availabe as methods
# additon of two tensors can be done as
x = x + y
#or
x.add(y)


-1.0036 -1.1978  0.2704
-0.0486 -1.2952 -1.0026
 1.8614  1.1929  1.0874
[torch.FloatTensor of size 3x3]

In [0]:
#also
torch.add(x, y, out=result)


 0.3360 -0.3223  2.7800
 0.2366 -1.1127 -1.3100
 1.0351  0.0542  0.1007
[torch.FloatTensor of size 3x3]

In [0]:
# normal multiplication
print(torch.mul(y, x))


1.00000e-25 *
  0.0000 -0.0000  0.0000
  0.0000 -0.0000 -0.0000
 -1.2919  0.0000  0.0000
[torch.FloatTensor of size 3x3]



### Cuda

In [0]:
#moving the whole operations to GPU.
if torch.cuda.is_available():
    #y = y.cuda()
    u = (x + x).cuda()
print(u) # notice (GPU0) at the end.


 0.6719 -0.6446  5.5599
 0.4733 -2.2254 -2.6199
 2.0703  0.1085  0.2014
[torch.cuda.FloatTensor of size 3x3 (GPU 0)]



## Variable

 __autograd.Variable__ is the central class of the package. It wraps a Tensor, and supports nearly all of operations defined on it. Once you finish your computation you can call __.backward()__ and have all the gradients computed automatically.

![alt text](http://pytorch.org/tutorials/_images/Variable.png "Variable Structure")

You can access the raw tensor through the __.data__ attribute, while the gradient w.r.t. this variable is accumulated into __.grad__.


**Interesting point** : The other important class in autograd package is the Function class ""`torch.nn.functional as F`"". Variable and Function are interconnected to build an acyclic graph, the encodes the complete history of computation. Every variable has a ""**.grad_fn**" attribute that references the Function that has created the Variable. The variables created by user have **grad_fn** as None.

In [0]:
# import Variable from pytorch.
from torch.autograd import Variable

In [0]:
#Creating a dot product of two matrices.

x = Variable(torch.cuda.FloatTensor([10, 10]))
y = Variable(torch.cuda.FloatTensor([5, 0]), requires_grad=True)

z = x.dot(y*y) # z = x * (y**2)
#print x,y,z
print(x)
print(y)
print(z)

Variable containing:
 10
 10
[torch.cuda.FloatTensor of size 2 (GPU 0)]

Variable containing:
 5
 0
[torch.cuda.FloatTensor of size 2 (GPU 0)]

Variable containing:
 250
[torch.cuda.FloatTensor of size 1 (GPU 0)]



In [0]:
z.backward(retain_graph=True) # for computing gradients automatically.
print(f'value of z : {z.data}')

value of z : 
 250
[torch.cuda.FloatTensor of size 1 (GPU 0)]



In [0]:
y.grad.data # derivative ---> 2xy


 100
   0
[torch.cuda.FloatTensor of size 2 (GPU 0)]

In [0]:
#runing backward pass for the second time.
z.backward()

In [0]:
y.grad.data #derivative ---> 2*2xy
#Here the resultant gradient is erroneous according to our actual input.
#This is because while, calculating the gradiets during the second pass, they get added with the gradients from the first pass.
#Initializing weights to zero after each pass, solves the issue.


 200
   0
[torch.cuda.FloatTensor of size 2 (GPU 0)]

In [0]:
# Reacp the above code
x = Variable(torch.cuda.FloatTensor([10, 10]))
y = Variable(torch.cuda.FloatTensor([5, 0]), requires_grad=True)
z = x.dot(y*y)

z.backward(retain_graph=True)
print(f'Gradients form the first run : {y.grad.data}')

#Uncomment the line below to understand the error.
#y.grad.data.zero_() # weights --> 0

z.backward()
print(f'Gradients form the first run : {y.grad.data}')

Gradients form the first run : 
 100
   0
[torch.cuda.FloatTensor of size 2 (GPU 0)]

Gradients form the first run : 
 100
   0
[torch.cuda.FloatTensor of size 2 (GPU 0)]



## Gradient Descent

In [0]:
x_data = [1.0, 2.0, 3.0]
y_data = [2.0, 4.0, 6.0]

w = Variable(torch.Tensor([1.0]),  requires_grad=True)  # Any random value

# our model forward pass
def forward(x):
    return x * w

# Loss function
def loss(x, y):
    y_pred = forward(x)
    return (y_pred - y) * (y_pred - y)

# Before training
print("predict (before training)",  4, forward(4).data[0])

# Training loop
for epoch in range(10):
    for x_val, y_val in zip(x_data, y_data):
        l = loss(x_val, y_val)
        l.backward()
        print("\tgradient: ", x_val, y_val, w.grad.data[0])
        w.data = w.data - 0.01 * w.grad.data

        # Manually zero the gradients after updating weights
        w.grad.data.zero_()

    print("progress:", epoch, l.data[0])

# After training
print("predict (after training)", 4, forward(4).data[0])

predict (before training) 4 4.0
	grad:  1.0 2.0 -2.0
	grad:  2.0 4.0 -7.840000152587891
	grad:  3.0 6.0 -16.228801727294922
progress: 0 7.315943717956543
	grad:  1.0 2.0 -1.478623867034912
	grad:  2.0 4.0 -5.796205520629883
	grad:  3.0 6.0 -11.998146057128906
progress: 1 3.9987640380859375
	grad:  1.0 2.0 -1.0931644439697266
	grad:  2.0 4.0 -4.285204887390137
	grad:  3.0 6.0 -8.870372772216797
progress: 2 2.1856532096862793
	grad:  1.0 2.0 -0.8081896305084229
	grad:  2.0 4.0 -3.1681032180786133
	grad:  3.0 6.0 -6.557973861694336
progress: 3 1.1946394443511963
	grad:  1.0 2.0 -0.5975041389465332
	grad:  2.0 4.0 -2.3422164916992188
	grad:  3.0 6.0 -4.848389625549316
progress: 4 0.6529689431190491
	grad:  1.0 2.0 -0.4417421817779541
	grad:  2.0 4.0 -1.7316293716430664
	grad:  3.0 6.0 -3.58447265625
progress: 5 0.35690122842788696
	grad:  1.0 2.0 -0.3265852928161621
	grad:  2.0 4.0 -1.2802143096923828
	grad:  3.0 6.0 -2.650045394897461
progress: 6 0.195076122879982
	grad:  1.0 2.0 -0.24144

## Data Loaders

A data loader is an iterator that will provide a mini-batch of data ( subset). But first we need to ensure that we start at the beginning of the dataset. Pythons' **`iter()`** method will create an iterator object and start at the beginning of the dataset. And afterwards our iterator will have **`__next__`** that can be used to pull a mini-batch. We can evoke one from ** torch.utils **

In [0]:
# data loader form pytorch library
from torch.utils.data import Dataset, DataLoader
#import numpy as np
#datasets that are already in given in the library.
import torchvision.datasets as datasets
#tranforms that can be applied to the dataset. Like, 'ToTensor()' as it will be in a processable format.
import torchvision.transforms as transforms

In [0]:
# if our data is already preprocessed into binary code library. We can invoke it using the default dataloaders. ....MNIST Dataset
train_dataset = datasets.MNIST(root='./data/data',
                               train=True,
                               transform=transforms.ToTensor(),
                               download=True)

test_dataset = datasets.MNIST(root='./data/data',
                              train=False,
                              transform=transforms.ToTensor())

# Data Loader (Input Pipeline)
train_loader = DataLoader(dataset=train_dataset, batch_size=32, shuffle=True)

test_loader = DataLoader(dataset=test_dataset, batch_size=32, shuffle=False)


Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Processing...
Done!


In [0]:
# get the first batch
for i in train_loader:
    print(i) #loop runs only once !
    break
    
train_loader.batch_size
#notice what happens because of shuffle = true

[
(0 ,0 ,.,.) = 
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
           ...             ⋱             ...          
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
     ⋮ 

(1 ,0 ,.,.) = 
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
           ...             ⋱             ...          
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
     ⋮ 

(2 ,0 ,.,.) = 
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000


32

In [0]:
# pythonic way to get the first batch
train_iterable = iter(train_loader)
X, y = next(train_iterable)

In [0]:
print(y)


 2
 1
 6
 7
 0
 9
 6
 6
 8
 6
 9
 5
 2
 3
 3
 2
 1
 1
 5
 1
 4
 9
 8
 2
 2
 5
 4
 6
 2
 3
 4
 7
[torch.LongTensor of size 32]



### Create a custom DataLoader

```
# You should build custom dataset as below.
class CustomDataset(data.Dataset):
    def __init__(self):
        # TODO
        # 1. Initialize file path or list of file names. 
        pass
    def __getitem__(self, index):
        # TODO
        # 1. Read one data from file (e.g. using numpy.fromfile, PIL.Image.open).
        # 2. Preprocess the data (e.g. torchvision.Transform).
        # 3. Return a data pair (e.g. image and label).
        pass
    def __len__(self):
        # You should change 0 to the total size of your dataset.
        return 0 

# Then, you can just use prebuilt torch's data loader. 
custom_dataset = CustomDataset()
```

In [0]:
# create a generator class
#don't run the code below if you haven't set up the shakespeare.txt

class MyDataLoader(Dataset):
    #Initialize your dataset (loading it)
    
    def __init__(self, filename = 'data/shakespeare.txt'):
        self.len = 0
        with open(filename, mode='rt') as f:
            self.ylines = [x.strip() for x in f if x.strip()]
            self.slines = [x.lower() for x in self.ylines]
            self.len = len(self.slines)    
        
    def __getitem__(self, index):
        return self.slines[index], self.ylines[index]
            
    def __len__(self):
        return self.len

In [0]:
# invoking our custom dataset loader
dataset = MyDataLoader()
train_loader = DataLoader(dataset=dataset,
                          batch_size=4,
                          shuffle=True,
                          num_workers=2)

for i, (src, target) in enumerate(train_loader):
    print(i, "data", src)
    break

## Pytorch .... DeepLearning Framework ? torch.nn !

so Let's see what it has to offer in nn(neural networks). Notice how autograd is ingrained into torch.nn

! important note : torch.nn only supports mini-batches, i.e. input to any nn layer is a 4D Tensor of samples * channels * height * width. If a single sample, input.unsqueeze(0) adds a fake dimension.

In [0]:
# Annotate your functions / classes!
torch.nn.Module?

[0;31mInit signature:[0m [0mtorch[0m[0;34m.[0m[0mnn[0m[0;34m.[0m[0mModule[0m[0;34m([0m[0;34m)[0m[0;34m[0m[0m
[0;31mDocstring:[0m     
Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in
a tree structure. You can assign the submodules as regular attributes::

    import torch.nn as nn
    import torch.nn.functional as F

    class Model(nn.Module):
        def __init__(self):
            super(Model, self).__init__()
            self.conv1 = nn.Conv2d(1, 20, 5)
            self.conv2 = nn.Conv2d(20, 20, 5)

        def forward(self, x):
           x = F.relu(self.conv1(x))
           return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their
parameters converted too when you call .cuda(), etc.
[0;31mFile:[0m           ~/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/module.py
[0;31mType:[0m         

In [0]:
torch.nn.Module??

[0;31mInit signature:[0m [0mtorch[0m[0;34m.[0m[0mnn[0m[0;34m.[0m[0mModule[0m[0;34m([0m[0;34m)[0m[0;34m[0m[0m
[0;31mSource:[0m        
[0;32mclass[0m [0mModule[0m[0;34m([0m[0mobject[0m[0;34m)[0m[0;34m:[0m[0;34m[0m
[0;34m[0m    [0;34mr"""Base class for all neural network modules.[0m
[0;34m[0m
[0;34m    Your models should also subclass this class.[0m
[0;34m[0m
[0;34m    Modules can also contain other Modules, allowing to nest them in[0m
[0;34m    a tree structure. You can assign the submodules as regular attributes::[0m
[0;34m[0m
[0;34m        import torch.nn as nn[0m
[0;34m        import torch.nn.functional as F[0m
[0;34m[0m
[0;34m        class Model(nn.Module):[0m
[0;34m            def __init__(self):[0m
[0;34m                super(Model, self).__init__()[0m
[0;34m                self.conv1 = nn.Conv2d(1, 20, 5)[0m
[0;34m                self.conv2 = nn.Conv2d(20, 20, 5)[0m
[0;34m[0m
[0;34m            def forward(self,

## Optimizers --> nn.optim

In [0]:
import torch.optim as optim

In [0]:
??optim

In [0]:
#tab to see the available attributes
optim.

SyntaxError: ignored