![](img/the_real_reason.png)

# Foreword

Material for this tutorial is here: https://github.com/sotte/pytorch_tutorial

**Prerequisites:**
- you have implemented machine learning models yourself
- you know what deep learning is
- you have used numpy
- maybe you have used tensorflow or similar libs

- if you use PyTorch on a daily basis, this tutorial is probably not for you

**Goals:**
- understand PyTorch concepts
- be able to use transfer learning in PyTorch
- be aware of some handy tools/libs

Note:
You don't need a GPU to work on this tutorial, but everything is much faster if you have one.
However, you can use Google's Colab with a GPU and work on this tutorial:
[PyTorch + GPU in Google's Colab](0X_pytorch_in_googles_colab.ipynb)

# Agenda

See README.md

# PyTorch Overview


> "PyTorch - Tensors and Dynamic neural networks in Python
with strong GPU acceleration.
PyTorch is a deep learning framework for fast, flexible experimentation."
>
> -- https://pytorch.org/*

This was the tagline prior to PyTorch 1.0.
Now it's:

> "PyTorch - From Research To Production
> 
> An open source deep learning platform that provides a seamless path from research prototyping to production deployment."

## "Build by run" - what is that and why do I care?

![](img/dynamic_graph.gif)

This is a much better explanation of PyTorch (I think)

In [1]:
import torch
from IPython.core.debugger import set_trace

def f(x):
    res = x + x
    # set_trace()  # <-- OMG! =D
    return res

x = torch.randn(1, 10)
f(x)

tensor([[-2.5449, -1.8257, -1.2288, -0.8771, -0.7182,  3.1249,  0.8576,  0.7684,
          2.4977, -2.2945]])

I like pytorch because
- "it's just stupid python"
- easy to debug
- nice and extensible interface
- research-y feel
- research is often published as pytorch project

## A word about TF
TF 2 is about to be released.
- eager by default
- API cleanup
- No more `session.run()`, `tf.control_dependencies()`, `tf.while_loop()`, `tf.cond()`, `tf.global_variables_initializer()`, etc.

## TF and PyTorch
- static vs dynamic
- production vs prototyping 

## *"The tyranny of choice"*
- TensorFlow
- MXNet
- Keras
- CNTK
- Chainer
- caffe
- caffe2
- many many more

All of them a good!


# References
- Twitter: https://twitter.com/PyTorch
- Forum: https://discuss.pytorch.org/
- Tutorials: https://pytorch.org/tutorials/
- Examples: https://github.com/pytorch/examples
- API Reference: https://pytorch.org/docs/stable/index.html
- Torchvision: https://pytorch.org/docs/stable/torchvision/index.html
- PyTorch Text: https://github.com/pytorch/text
- PyTorch Audio: https://github.com/pytorch/audio
- AllenNLP: https://allennlp.org/
- Object detection/segmentation: https://github.com/facebookresearch/maskrcnn-benchmark
- Facebook AI Research Sequence-to-Sequence Toolkit written in PyTorch: https://github.com/pytorch/fairseq
- FastAI http://www.fast.ai/
- Stanford CS230 Deep Learning notes https://cs230-stanford.github.io

# Example Network
Just to get an idea of how PyTorch feels like here are some examples of networks.

In [2]:
from collections import OrderedDict

import torch                     # basic tensor functions
import torch.nn as nn            # everything neural network
import torch.nn.functional as F  # functional/stateless version of nn
import torch.optim as optim      # optimizers :)

In [3]:
# Simple sequential model
model = nn.Sequential(
    nn.Conv2d(in_channels=1, out_channels=20, kernel_size=5),
    nn.ReLU(),
    nn.Conv2d(20, 64, 5),
    nn.ReLU(),
    nn.AdaptiveAvgPool2d(1),
)

In [4]:
model

Sequential(
  (0): Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))
  (1): ReLU()
  (2): Conv2d(20, 64, kernel_size=(5, 5), stride=(1, 1))
  (3): ReLU()
  (4): AdaptiveAvgPool2d(output_size=1)
)

In [5]:
# forward pass
model(torch.rand(16, 1, 32, 32)).shape

torch.Size([16, 64, 1, 1])

In [6]:
# Simple sequential model with named layers
layers = OrderedDict([
    ("conv1", nn.Conv2d(in_channels=1, out_channels=20, kernel_size=5)),
    ("relu1", nn.ReLU()),
    ("conv2", nn.Conv2d(20,64,5)),
    ("relu2", nn.ReLU()),
    ("aavgp", nn.AdaptiveAvgPool2d(1)),
])
model = nn.Sequential(layers)
model

Sequential(
  (conv1): Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))
  (relu1): ReLU()
  (conv2): Conv2d(20, 64, kernel_size=(5, 5), stride=(1, 1))
  (relu2): ReLU()
  (aavgp): AdaptiveAvgPool2d(output_size=1)
)

In [7]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=6, kernel_size=5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(in_channels=6, out_channels=16, kernel_size=5)
        self.fc1 = nn.Linear(in_features=16 * 5 * 5, out_features=120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        x = F.adaptive_avg_pool2d(x, 1)
        return x


model = Net()
model

Net(
  (conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=400, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)

# Versions

In [8]:
import torch
torch.__version__

'1.3.0'

In [9]:
import torchvision
torchvision.__version__

'0.4.1a0+d94043a'

In [10]:
import numpy as np
np.__version__

'1.17.3'