In [1]:
from __future__ import print_function
import torch

<img src="slide1.jpg">

<img src="firmen.png">

<img src="torch.png">

Usually one uses PyTorch either as:

    A replacement for numpy to use the power of GPUs.
    a deep learning research platform that provides maximum flexibility and speed


PyTorch is a python package that provides two high-level features:

- Tensor computation (like numpy) with strong GPU acceleration
- Deep Neural Networks built on a tape-based autodiff system

You can reuse your favorite python packages such as numpy, scipy and Cython to extend PyTorch when needed.

With PyTorch, we use a technique called Reverse-mode auto-differentiation, which allows you to change the way your network behaves arbitrarily with zero lag or overhead. Our inspiration comes from several research papers on this topic, as well as current and past work such as autograd, autograd, Chainer, etc.

<img src="graph.gif">

In [None]:
Dynamische Aufbau erleichtert Debugging aber ist dafür nicht so perfomant wie statische Varianten. 


<img src="package.png">


If you use numpy, then you have used Tensors (a.k.a ndarray).

<img src="tensor_illustration.png">

PyTorch provides Tensors that can live either on the CPU or the GPU, and accelerate compute by a huge amount.

Broadcasting Regeln siehe: http://pytorch.org/docs/master/notes/broadcasting.html

Two tensors are “broadcastable” if the following rules hold:

- Each tensor has at least one dimension.
- When iterating over the dimension sizes, starting at the trailing dimension, the dimension sizes must either be equal, one of them is 1, or one of them does not exist.

# Tensoren

PyTorch Tensors verhalten sich sehr ähnlich wie numpy.ndarrays

torch.Tensor is an alias for the default tensor type (torch.FloatTensor).

In [21]:
x = torch.Tensor(5, 5)
print(x)

tensor(1.00000e-37 *
       [[ 3.4812,  0.0000,  0.0000,  0.0000,  0.0000],
        [ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000],
        [ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000],
        [ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000],
        [ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000]])


In [24]:
x = torch.rand(5, 3)
print(x)


tensor([[ 0.6274,  0.7488,  0.5962],
        [ 0.6923,  0.0584,  0.5398],
        [ 0.9051,  0.9494,  0.0254],
        [ 0.5413,  0.6173,  0.9948],
        [ 0.9108,  0.8124,  0.4479]])


In [25]:
y = torch.rand(5,1)
print(y)

tensor([[ 0.0308],
        [ 0.3140],
        [ 0.8772],
        [ 0.0796],
        [ 0.4638]])


In [26]:
z = y * x 
print (z)

tensor([[ 0.0194,  0.0231,  0.0184],
        [ 0.2174,  0.0183,  0.1695],
        [ 0.7939,  0.8327,  0.0223],
        [ 0.0431,  0.0491,  0.0791],
        [ 0.4224,  0.3767,  0.2077]])


In [2]:
a = torch.tensor([[1, 2, 3], [4, 5, 6]])
b = a.pow(2)
print(a)
print(b)

tensor([[ 1,  2,  3],
        [ 4,  5,  6]])
tensor([[  1,   4,   9],
        [ 16,  25,  36]])


<img src="tensor.png">

In [34]:
tensor = torch.ones((2,), dtype=torch.int8)
data = [[0, 1], [2, 3]]
print(tensor.new_tensor(data))


tensor([[ 0,  1],
        [ 2,  3]], dtype=torch.int8)


# Autograd

- Automatische Ableitungen
- Keine Session - der Code bestimmt den Graphen (define-by-run)

## Variable

![](http://pytorch.org/tutorials/_images/Variable.png)

- Wrapper für Tensoren
- zentrale Schnittstelle zu pyTorch
- hält Methoden für die Bearbeitung der Gradienten


In [16]:
#x = torch.autograd.Variable(torch.ones(2, 2), requires_grad=True)   Depricated
x = torch.ones((2, 3, 4), requires_grad=True)
print(x)

tensor([[[ 1.,  1.,  1.,  1.],
         [ 1.,  1.,  1.,  1.],
         [ 1.,  1.,  1.,  1.]],

        [[ 1.,  1.,  1.,  1.],
         [ 1.,  1.,  1.,  1.],
         [ 1.,  1.,  1.,  1.]]])


In [17]:
f = x + 2
print(f)

tensor([[[ 3.,  3.,  3.,  3.],
         [ 3.,  3.,  3.,  3.],
         [ 3.,  3.,  3.,  3.]],

        [[ 3.,  3.,  3.,  3.],
         [ 3.,  3.,  3.,  3.],
         [ 3.,  3.,  3.,  3.]]])


In [7]:
x = torch.randn(5, 5)  # requires_grad=False by default
y = torch.randn(5, 5)  # requires_grad=False by default
z = torch.randn((5, 5), requires_grad=True)
a = x + y
print(a.requires_grad)



False


In [8]:
b = a + z
print(b.requires_grad)


True


# Neuronale Netzwerke

- [```torch.nn```](http://pytorch.org/docs/master/nn.html)
- ```nn.Module```: 
    - Basisklasse aller Neuronalen Netzwerke
- ```nn.Parameter```: 
    - Ähnlich wie ```autograd.Variable```, registriert sich automatisch innerhalb einer von ```nn.Module``` erbenden Klasse
    - Alle definierten Parameter können über das Attribut ```net.parameters``` ausgegeben werden
- ```autograd.Function```: 
    - Implementierung der Vorwärts-Pfade im NN. 
    - Über die Registrierung von ```nn.Variable```  in F representiert diese mindestens 1 Knoten im Graphen des NN
    - Hier wird der Pfad und die Werte gespeichert um Backpropagation durchführen zu können.

torch.nn only supports mini-batches. The entire torch.nn package only supports inputs that are a mini-batch of samples, and not a single sample.

A typical training procedure for a neural network is as follows:

    Define the neural network that has some learnable parameters (or weights)
    Iterate over a dataset of inputs
    Process input through the network
    Compute the loss 
    Propagate gradients back into the network’s parameters
    Update the weights of the network, typically using a simple update rule: weight = weight - learning_rate * gradient
    
    
 Can calculate: 
     - loss
     - backpropagation
     - update weights


In [18]:
import torch
import torch.nn as nn
import torch.nn.functional as F


class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        # 1 input image channel, 6 output channels, 5x5 square convolution
        # kernel
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        # an affine operation: y = Wx + b
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        # Max pooling over a (2, 2) window
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        # If the size is a square you can only specify a single number
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        x = x.view(-1, self.num_flat_features(x))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

    def num_flat_features(self, x):
        size = x.size()[1:]  # all dimensions except the batch dimension
        num_features = 1
        for s in size:
            num_features *= s
        return num_features


net = Net()
print(net)


Net(
  (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=400, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)


In [19]:
input = torch.randn(1, 1, 32, 32)
out = net(input)
print(out)

tensor([[-0.0743,  0.1011,  0.0975,  0.0236,  0.0713,  0.0882, -0.0971,
         -0.0216,  0.0827,  0.0792]])
