# Learn pytorch

In [1]:
import torch

## 1. Init

- `torch.zeros`: Creates a tensor filled with zeros
- `torch.ones`: Creates a tensor filled with ones
- `torch.rand`: Creates a tensor with random values uniformly sampled between 0 and 1
- `torch.randn`: Creates a tensor with random values sampled from a normal distribution with mean 0 and variance 1
- `torch.arange`: Creates a tensor containing the values 
- `torch.Tensor (input list)`: Creates a tensor from the list elements you provide

In [2]:
torch.Tensor([[1, 2], [3, 4]])

tensor([[1., 2.],
        [3., 4.]])

In [3]:
x = torch.rand(2, 3, 4)
x

tensor([[[0.3423, 0.6369, 0.0349, 0.1796],
         [0.3723, 0.9329, 0.1407, 0.4714],
         [0.1003, 0.4271, 0.1500, 0.8458]],

        [[0.4611, 0.1609, 0.0177, 0.1063],
         [0.0812, 0.1075, 0.8238, 0.0881],
         [0.1949, 0.0136, 0.0343, 0.2525]]])

- Shape of tensor:

In [4]:
print(x.shape)
print(x.size())

torch.Size([2, 3, 4])
torch.Size([2, 3, 4])


- Change shape:

In [5]:
x = torch.arange(6)
x

tensor([0, 1, 2, 3, 4, 5])

In [6]:
x.view(2, 3)

tensor([[0, 1, 2],
        [3, 4, 5]])

## 2. Operations

In [7]:
x1 = torch.rand(2, 3)
x2 = torch.rand(2, 3)
x1 + x2

tensor([[0.6686, 1.6619, 0.7967],
        [1.6087, 0.6917, 1.7466]])

- `torch.matmul`: Performs the matrix product over two tensors, where the specific behavior depends on the dimensions. If both inputs are matrices (2-dimensional tensors), it performs the standard matrix product. For higher dimensional inputs, the function supports broadcasting. Can also be written as `a @ b`, similar to numpy.
- `torch.mm`: Performs the matrix product over two matrices, but doesn’t support broadcasting
- `torch.bmm`: Performs the matrix product with a support batch dimension. If the first tensor T is of shape (b x n x m), and the second tensor R (b x m x p), the output O is of shape (b x n x p) and has been calculated by performing matrix b multiplications of the submatrices of T and R
- `torch.einsum`: Performs matrix multiplications and more (i.e. sums of products) using the Einstein summation convention.

In [8]:
x = torch.arange(6).view(2, 3)
x

tensor([[0, 1, 2],
        [3, 4, 5]])

In [9]:
w = torch.arange(9).view(3, 3)
w

tensor([[0, 1, 2],
        [3, 4, 5],
        [6, 7, 8]])

In [10]:
torch.matmul(x, w), x @ w, torch.einsum('ij,jk->ik', x, w)

(tensor([[15, 18, 21],
         [42, 54, 66]]),
 tensor([[15, 18, 21],
         [42, 54, 66]]),
 tensor([[15, 18, 21],
         [42, 54, 66]]))

# 3. Dynamic Computation Graph and Backpropagation

Given an input, we define our function by manipulating that input, usually by matrix-multiplications with weight matrices and additions with so-called bias vectors. As we manipulate our input, we are automatically creating a computational graph. This graph shows how to arrive at our output from our input. PyTorch is a define-by-run framework; this means that we can just do our manipulations, and PyTorch will keep track of that graph for us. Thus, we create a dynamic computation graph along the way.

In [11]:
x = torch.arange(3, dtype=torch.float32, requires_grad=True)
x

tensor([0., 1., 2.], requires_grad=True)

In [12]:
a = x + 2
b = a ** 2
c = b + 3
y = c.mean()
y

tensor(12.6667, grad_fn=<MeanBackward0>)

![figure 1](media/pytorch_computation_graph.svg)

In [13]:
y.backward()

In [14]:
print(x.grad)

tensor([1.3333, 2.0000, 2.6667])


## 4. Model

The package `torch.nn` defines a series of useful classes like linear networks layers, activation functions, loss functions etc

In [15]:
import torch.nn as nn

To illustrate this neuro net, we use:

![figure 2](media/small_neural_network.svg)

### 4.1 Modules
A neural network is built up out of modules. Modules can contain other modules, and a neural network is considered to be a module itself as well. The basic template of a module is as follows:

In [16]:
class SimpleClassifier(nn.Module):
    def __init__(self, num_inputs, num_hidden, num_outputs):
        super().__init__()
        self.linear1 = nn.Linear(num_inputs, num_hidden)
        self.act_fn = nn.Tanh()
        self.linear2 = nn.Linear(num_hidden, num_outputs)

    def forward(self, x):
        x = self.linear1(x)
        x = self.act_fn(x)
        x = self.linear2(x)
        return x

In [17]:
model = SimpleClassifier(num_inputs=2, num_hidden=4, num_outputs=1)
print(model)

SimpleClassifier(
  (linear1): Linear(in_features=2, out_features=4, bias=True)
  (act_fn): Tanh()
  (linear2): Linear(in_features=4, out_features=1, bias=True)
)


In [18]:
for name, param in model.named_parameters():
    print(f"Parameter {name}, shape {param.shape}")

Parameter linear1.weight, shape torch.Size([4, 2])
Parameter linear1.bias, shape torch.Size([4])
Parameter linear2.weight, shape torch.Size([1, 4])
Parameter linear2.bias, shape torch.Size([1])


- Alternatives:

In [19]:
model = nn.Sequential(
          nn.Linear(2, 4),
          nn.Tanh(),
          nn.Linear(4, 1),
        )
print(model)

Sequential(
  (0): Linear(in_features=2, out_features=4, bias=True)
  (1): Tanh()
  (2): Linear(in_features=4, out_features=1, bias=True)
)
