![pytorch](https://upload.wikimedia.org/wikipedia/commons/9/96/Pytorch_logo.png)

### Useful Links:

* pytorch official documentation
http://pytorch.org/docs/master/index.html

* pytorch discussion
https://discuss.pytorch.org/

* pytorch official tutorials
https://pytorch.org/tutorials/



## Preliminaries

In [1]:
import numpy as np
import torch
import torchvision

In [2]:
# print versions
from platform import python_version
print(f"python version: {python_version()}")
print(f"torch version: {torch.__version__}")
print(f"torchvision version: {torchvision.__version__}")

python version: 3.7.12
torch version: 1.10.0+cu111
torchvision version: 0.11.1+cu111


In [3]:
# set random seeds
torch.manual_seed(42)
np.random.seed(42)

##  Tensors

One of the main data types in PyTorch is 'tensor'.
We will start with the concept of tensor and how it is used in PyTorch.

![](https://github.com/lyubonko/ucu2021cv/blob/master/assignments/fig/tensors.jpg?raw=true)

### Tensor Initialization

In [4]:
# initialization of 1d tensor of size 64 of type float32 (default)
# (this tensor is initialized with default values close to zero)

tensor_inits = [
                
  # initialization of 1d tensor of size 64 of type float32 (default)
  # (this tensor is initialized with default values close to zero)                
  torch.empty(64),

  # initialize with array [0,1,...,63]
  torch.arange(0,64),

  # tensor with all zeros
  torch.zeros(8, 8, dtype=torch.long),

  # tensor with all ones
  torch.ones(8, 8, dtype=torch.float32),

  # random tensor with values in range [0,1)
  torch.rand((3,3,3)),

  # tensor with random int values
  torch.randint(10, (2,2))
]

for v in tensor_inits:
  print(f" * the first 2 elements are: \n \t{v[:2]} ")
  print(f"   the type of this tensor is {v.dtype}")
  print(f"   the size is {v.size()} \n")



 * the first 2 elements are: 
 	tensor([1.6936e-27, 3.0802e-41]) 
   the type of this tensor is torch.float32
   the size is torch.Size([64]) 

 * the first 2 elements are: 
 	tensor([0, 1]) 
   the type of this tensor is torch.int64
   the size is torch.Size([64]) 

 * the first 2 elements are: 
 	tensor([[0, 0, 0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 0, 0]]) 
   the type of this tensor is torch.int64
   the size is torch.Size([8, 8]) 

 * the first 2 elements are: 
 	tensor([[1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.]]) 
   the type of this tensor is torch.float32
   the size is torch.Size([8, 8]) 

 * the first 2 elements are: 
 	tensor([[[0.8823, 0.9150, 0.3829],
         [0.9593, 0.3904, 0.6009],
         [0.2566, 0.7936, 0.9408]],

        [[0.1332, 0.9346, 0.5936],
         [0.8694, 0.5677, 0.7411],
         [0.4294, 0.8854, 0.5739]]]) 
   the type of this tensor is torch.float32
   the size is torch.Size([3, 3, 3]) 

 * the first 2 elements 

In [5]:
# initialize with array all ones
x = torch.ones(8, 8, dtype=torch.float)

print(f" * the size of the 'x' is: \n {x.size()} \n")
print(f" * the size of the 'x' can also be obtained by familar from numpy 'shape' command: \n {x.shape}")

 * the size of the 'x' is: 
 torch.Size([8, 8]) 

 * the size of the 'x' can also be obtained by familar from numpy 'shape' command: 
 torch.Size([8, 8])


#### <font color="red">**[PROBLEM I]:** </font>

-----

 <font color="red"> Initialize X </font>     
 <font color="red"> 3d Tensor of size (4,4,4) </font>   
 <font color="red"> of type int32 with all elements equal to 10 </font>   

-----

In [7]:
# YOUR CODE HERE
X = None 
X = 10 * torch.ones(4, 4, 4, dtype=torch.int32) # !DEL

In [8]:
X.shape

torch.Size([4, 4, 4])

### Reshaping, broadcasting

Tensor reshaping is done with command 'view':

In [31]:
a = torch.tensor([[1,2], [3,4]])
a_reshaped = a.view(4) # reshape into one-dimensional tensor of size 4

print(a)
print(a_reshaped)

tensor([[1, 2],
        [3, 4]])
tensor([1, 2, 3, 4])


#### <font color="red">**[PROBLEM II]:** </font>

-----

 <font color="red"> Use command 'view' to reshape **v** and **X** (obtained in PROBLEM I) into 2d <ins>*square*</ins> tensors  **v_** and **X_**. </font>  

  <font color="red">Convet all tensors to type **int16** </font>

 <font color="red"> Perform addition of these reshaped tensors, namely calculate **sum_tensor** = **v_** + **X_** + **x** </font>  

 <font color="red"> Finally display the result. </font>

-----

In [11]:
v = torch.ones(64)

# YOUR CODE HERE (replace 'None')
v_ = None  
X_ = None 
sum_tensors = None 
 
v_ = v.view((8,8)) # !DEL
X_ = X.view((8,8)) # !DEL
sum_tensors = v_ + X_ # !DEL
print(sum_tensors)

tensor([[11., 11., 11., 11., 11., 11., 11., 11.],
        [11., 11., 11., 11., 11., 11., 11., 11.],
        [11., 11., 11., 11., 11., 11., 11., 11.],
        [11., 11., 11., 11., 11., 11., 11., 11.],
        [11., 11., 11., 11., 11., 11., 11., 11.],
        [11., 11., 11., 11., 11., 11., 11., 11.],
        [11., 11., 11., 11., 11., 11., 11., 11.],
        [11., 11., 11., 11., 11., 11., 11., 11.]])


### Operations on Tensors

relevant tutorial
https://pytorch.org/tutorials/beginner/blitz/tensor_tutorial.html#operations

There are multiple syntaxes for different operations. As starters let us look for 'addition' operation.

In [12]:
x = torch.randint(10, (4, 4))
y = torch.ones(4,4)

In [13]:
print(torch.add(x, y))
print(x + y)
result = torch.empty_like(y)
torch.add(x, y, out=result)
print(result)

tensor([[ 1.,  6., 10.,  4.],
        [ 5., 10.,  7.,  3.],
        [ 1.,  7.,  3.,  8.],
        [10.,  8.,  4.,  4.]])
tensor([[ 1.,  6., 10.,  4.],
        [ 5., 10.,  7.,  3.],
        [ 1.,  7.,  3.,  8.],
        [10.,  8.,  4.,  4.]])
tensor([[ 1.,  6., 10.,  4.],
        [ 5., 10.,  7.,  3.],
        [ 1.,  7.,  3.,  8.],
        [10.,  8.,  4.,  4.]])


inplace addition

In [14]:
print(y)
y.add_(x)
print(y)

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])
tensor([[ 1.,  6., 10.,  4.],
        [ 5., 10.,  7.,  3.],
        [ 1.,  7.,  3.,  8.],
        [10.,  8.,  4.,  4.]])


Learn about other operations https://pytorch.org/tutorials/beginner/blitz/tensor_tutorial.html#operationsm

### Numpy bridge

In [15]:
# create numpy array
a = np.array([[1,2], [3,4]])
# transform numpy array into torch.Tensor
b = torch.from_numpy(a)
# make operation on this Tensor (in this case transpose)
b = b.transpose(1,0)
# transform back to numpy
c = b.numpy()                

print(f"{type(a)} \n {a} \n")
print(f"{type(b)} \n {b} \n")
print(f"{type(c)} \n {c} \n")

<class 'numpy.ndarray'> 
 [[1 2]
 [3 4]] 

<class 'torch.Tensor'> 
 tensor([[1, 3],
        [2, 4]]) 

<class 'numpy.ndarray'> 
 [[1 3]
 [2 4]] 



#### <font color="red">**[PROBLEM III]:** </font>

-----
<span style="color:red"> Using these two random matrices: </span>

In [16]:
x = np.random.randn(3, 10)
y = np.random.randn(4, 10)

<span style="color:red"> Do the following: </span>
* <span style="color:red">transform $\mathbf{x}$ and $\mathbf{y}$ to torch.Tensors</span>
* <span style="color:red">perform matrix mutliplication $\mathbf{r1} = \mathbf{x} \cdot \mathbf{y^T} $</span>  
<span style="color:blue"> look in for pytorch function http://pytorch.org/docs/master/torch.html#torch.mm </span>  or
<span style="color:blue">  http://pytorch.org/docs/master/torch.html#torch.matmul </span>  
* <span style="color:red">perform matrix element-wise mutliplication $\mathbf{r2} = \mathbf{r1} \cdot \mathbf{r1} $</span>  
<span style="color:blue"> look in for pytorch function http://pytorch.org/docs/master/torch.html#torch.mul </span> 
* <span style="color:red">perform scalar addition and scalar multiplication $\mathbf{r3} = 2 * \mathbf{r2} + 3 $</span>  
* <span style="color:red">transform the result back to numpy </span>

-----

In [17]:
# YOUR CODE HERE (replace 'None')
r1 = None
r2 = None
r3 = None
print(r3.numpy())

r1 = torch.from_numpy(x).mm(torch.from_numpy(y).transpose(1,0))
r2 = r1 * r1
r3 = 2 * r2 + 3
r3.numpy()

array([[15.75251238,  4.00731333,  4.7705989 , 22.49291184],
       [ 4.29568683, 76.39517546, 44.09112074,  5.5827972 ],
       [ 3.75285596, 26.46251526,  3.39959147,  5.45615519]])

### CUDA stuff

let us run on CUDA! ... if CUDA is available

In [58]:
torch.cuda.is_available()

True

In [62]:
x = torch.randn(3, 10)
if torch.cuda.is_available():
    device = "cuda"          # a CUDA device object
    y = torch.ones_like(x).to(device)  # directly create a tensor on GPU
    x = x.to(device)                   
    z = x + y
    print(z) # notice "device='cuda:0'" when we print this part
    print()
    print(z.to("cpu"))

tensor([[-6.6185e-01, -1.5984e-02,  9.9031e-01, -7.4351e-01, -3.7672e-02,
          1.6742e+00,  2.0888e+00,  4.2088e-02, -1.0418e+00,  2.8409e+00],
        [-2.2589e-01, -1.2388e+00,  1.6942e+00,  2.4659e-01, -5.0881e-02,
          2.4103e+00,  2.7051e+00,  2.1444e+00,  7.8683e-01,  2.0191e-01],
        [ 2.6263e+00,  4.5684e-01,  1.1178e+00,  2.5435e+00,  1.2222e+00,
         -1.0217e+00,  1.1737e+00,  2.0074e-03,  6.7857e-01,  1.5274e+00]],
       device='cuda:0')

tensor([[-6.6185e-01, -1.5984e-02,  9.9031e-01, -7.4351e-01, -3.7672e-02,
          1.6742e+00,  2.0888e+00,  4.2088e-02, -1.0418e+00,  2.8409e+00],
        [-2.2589e-01, -1.2388e+00,  1.6942e+00,  2.4659e-01, -5.0881e-02,
          2.4103e+00,  2.7051e+00,  2.1444e+00,  7.8683e-01,  2.0191e-01],
        [ 2.6263e+00,  4.5684e-01,  1.1178e+00,  2.5435e+00,  1.2222e+00,
         -1.0217e+00,  1.1737e+00,  2.0074e-03,  6.7857e-01,  1.5274e+00]])


##  Autograd: automatic differentiation

relevant tutorial
https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html#sphx-glr-beginner-blitz-autograd-tutorial-py


*torch.Tensor* is the central class of the package. If you set its attribute *.requires_grad* as True, it starts to track all operations on it. When you finish your computation you can call *.backward()* and have all the gradients computed automatically. The gradient for this tensor will be accumulated into .grad attribute. 

**use of autograd**

Lets start with simple example.
Consider the following function:
$$f = (x + y) \cdot z$$

For concretness let's take $x=2$, $y=-7$, $z=3$. The 'forward' calculation is shown in <span style="color:green"> green </span> on the image below.

Automaic differentiation provides the elegant tool to calculate derivatives of $f$ with respect to all variables, by 'backward' path.

$$f = (x + y) \cdot z = u \cdot z $$

$$ \frac{\partial f}{\partial u} = z $$

$$ \frac{\partial f}{\partial z} = u = -5 $$

$$ \frac{\partial f}{\partial x} = \frac{\partial f}{\partial u} \cdot \frac{\partial u}{\partial x} = z = 3$$

$$ \frac{\partial f}{\partial y} = \frac{\partial f}{\partial u} \cdot \frac{\partial u}{\partial y} = z = 3$$

![comp_graph_1](https://github.com/lyubonko/ucu2020cv/blob/master/assignments/fig/comp_graph_1.png?raw=true)

In [63]:
# Create tensors.
# ('requires_grad' is False by default)
x = torch.tensor([2.], requires_grad=True)
y = torch.tensor([-7.], requires_grad=True)
z = torch.tensor([3.], requires_grad=True)

# Build a computational graph.
f = (x + y) * z   

# Compute gradients.
f.backward()

# Print out the gradients.
print(x.grad)    
print(y.grad)    
print(z.grad) 

tensor([3.])
tensor([3.])
tensor([-5.])


#### <font color="red">**[PROBLEM IV]** </font>


 Next we will consider the computational graph of the following function 

$$f = \frac{1}{1 + exp^{-(w_0 \cdot x_0 + w_1 \cdot x_1 + b )}} = \frac{1}{1 + exp^{-(\mathbf{w} \cdot \mathbf{x} + b )}}$$


![comp_graph_2](https://github.com/lyubonko/ucu2020cv/blob/master/assignments/fig/comp_graph_2.png?raw=true)

 We are interested in computing partial derivatives: 

$$ \frac{\partial f}{\partial \mathbf{w}}  $$ 

$$ \frac{\partial f}{\partial b}  $$ 

$$ \frac{\partial f}{\partial \mathbf{x}}  $$ 

define $\{x_0, x_1\}$ and $\{w_0, w_1\}$ as vector variables $\mathbf{x}$ and $\mathbf{w}$
look in for pytorch exponent function http://pytorch.org/docs/master/torch.html#torch.exp 
use matrix operations

You should get the numbers the same as on the figure

In [19]:
w = torch.tensor([3., 5.], requires_grad=True)
x = torch.tensor([-2., 1.], requires_grad=True)
b = torch.tensor([2.], requires_grad=True)

#YOUR CODE HERE (replace 'None')
f = None 
f = 1 / (torch.exp(-w.view(-1,2).mm(x.view(2,-1)) - b) + 1)

# Compute gradients.
f.backward()

# Print out the gradients.
print(w.grad)
print(x.grad)      
print(b.grad) 

tensor([-0.3932,  0.1966])
tensor([0.5898, 0.9831])
tensor([0.1966])
