*Notebook conventions:*

* <font color="red">assignment problem</font>. The red color indicates the task that should be done
* <font color="green">debugging</font>. The green tells you what is expected outcome. Its primarily goal is to help you get the correct answer
* <font color="blue">comments, hints</font>.

Assignment 1 (pytorch basics)
======================



![pytorch](https://upload.wikimedia.org/wikipedia/commons/9/96/Pytorch_logo.png)

#### Useful Links:

* pytorch official documentation
http://pytorch.org/docs/master/index.html

* pytorch discussion
https://discuss.pytorch.org/

* pytorch official tutorials
https://pytorch.org/tutorials/

* pytorch tutorials (a bit more advance)
https://github.com/yunjey/pytorch-tutorial


### Preliminaries

In [None]:
import numpy as np
import torch
import torchvision

In [None]:
# check versions
from platform import python_version
print("python version:".ljust(25) + python_version())
print("torch version:".ljust(25) + torch.__version__)
print("torchvision version:".ljust(25) + torchvision.__version__)

In [None]:
# TODO: not sure I need it here
from google.colab import files

In [None]:
# random seed settings
torch.manual_seed(42)
np.random.seed(42)

###  Tensors

One of the main data type in pytorch is tensor.
We will start with the concept of tensor and how it is used in pytorch

![](https://github.com/lyubonko/ucu2020cv/blob/master/assignments/fig/tensors.jpg?raw=true)

### Tensor Initialization

In [None]:
# 1d tensor of size 64 of type float (default)
# (this tensor is initialized with default values close to zero)
v = torch.empty(64)

print(" * the first 4 elements of 'v' are:")
print(v[:4]) # print the first four elements of the tensor

# initialize with array [0,1,...,63]
v = torch.arange(0,64)

print(" * the first 4 elements of 'v' are:")
print(v[:4]) # print the first four elements of the tensor

print(" * the size of the 'v' is ")
print(v.size())

In [None]:
# 2d tensor of size 64 of type float
x = torch.zeros(8, 8, dtype=torch.long)

print(" * the last 4 elements of 'x' are:")
print(x[:4,:4]) # print the last four elements of the tensor

# initialize with array all ones
x = torch.ones(8, 8, dtype=torch.float)

print(" * the last 4 elements of 'x' are:")
print(x[:4, :4]) # print the last four elements of the tensor

print(" * the size of the 'x' is ")
print(x.size())

print(" * the size of the 'x' can also be obtained by familar from numpy 'shape' command")
print(x.shape)

#### <font color="red">**[PROBLEM I]:** </font>

-----

 <font color="red"> Initialize X </font>     
 <font color="red"> 3d Tensor of size (4,4,4) </font>   
 <font color="red"> of type int32 with all elements equal to 10 </font>   

-----

In [None]:
# YOUR CODE HERE

In [None]:
X.shape

### Reshaping, broadcasting

Tensor reshaping is done with command 'view':

In [None]:
a = torch.tensor([[1,2], [3,4]])
a_reshaped = a.view(4) # reshape into one-dimensional tensor of size 4

print(a)
print(a_reshaped)

#### <font color="red">**[PROBLEM II]:** </font>

-----

 <font color="red"> Use command 'view' to reshape v and X into 2d tensor --> v' and X'. </font>  
  <font color="red">Also convet all tensors to type double </font>
 <font color="red"> Perform addition of these reshaped tensors, namely calculate v' + X' + x </font>  
 <font color="red"> Finally display the result. </font>

-----

In [None]:
# YOUR CODE HERE

### Operations on Tensors

relevant tutorial
https://pytorch.org/tutorials/beginner/blitz/tensor_tutorial.html#operations

There are multiple syntaxes for operations. Let us look for addition operation.

In [None]:
x = x[:4,:4]
y = v.type(torch.float).view(8,8)[:4,:4]

In [None]:
print(torch.add(x, y))
print(x + y)
result = torch.empty_like(x)
torch.add(x, y, out=result)
print(result)

inplace addition

In [None]:
print(x)
x.add_(y)
print(x)

Learn about other operations https://pytorch.org/tutorials/beginner/blitz/tensor_tutorial.html#operationsm

### Numpy bridge

In [None]:
# create numpy array
a = np.array([[1,2], [3,4]])
# transform numpy array into torch.Tensor
b = torch.from_numpy(a)
# make operation on this Tensor (in this case transpose)
b = b.transpose(1,0)
# transform back to numpy
c = b.numpy()                

print(a, type(a))
print(b, type(b))
print(c, type(c))

#### <font color="red">**[PROBLEM III]:** </font>

-----
<span style="color:red"> Using these two random matrices: </span>

In [None]:
x = np.random.randn(3, 10)
y = np.random.randn(4, 10)

<span style="color:red"> Do the following: </span>
* <span style="color:red">transform $\mathbf{x}$ and $\mathbf{y}$ to torch.Tensors</span>
* <span style="color:red">perform matrix mutliplication $\mathbf{r1} = \mathbf{x} \cdot \mathbf{y^T} $</span>  
<span style="color:blue"> look in for pytorch function http://pytorch.org/docs/master/torch.html#torch.mm </span>  
* <span style="color:red">perform matrix element-wise mutliplication $\mathbf{r2} = \mathbf{r1} \cdot \mathbf{r1} $</span>  
<span style="color:blue"> look in for pytorch function http://pytorch.org/docs/master/torch.html#torch.mul </span> 
* <span style="color:red">perform scalar addition and scalar multiplication $\mathbf{r3} = 2 * \mathbf{r2} + 3 $</span>  
* <span style="color:red">transform the result back to numpy </span>

-----

In [None]:
# YOUR CODE HERE

### CUDA stuff

let us run on CUDA! ... if CUDA is available

In [None]:
torch.cuda.is_available()

In [None]:
x = torch.randn(3, 10)
if torch.cuda.is_available():
    device = "cuda"          # a CUDA device object
    y = torch.ones_like(x).to(device)  # directly create a tensor on GPU
    x = x.to(device)                   
    z = x + y
    print(z)
    print(z.to("cpu", torch.double))

###  Autograd: automatic differentiation

relevant tutorial
https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html#sphx-glr-beginner-blitz-autograd-tutorial-py


*torch.Tensor* is the central class of the package. If you set its attribute *.requires_grad* as True, it starts to track all operations on it. When you finish your computation you can call *.backward()* and have all the gradients computed automatically. The gradient for this tensor will be accumulated into .grad attribute. 

**use of autograd**

Lets start with simple example.
Consider the following function:
$$f = (x + y) \cdot z$$

For concretness let's take $x=2$, $y=-7$, $z=3$. The 'forward' calculation is shown in <span style="color:green"> green </span> on the image below.

Automaic differentiation provides the elegant tool to calculate derivatives of $f$ with respect to all variables, by 'backward' path.

$$f = (x + y) \cdot z = u \cdot z $$

$$ \frac{\partial f}{\partial u} = z $$

$$ \frac{\partial f}{\partial z} = u = -5 $$

$$ \frac{\partial f}{\partial x} = \frac{\partial f}{\partial u} \cdot \frac{\partial u}{\partial x} = z = 3$$

$$ \frac{\partial f}{\partial y} = \frac{\partial f}{\partial u} \cdot \frac{\partial u}{\partial y} = z = 3$$

![comp_graph_1](https://github.com/lyubonko/ucu2020cv/blob/master/assignments/fig/comp_graph_1.png?raw=true)

In [None]:
# Create tensors.
# ('requires_grad' is False by default)
x = torch.tensor([2.], requires_grad=True)
y = torch.tensor([-7.], requires_grad=True)
z = torch.tensor([3.], requires_grad=True)

# Build a computational graph.
f = (x + y) * z   

# Compute gradients.
f.backward()

# Print out the gradients.
print(x.grad)    
print(y.grad)    
print(z.grad) 

#### <font color="red">**[PROBLEM IV]** </font>


 Next we will consider the computational graph of the following function 

$$f = \frac{1}{1 + exp^{-(w_0 \cdot x_0 + w_1 \cdot x_1 + b )}} = \frac{1}{1 + exp^{-(\mathbf{w} \cdot \mathbf{x} + b )}}$$


![comp_graph_2](https://github.com/lyubonko/ucu2020cv/blob/master/assignments/fig/comp_graph_2.png?raw=true)

 We are interested in computing partial derivatives: 

$$ \frac{\partial f}{\partial \mathbf{w}}  $$ 

$$ \frac{\partial f}{\partial b}  $$ 

$$ \frac{\partial f}{\partial \mathbf{x}}  $$ 

define $\{x_0, x_1\}$ and $\{w_0, w_1\}$ as vector variables $\mathbf{x}$ and $\mathbf{w}$
look in for pytorch exponent function http://pytorch.org/docs/master/torch.html#torch.exp 
use matrix operations

You should get the numbers the same as on the figure

In [None]:
w = torch.tensor([3., 5.], requires_grad=True)
x = torch.tensor([-2., 1.], requires_grad=True)
b = torch.tensor([2.], requires_grad=True)

f = None #YOUR CODE HERE


# Print out the gradients.
print(w.grad)
print(x.grad)      
print(b.grad) 