# Tensors in PyTorch
First of all, import the library [PyTorch](https://pytorch.org/). Such library works really similar to **NumPy** one which is really good in processing arrays.
The main feature of **PyTorch** is the type **tensor** which is very similar to **NumPy** arrays with the capacity to be processed very quickly by GPUs.
Moreover, it also provide a module for computing the gradient (very useful for backpropagation in neural networks!)


In [2]:
import torch

In [None]:
def activation(x):
    """ Sigmoid activation function

        Arguments
        -----------
        x: torch.Tensor
    """
    return 1/(1+torch.exp(-x))

We constuct a list of one vector of 1 row and 5 columns. Features are in general in the columns.
We create a vector of numbers which are taken by a *normal distribution* (gaussian with mean of zero and standard deviation of one).
For the weights we do the same (we have to have a weight for each feature). Moreover, we create the bias term sampled by the same distribution.

In [4]:
torch.manual_seed(7)
features = torch.randn((1,5))
print(features)
print(features.type())

weights = torch.randn_like(features)
bias = torch.randn((1,1))

tensor([[-0.1468,  0.7861,  0.9468, -1.1143,  1.6908]])
torch.FloatTensor


We compute the output of the network to the single sample.
In the case of **PyTorch**, we have the inner product/dot product for doing the matrix multiplication: thanks to `torch.mm()` and `toch.matmult()`.
The second fuction is more complicate and support broadcasting.
In our case, we have the tensors `features` and `weights` of the same shape, so if we want multiply them we have to transpose the second.
In the end, we want to have as *output a **tensor** with the **number of rows equal to the number of samples in input***.

## Reshape tensors
1. [`weights.reshape(a,b)`](https://pytorch.org/docs/stable/tensors.html#torch.Tensor.reshape) sometimes returns a new tensor of given size, and sometimes returns a clone with the given size (so it uses a new part of memory).
2. [`weights.resize_(a,b)`](https://pytorch.org/docs/stable/tensors.html#torch.Tensor.resize_) returns the same tensor with a different shape. However, if the new shape results in fewer elements than the original tensor, some elements will be removed from the tensor (but not from memory). If the new shape results in more elements than the original tensor, new elements will be uninitialized in memory. The underscore at the end of the method denotes that this method is performed in-place.
3. [`weights.view(a,b)`](https://pytorch.org/docs/stable/tensors.html#torch.Tensor.view) will return a new tensor with the data in `weights` and the size `(a,b)`

As rule of thumb: use `tensor.view(a,b)` in order to change the dimension of a tensor.

In [10]:
output = torch.sum(features*weights)+bias
output = activation(output)
print('The output of the single layer network is: ', output)

weights = weights.view(5,1)
output1 = torch.mm(features, weights)
output1 = output1+bias
output1 = activation(output1)
print('The output of the single layer network is: ', output1)

The output of the single layer network is:  tensor([[0.1595]])
The output of the single layer network is:  tensor([[0.1595]])


Now, we want to consider a 2 layer network.

In [14]:
### Generate some data
torch.manual_seed(7) # Set the random seed so things are predictable

# Features are 3 random normal variables
features = torch.randn((1, 3))

# Define the size of each layer in our network
n_input = features.shape[1]     # Number of input units, must match number of input features
n_hidden = 2                    # Number of hidden units 
n_output = 1                    # Number of output units

# Weights for inputs to hidden layer
W1 = torch.randn(n_input, n_hidden)
# Weights for hidden layer to output layer
W2 = torch.randn(n_hidden, n_output)

# and bias terms for hidden and output layers
B1 = torch.randn((1, n_hidden))
B2 = torch.randn((1, n_output))

In [17]:
h1 = activation(torch.mm(features,W1)+B1)
output2 = activation(torch.mm(h1,W2)+B2)

print('The output of the double layer network is: ', output2)

The output of the double layer network is:  tensor([[0.3171]])


All the other parameters (than the weights and the biases) which the programmer has to choose are named **hyperparameters**.
Generally speaking, it is known that the more the number of hidden units and the more the number of hidden layers a network has, the more such network is accurate (**on the training set**).

## Numpy to torch and back
PyTorch has a very useful tool to make such conversions (from PyTorch to Numpy and viceversa):
- `torch.from_numpy()` to convert a numpy tensor to a torch tensor
- `numpy()` to get the numpy version of a torch tensor

Note: In the end we have one single memory area. Indeed, the *memory is shared between the numpy array and the torch tensor*, so, *modifying in place the element of one or the other modifies also the other*.


In [26]:
import numpy as np
a = np.random.rand(1,3)
b = torch.from_numpy(a)
print(b)
print("Type of b: ", b.type())
c = b.numpy()
print("Type of c: ", type(c))
b.mul_(2)
print(a)
print(c)

tensor([[0.6166, 0.7356, 0.8789]], dtype=torch.float64)
Type of b:  torch.DoubleTensor
Type of c:  <class 'numpy.ndarray'>
[[1.23327644 1.47124599 1.75777034]]
[[1.23327644 1.47124599 1.75777034]]
