<a href="https://colab.research.google.com/github/Joycechidi/Deep-Learning-/blob/master/PyTorch-Practice/Tensors_in_PyTorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
#First import PyTorch
import torch

In [0]:
def activation(x):
    """
       Sigmoid activation function
    
       Arguments
       ----------
       x: torch.Tensor
    """
    return 1/(1+torch.exp(-x))

In [0]:
#Generate some random data
torch.manual_seed(7) #Set the random seed so things are predictable

#Features are 5 random normal variables
features = torch.randn((1, 5))

#True weights for our data, random normal variables again
weights = torch.randn_like(features)

# and a true bias term
bias = torch.randn((1, 1))

I will use actual data in my notebooks with time.

I'm going to do the following:

*   Calculate the output of the network with input features features, weights weights, and bias bias.
*   Use the function activation defined above as the activation function.

Similar to Numpy, PyTorch has a torch.sum() function, as well as a .sum() method on tensors, for taking sums. I prefer to use matrix multiplication of the features and weights since it is more efficient and accelerated using modern libraries and high-performance computing on GPUs.

In [24]:
output = activation(torch.matmul(features, weights))


RuntimeError: ignored

Running the above code as will return a size mismatch error. The shape of the features and weights are very important in running these calculations. In the next cell, I will transpose the weight in order to change the shape of the tensor from 1 x 5 to 5 x 1 so the calculation can go through.

In [27]:
output = activation(torch.matmul(features, weights.T) + bias)
output

tensor([[0.1595]])

I can also use the view() to manually change the shape of the weights and still get the tensor values as seen in the cell below.

In [28]:
output = activation(torch.mm(features, weights.view(5, 1)) + bias)
output

tensor([[0.1595]])

### Stack them up!

That's how you can calculate the output for a single neuron. The real power of this algorithm happens when you start stacking these individual units into layers and stacks of layers, into a network of neurons. The output of one layer of neurons becomes the input for the next layer. With multiple input units and output units, we now need to express the weights as a matrix.

<img src='assets/multilayer_diagram_weights.png' width=450px>

The first layer shown on the bottom here are the inputs, understandably called the **input layer**. The middle layer is called the **hidden layer**, and the final layer (on the right) is the **output layer**. We can express this network mathematically with matrices again and use matrix multiplication to get linear combinations for each unit in one operation. For example, the hidden layer ($h_1$ and $h_2$ here) can be calculated 

$$
\vec{h} = [h_1 \, h_2] = 
\begin{bmatrix}
x_1 \, x_2 \cdots \, x_n
\end{bmatrix}
\cdot 
\begin{bmatrix}
           w_{11} & w_{12} \\
           w_{21} &w_{22} \\
           \vdots &\vdots \\
           w_{n1} &w_{n2}
\end{bmatrix}
$$

The output for this small network is found by treating the hidden layer as inputs for the output unit. The network output is expressed simply

$$
y =  f_2 \! \left(\, f_1 \! \left(\vec{x} \, \mathbf{W_1}\right) \mathbf{W_2} \right)
$$

In [0]:
##Generate some data
torch.manual_seed(7) # set the random seed to a number so things are preddictable

#Features arre 3 random normal variables
features = torch.randn((1, 3))

#Define te size of each layer in our network
n_input = features.shape[1]
#number of inputs must match number of input features
n_hidden = 2  #number of hidden units
n_output = 1  #number of output units


#Weights for inputs to hidden layer
W1 = torch.randn(n_input, n_hidden)
#Weights for hidden layer to output layer
W2 = torch.randn(n_hidden, n_output)

# and bias terms for hidden and output layers
B1 = torch.randn((1, n_hidden))
B2 = torch.randn((1, n_output))

Let's calculate the output for this multi-layer network using weightgs W1 & W2, and the biases, B1 & B2

In [30]:
# Solution

input_layer = activation(torch.mm(features, W1) + B1)
output = activation(torch.mm(input_layer, W2) + B2)

print(output)

tensor([[0.3171]])


The number of hidden units a parameter of the network has is often called a **hyperparameter**. This differentiates it from the weights and biases parameters.
When training a neural network, the more hidden layers your network and the more layers it has, the better it will be able to learn from data and make accurate predictions.

Numpy to Torch (Vice versa)

In [32]:
import numpy as np
a = np.random.rand(4, 3)
a

#torch.from_numpy()

array([[0.84627374, 0.80484474, 0.94360632],
       [0.54594401, 0.23088077, 0.73241634],
       [0.9551725 , 0.97671948, 0.17419294],
       [0.28749573, 0.50676332, 0.43288085]])

In [34]:
b = torch.from_numpy(a)
b

tensor([[0.8463, 0.8048, 0.9436],
        [0.5459, 0.2309, 0.7324],
        [0.9552, 0.9767, 0.1742],
        [0.2875, 0.5068, 0.4329]], dtype=torch.float64)

In [35]:
b.numpy()

array([[0.84627374, 0.80484474, 0.94360632],
       [0.54594401, 0.23088077, 0.73241634],
       [0.9551725 , 0.97671948, 0.17419294],
       [0.28749573, 0.50676332, 0.43288085]])

The memory is shared between the Numpy array and Torch tensor, so if you change the values in-place of one object, the other will change as well.

In [36]:
# Multipy a PyTorch tensor by 2, in place
b.mul_(2)

tensor([[1.6925, 1.6097, 1.8872],
        [1.0919, 0.4618, 1.4648],
        [1.9103, 1.9534, 0.3484],
        [0.5750, 1.0135, 0.8658]], dtype=torch.float64)

In [37]:
# Numpy array matches new values from Tensor
a

array([[1.69254747, 1.60968948, 1.88721264],
       [1.09188802, 0.46176154, 1.46483267],
       [1.91034499, 1.95343897, 0.34838588],
       [0.57499145, 1.01352663, 0.8657617 ]])