# Artificial Neural Networks

In this second chapter, we delve deeper into Artificial Neural Networks, learning how to train them with real datasets.

# (1) Activation functions

## Motivation

<img src="image/Screenshot 2021-01-26 171633.png">

```
input_layer = torch.tensor([2., 1.])
weight_1 = torch.tensor([[0.45, 0.32], [-0.12, 0.29]])
hidden_layer = torch.matmul(input_layer, weight_1)
weight_2 = torch.tensor([[0.48, -0.12], [0.64, 0.91]])
output_layer = torch.matmul(hidden_layer, weight_2)
print(output_layer)
```

## Matrix multiplication is a linear tranformation

```
input_layer = torch.tensor([2., 1.])
weight_1 = torch.tensor([[0.45, 0.32], [-0.12, 0.29]])
hidden_layer = torch.matmul(input_layer, weight_1)
weight_2 = torch.tensor([[0.48, -0.12], [0.64, 0.91]])
weight = torch.matmul(weight_1, weight_2)
output_layer = torch.matmul(hidden_layer, weight_2)
print(output_layer)
print(weight)
```

 ## Non linear separable datasets

 <img src="image/Screenshot 2021-01-26 172104.png">

 ## Activation functions

 <img src="image/Screenshot 2021-01-26 172208.png">

 ## ReLU activation function

 ```
 RelU(x) = max(0, x)
 ```

 ```
import torch
relu = nn.ReLU()

tensor_1 = torch.tensor([2., 4.])
print(relu(tensor_1))

tensor_2 = torch.tensor([[2., -4.], [1.2, 0.]])
print(relu(tensor_2))
 ```

# Exercise I: Neural networks

Let us see the differences between neural networks which apply `ReLU` and those which do not apply `ReLU`. We have already initialized the input called `input_layer`, and three sets of weights, called `weight_1`, `weight_2` and `weight_3`.

We are going to convince ourselves that networks with multiple layers which do not contain non-linearity can be expressed as neural networks with one layer.

The network and the shape of layers and weights is shown below.

<img src="image/net-ex.jpg">

### Instructions

- Calculate the first and second hidden layer by multiplying the appropriate inputs with the corresponding weights.
- Calculate and print the results of the output.
- Set `weight_composed_1` to the product of `weight_1` with `weight_2`, then set weight to the product of `weight_composed_1` with `weight_3`.
- Calculate and print the output.


In [None]:
# Calculate the first and second hidden layer
hidden_1 = torch.matmul(input_layer, weight_1)
hidden_2 = torch.matmul(hidden_1, weight_2)

# Calculate the output
print(torch.matmul(hidden_2, weight_3))

# Calculate weight_composed_1 and weight
weight_composed_1 = torch.matmul(weight_1, weight_2)
weight = torch.matmul(weight_composed_1, weight_3)

# Multiply input_layer with weight
print(torch.matmul(input_layer, weight))

# Exercise II: ReLU activation

n this exercise, we have the same settings as the previous exercise. In addition, we have instantiated the `ReLU` activation function called `relu()`.

Now we are going to build a neural network which has non-linearity and by doing so, we are going to convince ourselves that networks with multiple layers and non-linearity functions cannot be expressed as a neural network with one layer.

<img src="image/net-ex.jpg">

- Apply non-linearity on `hidden_1` and `hidden_2`.
- Apply non-linearity in the product of first two weight.
- Multiply the result of the previous step with `weight_3`.
- Multiply `input_layer` with `weight` and print the results.


In [None]:
# Apply non-linearity on hidden_1 and hidden_2
hidden_1_activated = relu(torch.matmul(input_layer, weight_1))
hidden_2_activated = relu(torch.matmul(hidden_1_activated, weight_2))
print(torch.matmul(hidden_2_activated, weight_3))

# Apply non-linearity in the product of first two weights. 
weight_composed_1_activated = relu(torch.matmul(weight_1, weight_2))

# Multiply `weight_composed_1_activated` with `weight_3
weight = torch.matmul(weight_composed_1_activated, weight_3)

# Multiply input_layer with weight
print(torch.matmul(input_layer, weight))

# Exercise III: ReLU activation again

Neural networks don't need to have the same number of units in each layer. Here, you are going to experiment with the `ReLU` activation function again, but this time we are going to have a different number of units in the layers of the neural network. The input layer will still have `4` features, but then the first hidden layer will have `6` units and the output layer will have `2` units.

<img src="image/net-ex2.jpg">

### Instructions

- Instantiate the `ReLU()` activation function as `relu` (the function is part of `nn` module).
- Initialize `weight_1` and `weight_2` with random numbers.
- Multiply the `input_layer` with `weight_1`, storing results in `hidden_1`.
- Apply the `relu` activation function over `hidden_1`, and then multiply the output of it with `weight_2`.


In [None]:
# Instantiate ReLU activation function as relu
relu = nn.ReLU()

# Initialize weight_1 and weight_2 with random numbers
weight_1 = torch.rand(4, 6)
weight_2 = torch.rand(6, 2)

# Multiply input_layer with weight_1
hidden_1 = torch.matmul(input_layer, weight_1)

# Apply ReLU activation function over hidden_1 and multiply with weight_2
hidden_1_activated = relu(hidden_1)
print(torch.matmul(hidden_1_activated, weight_2))