In [1]:
import torch

In [2]:
 def activation(x):
        return 1/(1+torch.exp(-x))

In [3]:
torch.manual_seed(7)

<torch._C.Generator at 0x1e14db70c70>

In [4]:
features = torch.randn((1,5))

In [5]:
features

tensor([[-0.1468,  0.7861,  0.9468, -1.1143,  1.6908]])

In [8]:
weights = torch.rand_like(features)

In [9]:
weights

tensor([[0.2868, 0.2063, 0.4451, 0.3593, 0.7204]])

In [10]:
bias = torch.randn((1,1))

In [18]:
y = activation(torch.sum(features*weights)+bias)

In [19]:
y

tensor([[0.6140]])

## Using matrix multiplication
* <mark>torch.mm</mark> - Much strict
* <mark>torch.matmul</mark> - Supports broadcasting

In [20]:
torch.mm(features,weights)

RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x5 and 1x5)

In [23]:
torch.mm(features.T, weights)

tensor([[-0.0421, -0.0303, -0.0653, -0.0527, -0.1057],
        [ 0.2255,  0.1622,  0.3499,  0.2824,  0.5663],
        [ 0.2716,  0.1953,  0.4214,  0.3402,  0.6821],
        [-0.3196, -0.2299, -0.4960, -0.4004, -0.8028],
        [ 0.4850,  0.3488,  0.7526,  0.6075,  1.2180]])

In [24]:
torch.mm(features, weights.T)

tensor([[1.3592]])

## methods to reshape
* <mark> .reshape() </mark>\
Will return a new tensor with same data with size passed.
* <mark> .resize() </mark>\
Returns same tensor
* <mark> .view() </mark>
Returns a new Tensor, no messing with data

In [25]:
torch.mm(features.reshape(5,1), weights)

tensor([[-0.0421, -0.0303, -0.0653, -0.0527, -0.1057],
        [ 0.2255,  0.1622,  0.3499,  0.2824,  0.5663],
        [ 0.2716,  0.1953,  0.4214,  0.3402,  0.6821],
        [-0.3196, -0.2299, -0.4960, -0.4004, -0.8028],
        [ 0.4850,  0.3488,  0.7526,  0.6075,  1.2180]])

In [29]:
torch.mm(features.view(5,1), weights)

tensor([[-0.0421, -0.0303, -0.0653, -0.0527, -0.1057],
        [ 0.2255,  0.1622,  0.3499,  0.2824,  0.5663],
        [ 0.2716,  0.1953,  0.4214,  0.3402,  0.6821],
        [-0.3196, -0.2299, -0.4960, -0.4004, -0.8028],
        [ 0.4850,  0.3488,  0.7526,  0.6075,  1.2180]])

On our example

In [32]:
activation(torch.mm(features, weights.view(5,1)).sum()+bias)

tensor([[0.6140]])

### Stack them up!

That's how you can calculate the output for a single neuron. The real power of this algorithm happens when you start stacking these individual units into layers and stacks of layers, into a network of neurons. The output of one layer of neurons becomes the input for the next layer. With multiple input units and output units, we now need to express the weights as a matrix.

![image.png](attachment:image.png)
The first layer shown on the bottom here are the inputs, understandably called the **input layer**. The middle layer is called the **hidden layer**, and the final layer (on the right) is the **output layer**. We can express this network mathematically with matrices again and use matrix multiplication to get linear combinations for each unit in one operation. For example, the hidden layer ($h_1$ and $h_2$ here) can be calculated 

$$
\vec{h} = [h_1 \, h_2] = 
\begin{bmatrix}
x_1 \, x_2 \cdots \, x_n
\end{bmatrix}
\cdot 
\begin{bmatrix}
           w_{11} & w_{12} \\
           w_{21} &w_{22} \\
           \vdots &\vdots \\
           w_{n1} &w_{n2}
\end{bmatrix}
$$

The output for this small network is found by treating the hidden layer as inputs for the output unit. The network output is expressed simply

$$
y =  f_2 \! \left(\, f_1 \! \left(\vec{x} \, \mathbf{W_1}\right) \mathbf{W_2} \right)
$$

> **Exercise:** Calculate the output for this multi-layer network using the weights `W1` & `W2`, and the biases, `B1` & `B2`. 

In [33]:
### Generate some data
torch.manual_seed(7) # Set the random seed so things are predictable

# Features are 3 random normal variables
features = torch.randn((1, 3))#input as a row vector

# Define the size of each layer in our network
n_input = features.shape[1]     # Number of input units, must match number of input features
n_hidden = 2                    # Number of hidden units 
n_output = 1                    # Number of output units

# Weights for inputs to hidden layer
W1 = torch.randn(n_input, n_hidden)
# Weights for hidden layer to output layer
W2 = torch.randn(n_hidden, n_output)

# and bias terms for hidden and output layers
B1 = torch.randn((1, n_hidden))
B2 = torch.randn((1, n_output))

In [39]:
print(W1)
print(W1.shape)

tensor([[-1.1143,  1.6908],
        [-0.8948, -0.3556],
        [ 1.2324,  0.1382]])
torch.Size([3, 2])


In [40]:
print(W2)
print(W2.shape)

tensor([[-1.6822],
        [ 0.3177]])
torch.Size([2, 1])


In [57]:
### Solution

laye1_output = activation(torch.mm(features, W1) + B1)
output = activation(torch.mm(laye1_output, W2) + B2)
print(output)

tensor([[0.3171]])
