In [1]:
# First import PyTorch
import torch

In [2]:
def activation(x):
    """
    Sigmoid activation function
    
    Arguments:
    ----------
    x: torch.Tensor
    """
    return 1/(1 + torch.exp(-x))

# Tensors, for a Layer

In [3]:
### Generate some data

# Set the random seed, so things are predictable
torch.manual_seed(7)

# Features are 5 random normal variables
features = torch.randn((1, 5))

# True weights for our data,
# random normal variables again
weigths = torch.randn_like(features)
# and a true bias term
bias = torch.rand((1, 1))
print(weigths)

tensor([[-0.8948, -0.3556,  1.2324,  0.1382, -1.6822]])


`features = torch.randn((1, 5))` creates a tensor with shape `(1, 5)`, one row and five columns, that contains values randomly distributed according to the normal distribution with a mean of zero and standard deviation of one. 

`weights = torch.randn_like(features)` creates another tensor with the same shape as `features`, again containing values from a normal distribution.

Finally, `bias = torch.randn((1, 1))` creates a single value from a normal distribution.

In [4]:
## Calculate the output of this network using
## the weights and bias tensors
output = activation(
        torch.sum(features * weigths) + bias
    )
print(output)

tensor([[0.2504]])


You can do the multiplication and sum in the same operation using a matrix multiplication. In general, you'll want to use matrix multiplications since they are more efficient and accelerated using modern libraries and high-performance computing on GPUs.

But we can face with dimension problems.

There are a few options here: [`weights.reshape()`](https://pytorch.org/docs/stable/tensors.html#torch.Tensor.reshape), [`weights.resize_()`](https://pytorch.org/docs/stable/tensors.html#torch.Tensor.resize_), and [`weights.view()`](https://pytorch.org/docs/stable/tensors.html#torch.Tensor.view).

* `weights.reshape(a, b)` will return a new tensor with the same data as `weights` with size `(a, b)` sometimes, and sometimes a clone, as in it copies the data to another part of memory.
* `weights.resize_(a, b)` returns the same tensor with a different shape. However, if the new shape results in fewer elements than the original tensor, some elements will be removed from the tensor (but not from memory). If the new shape results in more elements than the original tensor, new elements will be uninitialized in memory. Here I should note that the underscore at the end of the method denotes that this method is performed **in-place**. Here is a great forum thread to [read more about in-place operations](https://discuss.pytorch.org/t/what-is-in-place-operation/16244) in PyTorch.
* `weights.view(a, b)` will return a new tensor with the same data as `weights` with size `(a, b)`.

I usually use `.view()`, but any of the three methods will work for this. So, now we can reshape `weights` to have five rows and one column with something like `weights.view(5, 1)`.

torch.mm is more strict about tensor shape. but torch.matmul do broadcasting a sometimes gives useless data. so always use torch.mm to get what you expect.

In [5]:
## Calculate the output of this network using
## the matrix multiplication
output = activation(
        torch.mm(features, weigths.view(5, 1)) + bias
    )
print(output)

tensor([[0.2504]])


# Stack Them up - Multiple Layers

In [6]:
### Generate some data

# Set seed
torch.manual_seed(7)

features = torch.randn((1, 3))

# Define the size of each layer in our network
# No of input units, must match no of input features
n_input = features.shape[1]
n_hidden = 2
n_output = 1

# Weights for inputs  to hidden layer
W1 = torch.randn(n_input, n_hidden)
# Weights for hidden layer to output layer
W2 = torch.randn(n_hidden, n_output)

# and bias terms for hidden and output layers
B1 = torch.randn((1, n_hidden))
B2 = torch.randn((1, n_output))

Calculate the output for this multi-layer network.

In [7]:
hidden_output = activation(torch.mm( features, W1) + B1)

output = activation( torch.mm(hidden_output, W2) + B2)

print(output)

tensor([[0.3171]])


# Numpy to Torch an back

PyTorch has a great feature for converting between Numpy arrays and Torch tensors.

In [8]:
import numpy as np
a = np.random.rand(4,3)
a

array([[ 0.21654698,  0.29338615,  0.92256296],
       [ 0.6410212 ,  0.10907066,  0.91496208],
       [ 0.19697447,  0.35923486,  0.391553  ],
       [ 0.55148733,  0.76812176,  0.51890214]])

In [9]:
b = torch.from_numpy(a)
b

tensor([[0.2165, 0.2934, 0.9226],
        [0.6410, 0.1091, 0.9150],
        [0.1970, 0.3592, 0.3916],
        [0.5515, 0.7681, 0.5189]], dtype=torch.float64)

In [10]:
b.numpy()

array([[ 0.21654698,  0.29338615,  0.92256296],
       [ 0.6410212 ,  0.10907066,  0.91496208],
       [ 0.19697447,  0.35923486,  0.391553  ],
       [ 0.55148733,  0.76812176,  0.51890214]])

The memory is shared between the Numpy array and Torch tensor, so if we change the values **in-place** of one object, the other will change as well.

In [11]:
# Multiply PyTorch Tensor by 2, in place
b.mul_(2)

tensor([[0.4331, 0.5868, 1.8451],
        [1.2820, 0.2181, 1.8299],
        [0.3939, 0.7185, 0.7831],
        [1.1030, 1.5362, 1.0378]], dtype=torch.float64)

In [12]:
# Numpy array matches new values from Tensor
a

array([[ 0.43309395,  0.58677229,  1.84512592],
       [ 1.2820424 ,  0.21814132,  1.82992416],
       [ 0.39394894,  0.71846972,  0.783106  ],
       [ 1.10297465,  1.53624351,  1.03780427]])