# 5 Statistical Functions for Random Sampling in PyTorch

An short introduction about PyTorch and about the chosen functions. 

- torch.bernoulli()
- torch.normal()
- torch.poisson()
- torch.randn()
- torch.randperm()

In [2]:
# Import torch and other required modules
import torch

## Function 1 - torch.bernoulli(input, *, generator=None, out=None)

It draws binary random numbers (0 or 1) from a Bernoulli distribution and the `output` is of the same shape as input.

***Parameters***:

**input (Tensor)** – the input tensor of probability values for the Bernoulli distribution

***Keyword Arguments***:
**generator (torch.Generator, optional)** – a pseudorandom number generator for sampling

**out (Tensor, optional)** – the output tensor

In [3]:
# Example 1 - working
rand_m = torch.rand(4, 4) # generate a random matrix of shape 4x4
print(rand_m)

torch.bernoulli(rand_m) # draws a binary random number (0 or 1)

tensor([[0.0643, 0.0690, 0.5069, 0.0323],
        [0.3004, 0.6761, 0.1933, 0.7579],
        [0.2958, 0.8812, 0.2917, 0.3713],
        [0.8113, 0.0845, 0.9176, 0.9883]])


tensor([[0., 0., 1., 0.],
        [1., 1., 0., 1.],
        [0., 1., 0., 0.],
        [1., 1., 1., 1.]])

Draws a binary random number (`0` or `1`) for given random matrix.

In [4]:
# Example 2 - working
uni_m = torch.empty(4, 4).uniform_(0, 1) # generate a uniform random matrix with range [0, 1]
print(uni_m)

torch.bernoulli(uni_m) # draws a binary random nubber (0 or 1)

tensor([[0.1723, 0.1165, 0.3703, 0.8497],
        [0.5641, 0.0323, 0.1722, 0.3173],
        [0.1663, 0.7910, 0.7019, 0.3336],
        [0.8514, 0.7596, 0.9491, 0.4763]])


tensor([[0., 0., 1., 1.],
        [1., 0., 0., 0.],
        [0., 0., 1., 0.],
        [1., 0., 1., 0.]])

Draws a binary random number (`0` or `1`) for given uniform random matrix.

In [5]:
# Example 3 - breaking (to illustrate when it breaks)
b = torch.ones(4, 4, dtype = torch.int32) # generate a matrix of shape 4x4 with 1 on each element
print(b)

torch.bernoulli(b)

tensor([[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]], dtype=torch.int32)


RuntimeError: "bernoulli_tensor_cpu_p_" not implemented for 'Int'

For Bernoulli Distribution `output` can have integral `dtype`, but input must have floating point `dtype`.

Bernoulli Distribution is a random experiment that has only two outcomes (usually called a `Success` or a `Failure`). 

It is best used when for a given event we have two outcomes.

## Function 2 - torch.normal(mean, std, *, generator=None, out=None)

Returns a tensor of random numbers drawn from separate normal distributions whose mean and standard deviation are given.

The `mean` is a tensor with the mean of each output element’s normal distribution

The `std` is a tensor with the standard deviation of each output element’s normal distribution

The shapes of `mean` and `std` don’t need to match, but the total number of elements in each tensor need to be the same.

***Parameters***:

**mean (Tensor)** – the tensor of per-element means

**std (Tensor)** – the tensor of per-element standard deviations

***Keyword Arguments***:

**generator (torch.Generator, optional)** – a pseudorandom number generator for sampling

**out (Tensor, optional)** – the output tensor

In [6]:
# Example 1 - working
torch.normal(mean = torch.arange(1., 11.), std = torch.arange(1, 0, -0.1))

tensor([ 1.2426,  3.9152,  1.9166,  3.5866,  4.1520,  6.2800,  6.4655,  7.8572,
         8.9195, 10.0185])

It returns a tensor of random numbers drawn from separate normal distributions.

In [7]:
# Example 2 - working
torch.normal(mean = torch.arange(1., 6.))

tensor([0.0205, 0.4621, 3.4993, 4.7508, 5.3769])

It returns a tensor of random numbers drawn from separate normal distributions whose standard deviation is 1.

In [8]:
# Example 3 - breaking (to illustrate when it breaks)
torch.normal(mean = 0, std = 1)

TypeError: normal() received an invalid combination of arguments - got (mean=int, std=int, ), but expected one of:
 * (Tensor mean, Tensor std, *, torch.Generator generator, Tensor out)
 * (Tensor mean, float std, *, torch.Generator generator, Tensor out)
 * (float mean, Tensor std, *, torch.Generator generator, Tensor out)
 * (float mean, float std, tuple of ints size, *, torch.Generator generator, Tensor out, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)


It does not work when mean is 0 and std is 1.

The normal distribution is a probability function that describes how the values of a variable are distributed.

## Function 3 - torch.poisson(input *, generator=None)

It returns a tensor of the same size as `input` with each element sampled from a Poisson distribution with rate parameter given by the corresponding element in `input`.

***Parameters***:

**input (Tensor)** – the input tensor of probability values for the Bernoulli distribution

***Keyword Arguments***:

**generator (torch.Generator, optional)** – a pseudorandom number generator for sampling

**out (Tensor, optional)** – the output tensor.

In [9]:
# Example 1 - working
rates = torch.rand(4, 4) * 4  # rate parameter between 0 and 4
print(rates)

torch.poisson(rates)

tensor([[3.0679, 3.7367, 1.9038, 1.6546],
        [1.8523, 1.0336, 3.4176, 3.6221],
        [3.7246, 2.7234, 0.6564, 0.5517],
        [2.0784, 3.7418, 1.3999, 2.3717]])


tensor([[2., 6., 1., 2.],
        [2., 0., 3., 8.],
        [4., 3., 0., 0.],
        [1., 3., 3., 2.]])

Returns a output from a poisson distribution for input tensor containing rate parameter between 0 and 4.

In [10]:
# Example 2 - working
rates = torch.rand(5, 4) * 7  # rate parameter between 0 and 7
print(rates)

torch.poisson(rates)

tensor([[1.3724, 5.5118, 5.4870, 5.1002],
        [0.9739, 2.7864, 2.1179, 3.6325],
        [6.7643, 5.6531, 1.9739, 2.6923],
        [2.5331, 6.6457, 4.4047, 2.9525],
        [5.1334, 5.9861, 3.9731, 4.5331]])


tensor([[ 2.,  5.,  6.,  4.],
        [ 2.,  2.,  6.,  5.],
        [11.,  3.,  3.,  4.],
        [ 4.,  9.,  5.,  4.],
        [ 6.,  6.,  5.,  4.]])

Returns a output from a poisson distribution for input tensor containing rate parameter between 0 and 7.

In [11]:
# Example 3 - breaking (to illustrate when it breaks)
rates = torch.ones(3,3, dtype = torch.int32)
print(rates)

torch.poisson(rates)

tensor([[1, 1, 1],
        [1, 1, 1],
        [1, 1, 1]], dtype=torch.int32)


RuntimeError: "poisson_cpu" not implemented for 'Int'

Poisson Distribution not worked with `Integral datatype`.

The Poisson distribution shows how many times an event is likely to occur within a specified period of time.

## Function 4 - torch.randn(*size, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False)

It returns a tensor filled with random numbers from a normal distribution with mean `0` and variance `1` (also called the `standard normal distribution`).

***Parameters***:

**size (int...)** – a sequence of integers defining the shape of the output tensor. Can be a variable number of arguments or a collection like a list or tuple.

***Keyword Arguments***:

**out (Tensor, optional)** – the output tensor.

**dtype (torch.dtype, optional)** – the desired data type of returned tensor. Default: if None, uses a global default (see torch.set_default_tensor_type()).

**layout (torch.layout, optional)** – the desired layout of returned Tensor. Default: torch.strided.

**device (torch.device, optional)** – the desired device of returned tensor. Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.

**requires_grad (bool, optional)** – If autograd should record operations on the returned tensor. Default: False.

In [12]:
# Example 1 - working
torch.randn(4)

tensor([-0.7137, -0.5516,  0.4733,  0.4774])

For a given 1-D input tensor it returns random numbers from standard normal distribution.

In [13]:
# Example 2 - working
torch.randn(4,4)

tensor([[ 1.4962, -0.3302, -0.1390,  0.1099],
        [ 0.7223,  0.8696, -2.1832,  0.6171],
        [-1.0721, -1.7165, -0.3011, -0.5699],
        [ 1.1563,  1.7144, -0.4076, -0.8162]])

For a given 2-D input it returns random numbers from standard normal distribution.

In [14]:
# Example 3 - breaking (to illustrate when it breaks)
torch.randn(4, 5, dtype = torch.int32)

RuntimeError: "normal_kernel_cpu" not implemented for 'Int'

`randn` or we can say Standard Normal Distribution not worked with `Integral datatype`.

`Standard Normal Distribution` has a `standard score` or we can say `z-score` is used to calculate the probability of a score occurring within our normal distribution and allows us to compare two scores that are from different normal distributions.

## Function 5 - torch.randperm(n, *, out=None, dtype=torch.int64, layout=torch.strided, device=None, requires_grad=False)

It returns a random permutation of integers from `0` to `n - 1`.

***Parameters***:

**n (int)** – the upper bound (exclusive)

***Keyword Arguments***:

**out (Tensor, optional)** – the output tensor.

**dtype (torch.dtype, optional)** – the desired data type of returned tensor. Default: `torch.int64`.

**layout (torch.layout, optional)** – the desired layout of returned Tensor. Default: `torch.strided`.

**device (torch.device, optional)** – the desired device of returned tensor. Default: if `None`, uses the current device for the default tensor type (see `torch.set_default_tensor_type()`). device will be the `CPU` for `CPU` tensor types and the current `CUDA` device for `CUDA` tensor types.

**requires_grad (bool, optional)** – If `autograd` should record operations on the returned tensor. Default: `False`.

In [15]:
# Example 1 - working
torch.randperm(5)

tensor([2, 0, 3, 4, 1])

It returns the random arrangement of the given upper bound no.

In [16]:
# Example 2 - working
torch.randperm(9)

tensor([6, 2, 7, 5, 1, 0, 3, 4, 8])

It also returns the random permutation of the upper bound no. 9.

In [17]:
# Example 3 - breaking (to illustrate when it breaks)
torch.randperm(torch.arange(1, 10, 1), dtype = torch.float64)

TypeError: randperm() received an invalid combination of arguments - got (Tensor, dtype=torch.dtype), but expected one of:
 * (int n, *, torch.Generator generator, Tensor out, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)
 * (int n, *, Tensor out, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)


It can only work with integers not for tensor inputs.

Random Permutation is used when order of the lists matters.

## Conclusion

These are the 5 statistical PyTorch function for Random Sampling that I find interesting. There are a number of other useful functions. You can refer to the official documentation for the complete list of available Sampling functions.

## Reference Links

Official documentation for torch and tensor operations: 
- https://pytorch.org/docs/stable/torch.html
- https://pytorch.org/docs/stable/tensors.html
