In [2]:
#r "nuget:TorchSharp-cpu"

using TorchSharp;
using static TorchSharp.TensorExtensionMethods;
using static TorchSharp.torch.distributions;
using Microsoft.DotNet.Interactive.Formatting;

Formatter.SetPreferredMimeTypesFor(typeof(torch.Tensor), "text/plain");
Formatter.Register<torch.Tensor>((torch.Tensor x) => x.ToString(TorchSharp.TensorStringStyle.Default));

# Random Numbers and Distributions

There is a rich set of random number generation APIs in TorchSharp. We've already seen the ones that are easiest to use: randn(), rand(), and randint(). Normal and uniform distributions are the foundation for many other random number features.

Note that randint() will generate integers, and the default type is a 64-bit integer. Same goes for randperm().

In [17]:
torch.rand(10).print();
torch.randn(10).print();
torch.randint(100,10).print();
torch.randperm(25).print();

[10], type = Float32, device = cpu 0.60028 0.56578 0.094895 0.096953 0.37144 0.026844 0.19478 0.78418 0.58637 0.17138
[10], type = Float32, device = cpu 0.32318 -0.47797 1.5618 -0.12975 -0.1335 1.274 -0.12877 -2.6481 0.61141 0.47524
[10], type = Int64, device = cpu 46 11 42 37 24 5 84 79 67 54
[25], type = Int64, device = cpu 24 0 3 10 21 8 23 12 22 6 2 19 5 17 9 14 18 4 13 7 15 20 11 16 1


In the `TorchSharp.torch.distributions` static class, there is a much richer collection of distributions. Unlike the three basic generators, these generators are organized as classes that you call a method named `sample()` to get a bunch of random number.

## Setting the seed

Like most random number libraries, TorchSharp allows you to set the seed used for random number generation. You should see the first series and the last being identical, while the one in the middle is different.

One peculiarity about TorchSharp is that using the same initial seed will not lead to the same sequence of numbers when using a CPU vs. a GPU. You cannot reproduce results you had on a CPU by running things on a GPU.

In [18]:
torch.random.manual_seed(4711);
torch.rand(10).print();
torch.random.manual_seed(17);
torch.rand(10).print();
torch.random.manual_seed(4711);
torch.rand(10).print();

[10], type = Float32, device = cpu 0.69071 0.94377 0.033924 0.28365 0.10061 0.89436 0.21124 0.16128 0.59802 0.43391
[10], type = Float32, device = cpu 0.43424 0.53511 0.83021 0.12386 0.029321 0.5494 0.38249 0.54626 0.46828 0.017153
[10], type = Float32, device = cpu 0.69071 0.94377 0.033924 0.28365 0.10061 0.89436 0.21124 0.16128 0.59802 0.43391


## Coin Toss

For example, to get a single-value sample of the Bernoulli distribution, which is a binary false/true, 0/1, yes/no, heads/tails generator, you do the following, passing in the probability of the result being '1':

In [30]:
var bern = Bernoulli(torch.tensor(0.5f));
bern.sample().item<float>()

The element type of the sample will be determined by the element type of the probability tensor, so using precise number literal syntax is important.

Usually, you want more than one value, you want a tensor-full of them. `sample()` takes as its arguments the size of the dimensions of the tensor.

In [9]:
bern.sample(3,4)

[3x4], type = Float32, device = cpu
 0 1 0 0
 1 1 1 0
 0 0 0 1


To help with sampling, there's a class called `Binomial` which will run a number of coin tosses and count the number of times the result is '1'.

In [51]:
var bin = Binomial(torch.tensor(100), torch.tensor(0.25f));
bin.sample().item<float>()

In [52]:
bin.sample(3,4)

[3x4], type = Float32, device = cpu
 16 26 33 17
 29 28 30 31
 27 21 25 23


## Categories

In the coin toss scenario, there were two categories -- yes/no, true/false, 0/1, etc. A more general class of distributions support N different categories. The foundational class for that is called 'Categorical,' and it works just like Bernoulli. You tell it how many categories there are, the probabilities for those categories (it doesn't have to be even), and then you get your sample. The length of the probabilities tensor tells the Categorical class how many categories there are. The categories are represented as integers in the range [0..N[.

In [53]:
var cat = Categorical(torch.tensor(new float[]{0.1f, 0.7f, 0.1f, 0.1f}));
cat.sample(6)

[6], type = Int64, device = cpu
 1 0 3 3 1 1


There's a class corresponding to 'Binomial' for categorical distributions. Here, the category is denoted by the index into the tensor. For sample sizes of at least one dimension, the innermost dimension (the last index) representes the category. In other words, each row is a sample, each column is a category. The value in each cell is how many times (out of the total count specified) that the category was selected.

In [60]:
var mult = Multinomial(100, new float[]{0.1f, 0.7f, 0.1f, 0.1f});
Console.WriteLine(mult.sample().str());
Console.WriteLine(mult.sample(5).str());
mult.sample(2,3)

[4], type = Float32, device = cpu
 9 73 11 7

[5x4], type = Float32, device = cpu
  7 72 13  8
  8 74  5 13
 10 65 14 11
 10 70  5 15
 12 61 14 13



[2x3x4], type = Float32, device = cpu
[0,..,..] =
 10 72  9  9
  7 78  5 10
 10 64 15 11

[1,..,..] =
  8 69 12 11
  9 73 10  8
 12 63 16  9


## Real-valued Distributions

The majority of random distributions are concerned with real numbers, parameteried by either a min/max range, or the mean and standard deviation, or parameters specific to a distribution.

The normal, a.k.a. Gaussian, distribution is the familiar bell-curve, where the likelihood of a value being selected is much higher closer to the mean. 

With 'torch.randn()' and 'torch.rand()', the mean is always zero, and the standard deviation always one. To alter them, you get the sample, multiply by the desired standard deviation, then add the desired mean (in that order).

You can still do that when you are using the distribution classes, but they also allow you to pass in the parameters when creating the distribution class. This is convenient when you are passing the distribution to a function, which doesn't necessarily have to know anything about what kind of distrubition it is given, or its parameters.

In [92]:
torch.Tensor GetTensorFromDistribution(Distribution dist) => dist.sample(5,5);

var norm1 = Normal(torch.tensor(0.5f), torch.tensor(0.125f));
var norm2 = Normal(torch.tensor(0.15f), torch.tensor(0.025f));

Console.WriteLine(GetTensorFromDistribution(norm1).str());
Console.WriteLine(GetTensorFromDistribution(norm2).str());

[5x5], type = Float32, device = cpu
 0.75503 0.56854 0.51172  0.3749 0.65813
 0.69464  0.4754 0.46847  0.4506 0.37406
 0.49459 0.29273 0.57955  0.5678 0.44996
 0.38559 0.41159 0.44973 0.46208 0.65809
 0.47752 0.48089 0.27247 0.57379 0.70928

[5x5], type = Float32, device = cpu
 0.16162 0.13561 0.16598 0.11149 0.15588
 0.15314 0.16739  0.1255 0.15724 0.13274
 0.18135 0.13985 0.16685 0.16375 0.17819
 0.13424 0.14583  0.1146 0.16302 0.18071
 0.11286 0.18956 0.15141 0.17078 0.14096



The same goes for uniformly distributed numbers -- there's a class parameterized by the boundaries: [low,high].

In [93]:
GetTensorFromDistribution(Uniform(torch.tensor(10.0f), torch.tensor(17.0f)))

[5x5], type = Float32, device = cpu
 10.107 14.147 10.603 15.577 16.566
 16.541 16.203 14.338 13.267  10.56
 11.129 13.218 10.185 12.109 12.166
   13.1 16.964 16.228 11.287 11.379
 11.114 15.412 11.667 14.539 14.927


Those distributions are just the most basic ones; there's a number of more esoteric distributions that are beyond the scope of this tutorial to describe when to use. The usage patterns are the same, though: create a distribution instance, and then call `sample()` to get a tensor filled with random numbers from the distribution.

# Generator

So far, all the random numbers have been using the default RNG. This is a process-wide generator, which is returned by the call to manual_seed() that we saw before. In the preceding examples, the return value was ignored. Once the generator has been captured, it can be used to parameterize random number generation. Most random number APIs take an optional generator argument. 

Usually, the generator is the last argument, defaulted to 'null'. Often, there are other parameters with default values that come before the generator, so it's a good idea to get into the habit of passing the generator instance by name.

In [96]:
torch.Generator gen1 = torch.random.manual_seed(17);
torch.rand(2,3, generator: gen1).print();
torch.randn(2,3, generator: gen1).print();

[2x3], type = Float32, device = cpu 0.43424  0.53511 0.83021 0.12386 0.029321  0.5494
[2x3], type = Float32, device = cpu 1.5401  0.60896   0.68293  1.262 0.020885 -0.035457


Using the default RNG goes a long way, but in complex scenarios that require specific control over random number sequences, having a generator object will be required. For example, in a multi-threaded application, there is no reproducibility, even with a manually set seed, if more than one thread is generating numbers. In that situation, it can make sense to have each thread have its own generator object.

Generator objects may be created directly, either given a seed or using '0' as the seed. You can also control whether it's a CPU or GPU generator. Each generator instance maintains its own state, so if two generators are given the same seed, they will always generate the same sequence of numbers. Obviously, with the caveat that parallelism introduces non-determinism.

In [100]:
var gen2 = new torch.Generator(189);
var gen3 = new torch.Generator(189);

torch.rand(2,3, generator: gen2).print();
torch.rand(2,3, generator: gen3).print();
torch.rand(2,3, generator: gen2).print();
torch.rand(2,3, generator: gen3).print();

[2x3], type = Float32, device = cpu 0.21357 0.042377 0.74202 0.39866  0.87494 0.73042
[2x3], type = Float32, device = cpu 0.21357 0.042377 0.74202 0.39866  0.87494 0.73042
[2x3], type = Float32, device = cpu 0.59938  0.12294 0.19049 0.31288 0.014337  0.6759
[2x3], type = Float32, device = cpu 0.59938  0.12294 0.19049 0.31288 0.014337  0.6759


In the case of distribution instances, a generator object is passed in when creating the instance, not when sampling.

In [101]:
norm1 = Normal(torch.tensor(0.5f), torch.tensor(0.125f), generator: gen2);
norm1.sample(10)

[10], type = Float32, device = cpu
 0.63033 0.55221 0.70623 0.49354 0.60153 0.48434 0.67161 0.49847 0.28315 0.50293
