In [None]:
#r "nuget:TorchSharp-cpu"
using TorchSharp;
using static TorchSharp.TensorExtensionMethods;

# Tensors

In TorchSharp, as in all deep learning, the fundamental data type is a 'tensor,' which is simply a generalized matrix. In Linear Algebra, a one-dimensional arrays are called 'vector,' and a two-dimensional array is a 'matrix.' Generalizing on that, a tensor is simply an N-dimensional array. 

Please note that there is an overloaded use of the word 'dimensions' here -- in physics, a vector (one dimension) with three elements is used to represents a point in space, one element for each spatial dimension. When we speak of 'dimension' in these tutorials, it is the number of tensor dimensions that is of interest.

So, let's get started with tensors by creating some.

## Constant-Filled Tensors

The simplest tensor creation primitives just initializes a tensor with either 0 or 1 in all its elements. The arguments passed in is the size of each dimension. Think of the first dimension as the rows of a table, the second as the columns, and then you just have to generalize things in your head after that. In the examples below, we'll mostly be creating 3x4 matrices, for simplicity.

One thing to note is that .NET Interactive will show the object and its fields, etc. when you say just 't' at the end of the notebook cell. What we want for tensors is to show the contents, and there's a special version of ToString() taking a Boolean that shows not just the size and type of the tensor, but also its contents.

In [None]:
var t = torch.ones(3,4);
t.ToString(true)

If you have more than two dimensions, ToString(true) will try to format it in a way that makes sense to a human:

In [None]:
torch.ones(2,4,4).ToString(true)

If you intend to fill the tensor with values from somewhere else, in other words because you pre-allocated it, there's an 'empty' factory that is faster than using anything else. The values are just whatever was found in memory when the tensor was created. Don't mistake that for random values, though.

In [None]:
torch.empty(4,4).ToString(true)

You can also create a tensor from any value you want:

In [None]:
torch.full(4,4,3.14f).ToString(true)

In [None]:
Console.Write(torch.zeros(4,4).ToString(true));
Console.Write(torch.ones(4,4).ToString(true));

You may have noticed that each tensor has a 'type = Float32' attribute. This is a peculiarity about the TorchSharp tensor type -- the element type does not show up in the type, Tensor is not Tensor\<T\>. This is so because the underlying C++ / CUDA runtime represents tensors this way, and it makes it easier to port code from Python, too.

You may also have noticed that tensors are created using factories, not constructors. Also, the naming convention doesn't look anything like .NET. We chose to step away from .NET conventions in order to make it easier to port code from Python. We know this will upset some, and please some, but it's the decision we came to after a long time of deliberating.

Anyway, 'Float32' is the default, but you can create tensors of other types, too, including complex tensors:

In [None]:
torch.zeros(4,4, dtype: torch.int32).ToString()

In [None]:
t = torch.zeros(4,4, dtype: torch.complex64);
t.ToString()

To access the contents of a tensor, you treat it as a multi-dimensional array (note that the number of dimensions also isn't part of the type itself). When you do, you'll be surprised to see that the result of the indexing operator is another tensor, one that has no shape -- this is how TorchSharp represents a scalar value. Later in this tutorial, we will see why. For now, just know that you have to extract the value using a function, based on the type you expect to get out.

In PyTorch, there's a method '.item()' used for this purpose. In TorchSharp, it's a templatized method: 

In [None]:
t = torch.zeros(4,4, dtype: torch.int32);
Console.Write(t[0,0]);
t[0,0].item<int>()

To write to a single element, you have to create a tensor from the value you want to write.

In [None]:
t[0,0] = torch.tensor(35);
t.ToString(true)

## Randomized Tensors

In machine learning, random number generation is very important, and you often end up using the RNG APIs to create tensors. There are a big number of RNGs, most of them for floating point values, but there are some for integers, too.

The usual suspects are present -- normal and uniform distributions, binomial (true/false or 0/1) and uniformly distributed integers.

In [None]:
// Normal distribution
torch.randn(3,4).ToString(true)

In [None]:
// Uniform distribution between [0,1[
torch.rand(3,4).ToString(true)

To change the range, just multiply and/or add:

In [None]:
// Uniform distribution between [100,110[
(torch.rand(3,4) * 10 + 100).ToString(true)

The main factory for integer values is not quite as convenient to use -- in the function signature, there is an integer to pass in for the max value, so the dimension values have to be passed in an array. We'll try to address this inconvenience in a future API change.

In [None]:
torch.randint(10, new long[]{3,4}).ToString(true)

If that is too annoying, you can use 'rand()' and convert the tensor to an integer using '.to()'. That comes with some extra syntactic overhead, though, and the conversion has a runtime cost.

In [None]:
(torch.rand(3,4) * 10).to(torch.int32).ToString(true)

Make a mental note of the '.to()' call -- it may not be the right choice here, but that is how you convert tensors from one elemen type to another. You can also use it to move data between CPU and GPU, and convert at the same time. More on that later.