In [1]:
#r "nuget:TorchSharp-cpu"

using TorchSharp;
using static TorchSharp.TensorExtensionMethods;
using Microsoft.DotNet.Interactive.Formatting;

Formatter.SetPreferredMimeTypesFor(typeof(torch.Tensor), "text/plain");
Formatter.Register<torch.Tensor>((torch.Tensor x) => x.ToString(TorchSharp.TensorStringStyle.Default));

# Basic Numerics

Arithmetic is what TorchSharp is all about, and the capabilities are rich. It's all about tensor arithmetic, though -- that's where GPU acceleration makes sense.

In [10]:
var a = torch.ones(3,4);
var b = torch.zeros(3,4);
var c = torch.tensor(5);
a * c + b

[3x4], type = Float32, device = cpu
 5 5 5 5
 5 5 5 5
 5 5 5 5


In [11]:
a

[3x4], type = Float32, device = cpu
 1 1 1 1
 1 1 1 1
 1 1 1 1


It's often the case that you can reuse the storage for one of the operands, so TorchSharp defines a number of 'in place' operators. These will only work if the operand has the same shape and layout as the result, of course. To use the in-place operators, you can't use the nice math syntax, you have to use functions. TorchSharp follows the PyTorch convention of appending a '_' to the name of in-place operators. It's very similar to the '*=', '+=', etc. operators in C#, except that they can be chained together.

In the expression below, the storage for 'a' is used to hold first the result of multiplying with c, and then adding b.

In [12]:
a.mul_(c).add_(b)

[3x4], type = Float32, device = cpu
 5 5 5 5
 5 5 5 5
 5 5 5 5


After this, 'a' is not longer holding ones, since it's been overwritten. The impact on performance that using in-place operators has is significant, if used consistently, but it's important to know what you're overwriting and not to over-use in-place operators. Think of it as a performance optimization.

In [13]:
a

[3x4], type = Float32, device = cpu
 5 5 5 5
 5 5 5 5
 5 5 5 5


## Broadcasting

In the simple example above, you saw that 'c' was defined from a single value. If we look at it, we can see that it's a singleton tensor. That is, it has no shape.

In [14]:
c.shape

In [15]:
c

[], type = Int32, device = cpu, value = 5

Even though its shape differed from that of 'a,' we were able to use it in the computation. How come?

TorchSharp will adjust the shape, without allocating new memory, of a tensor to be compatible with another tensor in situations like this. This is called 'broadcasting' and is found in most every numerics and deep learning library around. It's not just singletons that can be broadcast -- any tensor that is compatible will have it work.

In [16]:
a = torch.ones(3,4);
(a + torch.ones(4)).print();
a + torch.ones(1,4)

[3x4], type = Float32, device = cpu 2 2 2 2 2 2 2 2 2 2 2 2


[3x4], type = Float32, device = cpu
 2 2 2 2
 2 2 2 2
 2 2 2 2


# Numerics Library

The collection of numerical operators that are available is too large to go through here, but suffice it to say that all the usual suspects are available. Most of them operate on an element-wise basis, i.e. the operator is applied to each element of the operands, possibly with broadcasting getting involved.

One notable and __very__ significant exception is matrix multiplication, which is vector dot product generalized to matrices. The '*' operator denotes element-wise multiplication, while matrix multiplication is performed by the 'mm' method:

In [17]:
a = torch.full(4,4, 17);
b = torch.full(4,4, 12);

(a * b).print();
(a.mm(b)).str()

[4x4], type = Int64, device = cpu 204 204 204 204 204 204 204 204 204 204 204 204 204 204 204 204


[4x4], type = Int64, device = cpu
 816 816 816 816
 816 816 816 816
 816 816 816 816
 816 816 816 816


There are some very specialized operators doing more than one thing at a time avoiding creating temporaries. Some of them are there because the absence of temporaries can lead to more numerical stability (such as avoiding rounding error propagation), or because you don't have to go back and forth between the CPU and GPU as often. It is almost always the right choice to use these special composite operators when they are a match for your computation.

An example is xlogy(), which performs x * log(y) all in one operation.

In [18]:
var x = torch.rand(5);
var y = torch.rand(5);
(x * torch.log(y)).print();
x.xlogy(y)

[5], type = Float32, device = cpu -0.029798 -0.22131 -0.6939 -0.3144 -0.1417


[5], type = Float32, device = cpu
 -0.029798 -0.22131 -0.6939 -0.3144 -0.1417
