## Tensors and Operations

**Tensor** is the basic computational unit in PyTorch. It is very similar to **NumPy array**, and supports similar operations. However, there are two very important features of Torch tensors that make them especially useful for training large-scale neural networks:

* Tensor operations can be performed on GPU using CUDA
* Tensor operations support automatic differentiation using [AutoGrad](autograd.ipynb)

Conversion between Torch tensors and NumPy arrays can be done easily:

In [1]:
import torch
import numpy as np

np_array = np.arange(10)
tensor = torch.from_numpy(np_array)

print(f"Tensor={tensor}, Array={tensor.numpy()}")

Tensor=tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), Array=[0 1 2 3 4 5 6 7 8 9]


**Note:** When using CPU for computations, tensors converted from arrays share the same memory for data. Thus, changing the underlying array will also affect the tensor.

### Creating Tensors

The fastest way to create a tensor is to define an *uninitialized* tensor - the values of this tensor are not set, and depend on the whatever data was there in memory:

In [None]:
x = torch.empty(3,6)
print(x)

In practice, we ofter want to create tensors initialized to some values, such as zeros, ones or random values. Note that you can also specify the type of elements using `dtype` parameter, and chosing one of `torch` types:

In [None]:
x = torch.randn(3,5)
print(x)
y = torch.zeros(3,5,dtype=torch.int)
print(y)
z = torch.ones(3,5,dtype=torch.double)
print(z)

You can also create random tensors with values sampled from different distributions, as described [in documentation](https://pytorch.org/docs/stable/torch.html#random-sampling).

Similarly to NumPy, you can use `eye` to create a diagonal identity matrix:

In [None]:
print(torch.eye(10))

You can also create new tensors with the same properties or size as existing tensors:

In [None]:
print(z.new_ones(2,2)) # new_ method allows specifying new size
print(torch.zeros_like(x,dtype=torch.long)) # _like method supports overriding dtype

Size of the tensor can be obtained using `.size()` method, which returns a tuple-like object:

In [None]:
print(z.size())

### Tensor Operations

Tensors support all basic arithmetic operations, which can be specified in different ways:
* Using operators, such as `+`, `-`, etc.
* Using functions such as `add`, `mult`, etc. Functions can either return values, or store them in the specified ouput variable (using `out=` parameter)
* In-place operations, which modify one of the arguments. Those operations have `_` appended to their name, eg. `add_`.

Complete reference to all tensor operations can be found [in documentation](https://pytorch.org/docs/stable/torch.html).

Let us see examples of those operations on two tensors, `x` and `y`.


In [None]:
x = torch.randn(3,5)
y = torch.randn(3,5)

#### Using operator notation

We can use overloaded arithmetic operators, such as `+` and `*`:

In [None]:
print(x*y)

Note, that `*` means elementwise product, and not the matrix product. To compute matrix product, we need to use `matmul` function, as shown below.

#### Using functions

While only some operations are available as Python operators, [many more functions](https://pytorch.org/docs/stable/torch.html#math-operations) can be specified using the full name. In the example below, `t` transposes the matrix, and `matmul` means matrix multiplication:

In [None]:
torch.matmul(x,y.t())

Simple operations (addition, multiplication, etc.) also have corresponsing functions, and can be called either as methods, or as functions: 

In [None]:
print(x.add(y))
print(torch.add(x,y))

Sometimes it may be more convenient to store the result into specified variable, instead of returning it from a function. In this case you can use `out=` parameter:

In [None]:
torch.add(x,y,out=z)
print(z)

#### In-place operations

When training neural networks, you often need to **modify** the weights, i.e. perform some operation and then store the result into the original variable. Those operations are called **in-place operations**, and they are marked by the `_` symbol at the end of their name: 

In [None]:
x.add_(y)
print(x)

### Resizing and Indexing

Very often you need to change the shape of the tensor without modifying its valies, eg. to add an extra dimension. To do that, you can use `view` method, which provides a **view** to the same in-memory values using different dimensions:


In [None]:
print(x)
print(x.view(5,3,1))
print(x.view(5,-1))

Note that the number of elements in a view should be the same as in the original tensor, and that you can use `-1` in one of the dimensions to figure out this dimension automatically.

**Note:** `view` is similar to `reshape` operation in NumPy. There is also a `reshape` method available in PyTorch, and it is more powerful than `view`, because it can also reshape non-contiguous arrays by copying them to the new shape. However, in vast majority of cases you can use `view` and make sure that no data copying occurs, and the operation is always efficient.

Tensors support all slicing operations that exist in NymPy:

In [None]:
print(x[0], x[:,0], x[...,1])

If you have a one-element tensor, for example, after aggregating all values of the tensor into one value, you can convert it to a Python numerical value using `item()`:

In [None]:
print(x.sum().item())

### GPU Computations

One of the major benefits of using PyTorch is the ability to perform tensor operations on GPU. To do that, we need to explicitly **move** tensors to GPU using `.to` method.

In most of the cases, we check for the availability of GPU on our machine, and define the `device` object accordingly. Then we move all tensors to that device before performing the computations:

In [None]:
if torch.cuda.is_available():
    device = torch.device("cuda")
else:
    device = torch.device("cpu")

print("Doing computations on {}".format(device))

x = torch.randn(3,5,device=device)
y = torch.ones_like(x)
y = y.to(device)
z = x+y # this is performed on GPU if it is available
print(z)
print(z.to("cpu",torch.double))

In the last operation, when we move the tensor back to the CPU, we can also change the `dtype`. This does not result in additional computational time, because we need to copy and transform the data when moving it from GPU anyway.