# Tensors


*Adapted from `pytorch.org/tutorials/beginner/basics/tensorqs_tutorial.html`*.

Tensors are a specialized data structure that are very similar to arrays and matrices.
In PyTorch, we use tensors to encode the inputs and outputs of a model, as well as the model’s parameters.

Tensors are similar to NumPy’s ndarrays, except that tensors can run on GPUs or other hardware accelerators. Tensors
are also optimized for automatic differentiation (we'll see more about that later in the Autograd section). If you’re familiar with ndarrays, you’ll be right at home with the Tensor API!

In [None]:
import torch
import numpy as np

## Initializing a Tensor

Tensors can be initialized in various ways. Take a look at the following examples:

**Directly from data**

Tensors can be created directly from data. The data type is automatically inferred.



In [None]:
data = [[1, 2],[3, 4.]]
x_data = torch.tensor(data)
x_data # Notice that all elements of a tensor must have the same data type

tensor([[1., 2.],
        [3., 4.]])

**From a NumPy array**

Tensors can be created from NumPy arrays (and vice versa - `x_data.numpy()`).



In [None]:
np_array = np.array(data)
x_np = torch.from_numpy(np_array)
x_np

tensor([[1., 2.],
        [3., 4.]], dtype=torch.float64)

**With random or constant values:**


In [None]:
rand_tensor = torch.rand(2,3)
rand_tensor

tensor([[0.5454, 0.6531, 0.4457],
        [0.7516, 0.4656, 0.7648]])

## Attributes of a Tensor

Tensor attributes describe their shape, and the device on which they are stored.



In [None]:
tensor = torch.rand(3,4)
tensor.shape # Shape

torch.Size([3, 4])

In [None]:
tensor.device # Device on which the tensor is stored, e.g., cpu or gpu

device(type='cpu')

## Operations on Tensors

Over 100 tensor operations, including arithmetic, linear algebra, matrix manipulation (transposing,
indexing, slicing), sampling and more are
comprehensively described [here](https://pytorch.org/docs/stable/torch.html).

Each of these operations can be run on the GPU (at typically higher speeds than on a
CPU). If you’re using Colab, allocate a GPU by going to `Runtime > Change runtime type > GPU`.

By default, tensors are created on the CPU. We need to explicitly move tensors to the GPU using
``.to`` method (after checking for GPU availability). Keep in mind that copying large tensors
across devices can be expensive in terms of time and memory!



In [None]:
# We move our tensor to the GPU if available
if torch.cuda.is_available():
    tensor = tensor.to("cuda")

Try out some of the operations from the list.
If you're familiar with the NumPy API, you'll find the Tensor API a breeze to use.




**Standard numpy-like indexing and slicing:**



In [None]:
tensor = torch.ones(3, 4)
tensor, tensor[0]

(tensor([[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]]),
 tensor([1., 1., 1., 1.]))

**Arithmetic operations**



In [None]:
# This computes the matrix multiplication between two tensors.
y = tensor @ tensor.T
y

tensor([[4., 4., 4.],
        [4., 4., 4.],
        [4., 4., 4.]])

**Single-element tensors** If you have a one-element tensor, for example by aggregating all
values of a tensor into one value, you can convert it to a Python
numerical value using ``item()``:



In [None]:
agg = tensor.sum()
agg_item = agg.item()
agg, agg_item

(tensor(12.), 12.0)

# Autograd

When training neural networks, the most frequently used algorithm is back propagation. In this algorithm, parameters (model weights) are adjusted according to the gradient of the loss function with respect to the given parameter.

To compute those gradients, PyTorch has a built-in differentiation engine called torch.autograd. It supports automatic computation of gradient for any computational graph.

Let's create two tensors `a` and `b` with `requires_grad=True` to signal PyTorch that every operation on them should be tracked.

In [None]:
a = torch.tensor([2.], requires_grad=True)
b = torch.tensor([6.], requires_grad=True)

Let's define a new tensor `Q` from `a` and `b`,
$$Q = 3a^3-b^2$$

In [None]:
Q = 3 * a**3 - b**2

We can use PyTorch to compute $\frac{\partial Q}{\partial a}$ and $\frac{\partial Q}{\partial b}$.

Calling `Q.backward()`, PyTorch computes these derivatives and stores them in the `.grad` attributes of `a` and `b` respectively.

In [None]:
Q.backward()

In [None]:
a.grad == 9 * a**2 # Compares with actual derivative

tensor([True])

In [None]:
b.grad == -2 * b # Compares with actual derivative

tensor([True])

Notice that if you compute new derivatives with respect to `a`, the values will be accumulated (and not substituted!).

In [None]:
a.grad

tensor([36.])

In [None]:
Q = 3 * a**3 - b**2
Q.backward()
a.grad

tensor([72.])

Then, if you want to compute a new derivative, you should zero the `grad` attribute with `a.grad.zero_()`.

In [None]:
a.grad.zero_()
Q = 3 * a**3 - b**2
Q.backward()
a.grad

tensor([36.])

*If you want to know more about `torch.autograd` you can check `https://pytorch.org/tutorials/beginner/basics/autogradqs_tutorial.html`.*

## Exercise

Given the tensors `u = [1., 2., 3.]`, `v = [4., 5.]`, and the function
$$f(u, v) =
\sum_i u_i^2 + \sum_i \log v_i,$$
compute the gradients of $f$ with respect to $u$ and $v$ and compare with your analytical predictions.