From: [PyTorch doc](https://pytorch.org/tutorials/beginner/basics/intro.html)

# Tensors

* A special data structure that are very similar to arrays and matrices.
* Used to encode the inputs, outputs, and parameters of a model.
* Similar to NumPy's `ndarray`, except that tensors can run on  hardware accelerators such as GPUs.
* Tensors and NumPY's arrays can often share the same underlying memory, eliminating the need to copy data (see [Bridge with NumPy](https://pytorch.org/tutorials/beginner/blitz/tensor_tutorial.html#bridge-to-np-label))
* Tensors are optimized for automatic differentiation (see [Autograd](https://pytorch.org/tutorials/beginner/basics/autogradqs_tutorial.html))

In [1]:
import torch
import numpy as np

## Initializing a Tensor

### Directly from data

In [2]:
data = [[1, 2], [3, 4]]
x_data = torch.tensor(data)

### From a NumPy array

In [3]:
np_arr = np.array(data)
x_np = torch.from_numpy(np_arr)

### From another tensor

The new tensor retains the properties (shape, datatype) of the argument tensor, unless explicitly overriden.

In [4]:
x_ones = torch.ones_like(x_data);
print(x_ones)

tensor([[1, 1],
        [1, 1]])


In [5]:
x_rand = torch.rand_like(x_data, dtype=torch.float);
print(x_rand)

tensor([[0.6344, 0.2014],
        [0.8269, 0.0569]])


### With random or constant values

In [6]:
shape = (2, 3,)

In [7]:
torch.rand(shape)

tensor([[0.8932, 0.5327, 0.0120],
        [0.9442, 0.5825, 0.8924]])

In [8]:
torch.ones(shape)

tensor([[1., 1., 1.],
        [1., 1., 1.]])

In [9]:
torch.zeros(shape)

tensor([[0., 0., 0.],
        [0., 0., 0.]])

## Attributes of a Tensor

In [10]:
tensor = torch.rand(3, 4)

In [11]:
tensor.shape

torch.Size([3, 4])

In [12]:
tensor.dtype

torch.float32

In [13]:
tensor.device

device(type='cpu')

## Operations on Tensors

* Comprehensive list, including arithmetic, linear algebra, matrix multiplication, sampling, and more can be found [here](https://pytorch.org/docs/stable/torch.html).
* Each operation can be run on the GPU.
* By default, tensors are created on the CPU. We need to explicitly move tensors to the GPU using `.to` method (after checking for GPU availability).
  * Note that copying large tensors across devices can be expensive in terms of time and memory.

In [14]:
if torch.cuda.is_available():
    print('Cuda is available. Using cuda...')
    tensor = tensor.to('cuda')
else:
    print('Cuda is unavailable. Using cpu...')

Cuda is available. Using cuda...


### Indexing and Slicing (numpy-like)

In [15]:
tensor = torch.from_numpy(np.arange(12).reshape(3, 4) + 1)

In [16]:
tensor

tensor([[ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12]])

In [17]:
tensor[0]

tensor([1, 2, 3, 4])

In [18]:
tensor[:, 0]

tensor([1, 5, 9])

In [19]:
tensor[..., -1]

tensor([ 4,  8, 12])

In [20]:
tensor[:, 1] = 0

In [21]:
tensor

tensor([[ 1,  0,  3,  4],
        [ 5,  0,  7,  8],
        [ 9,  0, 11, 12]])

In [22]:
torch.cat([tensor, tensor], dim = 1)

tensor([[ 1,  0,  3,  4,  1,  0,  3,  4],
        [ 5,  0,  7,  8,  5,  0,  7,  8],
        [ 9,  0, 11, 12,  9,  0, 11, 12]])

In [23]:
torch.cat([tensor, tensor], dim = 0)

tensor([[ 1,  0,  3,  4],
        [ 5,  0,  7,  8],
        [ 9,  0, 11, 12],
        [ 1,  0,  3,  4],
        [ 5,  0,  7,  8],
        [ 9,  0, 11, 12]])

### Arithmetic operations

#### Matrix multiplication

In [24]:
tensor @ tensor.T

tensor([[ 26,  58,  90],
        [ 58, 138, 218],
        [ 90, 218, 346]])

In [25]:
tensor.matmul(tensor.T)

tensor([[ 26,  58,  90],
        [ 58, 138, 218],
        [ 90, 218, 346]])

#### Element-wise product

In [26]:
tensor * tensor

tensor([[  1,   0,   9,  16],
        [ 25,   0,  49,  64],
        [ 81,   0, 121, 144]])

In [27]:
tensor.mul(tensor)

tensor([[  1,   0,   9,  16],
        [ 25,   0,  49,  64],
        [ 81,   0, 121, 144]])

In [28]:
torch.rand_like(tensor, dtype=torch.float)

tensor([[0.7362, 0.8756, 0.7110, 0.0362],
        [0.9215, 0.7803, 0.4575, 0.9803],
        [0.5080, 0.4605, 0.5902, 0.6911]])

#### Single-element tensors (e.g. after aggregation)

In [29]:
tensor.sum()

tensor(60)

In [30]:
tensor.sum().item()

60

#### In-place operations

Operations that store the result into the operand are called in-place. They are denoted by a `_` suffix.

In [31]:
t1 = tensor.detach().clone()

In [32]:
t1.add_(2)

tensor([[ 3,  2,  5,  6],
        [ 7,  2,  9, 10],
        [11,  2, 13, 14]])

In [33]:
tensor

tensor([[ 1,  0,  3,  4],
        [ 5,  0,  7,  8],
        [ 9,  0, 11, 12]])