# Tensors
Tensors are a specialized data structure that are very similar to arrays and matrices. In PyTorch, we use tensors to encode the inputs and outputs of a model, as well as the model’s parameters.

Tensors are similar to NumPy’s ndarrays, except that tensors can run on GPUs or other hardware accelerators. In fact, tensors and NumPy arrays can often share the same underlying memory, eliminating the need to copy data (see Bridge with NumPy). Tensors are also optimized for automatic differentiation (we’ll see more about that later in the Autograd section). If you’re familiar with ndarrays, you’ll be right at home with the Tensor API. If not, follow along!

In [127]:
import torch
import numpy as np

## Initializing a Tensor
Tensors can be created directly from data. The data type is automatically inferred.

### Set data directly to tensor.

In [128]:
data = [[1, 2],[3, 4]]
x_data = torch.tensor(data)
print(x_data)

tensor([[1, 2],
        [3, 4]])


### From a Numpy array

In [129]:
np_array = np.array(data)
x_np = torch.from_numpy(np_array)
print(x_np)

tensor([[1, 2],
        [3, 4]])


### From another tensor:

The new tensor retains the properties (shape, datatype) of the argument tensor, unless explicitly overridden.

In [130]:
x_ones = torch.ones_like(x_data) # retains the properties of x_data
print(f"Ones Tensor: \n {x_ones} \n")

x_rand = torch.rand_like(x_data, dtype=torch.float) # overrides the datatype of x_data
print(f"Random Tensor: \n {x_rand} \n")

Ones Tensor: 
 tensor([[1, 1],
        [1, 1]]) 

Random Tensor: 
 tensor([[0.3990, 0.2059],
        [0.3533, 0.3383]]) 



### With random or constant values:

'shape' is a tuple of tensor dimensions. In the functions below, it determines the dimensionality of the output tensor.

In [131]:
shape = (4,5)
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)

print(f"Random Tensor: \n {rand_tensor} \n")
print(f"Ones Tensor: \n {ones_tensor} \n")
print(f"Zeros Tensor: \n {zeros_tensor}")

Random Tensor: 
 tensor([[0.1808, 0.5819, 0.7232, 0.5336, 0.3443],
        [0.9215, 0.0228, 0.7205, 0.5110, 0.5846],
        [0.9408, 0.4217, 0.1138, 0.8900, 0.4331],
        [0.3029, 0.0224, 0.2117, 0.8643, 0.6174]]) 

Ones Tensor: 
 tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]]) 

Zeros Tensor: 
 tensor([[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]])


## Attributes of a Tensor
Tensor attributes describe their shape, datatype, and the device on which they are stored.

In [132]:
tensor = torch.rand(3,4)

print(f"Shape of tensor: {tensor.shape}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")

Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


## Operations on Tensors
Over 100 tensor operations, including arithmetic, linear algebra, matrix manipulation (transposing, indexing, slicing), sampling and more are comprehensively described here.

Each of these operations can be run on the GPU (at typically higher speeds than on a CPU). If you’re using Colab, allocate a GPU by going to Runtime > Change runtime type > GPU.

By default, tensors are created on the CPU. We need to explicitly move tensors to the GPU using .to method (after checking for GPU availability). Keep in mind that copying large tensors across devices can be expensive in terms of time and memory!

In [133]:
# We move our tensor to the GPU if available
if torch.cuda.is_available():
    tensor = tensor.to("cuda")

### Get value

In [134]:
print("z[1, 2] = ", z[1, 2].item())
d = torch.rand(size=(2, 3, 3), dtype=torch.float64, requires_grad=True)
e = d[0][2][0]
print(d)
print(e)

z[1, 2] =  0.8137245178222656
tensor([[[0.9267, 0.9474, 0.8953],
         [0.1128, 0.0133, 0.6570],
         [0.0733, 0.8075, 0.4105]],

        [[0.1203, 0.5967, 0.0043],
         [0.6777, 0.7786, 0.6424],
         [0.8480, 0.2474, 0.6842]]], dtype=torch.float64, requires_grad=True)
tensor(0.0733, dtype=torch.float64, grad_fn=<SelectBackward0>)


### Standard numpy-like indexing and slicing:

In [135]:
tensor = torch.ones(4, 4)
print(f"First row: {tensor[0]}")
print(f"First column: {tensor[:, 0]}")
print(f"Last column: {tensor[..., -1]}")
tensor[:,1] = 0
print(tensor)

First row: tensor([1., 1., 1., 1.])
First column: tensor([1., 1., 1., 1.])
Last column: tensor([1., 1., 1., 1.])
tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])


### Joining tensors
You can use torch.cat to concatenate a sequence of tensors along a given dimension. See also torch.stack, another tensor joining operator that is subtly different from torch.cat.

In [136]:
t1 = torch.cat([tensor, tensor, tensor], dim=1)
print(t1)

# t2 = torch.stack((tensor, tensor, tensor), dim=1)
# print(t2)

tensor([[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.]])


### Arithmetic operations

**Addition**

In [137]:
x = torch.ones(3, 3)
y = torch.rand(3, 3)

# element-wise addition
z = x + y
# torch.add(x, y)

# inplace addition, everything with a trailing undersore is an inplace operation
# i.e. it will modify the variable
# y.add_(x)

print(f"x = {x}")
print(f"y = {y}")
print(f"z = {z}")


x = tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
y = tensor([[0.9198, 0.0865, 0.7149],
        [0.7074, 0.6015, 0.0016],
        [0.8720, 0.5109, 0.8494]])
z = tensor([[1.9198, 1.0865, 1.7149],
        [1.7074, 1.6015, 1.0016],
        [1.8720, 1.5109, 1.8494]])


**Subtraction**

In [138]:
z = x - y
# z = torch.sub(x, y)
print(f"z = {z}")

z = tensor([[0.0802, 0.9135, 0.2851],
        [0.2926, 0.3985, 0.9984],
        [0.1280, 0.4891, 0.1506]])


**Multiplication**

In [139]:
z = x * y
# z = torch.mul(x, y)
print(f"z = {z}")

z = tensor([[0.9198, 0.0865, 0.7149],
        [0.7074, 0.6015, 0.0016],
        [0.8720, 0.5109, 0.8494]])


Matrix multiplication v.s. Element-wise multiplication

In [140]:
# This computes the matrix multiplication between two tensors. y1, y2, y3 will have the same value
# ``tensor.T`` returns the transpose of a tensor
x = torch.ones(3, 4)
y1 = x @ x.T
y2 = x.matmul(x.T)

y3 = torch.rand_like(y1)
torch.matmul(x, x.T, out=y3)
print(f"y1 = {y1},\ny2 = {y2},\ny3 = {y3}")


# This computes the element-wise product. z1, z2, z3 will have the same value
z1 = tensor * tensor
z2 = tensor.mul(tensor)

z3 = torch.rand_like(tensor)
torch.mul(tensor, tensor, out=z3)
print('Tensor = ', tensor)
print(f"z1 = {z1},\nz2 = {z2},\nz3 = {z3}")

y1 = tensor([[4., 4., 4.],
        [4., 4., 4.],
        [4., 4., 4.]]),
y2 = tensor([[4., 4., 4.],
        [4., 4., 4.],
        [4., 4., 4.]]),
y3 = tensor([[4., 4., 4.],
        [4., 4., 4.],
        [4., 4., 4.]])
Tensor =  tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])
z1 = tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]]),
z2 = tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]]),
z3 = tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])


**Division**

In [142]:
# z = x / y
# z = torch.div(x, y)
print(f"z = {z}")

z = tensor([[0.9198, 0.0865, 0.7149],
        [0.7074, 0.6015, 0.0016],
        [0.8720, 0.5109, 0.8494]])


### Single-element tensors
If you have a one-element tensor, for example by aggregating all values of a tensor into one value, you can convert it to a Python numerical value using item():

In [None]:
agg = tensor.sum()
agg_item = agg.item()
print(agg_item, type(agg_item))

12.0 <class 'float'>


### In-place operations
Operations that store the result into the operand are called in-place. They are denoted by a _ suffix. For example: x.copy_(y), x.t_(), will change x.

In [None]:
tensor = torch.ones(3, 4)
tensor[:, 1] = 0
print(f"The original matrix:\n {tensor} \n")
tensor.add_(5)
# the matrix 'tensor' has been changed (add 5 to each element)
print(f"The matrix after in-place adding:\n {tensor}")

The original matrix:
 tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]]) 

The matrix after in-place adding:
 tensor([[6., 5., 6., 6.],
        [6., 5., 6., 6.],
        [6., 5., 6., 6.]])


### Bridge with Numpy
Tensors on the CPU and NumPy arrays can share their underlying memory locations, and changing one will change the other.

**Tensor to Numpy**

In [None]:
t = torch.ones(5)
print(f"t: {t}")
n = t.numpy()
print(f"n: {n}")

t: tensor([1., 1., 1., 1., 1.])
n: [1. 1. 1. 1. 1.]


A change in the tensor reflects in the NumPy array.

In [None]:
# torch.add(t, 1, out = t)
# t = t + 1 is not working??
t.add_(1)
print(f"t: {t}")
print(f"n: {n}")

t: tensor([2., 2., 2., 2., 2.])
n: [2. 2. 2. 2. 2.]


**Numpy to Tensor**

In [None]:
n = np.ones(5)
t = torch.from_numpy(n)

Changes in the NumPy array reflects in the tensor.

In [None]:
np.add(n, 1, out=n)
print(f"t: {t}")
print(f"n: {n}")

t: tensor([2., 2., 2., 2., 2.], dtype=torch.float64)
n: [2. 2. 2. 2. 2.]


## Autograd
The autograd package provides automatic differentiation for all operations on Tensors. Generally speaking, *torch.autograd* is an engine for computing the vector-Jacobian product. It computes partial derivates while applying the chain rule.

**IMPORTANT: Must set *requires_grad = True*

In [163]:
import torch
x = torch.randn(2, 3, requires_grad=True)
y = x + 2
z = torch.sum(y * y * 3)
print(f"x = \n{x}\n")
print(f"y = \n{y}\n")
print(f"z = {z}")

x = 
tensor([[ 1.0582,  0.8941, -1.4242],
        [-2.5891, -1.2196, -0.7171]], requires_grad=True)

y = 
tensor([[ 3.0582,  2.8941,  0.5758],
        [-0.5891,  0.7804,  1.2829]], grad_fn=<AddBackward0>)

z = 61.9838981628418


**IMPORTANT:
The value we would like to perfom backpropagation (z in this case) must be a value, which cannot be a matrix, list, etc.

In [164]:
print(x.grad)
z.backward()
print(x.grad)

None
tensor([[18.3489, 17.3644,  3.4551],
        [-3.5344,  4.6822,  7.6972]])
