# Tensors

Tensors are a specilaized data structure that are very similar to arrays and matrices.

In Pytorch, we use tensors to encode the inputs and outputs of a model, as well as the model's parameters.

Tensors are similar to NumPy's ndarrays, except that tensors can run on GPUs or other hardware accelerators. In fact, tensors and NumPy arrays can often share the same underlying memory, eliminating the need to copy data.

Tensors are also optimized for automatic differentiation.

## Initializing a Tensor
Tensors can be initialized in various ways. Here are some examples:

### Directly from Data
Tensors can be created directly from data. The data type is automatically inferred.

In [1]:
import torch
import numpy as np

data = [[1,2], [3,4]]
x_data = torch.tensor(data)

print(f'Tensor x_data:\n{x_data}\n')

Tensor x_data:
tensor([[1, 2],
        [3, 4]])



### From a NumPy array
Tensors can be created from NumPy arrays (and vice versa)

In [2]:
np_array = np.array(data)
x_np = torch.from_numpy(np_array)

print(f"NumPy Array:\n{np_array}\n")
print(f'Tensor from NumPy Array:\n{x_np}\n')

NumPy Array:
[[1 2]
 [3 4]]

Tensor from NumPy Array:
tensor([[1, 2],
        [3, 4]])



### From another tensor:
The new tensor retains the properties (shape, datatype) of the argument tensor, unless explicitly overridden.

In [3]:
x_ones = torch.ones_like(x_data) # retains the properties of x_data
print(f'Ones Tensor:\n {x_ones} \n')

x_rand = torch.rand_like(x_data, dtype=torch.float) #overrides the datatype of x_data
print(f'Random Tesnor:\n {x_rand} \n')

Ones Tensor:
 tensor([[1, 1],
        [1, 1]]) 

Random Tesnor:
 tensor([[0.2657, 0.1195],
        [0.8898, 0.8610]]) 



### With random or constant values:
`shape` is a tuple of tensor dimensions. In the functions bleow, it determines the dimensionality of the output tensor.

A `tuple` stores multiple items in a single variable

In [4]:
shape = (2,3,) # A tuple of tensor dimensions

rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)

print(f"Random Tensor: \n {rand_tensor} \n")
print(f"Ones Tensor: \n {ones_tensor} \n")
print(f"Zeros Tensor: \n {zeros_tensor}")

Random Tensor: 
 tensor([[0.2529, 0.5051, 0.4882],
        [0.7300, 0.3428, 0.1516]]) 

Ones Tensor: 
 tensor([[1., 1., 1.],
        [1., 1., 1.]]) 

Zeros Tensor: 
 tensor([[0., 0., 0.],
        [0., 0., 0.]])


## Attributes of a Tensor
Tensor attributes describe their shape, datatype, and the device on which they are stored.

In [5]:
tensor = torch.rand(3,4) # 3 x 4 (3 rows x 4 columns)
print(f"Tensor:\n{tensor}\n")
print(f"Shape of tensor: {tensor.shape}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")

Tensor:
tensor([[0.5975, 0.1230, 0.7588, 0.5231],
        [0.5801, 0.8028, 0.1855, 0.1036],
        [0.5644, 0.2178, 0.7568, 0.1504]])

Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


## Operations on Tensors

There are over 1200 tensor operations.

Each of these operations can be run on the CPU and Accelerator such as CUDA, MPS, MTIA, or XPU.

By deafult, tenors are created on the CPU. We need to explicitly move tensors to the accelerator using `.to` method (after checking for accelerator availability). Keep in mind that copying large tensors across devices can be expensive in terms of time and memory!

In [6]:
if torch.accelerator.is_available():
    print(f'Accelerator is available')
    tensor = tensor.to(torch.accelerator.current_accelerator())
    print(f'Tensor moved to Accelerator \n {tensor} \n')

Accelerator is available
Tensor moved to Accelerator 
 tensor([[0.5975, 0.1230, 0.7588, 0.5231],
        [0.5801, 0.8028, 0.1855, 0.1036],
        [0.5644, 0.2178, 0.7568, 0.1504]], device='mps:0') 



### Standard numpy-like indexing and slicing:

In [7]:
tensor = torch.ones(4,4)
print(f"First row: {tensor[0]}")
print(f"First column: {tensor[:, 0]}")
print(f"Last column: {tensor[...,-1]}")
tensor[:,1] = 0 # This will set all elements in column @ index 1 equal to 0
print(f"Updated tensor: \n {tensor} \n")

First row: tensor([1., 1., 1., 1.])
First column: tensor([1., 1., 1., 1.])
Last column: tensor([1., 1., 1., 1.])
Updated tensor: 
 tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]]) 



### Joining tensors
You can use `torch.cat` to concatenate a sequence of tensors along a given dimension. 

In [8]:
t1 = torch.cat([tensor,tensor,tensor], dim=1)
print(f'Concatenated Tensors: \n {t1} \n')

Concatenated Tensors: 
 tensor([[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.]]) 



### Arithmetic operations

In [9]:
# This computes the matrix multiplication between two tensors. y1, y2, y3 will have the same value
# ``tensor.T`` returns the transpose of a tensor
y1 = tensor @ tensor.T
y2 = tensor.matmul(tensor.T)

y3 = torch.rand_like(y1)
torch.matmul(tensor, tensor.T, out=y3)


# This computes the element-wise product. z1, z2, z3 will have the same value
z1 = tensor * tensor
z2 = tensor.mul(tensor)

z3 = torch.rand_like(tensor)
torch.mul(tensor, tensor, out=z3)

tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])

### Single-element tensors
If you have a one-element tensor, for example by aggregating all values of a tensor into one value, you can convert it to a Python numerical value using `item()`:

In [10]:
agg = tensor.sum()
agg_item = agg.item()
print(agg_item, type(agg_item))

12.0 <class 'float'>


### In-place operations
Operations that store the result into one operand are called in-place.
They are denoted by a `_` suffix. For example: `x.copy_(y)`, `x.t_()`, will change `x`.

In [11]:
print(f"{tensor} \n")
tensor.add_(5)
print(tensor)

tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]]) 

tensor([[6., 5., 6., 6.],
        [6., 5., 6., 6.],
        [6., 5., 6., 6.],
        [6., 5., 6., 6.]])


#### Note:
In-place operations save some memory, but can be problematic when computing derivatives because of an immediate loss of history. Hence, their use is discouraged.