# Pytorch beginner course: Tensors

## Summary

- [What is a tensor?](#what-is-a-tensor)
- [Initialize a tensor in Pytorch](#initialize-a-tensor-in-pytorch)
  - [General purpose methods of pytorch](#general-purpose-methods-of-pytorch) 
- [How tensors are stored in memory](#how-tensors-are-stored-in-memory)
- [Math operations between tensors](#math-operations-between-tensors)
- [Main properties of a tensor](#main-properties-of-a-tensor)
- [Glossary of the used tools](#glossary-of-the-used-tools)
    - [Methods](#methods)
    - [Properties](#properties)
- [References](#references)
- [Author](#author)

## What is a tensor?

In this first lecture we will see one of the most used data structure in the Machine Learning, a data structure named **tensor**.
The tensor expands the matrix concept, in particular in the case of the matrix usually we have two dimensions, while in the tensor case we can has a variable number of dimensions, hence we can say that the tensors are a data structure that generalized the matrix *(and the array)* concept.

⚠: With a tensor we can represent a vector or matrix, but with matrix with can't represent a tensor

⚠: The result of a **Kronecker product** return a tensor

Below we can see an image representing some form of a tensor:

![Tensors](images/Tensor.png)

## Initialize a tensor in Pytorch

Below we can see the main methods to initialize a tensor in pytorch

In [2]:
# import the pytorch module
import torch

# Declare a tensor of 2x2 dimensions with not initialized values
empty_tensor = torch.empty(2,2)

# Print the values of the tensor
print("EMPTY TENSOR\n", empty_tensor)

EMPTY TENSOR
 tensor([[0., 0.],
        [0., 0.]])


Pytorch offers many methods to initialize a tensors and below we can see some of these methods, remeber that for all init methods we can specify anything dimensions, for example in the previous scenario I could have used the `empty()` method as: 

In [3]:
empty_tensor = torch.empty(2,2,3,2)

print("EMPTY TENSOR WITH MANY DIMENSIONS\n", empty_tensor)

EMPTY TENSOR WITH MANY DIMENSIONS
 tensor([[[[0., 0.],
          [0., 0.],
          [0., 0.]],

         [[0., 0.],
          [0., 0.],
          [0., 0.]]],


        [[[0., 0.],
          [0., 0.],
          [0., 0.]],

         [[0., 0.],
          [0., 0.],
          [0., 0.]]]])


In [4]:
# Init some tensors
zeros_tensor = torch.zeros(2,2,3)
ones_tensor = torch.ones(3,3)
rand_tensor = torch.rand(2,2)
randint_tensor = torch.randint(low=2,high=10, size=(3,2))
customized_tensor = torch.tensor([[2,13,1,4], [12,4,3,4]])

# Output of all tensors
print("ZEROS TENSOR\n", zeros_tensor)
print("\nONES TENSOR\n", ones_tensor)
print("\nRANDOM TENSOR\n", rand_tensor)
print("\nRANDOM INTEGER TENSOR\n", randint_tensor)
print("\nCUSTOMIZED TENSOR\n", customized_tensor)

ZEROS TENSOR
 tensor([[[0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.]]])

ONES TENSOR
 tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])

RANDOM TENSOR
 tensor([[0.0496, 0.9556],
        [0.5043, 0.2169]])

RANDOM INTEGER TENSOR
 tensor([[7, 5],
        [7, 8],
        [8, 9]])

CUSTOMIZED TENSOR
 tensor([[ 2, 13,  1,  4],
        [12,  4,  3,  4]])


For all init methods we can specify the type of the data that the tensor contain using a `dtpye` arguments.

In [5]:
integer_tensor = torch.ones(2,2, dtype=torch.int32)  # init a tensor for integer values

print("Type of my tensor: ", integer_tensor.dtype)   # output of the data type of my tensor
print("32bit INTEGER TENSOR\n", integer_tensor)      # output of the my tensor

Type of my tensor:  torch.int32
32bit INTEGER TENSOR
 tensor([[1, 1],
        [1, 1]], dtype=torch.int32)


Below i recap some data type available into the torch module:

| `dtype` | Type |
|:-------:|:----:|
| `float32` or `float` |32bit floating point|
| `float64` or `double` |64bit floating point|
| `complex64` or `cfloat` |64bit complex number|
| `int8` |8bit integer|
| `int16` |16bit integer|
| `int32` |32bit integer|
| `int64` or `long`|64bit integer|
| `uint8`|unsigned 8bit integer|
| `bool` |boolean|

All init methods have a variant that initialize a tensor with the same dimesion as the specificated tensor in arguments

In [6]:
x = torch.ones(3, dtype=torch.float32)      # init a simple ones tensor with Size=3
y = torch.ones_like(x)                      # tensor y will be initialized with the same shape of x

print("Shape of x:", x.shape)
print("Shape of y:", y.shape)

Shape of x: torch.Size([3])
Shape of y: torch.Size([3])


## How tensors are stored in memory

Like the matrices *(in a classical programming language)* also the tensors are stored in a contigous space in memory, for this reason we can access to any index of our tensor using a simple equation that calculate the specific address memory index of that specific element.

This simple equation use a particular value named `stride` value and we can get the value with the method `tensor.stride()`, we will have as many stride values as the dimensions of the tensor, as follow we can see this formula:

$$
target = index_1 \cdot stride_1 + ... + index_n \cdot stride_n
\\
$$

$$
n = number \ of \ a \ dimensions
$$

In [7]:
x = torch.tensor([[1,2],[3,4]])

"""
in this block of code we will catch the value 3
"""

# logical access
target = x[1,0].item()
print("Logical access: ", target)

# address access
row_index, col_index = 1, 0
stride = x.stride()
address = row_index*stride[0] + col_index*stride[1]
target = x.storage()[address]
print("Address access: ", target)

Logical access:  3
Address access:  3


  target = x.storage()[address]


The `storage()` is a method that return the whole raw data stored into the memory, so we can get all elements of the tensor contigously.

In [8]:
print("Contigously raw data\n", x.storage())

Contigously raw data
  1
 2
 3
 4
[torch.storage.TypedStorage(dtype=torch.int64, device=cpu) of size 4]


### General purpose methods of pytorch

In [9]:
x = torch.tensor([1,2,3,4])

print("Is tensor: ", torch.is_tensor(x))        # return true, if the obj is a tensor
print("Size of x:", x.size())                   # return the dimension of the tensor
print("Element at ([1,2]): ", x[2].item())      # return the value at indicated position

# Convert other data structure in a tensor
other_data_structure = [1,2,3,4]
converted = torch.as_tensor(other_data_structure)
print("TENSOR FROM OTHER DATA STRUCTURE:",converted)

Is tensor:  True
Size of x: torch.Size([4])
Element at ([1,2]):  3
TENSOR FROM OTHER DATA STRUCTURE: tensor([1, 2, 3, 4])


Pytorch is compatible with numpy

In [10]:
import numpy as np

numpy_structure = np.ones(2, dtype=int)
tensor = torch.from_numpy(numpy_structure)

print("Tensor from numpy:", tensor)

Tensor from numpy: tensor([1, 1])


pytorch uses standard numpy-like indexing and slicing to access to the data of the tensors

In [23]:
my_tensor = torch.tensor([[3,2,4], [4,1,5]], dtype=torch.int64)

print(f"My tensor:\n {my_tensor}")
print(f'First element: {my_tensor[0,0]}')
print(f'First column: {my_tensor[:,0]}')
print(f'Last column: {my_tensor[...,-1]}')

My tensor:
 tensor([[3, 2, 4],
        [4, 1, 5]])
First element: 3
First column: tensor([3, 4])
Last column: tensor([4, 5])


## Main properties of a tensor

Our tensors have many properties, to follow we show some of them

In [None]:
x = torch.randint(low=0, high=10, size=(2,2,2))

print("Shape:", x.shape)          # shape of tensor
print("Device:", x.device)        # device on which the tensor is stored
print("Data type:", x.dtype)      # data type of tensor
print("Layout: ", x.layout)       # how the tensor is saved in memory
print("Is leaf: ", x.is_leaf)     # if true, the tensor was created by the user, otherwise, the tensor was created by a previous computation

Shape: torch.Size([2, 2, 2])
Device: cpu
Data type: torch.int64
Layout:  torch.strided
Is leaf:  True


## Math operations between tensors

We can execute many mathematical operations on our tensors and we can do that using a tipical python operators, like `+`, `-`, `*`, `\`, etc., or using the specific methods defined into the torch module *(this last practice is advised)*

In [None]:
# Initialize two different tensors
x = torch.tensor([[1,2,3,4], [5,6,7,8]])
y = torch.ones(2,4, dtype=int)

# Output of our original tensors
print("x:", x)
print("\ny:", y)

# Some operations
sum = torch.add(x,y)     #equivalent to: sum = x+y
diff = torch.sub(x,y)    #equivalent to: diff = x-y
prod = torch.mul(x,y)    #equivalent to: prod = x*y
div = torch.div(x,y)     #equivalent to: div = x/y

# Ouptput
print("\nx+y:", sum)
print("\nx-y:", diff)
print("\nx*y:", prod)
print("\nx/y:", div)

x: tensor([[1, 2, 3, 4],
        [5, 6, 7, 8]])

y: tensor([[1, 1, 1, 1],
        [1, 1, 1, 1]])

x+y: tensor([[2, 3, 4, 5],
        [6, 7, 8, 9]])

x-y: tensor([[0, 1, 2, 3],
        [4, 5, 6, 7]])

x*y: tensor([[1, 2, 3, 4],
        [5, 6, 7, 8]])

x/y: tensor([[1., 2., 3., 4.],
        [5., 6., 7., 8.]])


In this first block of code we saw the main four operations, and for each one of them we have a method to compute the operation in place:

* `torch.add_()`
* `torch.sub_()`
* `torch.mul_()`
* `torch.div_()`

⚠: In `pytorch` many methods have "in place" variant that have a `_` at the end of the name of the method

Here is an example:

In [None]:
# Init two ones tensors
x = torch.ones(1,2, dtype=int)  # tensor([[1,1]])
y = torch.ones(1,2, dtype=int)  # tensor([[1,1]])

# Compute the sum in place
x.add_(y)

# Output
print("Sum in place: ", x)

Sum in place:  tensor([[2, 2]])


of course in the torch module we have many mathematiacal tools, to follow we show some

In [None]:
x = torch.tensor([-5.15,-1.45])  # tensor with a negative values

aboslute_x = torch.abs(x)  # in-place version: x.abs_()
floor_x = torch.floor(x)   # in-place version: x.floor_()

print("|x|:", aboslute_x)
print("floor(x):", floor_x)

|x|: tensor([5.1500, 1.4500])
floor(x): tensor([-6., -2.])


You can see all mathematical operations at: [Math operations pytorch documentation](https://pytorch.org/docs/stable/torch.html#math-operations)

## Glossary of the used tools

### Methods

- `torch.empty()`
- `torch.zeros()`
- `torch.ones()`
- `torch.rand()`
- `torch.randint()`
- `torch.tensor()`
- `torch.size()`
- `torch.add()`
- `torch.sub()`
- `torch.mul()`
- `torch.div()`
- `torch.add_()`
- `torch.sub_()`
- `torch.mul_()`
- `torch.div_()`
- `torch.abs()`
- `torch.floor()`
- `torch.abs_()`
- `torch.floor_()`
- `torch.is_tensor()`
- `torch.as_tensor()`
- `item()`
- `torch.from_numpy()`
- `torch.Tensor.stride()`
- `torch.Tensor.storage()`

### Properties

- `torch.dtype`
- `torch.shape`
- `torch.device`
- `torch.layout`
- `torch.is_leaf`

## References

[Pytorch documentation](https://pytorch.org/docs/stable/index.html)

## Author

Emilio Garzia, 2024