# Tensors
Tensors are mathematical objects, don't forget that. In Pytorch, you will probably see them as multi-dimensional arrays of real numbers, and in general, that's what they are most used as. However, they could contain any mathematical object that is part of a vector space with its corresponding operations and field.

In Pytorch, tensors are objects (just like everything else in Python), and they are initialized with a multi-dimensional array of numbers (except for the scalar tensors). There are four types: scalars, vectors, matrices and tensors (3D and above).

## Scalars (a)

In [1]:
import torch
print("SCALAR")
scalar: torch.Tensor = torch.tensor(7)
print(scalar)
print(scalar.ndim)
print(scalar.shape)

SCALAR
tensor(7)
0
torch.Size([])


- The ``__str__`` magic method prints the definition of the tensor without the torch module, the dtype if not float32, the device if not CPU, and the last operation on the tensor if requires grad is set to True.
- The ``ndim`` attribute returns the number of dimensions of the tensor or its rank.
  > You could see it as the amount of sub-indices you need to point to one component. For ex., the components of a vector will only need one to point to any axis of the vector's coordinate system, whereas matrices components will require two: One pointing to the column (vector) and one pointing to some axis of the coordinate system. In consequence, scalars won't need subindices because they need to point to the same system that is a numeric system, and so they have dimension 0.
  * N. dimensions = Number of square brackets ([]).
- The ``shape`` attribute returns the size of the tensor in each dimension (it's an alias for the `size` attribute).
  > It returns the number of components in each dimension. You could see it as the amount of values each subindex could take.
  - You can infer the 'size' of each dimension by counting the amount of commas there is within a certain pair of square brackets (excluding the ones inside their content).
  - Note that from left to right, the size refers to the most outer to the most inner dimension. For example, if a tensor has size [2, 3, 4], when ignoring the first pair of brackets, there will be two elements; ignoring the second pair, there will be three elements; and ignoring the third pair, there will be four elements.

## Vectors (y)

In [3]:
print("VECTORS")
vector: torch.Tensor = torch.tensor([1, 0, 1])
print(vector)
print(vector.ndim)
print(vector.shape)
print(vector[1])

VECTORS
tensor([1, 0, 1])
1
torch.Size([3])
tensor(0)


- You can access the components of a tensor by using the square brackets notation. For this purpose, just think of the tensor as an array.
  > Think of the indices inside the brackets as the one to access a component of a tensor. You may access another tensor or a scalar. Be mindful, since you lock or select a component of a dimension per bracket, each bracket will decrease the rank or dimension of the tensor by one.

> WARNING: The tensors are not printed as matrices, but as arrays. That means that you should read the rows as columns and the columns as rows. This is because the tensors are not matrices, but multi-dimensional arrays in Pytorch.

## Matrices (Q)

In [4]:
print("MATRICES")
matrix: torch.Tensor = torch.tensor([[1, 2, -3, 4], [5, -6, -7, 8], [-10, -20, 30, 40], [0, 0, 0, 0]])
print(matrix)
print(matrix.ndim) #* Equivalent to .ndim
print(matrix.size()) #* Equivalent to .shape
print(matrix[3]) #* Vector in the fourth component
print(matrix[0][0]) #* First scalar of the vector in the first component.
matrix2: torch.Tensor = torch.tensor(data=[[2, 2], [2, 3]], dtype=torch.float)
# matrix2 = matrix2.float()
print(torch.det(matrix2)) #* Determinant of the matrix

MATRICES
tensor([[  1,   2,  -3,   4],
        [  5,  -6,  -7,   8],
        [-10, -20,  30,  40],
        [  0,   0,   0,   0]])
2
torch.Size([4, 4])
tensor([0, 0, 0, 0])
tensor(1)
tensor(2.)


- `torch.det()` calculates the determinant of a matrix. It should be noted that the tensor must have floating-point data type, which can be achieved by setting the parameter `dtype` to any floating-point `dtype` argument (e.g., `torch.float32`), by using the `.float()` or `.double()` methods of `Tensor` (equivalent to `self.to(torch.float32)` and `self.to(torch.float64)`, respectively), or by adding a decimal point to at least one entry.

- Now, despite the function `torch.tensor()` returning an instance of the `torch.Tensor` class, it is different from calling the constructor of the `Tensor` class:
  - `tensor()` accepts the `device` argument which allows you to specify where the tensor will be stored (CPU or GPU), whereas the constructor does not.
  - By default, any tensor created with `tensor()` will have the `requires_grad` attribute set to `False`, i.e., a **leaf tensor**. In Pytorch, this means the tensor doesn't use the autograd engine to compute gradients (SHOULD DIVE DEEPER INTO THIS). In contrast, the constructor will set this attribute to `True`.
  - The `dtype` argument is also exclusively accepted by `tensor()`, which allows you to specify the data type of the tensor. However, if not specified, it will infer the data type from the input data.
- The key takeaway from the docs is that the `Tensor` class is a base class and initializing them with the constructor is "discouraged". Multiple ways of creating a tensor are provided [here](https://pytorch.org/docs/stable/tensors.html#tensor-class-reference).

## Tensors (X)

In [5]:
print("TENSORS (3D+)")
tensor_r3: torch.Tensor = torch.rand([2, 3, 3])
print(tensor_r3)
print(tensor_r3.ndim)
tensor_r4: torch.Tensor = torch.rand(2, 2, 2, 3) #* There is no need to put the sizes in a list, but I assume you would have to if you needed to pass more arguments to rand()
print(tensor_r4) #* Rank 4 tensor containing two rank 3 tensors that contain two matrices with two shape 3 (3 axis) vectors each
print(tensor_r4.ndim)

TENSORS (3D+)
tensor([[[0.0359, 0.9667, 0.0933],
         [0.9923, 0.0734, 0.3498],
         [0.4092, 0.1741, 0.9880]],

        [[0.6884, 0.9304, 0.0443],
         [0.8843, 0.9370, 0.6180],
         [0.1633, 0.4733, 0.7147]]])
3
tensor([[[[0.7087, 0.3501, 0.6360],
          [0.3732, 0.8799, 0.5636]],

         [[0.7067, 0.4211, 0.9650],
          [0.4375, 0.7758, 0.7086]]],


        [[[0.0436, 0.5770, 0.3921],
          [0.1392, 0.2634, 0.8835]],

         [[0.8987, 0.7395, 0.0448],
          [0.2507, 0.6783, 0.6747]]]])
4


As you can see, creating higher-dimensional or higher-ranked tensors is just a matter of adding more square brackets, and to intepret a rank-nth tensor as a collection of rank-(n-1)th tensors and such as a collection of rank-(n-2)th tensors and so on, until you reach scalars.

## Generating tensors

In [6]:
test: torch.Tensor = torch.rand(3, 4)
print("Random tensor (test)")
print(test)
torch.zero_(test)
print("Using the zero_() function on the random tensor")
print(test)
print("Converting the previous tensor of zeros into one of ones", torch.ones_like(test))

zeroes: torch.Tensor = torch.zeros(2, 2)
print("New tensor of zeroes using zeros()")
print(zeroes)

ones: torch.Tensor = torch.ones(2, 2)
print("New tensor of ones using ones()")
print(ones)

in_range: torch.Tensor = torch.arange(start=1, end=9, step=2) #* Odd numbers from 1 to 8
print(in_range)
print(in_range.shape, in_range.ndim)

empty_tensor: torch.Tensor = torch.empty_like(test)
print("Supposedly an uninitialized tensor with the same shape as test")
print(empty_tensor)

full_of_eights: torch.Tensor = torch.full_like(input=test, fill_value=8, dtype=torch.int16)
print("A tensor with the same shape as test and with all of its entries = 8")
print(full_of_eights)


Random tensor (test)
tensor([[0.8960, 0.2513, 0.4274, 0.6678],
        [0.8142, 0.0179, 0.3338, 0.1075],
        [0.8935, 0.4005, 0.3862, 0.3361]])
Using the zero_() function on the random tensor
tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])
Converting the previous tensor of zeros into one of ones tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])
New tensor of zeroes using zeros()
tensor([[0., 0.],
        [0., 0.]])
New tensor of ones using ones()
tensor([[1., 1.],
        [1., 1.]])
tensor([1, 3, 5, 7])
torch.Size([4]) 1
Supposedly an uninitialized tensor with the same shape as test
tensor([[4.9487e-14, 0.0000e+00, 0.0000e+00, 0.0000e+00],
        [       nan, 0.0000e+00, 1.1578e+27, 7.1463e+22],
        [4.6241e+30, 1.0552e+24, 5.5757e-02, 1.8728e+31]])
A tensor with the same shape as test and with all of its entries = 8
tensor([[8, 8, 8, 8],
        [8, 8, 8, 8],
        [8, 8, 8, 8]], dtype=torch.int16)


- `torch.rand(Sequence[int])` is a way to create a tensor with random values from a uniform distribution in the range [0, 1) and shape `Sequence[int]`.
- `torch.zeros(Sequence[int])` is a way to create a tensor filled with zeros and shape `Sequence[int]`. This will be useful in the future when creating masks, which filter out certain values in a tensor. A similar function is `torch.ones(Sequence[int])`, which creates a tensor filled with ones (not that used but may fulfill a similar purpose to the `zeros()` function).
  - `torch.zero_(Tensor)` modifies the input tensor in-place to fill it with zeros (it also returns it).
  - `torch.ones_like(Tensor)` creates a tensor filled with ones with the same shape as the input tensor.
- `torch.arange(int, int, int)` creates a vector with values from `start` to `end` (exclusive) with a step of `step` (all integers).
- `torch.empty_like(Tensor)` creates a tensor with the same shape as the input tensor but with uninitialized values (may look weird if printed).
- `torch.full_like(Tensor, float)` creates a tensor with the same shape as the input tensor but filled with the value `float`.
  - `torch.full(Sequence[int], float)` creates a tensor with the specified shape and filled with the value `float` without the need for another tensor.

## Tensor data types (dtypes)

In [7]:
test: torch.Tensor = torch.tensor([[2, 2], [1, 1]],
                                dtype=torch.half,
                                device="cuda",
                                requires_grad=True)
print(test)
test_2: torch.Tensor = torch.tensor([[1, 1], [1, 1]],
                                    dtype=torch.float64,
                                    device="cuda",
                                    requires_grad=False)
result: torch.Tensor = test * test_2
print(result)
print(result.device, result.shape, result.dtype, result.ndim)
torch.device

tensor([[2., 2.],
        [1., 1.]], device='cuda:0', dtype=torch.float16, requires_grad=True)
tensor([[2., 2.],
        [1., 1.]], device='cuda:0', dtype=torch.float64,
       grad_fn=<MulBackward0>)
cuda:0 torch.Size([2, 2]) torch.float64 2


torch.device

- You can get the device a tensor is stored in by using the `device` attribute. More importantly, you can set the default device for all tensors by using `torch.set_default_device(str)`. To temporary use it in a specific device, you can put your code inside a `with torch.device(str):` block (neither will affect the functions in which the device is specified).

## Potential errors when operating tensors

1. `RuntimeError` associated with a difference in the size of the tensors in at least one dimension (**shape**). This is because the tensors must have the same size in each dimension to perform element-wise operations.
2. `RuntimeError` expecting two or more tensors to be on the same device (*cuda* or *cpu*) but found at least a pair in different ones. This is because the tensors must be on the same device to perform element-wise operations.
3. Probably `RuntimeError` as well regarding different data types of tensors being operated. Sometimes it happens and sometimes it doesn't. Needs to be confirmed in the future.
   
Note: When operating two tensors with different data types, the resulting tensor will have the data type of the tensor with the highest precision. In the same way, if a tensor requires a gradient, the resulting tensor will also require it even if the other one(s) don't.

## Scalar operations with tensors

Think of these as the usual operations with a tensor and an element from a field but modified so that they are well-defined. For example, you can add a scalar to a tensor (or a tensor to a scalar) and it works by creating a tensor full of the scalars and of the same size or shape as the other tensor (could be done using `torch.full_like(Input_tensor, scalar)`. The same goes for the other operations:

In [8]:
tensor: torch.Tensor = torch.randn(size=[2, 2])
print(tensor)
addition_tensor: torch.Tensor = torch.add(tensor, 4) #* Same as torch.add(tensor, 4)
print(addition_tensor)
subtraction_tensor: torch.Tensor = tensor - 3
print(subtraction_tensor)
multip_tensor: torch.Tensor = (1/2) * tensor
print(multip_tensor)

tensor([[ 0.6692, -0.1866],
        [ 0.4688,  0.2123]])
tensor([[4.6692, 3.8134],
        [4.4688, 4.2123]])
tensor([[-2.3308, -3.1866],
        [-2.5312, -2.7877]])
tensor([[ 0.3346, -0.0933],
        [ 0.2344,  0.1061]])


All the operations element-wise together with tensors of the same shape are a field I believe, so the usual properties for real numbers with the usual operations should hold.

> Note: `randn()` generates the numbers from a normal distribution with mean 0 and variance 1, whilst `rand()` generates them from a uniform distribution in the range [0, 1).

## Matrix multiplication

Soooo, as you may already know, matrix multiplication is the basis of neural networks, and doing it efficiently is what makes the technology viable. PyTorch devs are aware of this so they implemented the `torch.matmul()` function

In [9]:
# %%time
tensor_1: torch.Tensor = torch.tensor([[1, 1], [2, 2], [3, 3]])
tensor_2: torch.Tensor = torch.tensor([[2, 2, 2], [3, 3, 3]])
result_with_matmul: torch.Tensor = torch.matmul(tensor_1, tensor_2)
result_with_infix: torch.Tensor = tensor_1 @ tensor_2
print(result_with_infix)
print(result_with_matmul)

tensor([[ 5,  5,  5],
        [10, 10, 10],
        [15, 15, 15]])
tensor([[ 5,  5,  5],
        [10, 10, 10],
        [15, 15, 15]])


`torch.matmul(Tensor, Tensor)` is used to multiply two tensors under some constraints, most notably in 2D, that the matrices are of size `n x m` and `m x p` respectively. It also has broadcasting, which allows you to multiply tensors of different shapes as long as they are compatible (which doesn't happen with `torch.mm()`)
- Although the time it takes the interpreter to compute varies, it is more efficient in terms of big-O complexity than the usual for-loop implementation.
- Seems like `torch.__matmul__(T, T)` is equivalent to `T @ T` (the matrix multiplication operator).

### Transpose

One way to avoid the mistake of computing the multiplication of tensors with different shapes is using their transpose. Now, this is generally a concept that makes sense in 2D, and it's just flipping a matrix. It can be done with the `torch.t()` function or the `Tensor.t()` method or its equivalent alias `Tensor.T`.

In addition, it can be done on a higher-dimensional or one-dimensional tensor. To do so, you can use the `Tensor.permute([int])` method to indicate how do you want to reorder the dimensions. Currently (2024), `Tensor.T` does so reverting the shape, but it will be deprecated soon. 

In [10]:
tensor: torch.Tensor = torch.rand(3, 2, 1)
print(tensor)
print(tensor.permute(1, 0, 2)) #* Dimensions higher to smaller reading them left to righ ([2]=3, [1]=2, [0]=1)
tensor_A: torch.Tensor = torch.tensor([[3, 3],
                                       [2, 2],
                                       [1, 1]])
tensor_B: torch.Tensor = torch.tensor([[1, 2],
                                       [3, 4],
                                       [5, 6]])
mul_1 = tensor_A @ tensor_B.T #* O: 3x3 matrix
mul_2 = tensor_A.T @ tensor_B #* O: 2x2 matrix
print(mul_1, mul_1.shape)
print(mul_2, mul_2.shape)

tensor([[[0.0058],
         [0.5430]],

        [[0.6794],
         [0.1083]],

        [[0.7741],
         [0.0140]]])
tensor([[[0.0058],
         [0.6794],
         [0.7741]],

        [[0.5430],
         [0.1083],
         [0.0140]]])
tensor([[ 9, 21, 33],
        [ 6, 14, 22],
        [ 3,  7, 11]]) torch.Size([3, 3])
tensor([[14, 20],
        [14, 20]]) torch.Size([2, 2])


In [11]:
def matrix_power(matrix, n):
    result = matrix 
    for _ in range(n-1):
        result = torch.matmul(matrix, result)
    result = torch.round(result * 1000) / 1000
    return result
p: float = 0.40
q: float = 1 - p
matrix = torch.tensor([[0.2, 0.8],
                       [0.5, 0.5]], dtype=torch.double)
# print(matrix)
matrix2: torch.Tensor = torch.matmul(matrix, matrix)
vector: torch.Tensor = torch.tensor([0, 0, 1, 0, 0], dtype=torch.double)
# print(torch.matmul(vector, matrix))
print(matrix_power(matrix, 3))

tensor([[0.3680, 0.6320],
        [0.3950, 0.6050]], dtype=torch.float64)


## Tensor aggregation

Whenever you convert a tensor to a single scalar representing something of the values the tensor stores. Some of the most common operations that you can do are: finding the maximum/minimum value and their position, the mean or expected value and the sum of all the values in the tensor, etc.

>Note: The tensor needs to be a float or complex dtype to compute its mean. Otherwise, an error message will pop up.

>Note: The positional minimum and maximum will return a single scalar value representing the position from left to right and from top to bottom (therefore flattening the tensor) in the string represantation of the tensor.
> If you want to know the exact position of the minimum and maximum, you may use the `dim` parameter to reduce the dimmensions to look the maximum/minimum for, and also set `keepdim` to true.

In [12]:
tensor: torch.Tensor = torch.randint(low=1, high=99, size=[3, 5])
print(tensor, tensor.type())
#* Max and min
minimum: int = tensor.min()
maximum: int = torch.max(tensor) #*When doing aggregation, you seem to be able to call the operation from the module and input the tensor
#* Positional max and min
print(f"Minimum at position {tensor.argmin()} is {minimum}")
print(f"Maximum at {torch.argmax(tensor)} is {maximum}")
print(tensor.argmin(dim=1, keepdim=True))
#* Mean and sum
mean: float = tensor.type(torch.float32).mean() #!Must be complex or float, so it needs to be casted in this case
sum: int = torch.sum(tensor)
print(f"Mean: {mean}, sum: {sum}")

tensor([[75, 34, 74, 80, 63],
        [ 5, 75, 68, 18, 25],
        [13, 93, 50, 15, 25]]) torch.LongTensor
Minimum at position 5 is 5
Maximum at 11 is 93
tensor([[1],
        [0],
        [0]])
Mean: 47.53333282470703, sum: 713


## Reshaping, viewing, (un)squeezing, and permuting tensors

- Reshape: Modifies current tensor to a new shape that's compatible with the previous one, i.e., one in which you can fit the same elements you had in the initial tensor. It does so by traversing the tensor from left to right and top to bottom to create a new tensor with a shape `[n1, n2, ..., nj]`, where `n1 * n2 * ... * nj = size1 * size2 * ... * sizen` (the new size is compatible with the current size).
  
  It also returns a copy of the reshaped tensor or a view if it's compatible with the last shape.
- View: Creates a new tensor that holds a reference to the base one, so whenever you change either (particularly its data), the other one will change.
- Stack: Stacks a sequence of tensors on a new dimension. The default dimension, `dim`, is 0, and the results you get when you change it are not as intuitive as you may think, so it really depends on particular use cases.
  - Hstack: Stacks "horizontally" a sequence of tensors. It is equivalent to `.cat(dim = n)` (concatenate in dimension `n`) for `n = 0` for 1D tensors, `n = 1` for 2D tensors, and `n = -1` for higher-dimmension tensors. What you essentially do is you take each element of the last dimmension and create a horizontal "vector" with each component of the element if necessary where the bottom element is the right-most. If the element only has one column, then you leave it as it is. After you do the same with the other tensors left in the sequence, you put them side-by-side in the same order of the sequence and concatanate each "row" of all the vectors into one.
  - Vstack: Stacks "vertically" a sequence of tensors. Essentially, you create the same "vectors" as in the hstack but you place them one below the other instead of side-by-side, and concatenate them in a single column instead of a single row.
- Squeeze: Removes all the 1D dimensions (or only the specified ones) and "joins" the other ones in contiguous dimensions. Visually, what it does is it removes any pair or n-uple of consecutive squared brackets that have no information between them, so they are essentially unnecessary.
- Unsqueeze: Transforms the tensor so that a new dimension is added at `dim`. However, `dim` cannot exceed the dimension of the tensor plus 1.
- Permute: Returns a view of the tensor with the dimensions permuted in the specified order but maintaining the same information. This means if you want to access the element `x[0, 1, 2]` and you permute the dimensions as `(1, 0, 2)` (so dimension 1 goes to dimension 0, 0 goes to 1, and 2 remains the same), the same element can now be accessed as `x[1, 0, 2]`.

> `Tensor.contiguous()` returns the same tensor if it's already contiguous or a deep copy of it that's contiguous.
> Being contiguous means it is stored in contiguous memory positions, just like an array. This may be an advantage when doing computations between tensors. You can check if the tensor is contiguous using `Tensor.is_contiguous()`, which is relevant since some operations on tensors return non-contiguous ones, like `.T` (transpose).

In [13]:
# print(tensor)
# print("Reshape", tensor.reshape([5, 3]))
tensor_1 = torch.randint(low=1, high=30, size=[3, 2, 3])
tensor_1[0, 1, 2] = 8923842
tensor_2 = torch.randint(low=1, high=30, size=[3, 2, 3])
print(tensor_1)
# print(tensor_2)
# print("Stack:", torch.stack([tensor_1, tensor_2], dim=1))
# print("Horizontal stack:", torch.hstack([tensor_1, tensor_2]))
# print("Vertical stack:", torch.vstack([tensor_1, tensor_2]))
zeroes = torch.zeros([3, 1, 3])
# print(zeroes)
squeezed: torch.Tensor = zeroes.squeeze()
# print(squeezed.unsqueeze(dim=2))
permuted = tensor_1.permute(1, 2, 0)
print(permuted)

tensor([[[     19,       6,       5],
         [     23,      19, 8923842]],

        [[     12,      23,      19],
         [     10,       6,      20]],

        [[     23,      13,       2],
         [      4,       3,      22]]])
tensor([[[     19,      12,      23],
         [      6,      23,      13],
         [      5,      19,       2]],

        [[     23,      10,       4],
         [     19,       6,       3],
         [8923842,      20,      22]]])


## Indexing

There are two ways to do it (the names are unofficial but do match the context):

### Chained

You access the i-th position of the j-th dimension by placing `i` inside square brackets after doing the same with the other `j-1` dimension: `tensor[a][b]...[j-1][i]`

This syntax is widely used when working with arrays of arrays or numpy arrays.

For example, if you want to access the 1st element of the 1st dimension of the 0th dimension in a 3D tensor, you can do so with the syntax `tensor[0][1][1]`.

### Tuple

You access the i-th position of the j-th dimension by placing all the values of the `j-1`-th dimensions inside a single pair of square brackets, separated by commas: `tensor[a, b, ..., j-1, i]`

For instance, the same element you wanted to access in the chained indexing example you can use the syntax `tensor[0, 1, 1]`.

This one is, on average, **marginally faster** because it doesn't need to break the tensor into sub-tensors as chained indexing does.


### The colon (`:`)

It's used when you want to sort of 'carry' an entire dimension and access all the elements of it and not just one (see example below). **It can only be used with tuple indexing**.

> When used at the last index, it is equivalent to removing it (it's, to make it easier to understand, implicitly called).

In [14]:
tensor = torch.arange(1, 13).reshape(1, 3, 4)
print(tensor)

#*Accessing the top-left element
print(tensor[0][0][0], tensor[0, 0, 0])

#*The colo syntax only works for tuple indexing
print(tensor[0][:][0], tensor[0, :, 0]) #!Different

#*The colon is redundant when used at the end
print(tensor[0, 0], tensor[0, 0, :])

#*Accessing the last "column" of elements
print(tensor[:, :, 3])

#*When the first dimension has only one element (a matrix in this case), using the colon just adds squared brackets to the element you're trying to access
print(tensor[:, 2, 3])

tensor([[[ 1,  2,  3,  4],
         [ 5,  6,  7,  8],
         [ 9, 10, 11, 12]]])
tensor(1) tensor(1)
tensor([1, 2, 3, 4]) tensor([1, 5, 9])
tensor([1, 2, 3, 4]) tensor([1, 2, 3, 4])
tensor([[ 4,  8, 12]])
tensor([12])


## From and to NumPy

NumPy is the best and most-widely used library to do computations with any and lots of data. Therefore, it makes a lot of sense that PyTorch and NumPy join forces.

This is mainly done by switching the data from one library to another using two PyTorch methods:

#### `torch.from_numpy(ndarray)`

This method/function returns a Tensor object containing the same data and shape as the ndarray input. However, the two don't share memory, so the Tensor is a *deep-copy* of the ndarray.

> Something to bear in mind is that, by default, the Tensor will have a `float64` data type (unless converted), which is the default data type in numpy.

#### `torch.Tensor.numpy()`

This function returns an ndarray containing the same data and shape as the tensor the method is called from. In the same way as the other method, the two don't share memory.

It has some requirements regarding the data type (compatible with NumPy), the tensor grad and bits, and the location of the tensor (GPU or CPU). If some of those requirements are not met, you may pass `True` as an argument to set `force` to true, which will execute a series of methods that may make the tensor compatible with an ndarray.

> Just like from NumPy to Torch, the default dtype of the ndarray will be `float32`, since this is a tensor's default dtype.
>
> IMPORTANT: The tensor must be stored in the CPU. Otherwise, the conversion to ndarray cannot be made.

In [15]:
import numpy as np

array_of_ones: np.ndarray = np.ones((2, 3))
print(array_of_ones, array_of_ones.dtype)
ones_tensor: torch.Tensor = torch.from_numpy(array_of_ones).type(torch.float32) #*You can use .type(torch.some_dtype) to convert the output tensor to the type you need (default is float32)
print(ones_tensor, ones_tensor.dtype)

print("---")
tensor = torch.arange(1., 10.).reshape(1, 3, 3)
array_from_tensor: np.ndarray = tensor.numpy()
print(tensor, tensor.dtype)
print(array_from_tensor, array_from_tensor.dtype)


[[1. 1. 1.]
 [1. 1. 1.]] float64
tensor([[1., 1., 1.],
        [1., 1., 1.]]) torch.float32
---
tensor([[[1., 2., 3.],
         [4., 5., 6.],
         [7., 8., 9.]]]) torch.float32
[[[1. 2. 3.]
  [4. 5. 6.]
  [7. 8. 9.]]] float32


## Reproducibility

It refers to a random tensor's ability to be reproduced at any time and machine, i.e., to be duplicated by anyone using any computer while still being pseudo-random.

This is done setting a random seed, which is a vector or number used to initialize a pseudo-random number generator (PRNG). In turn, the same generator should always output the same sequence of numbers allocated for a particular seed. This means, when not setting a seed, the seed is randomized every time a random number is needed, which is why you get different numbers almost every time you use `torch.rand` or any other PRNG.

In PyTorch (and probably in general), the seed must be an integer. If a floating point is passed, then it is rounded to the closest integer.

> In PyTorch, you use the `torch.manual_seed(seed)` to make the next random tensor reproducible. You only have to call it at the start of a file, but in Jupyter notebooks, it needs to be called before a random method every time you use one.
>
> If you want to set a seed only for the current GPU, you can use `torch.cuda.manual_seed(seed)`.

In [16]:
RANDOM_SEED: int = 42
torch.manual_seed(RANDOM_SEED)
random_tensor1: torch.Tensor = torch.rand(2, 3)
torch.manual_seed(RANDOM_SEED)
random_tensor2: torch.Tensor = torch.rand(2, 3)
print(random_tensor1)
print(random_tensor2)

tensor([[0.8823, 0.9150, 0.3829],
        [0.9593, 0.3904, 0.6009]])
tensor([[0.8823, 0.9150, 0.3829],
        [0.9593, 0.3904, 0.6009]])


## Device-agnostic code

As you may guess by its name, device-agnostic code allows you to run code no matter the device you are working with (CPU or GPU). It is a way to condition the way your code runs without affecting the code itself, but only where tensors are stored.

There are two type of devices: `cpu` and `cuda`, the latter standing for GPUs or TPUs compatible with CUDA.

You can initialize a tensor with the device parameter and it will store the tensor in the argument passed. Also, if you want to make all the computations of a file in one of the devices (whether it is the default or not it is always to explicitly state it, since in the future you may want to change it), you can use the `torch.device(device)` method and pass the desired device as the argument.

The most comprehensible and shortest standard to make device-agnostic code (there is another one in the docs but it involves more step and is visually more confusing), is by setting a `device` variable at the start of the code to a conditional depending on whether a GPU is available or not. In consequence, the code will run on a GPU every time it's available, and on a CPU otherwise.

> If a CUDA-compatible device is available, then it will print `cuda:0`, where `0` may be another integer representing the index of the device (there could be multiple GPUs or TPUs being used)

#### Switching between the two

If you want a **deep copy** of a tensor stored in the other device the tensor is currently in, you can use:
- `Tensor.cpu()` to go from CUDA to CPU.
- `Tensor.cuda()` to go from CPU to CUDA.

> Bear in mind both methods will return `self` if the tensor is already in the device you are converting to.

In [17]:
import numpy as np

device = "cuda" if torch.cuda.is_available() else "cpu"
print("Available device:", device)

#*Setting a device for a particular tensor
tensor = torch.rand(size=[2, 2], device=device)
print(tensor)

#*Setting a device for all tensors from here to the end of the file
torch.set_default_device(device)
tensor_1 = torch.rand(size=[2, 2])
print(tensor_1)

#!To convert to numpy arrays, the tensor must be stored in the CPU
ndarray: np.ndarray = tensor.cpu().numpy()
print(ndarray)

Available device: cuda
tensor([[0.6130, 0.0101],
        [0.3984, 0.0403]], device='cuda:0')
tensor([[0.9877, 0.1289],
        [0.5621, 0.5221]], device='cuda:0')
[[0.61295986 0.01005884]
 [0.3984137  0.0403084 ]]
