# **Tensors**
**Tensors** are used to represent data in neural networks. Real world information are encoded into tensors for the computer and neural network to work on.

The main advantage of using tensors is their ability to make use of hardware acceleration provided by GPUs and TPUs that are able to perform large sets of calculations efficiently by allowing for parallel processing.

In [None]:
import torch
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Remember to include this line at the beginning of every .ipynb file to allow for the console to show all outputs from line evaluations and not just the last one
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

# Setting up a device agnostic code
if torch.cuda.is_available:
    torch.set_default_device("cuda")


After having imported all the necessary libraries, let's check for the version of PyTorch installed in the system.

In [None]:
torch.__version__

## Creating Tensors

All the data stored an utilized in pytorch are stored as tensors. PyTorch tensors are created using **torch.tensor()**

Tensors are of several types and one of the classification is made on the basis of the rank of the tensor. They are:

*   Rank 0 Tensors (No basis vectors utilized -- **Scalars**)
*   Rank 1 Tensors (One basis vector for each direction -- **Vectors**)
*   Rank 2 Tensors (Two basis vector for each direction)
*   Rank 3 Tensors (Three basis vectors for each direction)

### Scalars

In [None]:
# Creating scalars.
# Scalars are tensors of rank 0
SCALAR =  torch.tensor(7) # Returns a pytorch tensor with no "autograd history" --> look into autograd mechanics
torch.is_tensor(SCALAR) # Returns True if the passed object is a PyTorch tensor
SCALAR.ndim# Returns the number of dimensions of ndarray in python
SCALAR.shape# --> Look into it
SCALAR.item() # Returns the item in the scalar (tensor of rank 0) as a regular python integer.

### Vectors

In [None]:
# Vectors are created similar to scalars
VECTOR = torch.tensor([7,7])
VECTOR.ndim
VECTOR.shape

### Matrices

In [None]:
# Matrices are created as
MATRIX = torch.tensor([[1,2,3], [3,4,5]])
MATRIX.ndim
MATRIX.shape
MATRIX[0]

### Tensors

In [None]:
TENSOR = torch.tensor([[[1,2,3],[4,5,6],[7,8,9]],[[10,11,12],[13,14,15],[16,17,18]],[[19,20,21],[22,23,24],[25,26,27]],[[28,29,30],[31,32,33],[34,35,36]]])
TENSOR.ndim
TENSOR.shape
TENSOR[1][2][1]

### Random Tensors

Random tensors are useful because neural networks usually start with a random collection of data and then tune them to better fit the problem's solution.
Manually initializing tensors that may contain thousands of data is impracical

In [None]:
# Creating a random tensor of size (3,5,2)
random = torch.rand(5,3,4)
random[3][1][3].item()
random.ndim
random.shape
# Creating a tensor of shape similar to an image tensor
random_image = torch.rand(size = (244,244,3)) # Height, width and color channels
print(random_image.shape, random_image.ndim)

### Zeros and Ones

The .zeros() method creates a tensor of required shape made up entirely of zeros. Tensors with only zeros and ones are used as masks to separate certain region of interest in an image. The .ones() method does the same thing but for ones for all elements.

In [None]:
ZEROS = torch.zeros(size = (5,10,10))
ONES = torch.ones(size = (5,3,4))
# print(ZEROS, ONES)

# Notes that T1 * T2  where T1 and T2 are tensors performs a simple correspondent element multiplication. So,
# print(random * ZEROS)

try:
  ONES*random
except RuntimeError:
  print("Mismatched Dimension") # AS the code clearly explains

In [None]:
ZEROS.dtype

### Range of tensors & tensor-like

.arange() returns a rank 2 tensor with elements ranging from start (inclusive) to end (exclusive) with steps (1 by default)

In [None]:
range = torch.arange(start=0, end=78) # .arange() is left inclusive and right exclusive
range

.zeros_like() returns a tensor of shape same as that of input but with each element zero .ones_like() works in a similar way but for ones and .rand_like() for random values

In [None]:
tens_zeroes_like = torch.rand_like(input=random)
tens_zeroes_like

## Tensor datatypes

Tensors in PyTorch by default store data in float_32 dtype unless explicitly initialized ie. .rand(), .zeroes(), .ones(), and .rand_like() return a tensor of dtype float_32. The major error points while coding with PyTorch are:
*  selection of datatypes 
*  wrong tensor dimensions 
*  tensors not on the right device

While initializing tensors we can pass params like:
*  dtype: datatype of tensor (torch.floa32 / torch.float64)
*  device: which device is the tensor on or associated with GPU or CPU
*  requires_grad: if PyTorch should track the gradients of the tensor while it is computed on

In [None]:
float_32 = torch.tensor([3.0,6.0,9.0], dtype=None, device=None, requires_grad=False)
new = torch.rand_like(input=float_32, dtype=torch.float64)
new
float_32
float_32.dtype

### .dtype 

is the property of tensor that represents the type of data stored in the tensor

### Typecasting

Explicit typecasting is done by .type() method

In [None]:
float_16 = float_32.type(torch.float16)
float_16.dtype

Now, we see that the result is implicitly (implies that the result must be of higher order) typecasted to be float32

> Note that the .type() method returns the typecasted tensor and so the returned tensor has to be assigned to some other new tensor or the original tensor

In [None]:
result = float_16 * float_32
result.dtype

Tensor attributes can be fetched as:
*  datatype: tensor.dtype
*  device: tensor.device
*  shape: tensor.shape | can also use tensor.size(). While shape is a property, size is a method

In [None]:
result.dtype
result.shape
result.size()
result.device

.to() method can change the tensor attributes like device and dtype

In [None]:
# result.to(device="cuda", dtype=torch.float64)

## Tensor Manipulation

Tensors can be operated on as:
* Addition
* Subtraction
* Multiplication (element-wise)
* Division
* Matrix multiplication

In [None]:
test = torch.tensor([[[1,2,3],[4,5,6]]], dtype=torch.long)

# Addition
test + 7

# Multiplication
new_test = test * 4

# Subtraction
test - 12

# Division
test / 5

Tensors are multiplied in two ways:

* Multiplication with a scalar (*)
* Matrix Multiplication (@)


In [None]:
# Scalar multiplication
print(f"{test} * {new_test} = {test * new_test}")

# Matrix multiplication
# torch.matmul(test, new_test)

d1 = torch.tensor([[1,2,3],[4,5,6]])
d2 = torch.tensor([[1,2],[3,4]])
try:
  torch.matmul(d1,d2)
except Exception as exp:
  print(exp)

One of the most frequent errors faced while working with neural networks and writing deep learning code is the size mismatch of tensors that are being multiplied. 

The two important rules followed are:

1. The inner dimensions must match:
  * `(3,2) @ (2,3)` will work
  * `(3,2) @ (3,2)` won't work
  * `(2,3) @ (3,2)` will work
  
2. The resulting matrix will have the shape of outer dimensions

>Note: inner dimensions for (5,7) and (4,6) are (5,4) and outer dimensions are (7,6)

In [None]:
# Transposing tensors

tensor = torch.rand([3,3,2])
tensor
# tensor.T
# tensor.T is a deprecated feature and a better way to transpose matrices is by using the permute function as follows
tensor.permute(*torch.arange(tensor.ndim - 1, -1, -1))
# This is a rather intuitive way to transpose or reverse the dimensions 

## Tensor Aggregation

We can find the max, min, sum, avg of a tensor by using tensor aggregation methods.

In [None]:
random_tensor = torch.rand([3,3,3]) * 100
random_tensor = random_tensor.type(torch.int16) # Refer to the note in explicit typecasting section
random_tensor
random_tensor.dtype

In [None]:
# Finding min
random_tensor.min()

# Finding max
random_tensor.max()

# Finding mean
try:
  random_tensor.mean()
except RuntimeError:
  print("Tensor must be of dtype float or complex and not integer")

# Finding sum
random_tensor.sum()

# Finding the positional min
random_tensor.argmin()

# Finding the positional max
random_tensor.argmax()

> Note that the .argmin() and .argmax() methods return the position of the min and max value assuming that the tensor is one dimensional ie. In a 3x3x3 tensor if the element of index `[1][2][1]` is min then the value returned by the method will be 16. We can see that the reasoning behind the result is that position is counted as 8 when we reach `[0][2][2]` from `[0][0][0]` and continue as 9 for `[1][0][0]`

## Reshaping, Stacking, Squeezing, and Unsqueezing Tensors

* Reshape: reshape an input tensor to a desired shape
* View: return a view of an input tensor of certain shape but keep the same memory as the original tensor
* Stacking: stack multiple tensors on top of one another (**vstack**) or side-by-side (**hstack**)
* Squeeze: removes all `1` dimensions from a tensor
* Unsqueeze: adds a `1` dimension to a traget tensor
* Permute: return a view of the input tensor with its dimension permuted (**swapped**) in a certain way

In [None]:
tensor_new = torch.arange(1.0,49.0)
tensor_new

# Reshaping a tensor
try:
  reshaped_tensor = tensor_new.reshape([3,4,4])
except Exception as err:
  print(str(err) + "Total elements: 48 N 50 O")
  try:
    reshaped_tensor = tensor_new.reshape([4,13])
  except Exception as err:
      print(str(err) + "Total elements: 52 N 50 O")
      reshaped_tensor = tensor_new.reshape([5,5,2])
      reshaped_tensor
# reshaped_tensor = tensor_new.reshape([5,2,5])
# reshaped_tensor


> A key concept to note is that the .reshape() method can only reshape tensors into new tensors that are equivalent to the original i.e. the number of elements in them must be equal. We cannot exclude some elements from the new tensor nor can we create a tensor with empty positions.

In [None]:
# Changing the view of a tensor 
view_tensor = tensor_new.view(3,2,4,1,2)
view_tensor

One thing to note about the .view() method is that the tensor returned by it will share the same memory as the original tensor but the shape wiil be changed. Essentially, the changes made to the tensor returned will be reflected to the original tensor.

In [None]:
# Manipulating the view_tensor
view_tensor *= 10.0
tensor_new
tensor_new /= 10.0
view_tensor

We see that the chages were applied to both view_tensor and tensor_new

In [None]:
# Stacking tensors
stacked_tensor = torch.stack([view_tensor, view_tensor, view_tensor], dim=4)
view_tensor
stacked_tensor
view_tensor.shape
stacked_tensor.shape

The .stack() method stacks tensors. It takes two arguments, a tensor list and `dim`. While the list is self explanatory one may find it hard to grasp what dim does. The dim argument specifies the index of dimension that will be added to the stacked tensor ie. for `dim = 0` the tensors wll be stacked one after other. say three tensors of dimension `[2,3]` were stacked so the dimension of the new tensor will be `[3,2,3]`. The same tensor will have dimension `[2,3,3]` for `dim = 2`.
> Play around in the code block above to better understand it.


In [None]:
# Squeezing tensors
squeezed_tensor = view_tensor.squeeze()
squeezed_tensor
squeezed_tensor.shape

In [None]:
# Unsqueezing tensors
unsqueezed_tensor = squeezed_tensor.unsqueeze(1)
squeezed_tensor
unsqueezed_tensor
squeezed_tensor.shape
unsqueezed_tensor.shape

In [None]:
# Permuting tensors
permuted_tensor = torch.permute(squeezed_tensor, (1,3,2,0))
permuted_tensor
permuted_tensor.shape

# Demonstrating that the permute method returns tensor in similar manner to view.
squeezed_tensor *= 10
permuted_tensor

The permute method reorganizes the input tensor in a specific order. To specify the order a tuple of the same length as the size tuple is to be passed where the index of the original size tuple are entered. Example: A tensor of size `(3,4,5,8)` is passed to `.permute(tensor, (2,3,0,1))`, the resulting tensor will have a size of `(5,8,3,4)`

## Indexing

Indexing with PyTorch is similar to indexing with NumPy

In [None]:

tensor_test = torch.arange(1,10).reshape([1,3,3])
tensor_test

In [None]:
# Accesing [0] th element of tensor
tensor_test[0]
# We get the same tensor as the dimension of tensor_test is [1,3,3]
# Accessing the 2nd element of 1st row 
tensor_test[0][0][1]

In [None]:
# Using : for indexing
# : selects all the elements in a dimension or ing general a list
tensor_test[0,:,1]

- `tensor_test[0,:,1]` - This will select all elements from the second dimension (columns) of the first dimension (rows) at index 1 of the tensor. In other words, it’s selecting the second column from all rows of the first matrix in the tensor.

- `tensor_test[0][:][1]` - This is a chained indexing operation. The first indexing operation tensor_test[0] selects the first matrix in the tensor. The second indexing operation [:] selects all elements of this matrix. The third indexing operation [1] then selects the second element of the result. In this case, it will select the second row of the first matrix in the tensor.

> So, the difference between tensor_test[0,:,1] and tensor_test[0][:][1] is that the former selects a column from a matrix in the tensor, while the latter selects a row from a matrix in the tensor.

To fetch `3,6,9` from the tensor

In [None]:
tensor_test[0,:,2]

## PyTorch and Numpy

Numpy being a widely used numerical computation library has been integrated i Pyorch to allow for better array and tensor manipulation. We can interchange betwee a Numpy ndarray and a pytorch tensor as follows:

- Pytorch -> Numpy: use `torch.Tensor.numpy()`
- Numpy -> pytorch: use `torch.from_numpy(ndarray)`

In [None]:
array = np.arange(1.0,9.0)
to_tensor = torch.from_numpy(array)
array, to_tensor, id(array), id(to_tensor)

> Note that the `.from_numpy()` method reflects the default dtype of float64 of numpy while PyTorch has a default dtype of float32. Also the tensor thus formeed will not refer to the same memory location and a new variable is being created upon calling it.

In [None]:
id(array), id(to_tensor)
array = array * 2
id(array), id(to_tensor)
to_tensor


> Note that manipulating variables using `a += 1` and `a = a + 1` are different. The former changes the value in-place while the latter changes the value and creates a new memory instance of it. This is why if the array was manipulated using `+=` the tensor referring to its memory was also changed while using `a = a + n` results in the array changing but the tensor remaining the same.

In [None]:
# Changing from tensor to numpy
tensor = torch.ones([7,3])
numpy_tensor = tensor.numpy()
tensor, numpy_tensor

## Reproducibility

The ability to recreate results when a program runs for multiple instances is `reporducibility`. The ability to reproduce an outcome is in direct contrast to the ability to work with random initial conditions. 

Sometimes you want the program to behave in the same way even though it is meant to work with randomly generated values later on. For this, instead of changing the program for debugging and for applicaton we can `seed` the `RNG` such that the random number generator will always produce the same set of values for all program executions.

In [None]:
torch.manual_seed(32)
random_tensor_one = torch.rand([3,4,5])
torch.manual_seed(32)
random_tensor_two = torch.rand([3,4,5])
random_tensor_one
random_tensor_two
random_tensor_one == random_tensor_two

# Seeding the CUDA RNG for one GPU, in multi-GPU systems, to seed all GPUs at once use .seed_all()
torch.cuda.manual_seed(0)
rt = torch.rand([2,2,2], device = device)
rt
torch.cuda.manual_seed(0)
nrt = torch.rand([2,2,2], device = device)
nrt
rt == nrt


## Accessing a GPU and Device Agnostic Code

Utilizing a GPU for ML operations siggnificantly reduces runtime once the models and training data scale up. 

> A device agnostic code is a program that utilizes the available resources in terms of both hardware and software without requiring any tweaking to fit the system where it is ran.

In [None]:
# Check if a GPU is available
torch.cuda.is_available()

### Putting tensors from CPU to GPU and vice versa

PyTorch can utlize GPU to compute tensor operations faster while numpy can only acces the cpu. So, sometimes we have t transfer our cpu tensor to gpu and vice versa.

In [None]:
# Creating a tensor defaults to it being made for CPU
tensor = torch.tensor([1,2,3])
tensor, tensor.device

# Creating tensor on GPU if available
gpu_tensor = torch.tensor([1,2,3] , device = device)
gpu_tensor.device

# Putting cpu tensor on th GPU
transferred = tensor.to(device)
transferred.device

# Putting tensors to CPU
transferred_again = gpu_tensor.to("cpu")
transferred_again.device
gpu_tensor

Some operations are not compatible for the tensors due to the device they are accessing

In [None]:
try:
  gpu_tensor.numpy()
except Exception as error:
  print("Error:",error)
  # gpu_tensor.to("cpu").numpy()
  # A better way to do that is by calling the .cpu() method on the tensor
  gpu_tensor.cpu().numpy()
  gpu_tensor
  # Remember to assign the new tensors to something else ot to itself