<a href="https://colab.research.google.com/github/Monika171/Deep-Learning-with-Pytorch/blob/main/01_tensor_operations.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# Jovian Commit Essentials
# Please retain and execute this cell without modifying the contents for `jovian.commit` to work
!pip install jovian --upgrade -q
import jovian
jovian.utils.colab.set_colab_file_id('1m5CRvySZxhn5TLxItOM-8Dp3ivE-anS3')

#  <font color=red><u>__5 IMPORTANT *MUST KNOW* PyTorch TENSOR FUNCTIONS!__</u></font>

### <font color=red><b> _PyTorch is a Deep Learning framework introduced by Facebook and is based on Torch library. It is basically a replacement for *NumPy* to use the power of GPUs.<br><br>Tensors are similar to *NumPy’s ndarrays*, with the addition being that Tensors can also be used on a GPU to accelerate computing.<br><br>Here we will look into five most common yet important tensor functions:_ </b></font>

- ### <font color=red><b>_TORCH.ISNAN_</b></font>
- ### <font color=red><b>_TORCH.SORT_</b></font>
- ### <font color=red><b>_TORCH.UNIQUE_</b></font>
- ### <font color=red><b>_TORCH.HISTC_</b></font>
- ### <font color=red><b>_TORCH.SAVE_</b></font>


*Before we begin, let's install and import PyTorch.*

In [2]:
# Uncomment and run the appropriate command for your operating system, if required

# Linux / Binder
# !pip install numpy torch==1.7.0+cpu torchvision==0.8.1+cpu torchaudio==0.7.0 -f https://download.pytorch.org/whl/torch_stable.html

# Windows
# !pip install numpy torch==1.7.0+cpu torchvision==0.8.1+cpu torchaudio==0.7.0 -f https://download.pytorch.org/whl/torch_stable.html

# MacOS
# !pip install numpy torch torchvision torchaudio

In [3]:
# Import torch and other required modules
import torch
import numpy as np


***
## <a id="function1"><font color=green><u>Function 1</u> - TORCH.ISNAN</font></a>

> Returns a tensor representing if each element of `input` is `NaN` or not. The returned tensor is a new tensor with boolean elements.

> **Note:** Complex values considered NaN when either their real and/or imaginary part is NaN.

##### <u>Format</u>:
```
torch.isnan(input)
 ```



In [4]:
# Example 1 - working
torch.isnan(torch.tensor([23, 44., 5, float('nan'), 0]))

tensor([False, False, False,  True, False])

> Only the fourth element is `NaN` here, rest even when zero is not considered a NaN

In [5]:
# Example 2 - working
torch.isnan(torch.tensor([np.nan, float('nan')+1j, 52+0j, float('inf')]))

tensor([ True,  True, False, False])

> As mentioned earlier, complex values considered NaN when either their real and/or imaginary part is NaN and here in second element, the real part is `NaN`.

In [7]:
# Example 3 - breaking (to illustrate when it breaks)
x = np.array([[1,2],[3,4.]])
# x = torch.from_numpy(x)

torch.isnan(x)

TypeError: ignored

> `input` argument must always be a tensor.<br>
> <u>Hence, Correction:</u><br>
> `x = np.array([[1,2],[3,4.]])`<br>
> `x = torch.from_numpy(x)`<br>
> `torch.isnan(x)`

### <u>Summary</u>:
* A Pytorch-internal procedure to detect NaNs in Tensors works on the GPU as well as on the CPU.
* `
torch.isnan(input)
 ` is very useful when issues arise during backward pass.

Let's save our work using Jovian before continuing.

In [8]:
!pip install jovian --upgrade --quiet

In [9]:
import jovian

In [10]:
jovian.commit(project='01-tensor-operations-monika171')

[jovian] Detected Colab notebook...[0m
[jovian] Please enter your API key ( from https://jovian.ai/ ):[0m
API KEY: ··········
[jovian] Uploading colab notebook to Jovian...[0m
[jovian] Capturing environment..[0m
[jovian] Committed successfully! https://jovian.ai/monika171/01-tensor-operations-monika171[0m


'https://jovian.ai/monika171/01-tensor-operations-monika171'

***
## <a id="function2"><font color=green><u>Function 2</u> - TORCH.SORT</font></a>

> Sorts in ascending order(by value)along a given dimension.

> If `dim` is not given, the last dimension of the input is chosen.

> For descending order (by value) set `descending` to `True`.

##### <u>Format</u>:
```
torch.sort(input, dim=-1, descending=False, *, out=None)
 ```

In [11]:
# Example 1 - working
x = torch.tensor([[78,30,77],[91,16,25],[55,180,101]])
print(x)
sorted, indices = torch.sort(x)
print(sorted)
print(indices)

tensor([[ 78,  30,  77],
        [ 91,  16,  25],
        [ 55, 180, 101]])
tensor([[ 30,  77,  78],
        [ 16,  25,  91],
        [ 55, 101, 180]])
tensor([[1, 2, 0],
        [1, 2, 0],
        [0, 2, 1]])


> Since `dim` wasn't mentioned here, the last dimension of the input is chosen.

In [12]:
# Example 2 - working
x1 = torch.tensor([[78,30,77],[91,16,25],[55,180,101]])
print(x1)
sorted, indices = torch.sort(x1, dim=0)
print(sorted)
print(indices)

tensor([[ 78,  30,  77],
        [ 91,  16,  25],
        [ 55, 180, 101]])
tensor([[ 55,  16,  25],
        [ 78,  30,  77],
        [ 91, 180, 101]])
tensor([[2, 1, 1],
        [0, 0, 0],
        [1, 2, 2]])


> When `dim=0`, sorting is done columnwise i.e. along the first axis.

In [13]:
# Example 3 - breaking (to illustrate when it breaks)
x3 = torch.tensor([[78,30,77],[91,16,25],[55,180,101]])
print(x3)
sorted, indices = torch.sort(x3, dim=2)
print(sorted)
print(indices)

tensor([[ 78,  30,  77],
        [ 91,  16,  25],
        [ 55, 180, 101]])


IndexError: ignored

> Error says "Dimension out of range" i.e. in our example, value of `dim` exceeds max value of 1.

### <u>Summary</u>:
* This function sorts along a given dimension in any order mentioned.

In [14]:
jovian.commit(project='01-tensor-operations-monika171')

[jovian] Detected Colab notebook...[0m
[jovian] Uploading colab notebook to Jovian...[0m
[jovian] Capturing environment..[0m
[jovian] Committed successfully! https://jovian.ai/monika171/01-tensor-operations-monika171[0m


'https://jovian.ai/monika171/01-tensor-operations-monika171'

***
## <a id="function3"><font color=green><u>Function 3</u> - TORCH.UNIQUE</font></a>

> This function returns the unique elements of the input tensor.

##### <u>Format</u>:
```
torch.unique(*args, **kwargs)
 ```

##### <u>Parameters</u>:
* input (Tensor) – the input tensor

* sorted (bool) – Whether to sort the unique elements in ascending order before returning as output.

* return_inverse (bool) – Whether to also return the indices for where elements in the original input ended up in the returned unique list.

* return_counts (bool) – Whether to also return the counts for each unique element.

* dim (int) – the dimension to apply unique. If `None`, the unique of the flattened input is returned. default: `None`


In [15]:
# Example 1 - working
output, inverse_indices = torch.unique(torch.tensor([[16, -3], [-3, 11]], dtype=torch.long), sorted=True, return_inverse=True)
print(output)
print(inverse_indices)

tensor([-3, 11, 16])
tensor([[2, 0],
        [0, 1]])


> Here `sorted=True` sorts the final unique elements in ascending order in output.
> Also, `return_inverse = True` returns the indices of elements in final output and here by default `dim = None`, so the unique value of the flattened input is returned.

In [16]:
# Example 2 - working
x = torch.tensor([[0, 8, 8],
                  [0, 1, 1],
                  [0, 8, 8],
                  [0 ,8, 6]])
print(torch.unique(x, sorted=True, dim=1))

tensor([[0, 8, 8],
        [0, 1, 1],
        [0, 8, 8],
        [0, 6, 8]])


> On `dim=1` there is double occurrence of the numbers `8 , 1, 8` in the first three rows respectively as `Numpy` and `PyTorch` does this to preserve the shape of the final result.

In [17]:
# Example 3 - breaking (to illustrate when it breaks)
x1 = torch.tensor([[0, 8, 8],
                  [0, 1, 1],
                  [0, 8, 8],
                  [0 ,8, 6]])
result, inverse_indices, counts = torch.unique(x, True, True, True, True, dim=0)
print(result)
print(inverse_indices)
print(counts)

TypeError: ignored

> There is too many arguments according to format.

> The correct syntax would be:
* `result, inverse_indices, counts = torch.unique(x, True, True, True, dim=0)`

### <u>Summary</u>:
* Currently in the CUDA implementation and the CPU implementation when *dim* is specified, *torch.unique* always sort the tensor at the beginning regardless of the sort argument. 

* this `unique` function can sometimes give different results as we noticed in example 2, because of padding.

* Use `torch.unique()` to get common items between tensors.

In [18]:
jovian.commit(project='01-tensor-operations-monika171')

[jovian] Detected Colab notebook...[0m
[jovian] Uploading colab notebook to Jovian...[0m
[jovian] Capturing environment..[0m
[jovian] Committed successfully! https://jovian.ai/monika171/01-tensor-operations-monika171[0m


'https://jovian.ai/monika171/01-tensor-operations-monika171'

***
## <a id="function4"><font color=green><u>Function 4</u> - TORCH.HISTC</font></a>

> Computes the histogram of a tensor.

> The elements are sorted into equal width bins between `min` and `max`. If `min` and `max` are both zero, the minimum and maximum values of the data are used.

> Elements lower than min and higher than max are ignored.

##### <u>Format</u>:
```
torch.histc(input, bins=100, min=0, max=0, *, out=None)
 ```


In [19]:
# Example 1 - 
torch.histc(torch.tensor([7, 5., 7]), bins=4, min=0, max=8)

tensor([0., 0., 1., 2.])

Explanation about example

In [20]:
# Example 2 - working
torch.histc(torch.tensor([15., 65, 15, 101]))

tensor([2., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 1.])

> Since `bins` value is not provided, by default the elements are sorted into 100 equally spaced bins between the minimum and maximum values of `input`.

In [21]:
# Example 3 - breaking (to illustrate when it breaks)
torch.histc(torch.tensor([53., 9, 53., 17, 17, 17, 23]), bins=5.0)

TypeError: ignored

> in this function arguments `bins`, `min`, `max` must be int and NOT float.

> Hence, corrected form will be:

> `torch.histc(torch.tensor([53., 9, 53., 17, 17, 17, 23]), bins=5.0)`

### <u>Summary</u>:
* torch.histc(x) returns a histogram of the elements in x e.g: sampling in a training batch.



In [22]:
jovian.commit(project='01-tensor-operations-monika171')

[jovian] Detected Colab notebook...[0m
[jovian] Uploading colab notebook to Jovian...[0m
[jovian] Capturing environment..[0m
[jovian] Committed successfully! https://jovian.ai/monika171/01-tensor-operations-monika171[0m


'https://jovian.ai/monika171/01-tensor-operations-monika171'

***
## <a id="function5"><font color=green><u>Function 5</u> - TORCH.SAVE</font></a>

> Saves an object to a disk file.

> This saves a serialized object to disk. It uses python's pickle utility for serialization. Models, tensors and dictionaries can be saved using this function.


##### <u>Format</u>:
```
torch.save(obj, f: Union[str, os.PathLike, BinaryIO], pickle_module=<module 'pickle' from '/opt/conda/lib/python3.6/pickle.py'>, pickle_protocol=2, _use_new_zipfile_serialization=True)
 ```

##### <u>Parameters</u>:
* obj – saved object

* f – a file-like object (has to implement write and flush) or a string or os.PathLike object containing a file name

* pickle_module – module used for pickling metadata and objects

* pickle_protocol – can be specified to override the default protocol

In [23]:
# Example 1 - working
# Save to file
x = torch.tensor([10, 17, 72, 23, 6])
torch.save(x, 'tensor.pt')

> A common PyTorch convention is to save tensors using `.pt` file extension.

> This makes a ‘tensor.pt’ file in the working directory and it contains the model architecture as well as the saved weights (if exists).

> We can load the saved file/model using the function below. Loading is as simple as Saving. In case of model, we do not even need to define the model architecture as the information about the model architecture is already stored in the saved file.

> * `torch.load()`: This function uses pickle's unpickling facilities to deserialize pickled object files to memory and also facilitates the device to load the data into.

> * `y = torch.load('tensor.pt')`

In [24]:
# Example 2 - working
# Save to io.BytesIO 
import io
buffer = io.BytesIO()
torch.save(x, buffer)

> Just like what we do with variables, data can be kept as bytes in an in-memory buffer when we use the io module’s Bytes IO operations.

> **Note:** You need to seek to the beginning of the buffer before reading:

> * `buffer.seek(0)`

> * `print(buffer.read())`

In [25]:
# Example 3 - breaking (to illustrate when it breaks)
z = torch.tensor([9, 3, 19, 24, 5])
torch.save(z)

TypeError: ignored

> An important argument *Path* is missing. To make it work, insert *PathLike* object containing a file name.

### <u>Summary</u>:

* We can also save an entire model i.e. architecture of a model as well as its weights, so that we can resume from the point where we had frozen all but the last layer.

* If you are working on a hosted environment it’s always better to save the model in cloud storage, so that later its easier to load your model without having to upload it which may take time because the models are usually of big size.

* Also if you plan to deploy your model in an app on the web, saving in cloud is better because it allows you to make tweaks and changes, put your model to test and perform faster iterations.

In [26]:
jovian.commit(project='01-tensor-operations-monika171')

[jovian] Detected Colab notebook...[0m
[jovian] Uploading colab notebook to Jovian...[0m
[jovian] Capturing environment..[0m
[jovian] Committed successfully! https://jovian.ai/monika171/01-tensor-operations-monika171[0m


'https://jovian.ai/monika171/01-tensor-operations-monika171'

## <font color=blue><u>__Conclusion__</u></font>
<font color=red>_Functions such as 
`TORCH.ISNAN`, `TORCH.SORT`, `TORCH.UNIQUE`, `TORCH.HISTC ` and `TORCH.SAVE` were discussed in this notebook. Various examples and uses of each individual functions were also discussed. This notebook can help us better understand other advanced functions such as *nansum*, *nanquantile*, *unique_consecutive*, *argsort*, *searchsorted*, *saving-loading-tensors* etc._</font>

## <font color=blue><u>__Reference Links__</u></font>

* [Official documentation for tensor operations](https://pytorch.org/docs/stable/torch.html)
* [Pytorch Operation to detect NaNs](https://www.mmbyte.com/article/74581.html)
* [Saving/Loading your model in PyTorch](https://medium.com/udacity-pytorch-challengers/saving-loading-your-model-in-pytorch-741b80daf3c)
* [Everything You Need To Know About Saving Weights In PyTorch](https://towardsdatascience.com/everything-you-need-to-know-about-saving-weights-in-pytorch-572651f3f8de)
*[How to save and reload a deep learning model in Pytorch?](https://www.dezyre.com/recipes/save-reload-deep-learning-model-pytorch)

In [28]:
jovian.commit(project='01-tensor-operations-monika171')

[jovian] Detected Colab notebook...[0m
[jovian] Uploading colab notebook to Jovian...[0m
[jovian] Capturing environment..[0m
[jovian] Committed successfully! https://jovian.ai/monika171/01-tensor-operations-monika171[0m


'https://jovian.ai/monika171/01-tensor-operations-monika171'