<a href="https://colab.research.google.com/github/Wycology/deep_learning_course/blob/main/2_Tensors.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# <font color='green'><b> SATELLITE DATA FOR AGRICULTURAL ECONOMISTS</b></font>


<font color='blue'><b>THEORY AND PRACTICE</b></font>

**_MACHINE & DEEP LEARNING_**


*David Wuepper, Hadi Hadi, Wyclife Agumba Oluoch*

[Land Economics Group](https://www.ilr1.uni-bonn.de/en/research/research-groups/land-economics), University of Bonn, Bonn, Germany

---

# **Background**


---

`Tensor`, by definition, is the fundamental data structure used to store and manipulate data in `PyTorch`. Understanding `tensor` is paramount to understanding how `PyTorch` implements advanced functions for deep learning. At both preprocessing of input and postprocessing of output, you will be dealing with `tensor` in most cases. It is therefore important to understand what `tensor` are and operations on them. If you know `NumPy` then you are 100% good to take on `tensor`. Let us see how to create some `tensor`.

For us to have `tensor`, we need to load `PyTorch` library first.

In [1]:
import torch # Of course we need to have the PyTorch

We start with a simple `tensor` with just one digit.

# **Creating `tensor`**
---
In order to use a `tensor`, we need to create it. We can create a `tensor` easily using a `torch.empty()` function as follows:

In [None]:
x = torch.empty(2, 3) # Creates a tensor of shape 2 by 3, that is two rows and three columns.
print(x)

tensor([[5.4222e-35, 0.0000e+00, 1.6971e-33],
        [0.0000e+00, 1.1210e-43, 0.0000e+00]])


The `empty()` method comes pre-built in `PyTorch`. The created tensor is 2-dimensional, normally called 2-d `tensor`. This is because it has **two** rows and **three** columns. You might be wondering why we call it empty yet you see some values. It is empty in the sense that the values you see are actually what was in the memory of the computer when the empty `tensor` was allocated in memory. In reality, the `tensor` is empty and values can be fed into it. Note that sometimes a 1-dimensional `tensor` is called a _vector_, 2-dimensional `tensor` called a _matrix_, and the word `tensor` normally used when the dimensions are above 2. Anyway, all are `tensor`s.

As opposed to `empty` `tensor`, one would normally prefer to have some values in a created `tensor`. A `tensor` can be created with all its elements set to **zeros**, or **ones**, or some **random values** that can be integers or floats. Good enough, all these can be done by in-built methods in `PyTorch` as follows:



In [None]:
zeros_tensor = torch.zeros(2, 3) # This is tensor will all values set to zeros.
print(zeros_tensor)

tensor([[0., 0., 0.],
        [0., 0., 0.]])


In [None]:
ones_tensor = torch.ones(2, 3) # This is tensor will all values set to ones.
print(ones_tensor)

tensor([[1., 1., 1.],
        [1., 1., 1.]])


><font color='red' size = 5><b>NOTE</b></font>: For the random `tensor`, it is a good practice to set **random seed** so that your output is reproducible the next time you rerun the same code. Remember to use the same integer values in the seed each time you want to reproduce the values.

In [None]:
torch.manual_seed(248) # This ensures that the code is reproducible.
random_tensor = torch.rand(2, 3) # This creates a tensor with random values between 0 and 1 in it.
print(random_tensor)

tensor([[0.1486, 0.6503, 0.1690],
        [0.0158, 0.4595, 0.9061]])


# **Shape of `tensor`**
---

When we want to perform some operations on a `tensor`, it is important to know its shape (printed out as **torch.Size([])**). This will tell us for example if the `tensor` is 1-dimensional, 2-d, 3-d, 4-d among others. There are some operations that we can only perform is specific rules of `tensor` shapes are obeyed as we will see. In order to know the shape of a tensor, we use the `shape` property as follows:

In [None]:
torch.manual_seed(248)
x = torch.rand(2, 3)
print(x.shape) # This is a two by three matrix.

torch.Size([2, 3])


You can think of the above matrix as a single band/image with two rows and three columns. Meaning this is basically having 6 elements inside it. We can print it as and confirm the number of elements in it as follows:

In [None]:
print(x)

tensor([[0.1486, 0.6503, 0.1690],
        [0.0158, 0.4595, 0.9061]])


Let us assume that you have an **RGB** image. That is, an image with three bands/color channels. In this casewe will have a 3-d `tensor`. This is becase we will have rows and columns as well as number of channels. Let us say, you have an RGB image of a cup with 256 pixels in row and 256 pixels in column for each of the three color channels. This will be something like (3, 256, 256). Let us create one:

In [None]:
torch.manual_seed(248)
rgb_tensor = torch.rand(3, 256, 256) # This is a 3-d tensor (channels dimension, rows dimension, and column dimension)
print(rgb_tensor.shape)

torch.Size([3, 256, 256])


The output is a list with 3, 256, and 256 elements in it corresponding to number of channels, rows and columns. The 3 means there are 3 elements in the first dimension. the first 256 means there are 256 elements in the second dimension and the last 256 means there are 256 elements in the third dimension.

In [None]:
print(rgb_tensor)

tensor([[[0.1486, 0.6503, 0.1690,  ..., 0.8042, 0.1762, 0.3878],
         [0.6061, 0.0266, 0.6565,  ..., 0.7283, 0.7022, 0.6209],
         [0.4686, 0.5914, 0.0205,  ..., 0.2622, 0.1006, 0.5342],
         ...,
         [0.4051, 0.9665, 0.9868,  ..., 0.6381, 0.9864, 0.7085],
         [0.6336, 0.8528, 0.4961,  ..., 0.6412, 0.6450, 0.5386],
         [0.3577, 0.7328, 0.4285,  ..., 0.2057, 0.8538, 0.1834]],

        [[0.1505, 0.2070, 0.0899,  ..., 0.9604, 0.4554, 0.2742],
         [0.0976, 0.9119, 0.0958,  ..., 0.3554, 0.9178, 0.1921],
         [0.6962, 0.9772, 0.1616,  ..., 0.8772, 0.3858, 0.0328],
         ...,
         [0.5004, 0.3803, 0.6465,  ..., 0.5358, 0.6438, 0.4367],
         [0.1902, 0.6078, 0.8052,  ..., 0.1226, 0.7050, 0.0753],
         [0.1047, 0.3733, 0.9489,  ..., 0.6650, 0.0291, 0.7273]],

        [[0.2848, 0.4889, 0.5891,  ..., 0.6803, 0.0650, 0.1653],
         [0.8378, 0.1194, 0.9079,  ..., 0.8441, 0.4194, 0.5504],
         [0.7955, 0.4902, 0.4107,  ..., 0.0283, 0.8470, 0.

## **Creating `tensor` _like**

There can be instances when you already have a tensor and you want to initialize another tensor **like** the one you have. The like here does not mean it will also have the same actual elements, but means will have same attributes including shape and data type. We will look at `tensor` data types later. Now, we can create a `tensor` like one we have but only filled with zeros, or empty, or ones, or random. The only think we need to remember from what we already discussed above is to add `_like` to the zeros, ones, rand as follows:

> Add blockquote



In [None]:
zeros_like_rgb = torch.zeros_like(rgb_tensor)
print(zeros_like_rgb.shape) # The shape is similar to that of rgb_tensor
print(zeros_like_rgb) # The values in it are, however, zeros

torch.Size([3, 256, 256])
tensor([[[0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         ...,
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.]],

        [[0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         ...,
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.]],

        [[0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         ...,
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.]]])


In [None]:
ones_like_rgb = torch.ones_like(rgb_tensor)
print(ones_like_rgb.shape) # The shape is the same as the shape of rgb_tensor ([3, 256, 256])
print(ones_like_rgb) # The elements in the output are all 1.'s

torch.Size([3, 256, 256])
tensor([[[1., 1., 1.,  ..., 1., 1., 1.],
         [1., 1., 1.,  ..., 1., 1., 1.],
         [1., 1., 1.,  ..., 1., 1., 1.],
         ...,
         [1., 1., 1.,  ..., 1., 1., 1.],
         [1., 1., 1.,  ..., 1., 1., 1.],
         [1., 1., 1.,  ..., 1., 1., 1.]],

        [[1., 1., 1.,  ..., 1., 1., 1.],
         [1., 1., 1.,  ..., 1., 1., 1.],
         [1., 1., 1.,  ..., 1., 1., 1.],
         ...,
         [1., 1., 1.,  ..., 1., 1., 1.],
         [1., 1., 1.,  ..., 1., 1., 1.],
         [1., 1., 1.,  ..., 1., 1., 1.]],

        [[1., 1., 1.,  ..., 1., 1., 1.],
         [1., 1., 1.,  ..., 1., 1., 1.],
         [1., 1., 1.,  ..., 1., 1., 1.],
         ...,
         [1., 1., 1.,  ..., 1., 1., 1.],
         [1., 1., 1.,  ..., 1., 1., 1.],
         [1., 1., 1.,  ..., 1., 1., 1.]]])


In [None]:
torch.manual_seed(248)
rand_like_rgb = torch.rand_like(rgb_tensor)
print(rand_like_rgb.shape) # Same shape as rgb_tensor
print(rand_like_rgb) # Interesting, the values here are exactly the same as those in rgb_tensor. Why? Because we used same manual seed.

torch.Size([3, 256, 256])
tensor([[[0.1486, 0.6503, 0.1690,  ..., 0.8042, 0.1762, 0.3878],
         [0.6061, 0.0266, 0.6565,  ..., 0.7283, 0.7022, 0.6209],
         [0.4686, 0.5914, 0.0205,  ..., 0.2622, 0.1006, 0.5342],
         ...,
         [0.4051, 0.9665, 0.9868,  ..., 0.6381, 0.9864, 0.7085],
         [0.6336, 0.8528, 0.4961,  ..., 0.6412, 0.6450, 0.5386],
         [0.3577, 0.7328, 0.4285,  ..., 0.2057, 0.8538, 0.1834]],

        [[0.1505, 0.2070, 0.0899,  ..., 0.9604, 0.4554, 0.2742],
         [0.0976, 0.9119, 0.0958,  ..., 0.3554, 0.9178, 0.1921],
         [0.6962, 0.9772, 0.1616,  ..., 0.8772, 0.3858, 0.0328],
         ...,
         [0.5004, 0.3803, 0.6465,  ..., 0.5358, 0.6438, 0.4367],
         [0.1902, 0.6078, 0.8052,  ..., 0.1226, 0.7050, 0.0753],
         [0.1047, 0.3733, 0.9489,  ..., 0.6650, 0.0291, 0.7273]],

        [[0.2848, 0.4889, 0.5891,  ..., 0.6803, 0.0650, 0.1653],
         [0.8378, 0.1194, 0.9079,  ..., 0.8441, 0.4194, 0.5504],
         [0.7955, 0.4902, 0.4107

## **Creating `tensor` directly from data**

It is also possible to directly supply data in `torch.tensor()` to create a `tensor` of your liking. The values can be a list, tuple, or numpy array etc.

In [None]:
some_values = torch.tensor([11, 2.0, 376, 0.004]) # Manually supplied values
print(some_values.shape) # This is a one dimension tensor.
print(some_values)

torch.Size([4])
tensor([1.1000e+01, 2.0000e+00, 3.7600e+02, 4.0000e-03])


In [None]:
some_values = torch.tensor([[1.2, 23, 13.1, 4.2], [23, 4, 5, 67]])
print(some_values.shape) # This is a two dimension tensor.
print(some_values)

torch.Size([2, 4])
tensor([[ 1.2000, 23.0000, 13.1000,  4.2000],
        [23.0000,  4.0000,  5.0000, 67.0000]])


In [None]:
tuple_tensor = torch.tensor((2.22, 1.675, 8)) # Creating from tuple
print(tuple_tensor.shape) # This is 1-d tensor
print(tuple_tensor)


torch.Size([3])
tensor([2.2200, 1.6750, 8.0000])


In [None]:
tuple_list_tensor = torch.tensor(((2.22, 1.675, 8), [1.2, 23, 13.1])) # From tuple and list
print(tuple_list_tensor.shape) # 2-d tensor
print(tuple_list_tensor)

torch.Size([2, 3])
tensor([[ 2.2200,  1.6750,  8.0000],
        [ 1.2000, 23.0000, 13.1000]])


><font color="red" size = 5>**NOTE:**</font> Using `torch.tensor` creates a copy of the data. Sometimes this may overwhelm your computer memory. For example if you do it on 100 Tb of data you end up with over 200 Tb of data. <font color="orange">_Remember, `tensor` always occupies larger memory than array_.</font>

# **`Tensor` Data Types**
---

Data types such as **integer**, **float**, **boolean** among others can easily be set for `tensor`s during creation. This is often crucial as operations on `tensor`s can be strictly data type dependent. That is, some operations may only need the data type to be float32 and not integer.

In [None]:
int_tensor = torch.ones((2, 3), dtype=torch.int16) # Specifying the data type to torch.int16
print(int_tensor)

tensor([[1, 1, 1],
        [1, 1, 1]], dtype=torch.int16)


Notice that in the printout, dtype is also shown. This did not happen in our earlier tensors because we did not specify it during creation. Let us see the same `tensor` under float64.

In [None]:
float64_tensor = torch.ones((2, 3), dtype = torch.float64)
print(float64_tensor.shape) # Of course 2-d tensor with two rows and three columns.
print(float64_tensor) # The printout also shows the dtype.


torch.Size([2, 3])
tensor([[1., 1., 1.],
        [1., 1., 1.]], dtype=torch.float64)


Other than definition of the data type of a tensor during creation, we can also specify a data type to an existing `tensor`. Say, I created a `tensor` called **float64_tensor** above, and we now want it to be **float32_tensor**. We can achieve that using `.to()` method as follows:

In [None]:
float32_tensor = float64_tensor.to(torch.float32)
print(float32_tensor) # Now this is float32

tensor([[1., 1., 1.],
        [1., 1., 1.]])


Notice here that float32 is not printed as was float64. This is because it is the **default** data type. You can confirm this by using `.dtype` attribute on the `tensor`.

In [None]:
float32_tensor.dtype

torch.float32

The following is a list of **common** data types that you will encounter most of the times:
* torch.bool<font color="red">
+ torch.int8
+ torch.uint8
+ torch.int16
+ torch.int32
+ torch.int64</font><font color="magenta">
+ torch.half
+ torch.float
+ torch.double
+ torch.bfloat16


# **Some Math operations with `Tensors`**
---
We now know how to create `tensor`s and also know various data types they can take. Next is what can we do with the `tensors`? To begin with we will do some basic addition with `scalar`, like numbers. For example, if we have a `tensor` of ones and we want to add 5 to each of the elements in it.
## **Operating a `tensor` with a `Scalar`(s)**

In [None]:
ones_tensor = torch.ones(2, 3)
print(ones_tensor)

# Add 5 to each of the elements

five_added = ones_tensor + 5
print(five_added)

tensor([[1., 1., 1.],
        [1., 1., 1.]])
tensor([[6., 6., 6.],
        [6., 6., 6.]])


It is evident that the `scalar` 5 is added to each of the elements in our original `ones_tensor`. Likewise, we can multiply a `tensor` by a `scalar` quantity  as follows:

In [None]:
torch.manual_seed(248)
random_tensor = torch.rand(2, 3)
print(random_tensor)
multiplied_tensor = random_tensor * 5
print(multiplied_tensor)

tensor([[0.1486, 0.6503, 0.1690],
        [0.0158, 0.4595, 0.9061]])
tensor([[0.7429, 3.2515, 0.8448],
        [0.0788, 2.2977, 4.5307]])


We can also supply a a chain of arithmetic operations including division, multiplication, addition, subtraction by `scalars`  as follows

In [None]:
torch.manual_seed(248)
random_tensor = torch.rand(2, 3)
print(random_tensor)
chained_operations = (((random_tensor / 3) * 7) + 9) - 89
print(chained_operations)

tensor([[0.1486, 0.6503, 0.1690],
        [0.0158, 0.4595, 0.9061]])
tensor([[-79.6533, -78.4826, -79.6058],
        [-79.9632, -78.9277, -77.8857]])


Let us try to confirm one element. Say the first element which is 0.1486

In [None]:
0.1486 / 3 # First we divide it by 3

0.04953333333333334

In [None]:
0.04953333333333334 * 7 # Then we multiply by 7

0.3467333333333334

In [None]:
0.3467333333333334 + 9 # Then we add 9

9.346733333333333

In [None]:
9.346733333333333 - 89 # Finally we subtract 89

-79.65326666666667

We can also perform power operations. For example getting square of a `tensor` or cube. That is, raising a `tensor` to power 2 or 3.

In [None]:
torch.manual_seed(248)
random_tensor = torch.rand(2, 3)
print(random_tensor)
power_three = random_tensor ** 3
print(power_three)

tensor([[0.1486, 0.6503, 0.1690],
        [0.0158, 0.4595, 0.9061]])
tensor([[3.2797e-03, 2.7500e-01, 4.8236e-03],
        [3.9115e-06, 9.7049e-02, 7.4405e-01]])


Performing manual confirmation with the first element in the `random_tensor`.

In [None]:
0.1486 * 0.1486 * 0.1486 # Correct to some decimal places.

0.0032813792560000008

><font color = "red" size = 5>**NOTE:**</font> When doing mathematical operations on `tensor`s involving a `scalar` the operation is distributed over each element. For example, when adding 5 to a `tensor` of size ([2, 3]), the value five will be added to all the 6 elements in the `tensor`.




## **Operations with several `tensor`s**

Other than the cases where we perform mathematical operations between a `tensor` and a `scalar`, more often, we will be operating several `tensor`s. This can be achieved, generally, when the `tensor`s involved have the same shape. For example:

In [None]:
torch.manual_seed(248)
a = torch.rand(2, 3)
b = torch.ones(2, 3)

print(a * b) # The two tensors have same shape hence can be added together.

tensor([[0.1486, 0.6503, 0.1690],
        [0.0158, 0.4595, 0.9061]])


We can also raise `tensor` `a` to the power of `b`.

In [None]:
print(a ** b)

tensor([[0.1486, 0.6503, 0.1690],
        [0.0158, 0.4595, 0.9061]])


Of course multiplying, dividing, subtracting them are also possible.

In [None]:
print(a * b)
print(a / b)
print(a - b)

tensor([[0.1486, 0.6503, 0.1690],
        [0.0158, 0.4595, 0.9061]])
tensor([[0.1486, 0.6503, 0.1690],
        [0.0158, 0.4595, 0.9061]])
tensor([[-0.8514, -0.3497, -0.8310],
        [-0.9842, -0.5405, -0.0939]])


><font color = "red" size = 5>**NOTE:**</font> There is an exception to this rule of same shape. This is called <font color = "steelblue">**BROADCASTING**</font>.

## **`tensor` broadcasting**

This is a way that enables us to perform operations between `tensor`s that have **some similarities** in their shapes. The shapes do not have to be identical. For example, one tensor can be ([2, 3]) and the other is ([1, 3]).

In [None]:
torch.manual_seed(248)
random_tensor = torch.rand(2, 3)
torch.manual_seed(248)
other_tensor = torch.rand(1, 3)

print(random_tensor)
print(other_tensor)

tensor([[0.1486, 0.6503, 0.1690],
        [0.0158, 0.4595, 0.9061]])
tensor([[0.1486, 0.6503, 0.1690]])


Let us print out the shapes to confirm that they are different.

In [None]:
print(random_tensor.shape)
print(other_tensor.shape)

torch.Size([2, 3])
torch.Size([1, 3])


We will try to add, subtract, multiply, divide and raise to power.

In [None]:
print(random_tensor + other_tensor)
print(random_tensor - other_tensor)
print(random_tensor * other_tensor)
print(random_tensor / other_tensor)
print(random_tensor ** other_tensor)


tensor([[0.2971, 1.3006, 0.3379],
        [0.1643, 1.1098, 1.0751]])
tensor([[ 0.0000,  0.0000,  0.0000],
        [-0.1328, -0.1908,  0.7372]])
tensor([[0.0221, 0.4229, 0.0285],
        [0.0023, 0.2988, 0.1531]])
tensor([[1.0000, 1.0000, 1.0000],
        [0.1060, 0.7067, 5.3630]])
tensor([[0.7533, 0.7559, 0.7405],
        [0.5397, 0.6031, 0.9835]])


><font color = "red" size = 5>**NOTE:**</font> Even though the shapes are not identical, they have similarities hence could be operated. The following rules must be fulfilled for **broadcasting** to work:

<ul type = "circle">
  <li>Each tensor must have at least one dimension.</li>
  <li>Compare dimensions of tensors from last to first.</li>
    <ol>
    <li>Each dimension must be equal, <font color = "red">OR</font></li>
    <li>One of the dimensions must be of size 1, <font color = "red">OR</font></li>
    <li>The dimension does not exist in one of the tensors.</li>
    </ol>

</ul>  
Given these conditions, let us look at the previous `tensor`s.

In [None]:
print(random_tensor.shape)
print(other_tensor.shape)

torch.Size([2, 3])
torch.Size([1, 3])


| The `tensor` | Rows | Columns |
|--------------|:------|:---------|
| `random_tensor`| 2| 3|
| `other_tensor` | 1 | 3|

Looking at the two `tensor`s against the rules:

<ul type = "circle">
  <li>Each tensor must have at least one dimension.<font color = "green" size = 5><b> PASS </b></font></li>
  <li>Compare dimensions of tensors from last to first.</li>
    <ol>
    <li>Each dimension must be equal, <font color = "red">OR</font></li>
    <li>One of the dimensions must be of size 1, <font color = "red">OR <font color = "green" size = 5><b> PASS </b></font></font></li>
    <li>The dimension does not exist in one of the tensors.</li>
    </ol>

</ul>

So, the two `tensor`s satisfy the requirements for broadcasting to take place hence they can be operated on just like was with scalar. This is an important concept to grasp as it underlies a lot of operations in deep learning. To further sink this concept home, consider the following **broadcastable** `tensor`s:


In [None]:
a = torch.ones(4, 3, 2)
b = torch.rand(   3, 2)
print(a * b)

tensor([[[0.0158, 0.4595],
         [0.9061, 0.9113],
         [0.8757, 0.6538]],

        [[0.0158, 0.4595],
         [0.9061, 0.9113],
         [0.8757, 0.6538]],

        [[0.0158, 0.4595],
         [0.9061, 0.9113],
         [0.8757, 0.6538]],

        [[0.0158, 0.4595],
         [0.9061, 0.9113],
         [0.8757, 0.6538]]])


The above operation is possible because:

1. Both `tensor`s have at least one dimension.
2. From the last dimensions coming to the first, the dimensions are equal.
3. A dimension does not exist in `tensor` `b`.

In [None]:
a = torch.rand(4, 3, 2)
c = torch.rand(   3, 1)
print(a ** c)

tensor([[[0.3544, 0.9951],
         [0.7225, 0.7968],
         [0.1666, 0.4672]],

        [[0.1710, 0.7572],
         [0.6262, 0.0293],
         [0.6357, 0.8451]],

        [[0.6077, 0.4984],
         [0.8315, 0.8361],
         [0.5781, 0.9051]],

        [[0.7544, 0.7733],
         [0.6465, 0.7096],
         [0.1289, 0.4218]]])


The operation is possible because:
<ol type = "a">
<li>Both tensors have at least one dimension.</li>
<li>One of the dimensions is 1.</li>
<li>One of the dimensions is equal in both a and c.</li>
</ol>  

In [None]:
a = torch.rand(4, 3, 2)
d = torch.rand(   1, 2)
print(a / d)

tensor([[[1.6133, 1.3889],
         [0.1296, 0.0552],
         [1.4020, 0.0900]],

        [[0.4603, 1.2146],
         [0.0323, 1.3905],
         [0.7545, 1.0433]],

        [[0.2705, 1.0885],
         [1.6729, 1.0340],
         [1.4824, 0.2456]],

        [[0.3745, 1.4759],
         [0.7574, 1.3992],
         [0.7349, 0.9948]]])


The above operation is possible because:
<ol type = "i">
  <li>Both tensors have at least one dimension.</li>
  <li>One of the dimensions is 1.</li>
  <li>One of the dimensions is equal in both a and d.</li>
</ol>

## **Additional useful math with `tensors`**

In [None]:
# Common functions

torch.manual_seed(248)
random_tensor = torch.rand(2, 3)
print(torch.abs(random_tensor)) # Returns non negative values in the tensor. Of course all of them because they range from 0 to 1.
print(torch.ceil(random_tensor)) #
print(torch.floor(random_tensor))
print(torch.clamp(random_tensor, -0.5, 0.5))
print(f"Max: {torch.max(random_tensor)}")
print(f"Min: {torch.min(random_tensor)}")
print(f"Mean: {torch.mean(random_tensor)}")
print(f"Median: {torch.median(random_tensor)}")
print(f"Std: {torch.std(random_tensor)}")
print(f"Var: {torch.var(random_tensor)}")
print(f"Product: {torch.prod(random_tensor)}")
print(f"Unique: {torch.unique(random_tensor)}")

tensor([[0.1486, 0.6503, 0.1690],
        [0.0158, 0.4595, 0.9061]])
tensor([[1., 1., 1.],
        [1., 1., 1.]])
tensor([[0., 0., 0.],
        [0., 0., 0.]])
tensor([[0.1486, 0.5000, 0.1690],
        [0.0158, 0.4595, 0.5000]])
Max: 0.9061495661735535
Min: 0.01575601100921631
Mean: 0.3915477693080902
Median: 0.16896241903305054
Std: 0.3424002230167389
Var: 0.11723791062831879
Product: 0.0001071081351255998
Unique: tensor([0.0158, 0.1486, 0.1690, 0.4595, 0.6503, 0.9061])


In [None]:
# Comparisons
torch.manual_seed(248)
random_tensor = torch.rand(2, 3)
torch.manual_seed(248)
other_tensor = torch.rand(1, 3)
print(torch.eq(random_tensor, other_tensor)) # Returns boolean with True where elements are equal and False otherwise.

tensor([[ True,  True,  True],
        [False, False, False]])


# **Copying `tensor`s**

As is the case with any other objects in Python, assigning a `tensor` to a new variable name makes the new variable just a _label_ to the original `tensor`. This means that a new copy is <font color = "red" size = 5>**NOT**</font> created. If you alter values in the new variable, values in the original variable also change. For example:

In [None]:
torch.manual_seed(248)
random_tensor = torch.rand(2, 3)
print(f"Random tensor: \n {random_tensor}")

new_random_tensor = random_tensor # Creating new label to the original random_tensor.

print(f"New random tensor: \n{new_random_tensor}")

Random tensor: 
 tensor([[0.1486, 0.6503, 0.1690],
        [0.0158, 0.4595, 0.9061]])
New random tensor: 
tensor([[0.1486, 0.6503, 0.1690],
        [0.0158, 0.4595, 0.9061]])


In [None]:
# Let us manipulate the new ranom tensor

new_random_tensor[0][0] = 2
print(new_random_tensor)

tensor([[2.0000, 0.6503, 0.1690],
        [0.0158, 0.4595, 0.9061]])


In [None]:
# Let's check what happened to the original random tensor
print(random_tensor)

tensor([[2.0000, 0.6503, 0.1690],
        [0.0158, 0.4595, 0.9061]])


><font color = "red" size = 5>**NOTE:**</font> The value has changed here too. This is because the labels are referring to the same object in memory.

To have a separate copy of the original `tensor` to work with, you can use the **clone**  method as follows:

In [None]:
torch.manual_seed(248)
random_tensor = torch.rand(2, 3)
print(random_tensor)

new_random_tensor = random_tensor.clone()

# Manipulate the new random tensor.
new_random_tensor[0][0] = 2
print(f"Random tensor: \n {random_tensor}")
print(f"New random tensor: \n {new_random_tensor}")
# The random tensor remains unchanged.

tensor([[0.1486, 0.6503, 0.1690],
        [0.0158, 0.4595, 0.9061]])
Random tensor: 
 tensor([[0.1486, 0.6503, 0.1690],
        [0.0158, 0.4595, 0.9061]])
New random tensor: 
 tensor([[2.0000, 0.6503, 0.1690],
        [0.0158, 0.4595, 0.9061]])


In [None]:
assert random_tensor is not new_random_tensor
torch.eq(random_tensor, new_random_tensor)

tensor([[False,  True,  True],
        [ True,  True,  True]])

# **Working on Graphical Processing Units**

So far we have been working on the Central Processing Unit, CPU. However, normally we would be compelled to use GPUs because of accelerated computation. `PyTorch` excels in utilizing NVIDIA GPUs that are Compute Unified Device Architecture (CUDA) compatible. There are two things here:
1. Check whether your machine has CUDA.
2. If working on colab, you can set GPU by going to **Runtime tab ==> Change runtime type ==> Hardware Accelerator T4 GPU**. You can also access other GPUs or TPU if you have paid version.

><font color = "red" size = 5>**NOTE:**</font> If running on colab, consider changing Runtime only after testing everything on CPU and very ready to run your code. This is because when using the paid version, you can get timed out. In fact, colab will warn you if you enable GPU but do not use it.

First we will check whether we have CUDA.

In [None]:
torch.cuda.is_available() # If CUDA is available, this will return true. Currently it is False.

False

Another fancy way to put it:

In [None]:
if torch.cuda.is_available():
  print("Yeeeey!! we have CUDA, we are going to enjoy faster computation!!")
else:
  print("Sorry, we only have to bear with the tortoise speed of CPU for now. ")

Sorry, we only have to bear with the tortoise speed of CPU for now :) 


When working on CPU, your data which are being computed are placed on the Random Access Memory of the computer. Unfortunately, GPU does not "see" the RAM. Therefore, we have to move data to a memory where GPU can "see" it and do the computation. Accordingly, we "move data to GPU". This is an important consideration when buying a GPU. It is important to know how much data it can hold and do computation on at one go. Generally, a GPU with memory of at least 16 Gbs should be sufficient for most of the task here. However, the more the better when having a lot of data loaded for computation at the same time.

All computers have CPU, however, some may not have CUDA enabled GPUs. So, when deciding on whether to use CUDA or not, we should write a logical statement which checks whether CUDA is available, so that it can be used. If CUDA is not available, then we let the computation to happen on the CPU. This we do as follows:

In [None]:
if torch.cuda.is_available():
  torch.manual_seed(248)
  gpu_rand = torch.rand(2, 3, device = "cuda")
  print(gpu_rand)
else:
  print("Sorry, we only have CPU for now!")


Sorry, we only have CPU for now!


Now, in a typical deep learning workflow, we would not be repeatedly doing this. It would be efficient to create a variable name **device** which holds cuda if available or defaults to cpu. So whenever we want to put data or model to GPU or CPU we only provide the name **device**. This we can define logically as follows:

In [None]:
if torch.cuda.is_available():
  device = torch.device('gpu')
else:
  device = torch.device('cpu')

print(device)

cpu


Now you can create a `tensor` and specify the device it should be on. Depending on the device you have. Remember the default is CPU.

In [None]:
torch.manual_seed(248)
random_tensor = torch.rand(2, 3, device = device)
print(random_tensor)

tensor([[0.1486, 0.6503, 0.1690],
        [0.0158, 0.4595, 0.9061]])


If you already had a tensor and want to move it to device, you can use the `.to` method as follows:

In [None]:
new_random_tensor = random_tensor.to(device)
print(new_random_tensor)

tensor([[0.1486, 0.6503, 0.1690],
        [0.0158, 0.4595, 0.9061]])


# **Handling `tensor` shapes**

Shapes of `tensor`s can be changed to suit some operations on them. For example, you may want to add or remove a dimension. Remember, when supplied images for analysis in `PyTorch`, they are normally taken in **batches** or **mini-batches**. This means, several images are taken at a time for training the model. For example, 16 images. This is therefore adding another dimension to the image `tensor`s. For example, if your image has 64 rows and 64 channels, then you pick 16 of them, then the tensor is of shape ([16, 64, 64]). However, if each image has 3 color channels/bands, then you end up with ([16, 3, 64, 64]). This results into a convention called N, C, H, W.
1. N ==> Number of batches or mini-batches.
2. C ==> Number of channels/bands per image.
3. H ==> Number of rows per image.
4. W ==> Number of columns per image.

## **unsqueeze**

This is the function which we use to add a dimension to a `tensor`. For example, if we have a single image with three channels, say ([3, 64, 64]), we may need to unsquueze it before passing it to the model. The model expends N dimension to, even if it is a batch of 1 image. This we can attain as follows:  

In [3]:
torch.manual_seed(248)
random_tensor = torch.rand(3, 64, 64)
print(random_tensor.shape)


torch.Size([3, 64, 64])


In [6]:
# Unsqueeze it, that is add N (batch) dimension.
unsqueeze_random_tensor = random_tensor.unsqueeze(dim=0) # Specify 0 as index of first dimension
print(unsqueeze_random_tensor.shape)


torch.Size([1, 3, 64, 64])


Now it is clear that a dimension with extent 1 has been added. Now we have **([1, 3, 64, 64])**, meaning **N = 1, C = 3, H = 64,** and **W = 64**.
This can therefore be pushed to model training. Remember, by adding the dimension, we have not changed the data in the object. In fact, dimension of extent 1 does not change the number of elements in a `tensor`. For example, let us create a `tensor` of dimension 1, then unsqueze it even twice and print the final value.

In [14]:
torch.manual_seed(248)
random_tensor = torch.rand(1)
print(random_tensor)
print(random_tensor.unsqueeze(0).unsqueeze(0))

tensor([0.1486])
tensor([[[0.1486]]])


Note that the value remains the same 0.1486. No change, but the number of dimensions change. As confirmed below.

In [15]:
torch.manual_seed(248)
random_tensor = torch.rand(1)
print(random_tensor.shape)
print(random_tensor.unsqueeze(0).unsqueeze(0).shape)

torch.Size([1])
torch.Size([1, 1, 1])


## **squeeze**

This is doing the opposite of **unsqueeze**. It is removing a dimension with extent 1 from a `tensor`. For example, sometimes you might have a `tensor` which contains 20 elements but it is of dimension ([1, 20]). This means that the first dimension could just be a batch. So, to extract the actual 20 values from this `tensor`, we can squeeze it as follows:

In [22]:
torch.manual_seed(248)
random_tensor = torch.rand(1, 20)
print(random_tensor)
print(random_tensor.shape) # The shape is ([1, 20])

tensor([[0.1486, 0.6503, 0.1690, 0.0158, 0.4595, 0.9061, 0.9113, 0.8757, 0.6538,
         0.3453, 0.9950, 0.6857, 0.7683, 0.0785, 0.3395, 0.1636, 0.7520, 0.5809,
         0.0166, 0.5257]])
torch.Size([1, 20])


In [23]:
torch.manual_seed(248)
random_tensor = torch.rand(1, 20)
print(random_tensor.squeeze())
print(random_tensor.squeeze().shape) # The shape is now ([20])

tensor([0.1486, 0.6503, 0.1690, 0.0158, 0.4595, 0.9061, 0.9113, 0.8757, 0.6538,
        0.3453, 0.9950, 0.6857, 0.7683, 0.0785, 0.3395, 0.1636, 0.7520, 0.5809,
        0.0166, 0.5257])
torch.Size([20])


><font color = "red" size = 5>**NOTE:**</font> Unsqueze can help us to handle broadcasting. Remember in broadcasting, there is that rule about a dimension having extent 1. This is exactly what unsqueeze does, it adds a dimension of extent 1. This means that when we lack such, we can add it to enable our operation run.

In [25]:
torch.manual_seed(248)
random_tensor = torch.rand(4, 3, 2)
torch.manual_seed(248)
other_tensor = torch.rand(3)

print(random_tensor.shape) # Note second last dimension has value 3.
print(other_tensor.shape) # The extent of 0th dimension is also 3.

torch.Size([4, 3, 2])
torch.Size([3])


In the above case, we can only add another dimension in the `other_tensor` in order to make it have extent 1 in the last dimension. The second last dimension will be 3 (same as random_tensor) and there is a missing dimension. This makes broadcasting work. Currently it cannot work as follows:

In [27]:
# print(random_tensor + other_tensor) # This is returning Runtime error when run.

In [30]:
torch.manual_seed(248)
random_tensor = torch.rand(4, 3, 2)
torch.manual_seed(248)
other_tensor = torch.rand(3).unsqueeze(1)

print(random_tensor.shape)
print(other_tensor.shape) # Now they can be operated because last dimension is 1 and second last is 3 which is similar in both cases.

torch.Size([4, 3, 2])
torch.Size([3, 1])


In [33]:
print(random_tensor ** other_tensor) # This can now work.

tensor([[[0.7533, 0.9381],
         [0.3147, 0.0673],
         [0.8769, 0.9835]],

        [[0.9863, 0.9805],
         [0.7585, 0.5008],
         [0.9991, 0.9382]],

        [[0.9616, 0.6852],
         [0.4954, 0.3081],
         [0.9530, 0.9123]],

        [[0.5441, 0.9089],
         [0.8561, 0.7175],
         [0.8864, 0.9645]]])


## **reshape**

The last bit of this section on handling `tensor` shapes is reshape. This is important when we want to convert, for instance, an image shape to linear shape. The linear shape will normally be 1 dimensional. Imagine it as instead of having a tensor of [64,64], you end up having a tensor of 64 * 64. That is all values in the original image are now held, poured out, as a single 'vector'. See the following:

In [36]:
torch.manual_seed(248)
random_tensor = torch.rand(2, 3)
print(random_tensor.shape)
reshaped_tensor = random_tensor.reshape(2 * 3)
print(reshaped_tensor.shape) # This is now a single dimension tensor with 6 elements in it.

torch.Size([2, 3])
torch.Size([6])


In [38]:
# We can print both of them to see how the elements are arranged:

print(random_tensor)
print(reshaped_tensor) # Elements spread out

tensor([[0.1486, 0.6503, 0.1690],
        [0.0158, 0.4595, 0.9061]])
tensor([0.1486, 0.6503, 0.1690, 0.0158, 0.4595, 0.9061])


Of course you can reshape the reshaped tensor to take it back to the original.

In [42]:
back_to_original = reshaped_tensor.reshape(2, 3)
back_to_original

tensor([[0.1486, 0.6503, 0.1690],
        [0.0158, 0.4595, 0.9061]])

## **`Numpy` `ndarrays` and PyTorch `tensors` exchange**

One may already have data in numpy ndarray and wants to convert them directly to `tensor`s or vise versa. This is normally a very easy thing to do as indicated below:

In [46]:
import numpy as np
np.random.seed(248)
np_arrays = np.random.rand(2, 3)
print(np_arrays)

[[0.69198662 0.48503501 0.02913885]
 [0.56996588 0.84630373 0.99216786]]


In [50]:
# Convert to tensor

tensor_from_numpy = torch.from_numpy(np_arrays)
print(tensor_from_numpy)

tensor([[0.6920, 0.4850, 0.0291],
        [0.5700, 0.8463, 0.9922]], dtype=torch.float64)


Note that not only is the conversion successful, but also that the default data type of float64 from numpy is also inherited. In the next cell we show how to convert from `tensor` back to numpy.

In [51]:
back_to_numpy = tensor_from_numpy.numpy()
print(back_to_numpy)

[[0.69198662 0.48503501 0.02913885]
 [0.56996588 0.84630373 0.99216786]]


><font color = "magenta" size = 5>**NOTE:**</font> When converting between `ndarray` and `tensor`, the new object will share same memory as the former. Therefore, any changes in the data will affect both. For example:

In [52]:
back_to_numpy[0][0] = 248

print(back_to_numpy)
print(tensor_from_numpy)

[[2.48000000e+02 4.85035006e-01 2.91388539e-02]
 [5.69965875e-01 8.46303728e-01 9.92167859e-01]]
tensor([[2.4800e+02, 4.8504e-01, 2.9139e-02],
        [5.6997e-01, 8.4630e-01, 9.9217e-01]], dtype=torch.float64)


## **Exercises**

1. Create a tensor with random integers between 0 and 10 called `ten` of shape 5 by 7. Ensure the dtype is `int16`. Ensure you set manual seed, preferably **248** so that you get same answer as my `tensor`.
2. In the `ten` created, index element in row 3 column 4.
  * Print it as a `tensor`
  * print it as an item
3. Slice the `ten` from rows 3 to 4 and columns 4 to 6.
4. Why will each of the following fail?

In [None]:
a = torch.ones(4, 3, 2)
b = torch.rand(4, 3)

print(a + b)

In [None]:
c = torch.rand(  2, 3)
print(a ** c)

In [None]:
d = torch.rand(0, )
print(a - d)

RuntimeError: The size of tensor a (2) must match the size of tensor b (0) at non-singleton dimension 2

# **Miscellaneous**
---
### **Tensor attributes**

After creating a `tensor`, we can see some information about it using its attributes. Some of the attributes include size, data type, ndim etc:

In [None]:
m = torch.rand(5, 6)
m.dtype # This tells us that it is float32

torch.float32

In [None]:
m.device # This returns device on which the tensor is located. In this case cpu.

device(type='cpu')

In [None]:
m.shape # Tells us the rank or dimension of the tensor. Here it is 5 x 6.

torch.Size([5, 6])

In [None]:
m.ndim # Tells us it has 2 dimensions.

2

There are additional attributes like:


*   `requires_grad`
*   `grad`
*   `grad_fn`
*   `s_cuda`
*   `is_sparse`
*   `is_quantized`
*   `is_leaf`
*  `is_mkldnn`

Well no need to dig them now.



## **`Tensor` Operations**
Well our task is not only to create `tensor`, but to work with them. So, in this part, we will cover some of the `tensor` operations, in fact we did them earlier when we were using transforms from `torchvision`. Some of the operations we will do here include slicing portions of the data, combining `tensor`, spliting `tensor` and both simple and advanced operations on them.

### **Indexing `tensors`**

We can index `tensor` by using [ ]. Just like with numpy arrays. In this example, we create a `tensor` then extract the element in the second row and second column.

In [None]:
x = torch.tensor([[1, 2], [3, 4], [5, 6], [7, 8]])
print(x)

tensor([[1, 2],
        [3, 4],
        [5, 6],
        [7, 8]])


Now to index number 4:

In [None]:
x.shape # Confirming the shape of the tensor

torch.Size([4, 2])

In [None]:
print(x[1, 1].item())

4


In [None]:
print(x[1][1].item()) # This can be indexed as so too.

4


To index number 7:

In [None]:
print(x[3, 0].item())

7


In [None]:
print(x[3][0].item()) # the .item() helps to return the actual number and not tensor.

7


### **Slicing `tensors`**

We can slice `tensor` by using [ ]. Just like with numpy arrays. In this example, we create a `tensor` then slice a portion of it.

In [None]:
x.shape

torch.Size([4, 2])

In [None]:
x

tensor([[1, 2],
        [3, 4],
        [5, 6],
        [7, 8]])

In [None]:
x[:2, :] # Slices the first two rows and all columns.

tensor([[1, 2],
        [3, 4]])

In [None]:
x[1:3, :] # Slices from row index 1 to 3 and all columns.

tensor([[3, 4],
        [5, 6]])

In [None]:
x[:, 1] # All rows in column 1

tensor([2, 4, 6, 8])

In [None]:
x[:, 0] # All rows in column index 0

tensor([1, 3, 5, 7])

In [None]:
x[:, -1] # All rows in column 1

tensor([2, 4, 6, 8])

In [None]:
torch.manual_seed(248)
ten = torch.randint(low = 0, high = 10, size = (5, 7), dtype = torch.int16)
ten

tensor([[2, 7, 3, 0, 6, 7, 9],
        [5, 0, 0, 7, 4, 9, 7],
        [5, 1, 6, 7, 0, 0, 1],
        [9, 2, 0, 5, 9, 5, 5],
        [3, 5, 4, 1, 9, 0, 3]], dtype=torch.int16)

In [None]:
ten[2, 2]

tensor(6, dtype=torch.int16)

In [None]:
ten[2, 2].item()

6

In [None]:
ten[2:4, 3:6]

tensor([[7, 0, 0],
        [5, 9, 5]], dtype=torch.int16)

## **References**

https://pytorch.org/docs/stable/tensors.html

https://pytorch.org/tutorials/beginner/introyt/tensors_deeper_tutorial.html