# <font color = 'pickle'>**Lecture 1.1. Intoduction to PyTorch Tensors**


# <font color = 'pickle'>**Tensors**

- Tensors are the basic building blocks of any deep learning network.

- They are used to represent all the different types of data be it images, sound files, text data etc.

- Tensors are **order N-matrix**.


If N=1, tensor will basically be a **vector**.
If N=2, tensor will be a **2-d matrix**.

Why Tensors and not NumPy arrays?

- NumPy only supports CPU computation.
- Tensor class supports automatic differentiation.

**Let us start by importing PyTorch library and understand some of the basic functions on tensors.**

## <font color = 'pickle'>**Importing PyTorch Library** 

In [None]:
import torch
import numpy as np

## <font color = 'pickle'>**Declaring a Tensor**

*torch.Tensor* is a multi-dimensional matrix containing elements of the same data type. 

We can create a Tensor of n dimensions using PyTorch in the following way:
    
    name_of_tensor = torch.Tensor(s1, s2, s3....sn)

In [None]:
t = torch.Tensor(1, 2, 4)

Here, we are creating a tensor "t" with **3 dimensions** and size 1 x 2 x 4.

## <font color = 'pickle'>**Basic functions on Tensors**

In [None]:
# We can get size of each dimension of tensor using size() and shape method
print(f'size of tensor using size() method: {t.size()}')
print(f'size of tensor using shape attribute: {t.shape}')

In [None]:
# we can pass argument to size() to get size of a particular dimension
# for example t.size(2) - will give us size of second dimesnion
# indexing in pytorch starts with zero, so zero represents first dimension, one represets second dimension
# and so on
t.size(2)

In [None]:
# Printing dimensions of tensor using dim() method
t.dim()

In [None]:
# Printing the values of tensor
# We have not initialized our tensor yet, so it will contain random values.
print(t)

In [None]:
# Printing number of elements of tensor using numel() method
torch.numel(t)

## <font color = 'pickle'>**Initializing a Tensor**

In [None]:
# Initializing a 1-d vector
# initialize tensor using tensor() method
vec1 = torch.tensor([1, 2, 5, 7, 8, 9])

# Printing size and dimensions of 1-d vector v
print(f'Size: {vec1.size()}')
print(f'Dimension: {vec1.dim()}')

In [None]:
# initialize tensor using Tensor() method 
vec2 = torch.Tensor([1, 2, 5, 7, 8, 9])

# Printing size and dimensions of 1-d vector v
print(f'Size: {vec2.size()}')
print(f'Dimension: {vec2.dim()}')

In [None]:
# Initializing a 2-d matrix
vec2d = torch.tensor([[1, 2, 5], [7, 8, 9]])

# Printing size and dimensions of 1-d vector v
print(f'Size: {vec2d.size()}')
print(f'Dimension: {vec2d.dim()}')

## <font color = 'pickle'>**Accessing elements of a Tensor**

We can access individual elements of a Tensor using **index values**. Indexing always **starts from 0**.

For example if the tensor is: `[10, 12, 31, 34]` 

Index of 10 is 0, index of 12 is 1 and so on.

In [None]:
t1 = torch.tensor([[1, 2, 5], [7, 8, 9]])

# Printing all elements
print(t1)

# Get the first row
print(t1[0])

In [None]:
# Get the first element of the second row
print(t1[1][0])

## <font color = 'pickle'>**Accessing a Sub-Tensor**

In [None]:
t1 = torch.tensor([1, 5, 9, 13, 21, 45, 67, 34])

# Specify [from: to : step)  
# by default step is 1
# here from is inclusive but to is not 

# Get a subarray [9, 13, 21, 45]
print(t1[2:6]) # index 2 (i.e 9) is inclusive but index 6 (i.e. 67) is not and step size is 1

# Get a subarray [9, 21]
print(t1[2:6:2]) # step size is 2

In [None]:
# subarrays created using slicing and indexing do not create a copy, 
# modifying the subarray modifies the original tensor as well

t1 = torch.tensor([1, 5, 9, 13, 21, 45, 67, 34])
t2 = t1[2:6].clone()
print('array and subarray before modifying subarray')
print(t1)
print(t2)

# modify subarray

t2[0] = 100
print('\narray and subarray after modifying subarray')
print(t1)
print(t2)

As we can see above, modifying subarray, modifes the original array as well

In [None]:
t1 = torch.tensor([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]])
t1

In [None]:
# get the sub array [[6,7], [10,11]]
t1[1:3,1:3]

## <font color = 'pickle'>**Some commonly used Tensors**

####<font color = 'pickle'>**1) Tensor containing all zeros/ all ones/ or any value**

In [None]:
# Tensor containing all zeros, size = 10
z1 = torch.zeros(5)      

# Tensor containing all zeros, size = 2 X 2 X 3
z2 = torch.zeros(2 ,2, 3)

print(z1)
print(z2)

In [None]:
# Tensor containing all ones, size = 7
z1 = torch.ones(7)      

# Tensor containing all ones, size = 1 X 2 X 3
z2 = torch.ones(1, 2, 3)

print(z1)
print(z2)

In [None]:
# We can also use torch.full(size, fill_value) to create a tensor filled with any value
# Tensor containing all fives, size = 1 X 2 X 3

z3 = torch.full(size = (1, 2, 3), fill_value = 5)
print(z3)

#### <font color = 'pickle'>**2) Tensor with elements in a particular range**
Suppose we need a tensor with values `1, 2, 3, 4.....n. ` 

We can simply specify the range and tensor will automatically get filled with these values.

In [None]:
# Creating a tensor with integers from 1 to 5 with space 1: [1, 2, 3, 4, 5]
# syntax arange(start, end, step) - create tensor with values in the interval [start, end). 
# start is inclusive , end is not i.e. start <= values < end
tr1 = torch.arange(1, 6) 
print(tr1)

In [None]:
# Creating a tensor with integers from 0 to 10 with space 2 using "step" parameter: [0, 2, 4, 6, 8, 10]
tr2 = torch.arange(0, 11, step=2)
print(tr2)

We can also use `torch.linspace()` to generate evenly spaced values between two numbers

In [None]:
# Generate 10 evenly spaced values between 0 and 1 (both inclusive)
t1 = torch.linspace(0, 1, 10)
print(t1)

#### <font color = 'pickle'>**3) Tensor with elements from probability distribution**

We can use the randn function to get elements from standard normal probabilty distribution i.e. normal dustribution with mean = 0 and variance = 1. If we want to select elements from normal ditsribution with different mean and variance then we should use torch.normal

In [None]:
torch.manual_seed(0) # for reproducabilty so that we get same results everytime we run this cell
# Sample 500,000 values from standard normal distribution (mean = 0 , variance = 1)
t1 = torch.randn(500000) 

# Sample 500,000 values from normal distribution (mean = 5 , std = 2)
t2 = torch.normal(mean = 5, std = 2, size = (500000,))

In [None]:
print('Mean and std of tensor using torch.randn')
print(torch.mean(t1))
print(torch.std(t1))

print('\nMean and std of tensor using torch.normal')
print(torch.mean(t2))
print(torch.std(t2))

In [None]:
torch.manual_seed(0) # for reproducabilty so that we get same results everytime we run this cell
t1 = torch.randn(5, 2) # we sampled 10 values from standard noemal distribution. (5, 2) is the shape.
t1

We can also sample from other distributions like torch.rand, torch.randint etc.

### <font color = 'pickle'>**4) Empty Tensor**
We can create uninitialized  tensors using torch.empty.
<br> Note : as seen earlier we can use torch.Tensor(shape) without passing any elements to create uninitialized  tensor. For code clarity, it is better to create uninitialized  tensors using torch.empty

In [None]:
# create empty tensor of shape (2, 4)
empty_tensor = torch.empty(2, 4)
empty_tensor

### <font color = 'pickle'>**5) Commonly used tensors based on shape of other tensors**

We can also use `torch.zeros_like(input)`, `torch.ones_like(input)`, `torch.full_like(input)` and `torch.empty_like(input)` to create tensors based on the shape of other tensors




In [None]:
input_tensor = torch.arange(6).view(2, 3)
input_tensor.shape

In [None]:
print(torch.ones_like(input_tensor))
print(torch.zeros_like(input_tensor))
print(torch.full_like(input_tensor, 5))
print(torch.empty_like(input_tensor))

###<font color = 'pickle'> **6) Identity Matrix**

Identity matrix is a matrix which has 1's along the diagnal and zeros everywhere else.

In [None]:
# Identity matrix of size 3
id_matrix = torch.eye(3)
print(id_matrix)

In [None]:
# Identity matrix of size 5
id_matrix = torch.eye(5)
print(id_matrix)

## <font color = 'pickle'>**Changing the shape of tensors**

### <font color = 'pickle'>**1) Reshape**

If we want to change the shape of our tensor, without affecting the elements present, we can use the ***reshape*** function.

In [None]:
# Initializing a tensor with 10 elements from 0 to 9
t = torch.arange(10)
print(t)

# Changing the shape of tensor t from 1x10 to 2x5
tr = t.reshape(2, 5)
print(tr)

If we have to specify just 1 dimension in reshape function and want the function to calculate the second dimension itself, we can write `-1` in place of second dimension. 

For 2 rows, we will write `reshape(2,-1)`

For 5 columns, we will write `reshape(-1,5)`. 

In [None]:
# Changing the shape of tensor t from 1 row to 2 rows
tr1 = t.reshape(2, -1)
print(tr1)

In [None]:
# Changing the shape of tensor t from 10 columns to 5 columns
tr2 = t.reshape(-1, 5)
print(tr2)

### <font color = 'pickle'>**2) View**

We can allow a tensor to be a view of an existing tensor. It performs the same operation as reshape. The only difference is that View will not create a copy and will allow us to perform fast and memory efficient computations whereas reshape may or may not share the same memory. There's a good discussion of the differences [here](https://stackoverflow.com/questions/49643225/whats-the-difference-between-reshape-and-view-in-pytorch).

Line from above link " *Another difference is that reshape() can operate on both contiguous and non-contiguous tensor while view() can only operate on contiguous tensor* "

[Definition of contiguous](https://stackoverflow.com/questions/26998223/what-is-the-difference-between-contiguous-and-non-contiguous-arrays/26999092#26999092)


In [None]:
# Initializing a tensor with 10 elements from 0 to 9
t = torch.arange(10)
print(t)
# Changing the shape of tensor t from 1x10 to 2x5
t.view(2, 5)

Views can reflect changes from the base tensor.

In [None]:
# Create a view of tensor t
tr = t.view(2, 5)

# Before change in base tensor
print(f'before changing the base tensor\n{tr}')

# Modifying element of base tensor
t[0] = 67

# After change in base tensor
print(f'\nafter changing the base tensor\n {tr}')


In [None]:
# we can use -1 with view as well. 
t = torch.rand((4, 5))
t1 = t.view(2, -1)
print(t1.shape)

In [None]:
# we can also flatten the tensor (convert the tensor to one dimensional tensor) by using view(-1)
# this gives the same result as method flatten()
t2 = t.view(-1)
t3 = t.flatten()
print(t2.shape)
print(t3.shape)

###<font color = 'pickle'> **3) Adding and removing dimensions of size 1**
- Insert a dimension of size 1 at a specific location (location specified by dim) using `torch.unsqueeze(dim)`
- Remove a dimension of size 1 at a specific location (location specified by dim) using `torch.squeeze(dim)`
- Remove all dimensions of size 1 using `torch.squeeze()`
- Insert dimenion of size 1 using `None `keyword
- Remove dimenion of size 1 using `0 `keyword

In [None]:
# Initialize an tensor
t1 = torch.ones(2, 2)
print(t1)
t1.shape

In [None]:
# add dimension of size 1 in the beginning using unsqueeze method and argument dim = 0
t1 = t1.unsqueeze(dim=0)
print(t1)
t1.shape

In [None]:
# add dimesnion of size 1 at the end usin unsqueeze method and dim = 3
t1 = t1.unsqueeze(dim=3)
print(t1)
print(t1.shape)

In [None]:
# We can add new dimesnion at any place
t1 = torch.arange(20).view(2, 2, 5)
print(t1.shape)
t1 = t1.unsqueeze(dim=1)
print(t1.shape)

In [None]:
# we can also use None keyword to add dimension of size 1 at multiple locations
t1 = t1[:, :, :, None, :, None]
print(t1.shape)

In [None]:
# Remove a dimension of size 1 at a specific location using torch.squeeze(dim)
t1 = t1.squeeze(dim=1)
print(t1.shape)

In [None]:
# Remove a dimension of size 1 at a specific location using 0 keyword
t1 = t1[:, :, 0]
print(t1.shape)

In [None]:
# Removing all dimensions of size 1 using torch.squeeze()
t1 = t1.squeeze()
print(t1.shape)

### <font color = 'pickle'>**4) Adopting shape of other tensors**
We can use view_as(input) to adopt shape of other tensors

In [None]:
a = torch.arange(10).view(2, 5)
# create a tensor b filled with ones (10 elements) and has same shape as b
b = torch.ones(10).view_as(a)
print(a.shape)
print(b.shape)

### <font color = 'pickle'>**5) Permute**

Permute function rearranges the original tensor according to the desired ordering and returns a new multidimensional rotated tensor. 

The size of the returned tensor remains the same as that of the original.

Let us consider an example:

If the size of a tensor is (2, 3, 4),

- First size is 2
- Second size is 3
- Third size is 4

Now, in case of permute we will just change the ordering of the sizes. Thus if we write permute(0, 2, 1) the new tensor will have:

- First size is 2 (1st size of previous)
- Second size is 4 (3rd size of previous)
- Third size is 3 (2nd size of previous)

Pytorch's function permute() only permutes or in other words shuffles the order of the axes of a tensor whereas view() reshapes the tensor by reducing/expanding the size of each dimension.


In [None]:
# Initilaize a tensor and print it's size and elements
torch.manual_seed(0)
t1 = torch.randint(0, 10, size =(2, 4))
print(t1.size())
print(t1)

In [None]:
t1.storage()

In [None]:
t1.is_contiguous()

In [None]:
t1.stride()

In [None]:
# Permute the tensor and print it's size and elements 
t1_p = t1.permute(1, 0)
print(t1_p.size())
print()
print(t1_p)

In [None]:
t1_p.storage()

In [None]:
t1_p.stride()

In [None]:
t1_p.is_contiguous()

In [None]:
t1_p.view(2, 4)

In [None]:
t1_p.reshape(2, 4)

In [None]:
# Initilaize a tensor and print it's size and elements
torch.manual_seed(0)
t2 = torch.rand(2, 3, 4)
print(t2.size())
print(f'\n{t2}')

In [None]:
# Permute the tensor and print it's size and elements - use permute (0, 2, 1)
t2_p = t2.permute(0, 2, 1)
print(t2_p.size())
print(f'\n{t2_p}')

In [None]:
# difference between permute and view
x = torch.arange(3*2).view(2, 3)
print(x)
# create a view (3, 2)
print(x.view(3, 2))
# permute axis(1, 0)
print(x.permute(1, 0))

Question and answer taken from following reference: <br>
https://discuss.pytorch.org/t/different-between-permute-transpose-view-which-should-i-use/32916

- (1) If I have a feature size of BxCxHxW, I want to reshape it to BxCxHW. Which one is a good option?
- (2) If I have a feature size of BxCxHxW, I want to change it to BxCxWxH . Which one is a good option?
- (3) If I have a feature size of BxCxH, I want to change it to BxCxHx1 . Which one is a good option?

Solution:
- permute changes the order of dimensions aka axes, so 2 would be a use case. Transpose is a special case of permute, use it with 2d tensors.
- view can combine and split axes, so 1 and 3 can use view,
- note that view can fail for noncontiguous layouts (e.g. crop a picture using indexing), in these cases reshape will do the right thing,
- for adding dimensions of size 1 (case 3), there also are unsqueeze and indexing with None.


## <font color = 'pickle'>**Operations on tensors of same size**
We can call element-wise operations on any two tensors of the same shape.

In [None]:
x = torch.tensor([1.0, 2, 4, 8])
y = torch.tensor([2, 2, 2, 2])
x + y, x - y, x * y, x / y, x ** y  # The ** operator is exponentiation

## <font color = 'pickle'>**Broadcasting - Operations on tensors of different  size**

Broadcasting describes how a tensor has to be treated during arithematic operation. If we have tensors of different sizes, we can broadcast the smaller array across the larger one so that they can have comaptible shapes.





### <font color = 'pickle'>**1) Understanding how broadcasting works**

* The following image describes how a tensor of 2 dimensional tensor will be added to a 1 dimensional tensor 
<img src="https://drive.google.com/uc?export=view&id=1QG2GO1owGpyXbcugJFVFGb4o_buV4s3j" width="500"/>

In [None]:
# create tensor
t2 = torch.tensor([[12, 16, 14], [13, 17, 13], [14, 18, 12]])
t1 = torch.tensor([1, 2, 3])
print(t2.shape)
print(t1.shape)

t1 has one dimension and t2 has 2- dimensions. Pytorch will first prepend 1 to the dimension of t1 so that it also has same number of dimensions.

In [None]:
# we can use None keyword to add redundant dimension
t1_mod = t1[None,] 
print(t1_mod)
print(t1_mod.shape)

Now it will stretch the tensor with smaller dimension so that it has the same shape as the tensor with higher dimensions.

In [None]:
# here we are stretching the tensor along first dimension
# we will repeat the same row thrice to generate tensor of size (3, 3)
t1_mod_rep = t1_mod.repeat(3, 1) # The number of times to repeat this tensor along each dimension
t1_mod_rep

In [None]:
print(t1_mod_rep + t2)

In [None]:
# we can check that it gives us the same result if we simply add t1 and t2
# so broadcasting is an efficient way of performing operations on tensors of unequal sizes
print(t1 + t2)

### <font color = 'pickle'>**2) Rules for Broadcasting**</font>
Broadcasting can only happen if the two tensors are broadcastable. Conditions for broadcasting:

- Each tensor has at least one dimension.

- When iterating over the dimension sizes, starting at the trailing dimension, the dimension sizes must either be equal, one of them is 1, or one of them does not exist.

Examples:

1. `t1(5, 8, 10) t2(5, 8, 10)`
Same size -> Broadcasting possible.
2. `t1((0,)) t2(5, 8, 10)`
t1 doesn't have atleast one dimension -> Broadcasting not possible. 
3. `t1(5, 8, 10, 1) t2(8, 1, 1)` Broadcasting possible. Reasons:
  - 1st trailing position : both have size 1
  - 2nd trailing position : t2 has size 1
  - 3rd trailing position : both have size 8
  - 4th training position: t2 size doesn't exist but t2 has atleast 1 dimension.

In [None]:
# Broadcasting
t1 = torch.empty(5, 8, 10, 1)
t2 = torch.empty(8, 1, 1)
(t1 + t2).size()

The dimensions after broadcasting will be:

- If the number of dimensions are
 not equal, prepend 1 to the dimensions of the tensor with fewer dimensions to make them equal length.

- Then, for each dimension size, the resulting dimension size is the max of the sizes along that dimension.

In [None]:
# Another example for broadcasting
t1 = torch.empty(1)
t2 = torch.empty(3, 1, 7)
(t1 + t2).size()

In [None]:
# Example where broadcasting is not possible
t1 = torch.empty(5, 8, 10, 1)
t2 = torch.empty(   3, 1, 1)
(t1 + t2).size()

Here, at third trailing position sizes are not equal and none of them is 1, thus broadcasting is not possible.

## <font color = 'pickle'>**Conversion to other Python Objects**

In [None]:
# Initializing a tensor
t = torch.arange(10)

# Converting tensor t to numpy array using numpy() mehod
arr = t.numpy()

# Converting numpy array to tensor T using tensor() method
T = torch.tensor(arr)

# Printing data type of arr and T
type(arr), type(T)

We can also use torch.from_numpy() and torch.as_tensor() to convert numpy array to PyTorch Tensor. However, with these methods, the PyTorch tensor and the source NumPy array share the same memory. This means that changes to one affect the other. However, the torch.Tensor() function always makes a copy.

In [None]:
my_ndarray = np.arange(10)
t_from_numpy = torch.from_numpy(my_ndarray)
t_as_tensor = torch.as_tensor(my_ndarray)
t_Tensor = torch.tensor(my_ndarray)

print(f"tensor craeted using torch.from_numpy before changing np array: {t_from_numpy}")
print(f"tensor craeted using torch.as_tensor before changing np array : {t_as_tensor}")
print(f"tensor craeted using torch.Tensor before changing np array    : {t_Tensor}")

# change numpy array
my_ndarray[2] = 1000

print()
print(f"tensor craeted using torch.from_numpy after changing np array: {t_from_numpy}")
print(f"tensor craeted using torch.as_tensor after changing np array : {t_as_tensor}")
print(f"tensor craeted using torch.Tensor after changing np array    : {t_Tensor}")

In [None]:
# Initializing a size-1 tensor
t = torch.tensor([10.5])

# Printing tensor
print(t)

# Accessing element of tensor using item function
# item returns the value of the tensor as python number
# works only for tensors with single element

print(t.item())

## <font color = 'pickle'>**Changing datatype of Tensors**
When creating tensor we can pass the dtype as an argument. We can also change the datatype of tensors using to() and type() mehods. For a list of dtypes visit https://pytorch.org/docs/stable/tensor_attributes.html#torch.torch.dtype

In [None]:
x = torch.tensor([8, 9, -3], dtype=torch.int)

# we can use type() method or to() method to change the datatype
print(f'Old: {x.dtype}')

# change the datatype to int64 using type() method
x = x.type(dtype=torch.int64)
print(f'New: {x.dtype}')

# change the datatype to int32 using type() method
x= x.to(dtype=torch.int32)
print(f'Newer: {x.dtype}')

## <font color = 'pickle'>**Concatenating Tensors**

We can use `torch.cat((tensors_to_concatenate), dim)` to concatenate tensors.

The tensors must have the same shape (except in the concatenating dimension).

In [None]:
# we can use torch
x1 = torch.randint(low=0, high=10, size = (2, 5))
x2 = torch.ones(4, 5)
x3 =  torch.zeros(2, 3)

# The tensors must have the same shape (except in the concatenating dimension)
# x1 and x2 have the same shape except for dim = 0, hence we can conactenate these along dim = 0
# x1 and x3 have the same shape except for dim = 1, hence we can conactenate these along dim = 1
# we cannot concatenate x2 and x3 along any dimension

x1_x2 = torch.cat((x1, x2), dim = 0) 
x1_x3 = torch.cat((x1, x3), dim = 1)
print(f'shape of x1_x2 is {x1_x2.shape}')
print(f'shape of x2_x3 is {x1_x3.shape}')
print(f'\nx1_x2\n:{x1_x2}')
print(f'\nx1_x3\n:{x1_x3}')

## <font color = 'pickle'>**Saving Memory - inplace operations**

In-place operation are operations that change the content of a given Tensor without making a copy. 

Operations that have a `_` suffix are in-place. For example: `.add_()`. Operations like += or *= are also inplace operations.

We can also perform in-place opaeration usng the notation `Z[:] = <expression>`.

As in-place operations do not make a copy, they can save memory. However, we need to use them carefully. They can be problematic when computing derivatives because of an immediate loss of history. We will learn about derivatives and computation graphs in coming lectures.

### <font color = 'pickle'>**1) Checking gpu**

In [None]:
# check if gpu is availaible
device = torch.device('cuda:0' if torch.cuda.is_available() else "cpu")
print(device)

In [None]:
# create a tensor
X = torch.tensor([1, 2, 3, 4])

In [None]:
# check the device attribute of the tensor
X.device

In [None]:
# move the tensor to gpu
X.to(device = device)

In [None]:
# it is more efficient to create the tensor on gpu directly
Y = torch.tensor([1, 2, 3], device = device)

In [None]:
# check the device attribute of the tensor
Y.device

### <font color = 'pickle'>**2) Memory allocation of in-place operations**

In [None]:
# create tensor
t1 = torch.randn(10000, 10000, device = 'cpu')

# move tensor to gpu
t1 = t1.to(device)
print(t1.device)

# we can use id() function to get memory location of tensor
print(f'initial memory location of tensor t1 is : {id(t1)}')

x = t1
print(f'initial memory location of x is : {id(x)}')

# Waits for everything to finish running
torch.cuda.synchronize()

# initial memory allocated
start_memory = torch.cuda.memory_allocated()

# inplace operation
# t1 = t1 + 0.1
# t1.add(0.1)
t1 +=  0.1
t1.add_(0.1)
# since the operation was inplace when we update t1 it will update x as well
print(x == t1)

print(f'final memory location of tensor t1 is: {id(t1)}')
print(f'final location of x is : {id(x)}')

# totall memory allocated after function call
end_memory = torch.cuda.memory_allocated()

# memory allocated because of function call
memory_allocated = end_memory - start_memory
print(memory_allocated/1024**2)

From the above example wecan see that both x and t1 has same memory location. When we ue in-place operation on t1, it also updates x

### <font color = 'pickle'>**3) Memory allocation of out-of-place operations**

In [None]:
# create tensor
t2 = torch.randn(10000, 10000, device = 'cpu')

# move tensor to gpu
t2 = t2.to(device)
print(t2.device)

# we can use id() function to get memory location of tensor
print(f'initial memory location of tensor t2 {id(t2)}')

y = t2
print(f'final memory location of y is : {id(y)}')

# Waits for everything to finish running
torch.cuda.synchronize()

# initial memory allocated
start_memory = torch.cuda.memory_allocated()

# out-place opertaions
t2 = t2 + 0.1

# since the operation was not inplace when we update t2 it will not update y
print(y == t2)

# we can use id() function to get memory location of tensor
print(f'final memory location of tensor t2 {id(t2)}')
print(f'final memory location of y is : {id(y)}')

# totall memory allocated after function call
end_memory = torch.cuda.memory_allocated()

# memory allocated because of function call
memory_allocated = end_memory - start_memory
print(memory_allocated/1024**2)


From the above example we can see that initially both y and t2 has same memory location. After running t2 = t2 + 0.1, we will find that id(t2) points to a different location. That is because Python first evaluates t2 + 0.1, allocating new memory for the result and then makes t2 point to this new location in memory. Since we have not done in-place operation, updating t2 does not effect y. y still points to the same memory location.

## <font color = 'pickle'>**Masks using binary tensors** 

In [None]:
# create a tensor which has probailities of events
prob = torch.tensor([0.7, 0.4, 0.6, 0.2, 0.8, 0.1])

# Binary tensors
print(prob > 0.5)
print(prob <= 0.5)

In [None]:
# create output tensor where output = 1 if prob >0.5 and 0 otherwise
# craete an empty output tensor of same shape as prob
output = torch.empty_like(prob)

# update output tensor using the binary mask
output[prob > 0.5] = 1
output[prob <= 0.5] = 0
print(output)