# Pytorch Tutorial - Part 1
This set of tutorials is only a supplement to the existing tutorials available at [https://pytorch.org/tutorials/](https://pytorch.org/tutorials/?utm_source=Google&utm_medium=PaidSearch&utm_campaign=%2A%2ALP+-+TM+-+General+-+HV+-+IN&utm_adgroup=PyTorch+Tutorials&utm_keyword=pytorch%20tutorials&utm_offering=AI&utm_Product=PyTorch&gclid=EAIaIQobChMIqPLjy8LA4wIVmQ4rCh1DpA7AEAAYASAAEgLy1fD_BwE). I strongly recommend you all to go through the official tutorials as well.

Let's begin. Firstly, make sure the runtime type chosen in the colab environment is GPU. To do that, pull down Runtime menu on the toolbar above and choose Runtime type to GPU.

To work with PyTorch, we have to import the module named 'torch'.

Let's first check the version of PyTorch loaded.

In [1]:
import torch
import numpy as np
torch.__version__

'1.1.0'

So, the latest stable version is loaded. 

As stated in the official docs, PyTorch is a Python-based scientific computing package targeted at two sets of audiences:

 - A replacement for NumPy to use the power of GPUs
 - a deep learning research platform that provides maximum flexibility and speed
 
Similar to ndarrays in numpy, the basic entity in PyTorch is Tensors. They are similar to ndarrays but they can be ported to GPU's and manipulated.
 
 So. let's create tensors.

In [2]:
# Use torch.tensor(data, dtype=None, device=None, requires_grad=False, pin_memory=False) to create a tensor.

t0 = torch.tensor(data = 2, dtype = torch.uint8, device = 'cpu') # a zero dimensional tensor
print(f'Tensor {t0} of type {type(t0)} with size {t0.size()} and data type {t0.dtype} residing on {t0.device}. It has {t0.dim()} dimensions.')

t1 = torch.tensor(data = [1, 2], dtype = torch.float, device = 'cpu') # a 1 dimensional tensor
print(f'\nTensor {t1} of type {type(t1)} with size {t1.size()} and data type {t1.dtype} residing on {t1.device}. It has {t1.dim()} dimension.') 

t2 =torch.tensor(data = ((1, 2), (3, 4)), dtype = torch.int32) # a 2 dimensional tensor
print(f'\nTensor {t2} of type {type(t1)} with size {t2.size()} and data type {t2.dtype} residing on {t2.device}. It has {t2.dim()} dimensions.')    

import numpy as np
t3 = torch.tensor(data = np.array([[[1, -2, 3.0], [4, -5.2, 6.1]], [[1, -1, .2], [0, .9, -2.3]]]), dtype = torch.double) # a 3 dimensional tensor
print(f'\nTensor {t3} of type {type(t3)} with size {t3.size()} and data type {t3.dtype} residing on {t3.device}. It has {t3.dim()} dimensions.')

t4 = torch.tensor(data = []) # an empty tensor. Its size is 0.
print(f'\nTensor {t4} of type {type(t4)} with size {t4.size()} and data type {t4.dtype} residing on {t4.device}. It has {t4.dim()} dimension.')

Tensor 2 of type <class 'torch.Tensor'> with size torch.Size([]) and data type torch.uint8 residing on cpu. It has 0 dimensions.

Tensor tensor([1., 2.]) of type <class 'torch.Tensor'> with size torch.Size([2]) and data type torch.float32 residing on cpu. It has 1 dimension.

Tensor tensor([[1, 2],
        [3, 4]], dtype=torch.int32) of type <class 'torch.Tensor'> with size torch.Size([2, 2]) and data type torch.int32 residing on cpu. It has 2 dimensions.

Tensor tensor([[[ 1.0000, -2.0000,  3.0000],
         [ 4.0000, -5.2000,  6.1000]],

        [[ 1.0000, -1.0000,  0.2000],
         [ 0.0000,  0.9000, -2.3000]]], dtype=torch.float64) of type <class 'torch.Tensor'> with size torch.Size([2, 2, 3]) and data type torch.float64 residing on cpu. It has 3 dimensions.

Tensor tensor([]) of type <class 'torch.Tensor'> with size torch.Size([0]) and data type torch.float32 residing on cpu. It has 1 dimension.


From the above examples, you should be clear about how data is supplied to tensor method (as a scalar or list or tupe or ndarray etc), type of the tensor created (the created tensor is an object of class torch.Tensor), the datatype of the tensor (as specified by the dtype attribute), the size of the tensor (as returned by the size() method on the tensor) and the residence of the tensor (currently cpu). With this you should not be finding any difference between the tensor and a numpy array. But, a tensor has another special attribute called requires_grad and also, tensors can reside on GPU. We will see more about requires_grad attribute of tensors and porting tensors to GPU very soon.



## Exercise
Create a 3 dimensional tensor of some size with appropriate number of long integers in it residing in CPU.

In [3]:
# do the exercise here
t = torch.tensor(data=[[[i for i in range(2)]]], dtype=torch.long, device='cpu')
print(f' Tensor {t} of type {type(t)} with size {t.size()} and data type {t.dtype} residing on {t.device}. It has {t.dim()} dimensions.')

 Tensor tensor([[[0, 1]]]) of type <class 'torch.Tensor'> with size torch.Size([1, 1, 2]) and data type torch.int64 residing on cpu. It has 3 dimensions.


In [4]:
# other ways of creating tensor
t0 = torch.zeros((2, 3), dtype = torch.uint8, device = 'cpu')
t1 = torch.ones((2, 3), dtype = torch.uint8)
t2 = torch.arange(start = 2.0, end = 5.0, step = 0.5)
t3 = torch.linspace(start = 2, end = 5, steps = 100)
t4 = torch.eye(3, dtype = torch.uint8)
t5 = torch.empty((2, 3)) #initialized to a 2x3 tensor with some random values
t6 = torch.empty_like(t0, dtype = torch.float32) # has same size as t0
t7 = torch.full(size = (2, 3), fill_value = 3.5, dtype = torch.int) # observe the result printed carefully
t8 = t7.new_tensor([[7, 8, 9, 13], [10, 11, -12, -14]]) #has same dtype as t7 but size is different; it is (2, 4)

for i in range(8):
    s = 't'+str(i)
    print(f't{i}: {vars()[s]}\n')


#convert from numpy to tensor
a = np.array([1, 2, 3])
t8 = torch.from_numpy(a) # t8 and a share same cpu memory 
t8[0] = -1 
print(f't8: {t8} residing in {t8.device}')
print(f'a: {a}') # a is also modified


t0: tensor([[0, 0, 0],
        [0, 0, 0]], dtype=torch.uint8)

t1: tensor([[1, 1, 1],
        [1, 1, 1]], dtype=torch.uint8)

t2: tensor([2.0000, 2.5000, 3.0000, 3.5000, 4.0000, 4.5000])

t3: tensor([2.0000, 2.0303, 2.0606, 2.0909, 2.1212, 2.1515, 2.1818, 2.2121, 2.2424,
        2.2727, 2.3030, 2.3333, 2.3636, 2.3939, 2.4242, 2.4545, 2.4848, 2.5152,
        2.5455, 2.5758, 2.6061, 2.6364, 2.6667, 2.6970, 2.7273, 2.7576, 2.7879,
        2.8182, 2.8485, 2.8788, 2.9091, 2.9394, 2.9697, 3.0000, 3.0303, 3.0606,
        3.0909, 3.1212, 3.1515, 3.1818, 3.2121, 3.2424, 3.2727, 3.3030, 3.3333,
        3.3636, 3.3939, 3.4242, 3.4545, 3.4848, 3.5152, 3.5455, 3.5758, 3.6061,
        3.6364, 3.6667, 3.6970, 3.7273, 3.7576, 3.7879, 3.8182, 3.8485, 3.8788,
        3.9091, 3.9394, 3.9697, 4.0000, 4.0303, 4.0606, 4.0909, 4.1212, 4.1515,
        4.1818, 4.2121, 4.2424, 4.2727, 4.3030, 4.3333, 4.3636, 4.3939, 4.4242,
        4.4545, 4.4848, 4.5152, 4.5455, 4.5758, 4.6061, 4.6364, 4.6667, 4.6970,
        

## Exercise
Create a new tensor of same data type as t7 defined above but with size 10 by 10 filled with the value -2. It should reside in CPU. Hint: use method [new_full] (https://pytorch.org/docs/stable/tensors.html#torch.Tensor) in Tensor class.

In [5]:
#complete exercise here
# new_full(size, fill_value, dtype=None, device=None, requires_grad=False) → Tensor
t9 = torch.Tensor().new_full(size=(10, 10), fill_value=-2, dtype=t7.dtype, device='cpu')
print(f' Tensor {t9} of type {type(t9)} with size {t9.size()} and data type {t9.dtype} residing on {t9.device}. It has {t9.dim()} dimensions.')

 Tensor tensor([[-2, -2, -2, -2, -2, -2, -2, -2, -2, -2],
        [-2, -2, -2, -2, -2, -2, -2, -2, -2, -2],
        [-2, -2, -2, -2, -2, -2, -2, -2, -2, -2],
        [-2, -2, -2, -2, -2, -2, -2, -2, -2, -2],
        [-2, -2, -2, -2, -2, -2, -2, -2, -2, -2],
        [-2, -2, -2, -2, -2, -2, -2, -2, -2, -2],
        [-2, -2, -2, -2, -2, -2, -2, -2, -2, -2],
        [-2, -2, -2, -2, -2, -2, -2, -2, -2, -2],
        [-2, -2, -2, -2, -2, -2, -2, -2, -2, -2],
        [-2, -2, -2, -2, -2, -2, -2, -2, -2, -2]], dtype=torch.int32) of type <class 'torch.Tensor'> with size torch.Size([10, 10]) and data type torch.int32 residing on cpu. It has 2 dimensions.


We will now do some operations on tensors. 

In [6]:
# operations on Tensors

t0 = torch.tensor(data = [[1., 2.], [3., 4.]])
t1 = torch.full(size = (2, 2), fill_value = 5.)

t2_1 = t0+t1 # adding Tensors t0 and t1
t2_2 = t0.add(t1) # this also adds Tensor t0 to t1; note that add is a method defined for Tensor object.
print(f't2_1: {t2_1}\nt2_2: {t2_2}')

t2_1.add_(t1) # inplace addition; result of adding t0 to t1 is stored in t0; 
            # In pytorch, operations ending with underscore means inplace operation.
print(f't2_1: {t2_1}')

t3 = t0*t1 # elementwise multilication of two tensors i.e Hadamard product
print(f't3: {t3}')

data1 = [[[1., 2., 3.], [4., 5., 6.]], [[-1., 2., 0], [1., 0, -3.]]]
data2 = [[[0., 1.], [-1., 4.], [9., -2.]], [[1., 4.], [0, 1.], [-2., 0]]]
t4 = torch.tensor(data = data1)
t5 = torch.tensor(data = data2)
t6 = t4.matmul(t5) # multiplying two 3d tensors; t4 is 2 x 2 x 3; t5 is 2 x 3 x 2; result t6 is 2 x 2 x 2
                   # multiplication is happening here at dimensions 2 and 3.
                   # look at torch.matmul for more details
print(f't6: {t6}')


t2_1: tensor([[6., 7.],
        [8., 9.]])
t2_2: tensor([[6., 7.],
        [8., 9.]])
t2_1: tensor([[11., 12.],
        [13., 14.]])
t3: tensor([[ 5., 10.],
        [15., 20.]])
t6: tensor([[[25.,  3.],
         [49., 12.]],

        [[-1., -2.],
         [ 7.,  4.]]])


For more arithmetic, logic, comparison, trignometric etc operations, you should look at (https://pytorch.org/docs/stable/tensors.html#torch.Tensor).  The following exercise will elicit some of the Tensor operations for you.

## Exercise
1. Create two 1d Tensors named t0 and t1 of dimensions 2x2x4 and 2x4x3 respectively . Use ''matmul' method of Tensor class to multiply these two tensors.

2. Print the positions where maximum occurs along dimension 1 in t0 defined in part1. Hint: Look at 'argmax' method of Tensor class. Make sure that your output has dimesnions 2x1x4.

3. Clamp the values in t0 defined in part 1 inplace between -10 and 20. You can use 'clamp_' method defined in Tensor class.

4. Create an empty Tensor t2 that has same size as t1 defined in part 1. Copy the contents of t1 to t2 using 'copy_' method defined in Tensor class.

5. Split t1 defined in part 1 in to a list of 3 Tensors along dimension 1 using 'chunk' method defined in the Tensor class. Provide another answer to this question using 'split' method defined in the Tensor class.

6. Create an empty Tensor t2 that has same size as t1 defined in part 1. Fill it with the value 2.92. You can use 'fill_' method defined in the Tensor class.

7. Flatten t1 defined in part 1 from dimension 1 to the ending dimension. You can use 'flatten' method defined in the Tensor class.

8. Create a 4d Tensor named t4  of size 64x3x128x128 using torch.rand. Check if elements of  t4 are contiguously stored in the  memory. You can use 'is_contiguous' method defined in the Tensor class. If not stored contiguously, use 'contiguous' method defined in the Tensor class to contiguously store the elements of t4. If stored contiguously, still go to the docs and read about 'contiguous' method.

9. Create a Tensor t5 that is a scalar with value -2.7. Use 'item' method defined in the Tensor class to convert Tensor t5 to the corresponding python number.

10. Return the sub-tensor where t1 defined in part 1 is > 7 using 'masked_select' method defined in the Tensor class.

11. Convert t1 defined in part1 to an ndarray using 'numpy' method defined in the Tensor class.

12. In t1 defined in part1 you have two 4x3 matrices. Extract 1st row of 1st matrix in to a Tensor named t6. Tensor indexing is similar to numpy type indexing. You can use indexing to do this. Similarly, extract 1st row of 2nd matrix from t0 defined in part 1 in to a Tensor named t7. Compute the inner product of these two Tensors using the 'dot' method defined in the Tensor class.

13.  Create two 3d Tensors named t8 and t9 of dimensions 2x2x4 and 2x3x4 respectively . Use 'matmul' method of Tensor class to multiply these two tensors. Do you get error? You should! Now, use 'permute method' defined in the Tensor class to permute the dimensions of Tensor t9 appropriately so that 'matmul' is possible now and the result is of dimensions 2x2x3.

14. Create an empty Tensor of dimension 2x3. Fill it with random values generated from the uniform distribution on the interval [11, 12] using 'random_' method defined in the Tensor class.

15. Create an empty Tensor of dimension 2x3. Fill it with random values generated from the continuous uniform distribution on the interval [0, 1] using 'uniform_' method defined in the Tensor class.

16. Create an empty Tensor of dimension 2x3. Fill it with zeros using 'zero_' method defined in the Tensor class.

17. Using 'transpose' method defined in the Tensor class, transpose dimensions 1 and 2 in t1 defined in part 1.

18. Sum along dimension 1 the Tensor t1 defined in part1. The ouput should have same number of dimensions as t1. Now, find mimimum along dimension 2 of the output. Now squeeze out the dimensions with size 1. 'sum', 'min' and 'squeeze' methods defined in the Tensor class will be useful. Further, read about 'unsqueeze' method defined in the Tensor class.



In [7]:
# complete the exercise here

# 1. Create two 3d Tensors named t0 and t1 of dimensions 2x2x4 and 2x4x3 respectively . Use ''matmul' method of Tensor class to multiply these two tensors.
t0 = torch.full(size=(2,2,4), fill_value=30.)
t1 = torch.full(size=(2,4,3), fill_value=3.)
res1 = torch.Tensor(t0).matmul(t1)
print("res1: ", res1)
print("res1.size(): ", res1.size())


# 2. Print the positions where maximum occurs along dimension 1 in t0 defined in part1. Hint: Look at 'argmax' method of Tensor class.
#    Make sure that your output has dimesnions 2x1x4.
res2 = t0.argmax(dim=1, keepdim=True) 
print("\n\nres2: ", res2)
print("res2.size(): ", res2.size())

# 3. Clamp the values in t0 defined in part 1 inplace between -10 and 20. You can use 'clamp_' method defined in Tensor class.
res3 = t0.clamp_(min=-10, max=20)
print("\n\nres3: ", res3)
print("t0: ", t0)


# 4. Create an empty Tensor t2 that has same size as t1 defined in part 1. Copy the contents of t1 to t2 using 'copy_' method defined in Tensor class.
t2 = torch.empty(size=t1.size())
res4 = t2.copy_(t1)
print("\n\nres4: ", res4)
print("t2: ", t2)

# 5. Split t1 defined in part 1 in to a list of 3 Tensors along dimension 1 using 'chunk' method defined in the Tensor class.
#    Provide another answer to this question using 'split' method defined in the Tensor class.
res5_1 = t1.chunk(chunks=3, dim=1)
print("\n\nres5_1: ", res5_1)
print("len(res5_1): ", len(res5_1))
res5_2 = t1.split(split_size=2, dim=1)
print("res5_2: ", res5_2)
print("len(res5_2): ", len(res5_2))

# 6. Create an empty Tensor t2 that has same size as t1 defined in part 1. Fill it with the value 2.92. You can use 'fill_' method defined in the Tensor class.
res6 = torch.empty(size=t1.size()).fill_(2.92)
print("\n\nres6: ", res6)


# 7. Flatten t1 defined in part 1 from dimension 1 to the ending dimension. You can use 'flatten' method defined in the Tensor class.
t1 = t1.flatten(start_dim=1, end_dim=-1)
print("\n\n7. t1.size(): ", t1.size())


# 8. Create a 4d Tensor named t4 of size 64x3x128x128 using torch.rand. Check if elements of t4 are contiguously stored in the memory.
#    You can use 'is_contiguous' method defined in the Tensor class. If not stored contiguously, use 'contiguous' method defined in the Tensor class
#    to contiguously store the elements of t4. If stored contiguously, still go to the docs and read about 'contiguous' method.
res8_1 = torch.rand(64, 3, 128, 128)
print("\n\nres8_1.is_contiguous():  ", res8_1.is_contiguous())
if not res8_1.is_contiguous():
    res8_1.contiguous()
    print("\n\nres8_1.is_contiguous():  ", res8_1.is_contiguous())

# 9. Create a Tensor t5 that is a scalar with value -2.7. Use 'item' method defined in the Tensor class to convert Tensor t5 to the corresponding python number.
res9_1 = torch.tensor(data=-2.7)
print("\n\nres9_1: ", res9_1)
res9_2 = res9_1.item()
print("type(res9_2): ", type(res9_2))


# 10. Return the sub-tensor where t1 defined in part 1 is > 7 using 'masked_select' method defined in the Tensor class.
t1 = torch.full(size=(2,4,3), fill_value=3.)
mask = t1.gt(7)
res10 = torch.masked_select(t1, mask)
print("\n\nres10: ", res10)


# 11. Convert t1 defined in part1 to an ndarray using 'numpy' method defined in the Tensor class.
res11 = t1.numpy()
print("\n\nres11: ", res11)


# 12. In t1 defined in part1 you have two 4x3 matrices. Extract 1st row of 1st matrix in to a Tensor named t6.
# Tensor indexing is similar to numpy type indexing. You can use indexing to do this. Similarly, extract 1st row of 2nd matrix 
# from t0 defined in part 1 in to a Tensor named t7. Compute the inner product of these two Tensors using the 'dot' method defined in the Tensor class.
t6 = t1[0, 0, :]
t7 = t1[1, 0, :]
print("\n\n12. t6: ", t6)
print("t7: ", t7)
res12 = t6.dot(t7)
print("res12: ", res12)

# 13. Create two 3d Tensors named t8 and t9 of dimensions 2x2x4 and 2x3x4 respectively .
# Use 'matmul' method of Tensor class to multiply these two tensors. Do you get error? You should! Now, use 'permute method' defined in the Tensor class
# to permute the dimensions of Tensor t9 appropriately so that 'matmul' is possible now and the result is of dimensions 2x2x3.
t8 = torch.rand(2,2,4)
t9 = torch.rand(2,3,4)
t9_permuted = t9.permute(0, 2, 1)
res13 = torch.matmul(t8, t9_permuted)
print("\n\nres13.size(): ", res13.size())


# 14. Create an empty Tensor of dimension 2x3. Fill it with random values generated from the uniform distribution on the interval [11, 12] 
#     using 'random_' method defined in the Tensor class.
# random_ generates numbers from the dsicrete uniform distribution over [from, to-1]
res14 = torch.empty(size=(2,3))
tmp = res14.random_(11, 12+1)
print("\n\nres14: ", res14)

# 15. Create an empty Tensor of dimension 2x3. Fill it with random values generated from the continuous 
# uniform distribution on the interval [0, 1] using 'uniform_' method defined in the Tensor class.
res15 = torch.empty((2,3)).uniform_(0,1)
print("\n\nres15: ", res15)


# 16. Create an empty Tensor of dimension 2x3. Fill it with zeros using 'zero_' method defined in the Tensor class.
res16 = torch.empty((2,3)).zero_()
print("\n\nres16: ", res16)


# 17. Using 'transpose' method defined in the Tensor class, transpose dimensions 1 and 2 in t1 defined in part 1.
t1 = t1.transpose_(1,2)
print("\n\n17. t1.size(): ", t1.size())


# 18. Sum along dimension 1 the Tensor t1 defined in part1. The ouput should have same number of dimensions as t1.
# Now, find mimimum along dimension 2 of the output. Now squeeze out the dimensions with size 1.
# 'sum', 'min' and 'squeeze' methods defined in the Tensor class will be useful. Further, read about 'unsqueeze' method defined in the Tensor class.
t1 = torch.full(size=(2,4,3), fill_value=3.)
res18_1 = t1.sum(dim=1, keepdim=True)
res18_2, res18_3 = res18_1.min(dim=2)
res18_3 = res18_1.squeeze_()
print("\n\nres18_1.size(): ", res18_1.size())
print("res18_2: ", res18_2)
print("res18_3.size(): ", res18_3.size())

res1:  tensor([[[360., 360., 360.],
         [360., 360., 360.]],

        [[360., 360., 360.],
         [360., 360., 360.]]])
res1.size():  torch.Size([2, 2, 3])


res2:  tensor([[[1, 1, 1, 1]],

        [[1, 1, 1, 1]]])
res2.size():  torch.Size([2, 1, 4])


res3:  tensor([[[20., 20., 20., 20.],
         [20., 20., 20., 20.]],

        [[20., 20., 20., 20.],
         [20., 20., 20., 20.]]])
t0:  tensor([[[20., 20., 20., 20.],
         [20., 20., 20., 20.]],

        [[20., 20., 20., 20.],
         [20., 20., 20., 20.]]])


res4:  tensor([[[3., 3., 3.],
         [3., 3., 3.],
         [3., 3., 3.],
         [3., 3., 3.]],

        [[3., 3., 3.],
         [3., 3., 3.],
         [3., 3., 3.],
         [3., 3., 3.]]])
t2:  tensor([[[3., 3., 3.],
         [3., 3., 3.],
         [3., 3., 3.],
         [3., 3., 3.]],

        [[3., 3., 3.],
         [3., 3., 3.],
         [3., 3., 3.],
         [3., 3., 3.]]])


res5_1:  (tensor([[[3., 3., 3.],
         [3., 3., 3.]],

        [[3., 3., 3.],

To reshape a Tensor, we can use 'view' method defined in the Tensor class. Remind yourself that python stores multi dimensional arrays (Tensors) in row major order.

In [8]:
t = torch.rand((2, 3, 4, 5)) # totally 120 elements
t_1 = t.view((2, 3, 20)) 
print(f'size of t_1: {t_1.size()}')
t_2 = t.view((2, -1)) # one argument can be -1; In this case the correct value is automatically calculated by the method
                      # Here it is calculated as 60
print(f'size of t_2: {t_2.size()}')

size of t_1: torch.Size([2, 3, 20])
size of t_2: torch.Size([2, 60])


Let us look at another example.

In [9]:
t = torch.tensor([[1, 2, 3], [4, 5, 6]])
t_1 = t.view((3, 2))
print(f't: {t}')
print(f't_1: {t_1}')

t: tensor([[1, 2, 3],
        [4, 5, 6]])
t_1: tensor([[1, 2],
        [3, 4],
        [5, 6]])


In the above example, if you had intended to perform transpose of the Tensor,  you have failed. t_1 is not the transpose of t though it has dimensions 3x2, the dimensions transpose of t will have. What has happened? To understand the scenario here, note that in memory t is stored as 1, 2, 3, 4, 5, 6 in row major order. Total number of elements is 6. When you asked for a new view with shape 3x2, 'view' method knew that the new view also has 6 elements that is compatible with the number of elements in the original tensor t. Now 'view' method provides the new view by simply doing a linear scan of the storage of t, considering first 2 elements as elements of 1st row, next 2 elements as elements of 2nd row and last two elements as elements of 3rd row. 

An important point to note with 'view' method is that it does not change the layout of the Tensor in the memory but only provides a new view. This is unlike transposing two dimensions of a Tensor which will change the layout of the transposed tensor in memory. Both the new view of the Tensor and the transposed Tensor share the memory with the original Tensor. Any change to one of them will affect the other. An example is shown below.

In [10]:
t = torch.tensor([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
print(f't is stored in memory as: {t.take(index = torch.arange(0, t.numel()))}\n') # print the linear way it is stored in memory
t_transposed = t.transpose(dim0 = 0, dim1 = 1)
print(f't_transposed is stored in memory as: {t_transposed.take(index = torch.arange(0, t_transposed.numel()))}\n') # print the linear way it is stored in memory

t_view = t.view((4, 3))
print(f't_view is stored in memory as: {t_view.take(index = torch.arange(0, t_view.numel()))}\n') # print the linear way it is stored in memory

t_transposed[0, 1] = -2
print(f't_transposed after modifying 0th row 1st column element of t_transposed: {t_transposed}\n')
print(f't after modifying 0th row 1st column element of t_transposed: {t}\n') # t also is modified, not the 0th row 1st col element but 1st row 0th col element

t_view[0, 1] = -2
print(f't_view after modifying 0th row 1st column element of t_view: {t_view}\n')
print(f't after modifying 0th row 1st column element of t_view: {t}\n') # t also is modified at exactly the same position



t is stored in memory as: tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

t_transposed is stored in memory as: tensor([ 1,  5,  9,  2,  6, 10,  3,  7, 11,  4,  8, 12])

t_view is stored in memory as: tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

t_transposed after modifying 0th row 1st column element of t_transposed: tensor([[ 1, -2,  9],
        [ 2,  6, 10],
        [ 3,  7, 11],
        [ 4,  8, 12]])

t after modifying 0th row 1st column element of t_transposed: tensor([[ 1,  2,  3,  4],
        [-2,  6,  7,  8],
        [ 9, 10, 11, 12]])

t_view after modifying 0th row 1st column element of t_view: tensor([[ 1, -2,  3],
        [ 4, -2,  6],
        [ 7,  8,  9],
        [10, 11, 12]])

t after modifying 0th row 1st column element of t_view: tensor([[ 1, -2,  3,  4],
        [-2,  6,  7,  8],
        [ 9, 10, 11, 12]])



Another important method defined in Tensor class is 'to' method. It can be used to create a copy of the exisiting Tensor with different data type. It can also be used to copy a tensor from cpu memory to gpu memory and vice-versa. The example below shows how 'to' method is useful for creating a copy of the Tensor with different datatype. Further examples later will show its usefulness for copying Tensor to gpu memory/cpu memory.

In [11]:
t = torch.eye(3)
print(f't: {t}')
print(f'Datatype of t: {t.dtype}\n')

t_1 = t.to(dtype = torch.int16)
print(f't_1: {t_1}')
print(f'Datatype of t_1: {t_1.dtype}\n')


t: tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])
Datatype of t: torch.float32

t_1: tensor([[1, 0, 0],
        [0, 1, 0],
        [0, 0, 1]], dtype=torch.int16)
Datatype of t_1: torch.int16



'is_cuda' method in the Tensor class will check if the Tensor resides on GPU. For eg, see the code below.

In [12]:
t = torch.ones(size = (2, 3))
t.is_cuda # currently t is residing on cpu

False

Currently Tensor t is residing in cpu. To copy it to gpu memory, we can do the following:

In [13]:
t_gpu = t.to(device = 'cuda')
print(f'Is t_gpu residing in gpu?\t{t_gpu.is_cuda}')
print(f't_gpu is residing in the device: {t_gpu.device}')

Is t_gpu residing in gpu?	True
t_gpu is residing in the device: cuda:0


To copy the Tensor back to cpu, we can do the following.

In [14]:
t_cpu = t.to(device = 'cpu')
print(f'Is t_cpu residing in gpu?\t{t_cpu.is_cuda}')
print(f't_cpu is residing in the device: {t_cpu.device}')

Is t_cpu residing in gpu?	False
t_cpu is residing in the device: cpu


We can also use 'cuda' method defined in the Tensor class to copy the Tensor object to gpu memory and 'cpu' method to copy the Tensor object to cpu memory. See examples below.

In [15]:
t_gpu = t_cpu.cuda()
print(f'Is t_gpu residing in gpu?\t{t_gpu.is_cuda}')
print(f't_gpu is residing in the device: {t_gpu.device}')

Is t_gpu residing in gpu?	True
t_gpu is residing in the device: cuda:0


In [16]:
t_cpu = t_gpu.cpu()
print(f'Is t_cpu resing in gpu?\t{t_cpu.is_cuda}')
print(f't_cpu is residing in the device: {t_cpu.device}')

Is t_cpu resing in gpu?	False
t_cpu is residing in the device: cpu
