# Deep Learning with Python + PyTorch from Zero to GANs

## Table of Contents

A short introduction to PyTorch and the chosen functions with short explanations.
1. Tensors: Data Types, Cuda Feature, Numpy Arrays, Shape, Flattening, View, Ones, Zeros, Random
2. Backward Propagation and Gradient
3. Arithmetic and Logical Operations
4. max() Function
5. Bounded and Stepped Tensors

## Environment
In this series of articles, I will use Kaggle as an environment, however you are free to choose another platforms such as Google Colab, Binder, Amazon or your local machine. Kaggle is an invaluable free platform for AI and Data Science applications, tutorials and competitions developed and maintained by Google while a Jupyter Notebook frontend running on Google’s servers. You can get GPU/TPU support for more powerful computing. It is completely free. Kaggle comes with some preinstalled configuration, probably with tools prefered by Google. Eventhough, some of the librariers and frameworks are not preinstalled in the default configuration so if your needs are not present on Kaggle you should install them by yourself.

## Setup

Anaconda is not preinstalled in the default configuration. We should install Anaconda to install PyTorch. Follow Setup section to install prerequisties it. Follow Setup section to install prerequisties it.

In [1]:
# Anaonda installition
#!wget https://repo.anaconda.com/archive/Anaconda3-5.2.0-Linux-x86_64.sh && bash Anaconda3-5.2.0-Linux-x86_64.sh -bfp /usr/local

In [2]:
# MiniConda installition
#!wget https://repo.continuum.io/miniconda/Miniconda3-4.5.4-Linux-x86_64.sh && bash Miniconda3-4.5.4-Linux-x86_64.sh -bfp /usr/localv

In [3]:
# Test installition
#!conda info --all
#!conda list

In [4]:
# To install PyTorch you can use following command
#!conda install pytorch cpuonly -c pytorch -y

In [5]:
# Import PyTorch
import torch

### 1. Tensors: Data Types, Cuda Feature, Shape, Reshape, View, Ones, Zeros, Randn

PyTorch performs all its operations using tensors. Tensor is general term for vector and matrix representation. You can represent multi dimensional matrices with tensors. Every tensor is a matrix, but vice versa is not valid.

Tensors are seperated into two categories: CPU tensors and GPU tensors. We will come back this later. 

There are many tensor functions represented with the name of the datatypes. We can define tensors with these name-specific functions. Some of them are indicated below. Data types are not limited with the ones below. For further information, please check out the PyTorch documentation.

In [6]:
t_Float = torch.FloatTensor([6.])
t_Double = torch.DoubleTensor([19.23])
t_Int = torch.IntTensor([0.])
t_Bool = torch.BoolTensor([True])

t_Float, t_Double, t_Int, t_Bool

(tensor([6.]),
 tensor([19.2300], dtype=torch.float64),
 tensor([0], dtype=torch.int32),
 tensor([True]))

One other way to define a tensor is using `tensor` function. We can indicate data type of the tensor by simply adding the data type `dtype=X` flag. Default data type for the `tensor` function is float32. Examples of tensors can be seen below.

NOTE: You can not use string data types in tensors!

In [7]:
t_Float = torch.tensor([6.], dtype=torch.float32)
t_Double = torch.tensor([19.23], dtype=torch.float64)
t_Int = torch.tensor([0.], dtype=torch.int32)
t_Bool = torch.tensor([True], dtype=torch.bool)

t_Float, t_Double, t_Int, t_Bool

(tensor([6.]),
 tensor([19.2300], dtype=torch.float64),
 tensor([0], dtype=torch.int32),
 tensor([True]))

### Tensor Examples with Different Dimensions

In [8]:
# Example 1 - Constant Tensor
t0 = torch.tensor(6.)  # dtype=float32 by default
t0

tensor(6.)

In [9]:
# Example 2 - Vector (1 dimensional tensor) with 3 elements
t1 = torch.tensor([1., 2., 3.], dtype=torch.float64)
t1

tensor([1., 2., 3.], dtype=torch.float64)

In [10]:
# Example 3 - 2 by 4 matrix (2 dimensional tensor) with 4 element in each dimension
t2 = torch.tensor([[-1., -2., -3., -4.],
                   [-5., -6., -7., -8.]], dtype=torch.float64)
t2

tensor([[-1., -2., -3., -4.],
        [-5., -6., -7., -8.]], dtype=torch.float64)

In [11]:
# Example 4 - 3 dimensional tensor
t3 = torch.tensor([[[0., 2., 4.],
                    [6., 8., 10.]],
                   [[-10., -8., -6.],
                    [-4., -2., 0.]],
                   [[0., 2., 4.],
                    [6., 8., 10.]]], dtype = torch.float64)
t3

tensor([[[  0.,   2.,   4.],
         [  6.,   8.,  10.]],

        [[-10.,  -8.,  -6.],
         [ -4.,  -2.,   0.]],

        [[  0.,   2.,   4.],
         [  6.,   8.,  10.]]], dtype=torch.float64)

In [12]:
# Example 5 - 4 dimensional tensor
t4 = torch.tensor([[[[1., 3.], [1., 3.]], [[1., 2.], [1., 3.]], [[1., 2.], [1., 3.]]],
                   [[[1., 3.], [1., 3.]], [[1., 2.], [1., 3.]], [[1., 2.], [1., 3.]]],
                   [[[1., 3.], [1., 3.]], [[1., 2.], [1., 3.]], [[1., 2.], [1., 3.]]],
                   [[[1., 3.], [1., 3.]], [[1., 2.], [1., 3.]], [[1., 2.], [1., 3.]]]], dtype = torch.float64)
t4

tensor([[[[1., 3.],
          [1., 3.]],

         [[1., 2.],
          [1., 3.]],

         [[1., 2.],
          [1., 3.]]],


        [[[1., 3.],
          [1., 3.]],

         [[1., 2.],
          [1., 3.]],

         [[1., 2.],
          [1., 3.]]],


        [[[1., 3.],
          [1., 3.]],

         [[1., 2.],
          [1., 3.]],

         [[1., 2.],
          [1., 3.]]],


        [[[1., 3.],
          [1., 3.]],

         [[1., 2.],
          [1., 3.]],

         [[1., 2.],
          [1., 3.]]]], dtype=torch.float64)

We mentioned GPU feature of Kaggle before. Also we talked about GPU tensors. We can store our tensors in GPU. We can use cuda function to use GPU tensors.

NOTE: To be able to execete operations on GPU, you should set your accelarator as GPU.

In [13]:
# Example 7 - 4 dimensional GPU tensor
t4_GPU = torch.cuda.FloatTensor([[[[1., 2.], [1., 3.]], [[1., 2.], [1., 3.]], [[1., 2.], [1., 3.]]],
                                 [[[1., 2.], [1., 3.]], [[1., 2.], [1., 3.]], [[1., 2.], [1., 3.]]],
                                 [[[1., 2.], [1., 3.]], [[1., 2.], [1., 3.]], [[1., 2.], [1., 3.]]],
                                 [[[1., 2.], [1., 3.]], [[1., 2.], [1., 3.]], [[1., 2.], [1., 3.]]]])
t4_GPU

tensor([[[[1., 2.],
          [1., 3.]],

         [[1., 2.],
          [1., 3.]],

         [[1., 2.],
          [1., 3.]]],


        [[[1., 2.],
          [1., 3.]],

         [[1., 2.],
          [1., 3.]],

         [[1., 2.],
          [1., 3.]]],


        [[[1., 2.],
          [1., 3.]],

         [[1., 2.],
          [1., 3.]],

         [[1., 2.],
          [1., 3.]]],


        [[[1., 2.],
          [1., 3.]],

         [[1., 2.],
          [1., 3.]],

         [[1., 2.],
          [1., 3.]]]], device='cuda:0')

Alternatively we can use device flag. To do this run the command below. It is simply assign GPU device into a variable. After assigning the GPU device into a variable, we need to set `device=X` flag with our variable in tensor definitions.

In [14]:
cuda0 = torch.device('cuda:0')
cuda0

device(type='cuda', index=0)

In [15]:
# Example 6 - 4 dimensional GPU tensor
t4_GPU = torch.tensor([[[[1., 3.], [1., 3.]], [[1., 2.], [1., 3.]], [[1., 2.], [1., 3.]]],
                       [[[1., 3.], [1., 3.]], [[1., 2.], [1., 3.]], [[1., 2.], [1., 3.]]],
                       [[[1., 3.], [1., 3.]], [[1., 2.], [1., 3.]], [[1., 2.], [1., 3.]]],
                       [[[1., 3.], [1., 3.]], [[1., 2.], [1., 3.]], [[1., 2.], [1., 3.]]]], dtype = torch.float64, device=cuda0)

t4_GPU

tensor([[[[1., 3.],
          [1., 3.]],

         [[1., 2.],
          [1., 3.]],

         [[1., 2.],
          [1., 3.]]],


        [[[1., 3.],
          [1., 3.]],

         [[1., 2.],
          [1., 3.]],

         [[1., 2.],
          [1., 3.]]],


        [[[1., 3.],
          [1., 3.]],

         [[1., 2.],
          [1., 3.]],

         [[1., 2.],
          [1., 3.]]],


        [[[1., 3.],
          [1., 3.]],

         [[1., 2.],
          [1., 3.]],

         [[1., 2.],
          [1., 3.]]]], device='cuda:0', dtype=torch.float64)

In [16]:
t4.device, t4_GPU.device

(device(type='cpu'), device(type='cuda', index=0))

Even though PyTorch performs all operations using tensors, sometimes datasets comprise of numpy arrays, in such situations we have to transform numpy arrays to PyTorch tensors. We can transform numpy arrays to the tensors via `from_numpy()` function.

In [17]:
import numpy as np

mx = np.array([[23, 67, 43],  # Define a multi dimensional numpy array
               [91, 88, 64], 
               [38, 35, 76], 
               [13, 43, 37], 
               [69, 56, 70]], dtype='float32')

t = torch.from_numpy(mx)
type(mx), type(t)

(numpy.ndarray, torch.Tensor)

Also we can transform tensors back to numpy arrays via `numpy()` function.

In [18]:
mx = t.numpy()
type(mx)

numpy.ndarray

To see the shape or size of the tensors we can use two seperate function: `shape` and `size()`. These two are executing the same task, only difference is `size()` is a callable object.

In [19]:
print("Shape of tensor t0: {}\n\
Shape of tensor t1: {}\n\
Shape of tensor t2: {}\n\
Shape of tensor t3: {}\n\
Shape of tensor t4: {}\n".format(t0.shape, t1.shape, t2.shape, t3.size(), t4.size()))

Shape of tensor t0: torch.Size([])
Shape of tensor t1: torch.Size([3])
Shape of tensor t2: torch.Size([2, 4])
Shape of tensor t3: torch.Size([3, 2, 3])
Shape of tensor t4: torch.Size([4, 3, 2, 2])



To see dimension of the tensors we can use two seperate function: `ndim` or `dim()` functions. These two are executing the same task, only difference is `dim()` is a callable object.

In [20]:
print("Dimension of tensor t0: {}\n\
Dimension of tensor t1: {}\n\
Dimension of tensor t2: {}\n\
Dimension of tensor t3: {}\n\
Dimension of tensor t4: {}\n".format(t0.ndim, t1.ndim, t2.ndim, t3.dim(), t4.dim()))

Dimension of tensor t0: 0
Dimension of tensor t1: 1
Dimension of tensor t2: 2
Dimension of tensor t3: 3
Dimension of tensor t4: 4



To change and manipulate the shape of the tensor we can use several functions which are `reshape()`, `t()` and `permute(arg1, arg2, arg3)` functions. These functions are not write changes into memory, so if you want to keep shape changed, you should assign it to a new tensor or itself.

Changing the shape of the tensor using `reshape()`:

In [21]:
t2_changed = t2.reshape(1, 8)
t2, t2.shape, t2_changed, t2_changed.shape

(tensor([[-1., -2., -3., -4.],
         [-5., -6., -7., -8.]], dtype=torch.float64),
 torch.Size([2, 4]),
 tensor([[-1., -2., -3., -4., -5., -6., -7., -8.]], dtype=torch.float64),
 torch.Size([1, 8]))

Taking transpose of the matrix tensor using `t()`:

In [22]:
t2_changed = t2.t()
t2, t2.shape, t2_changed, t2_changed.shape

(tensor([[-1., -2., -3., -4.],
         [-5., -6., -7., -8.]], dtype=torch.float64),
 torch.Size([2, 4]),
 tensor([[-1., -5.],
         [-2., -6.],
         [-3., -7.],
         [-4., -8.]], dtype=torch.float64),
 torch.Size([4, 2]))

Permuting the tensor is also kind of reshaping operations with small difference; instead of indicating new shape we are indicating the order of dimensions. Please see the example usage:

In [23]:
# Normal dimension order is 0th dimension, 1st dimension and 2nd dimension (0, 1, 2)
t3_changed = t3.permute(0, 2, 1)  # Set dimension order as 0th dimension, 2nd dimension and 1st dimension (0, 2, 1)
t3, t3.shape, t3_changed, t3_changed.shape

(tensor([[[  0.,   2.,   4.],
          [  6.,   8.,  10.]],
 
         [[-10.,  -8.,  -6.],
          [ -4.,  -2.,   0.]],
 
         [[  0.,   2.,   4.],
          [  6.,   8.,  10.]]], dtype=torch.float64),
 torch.Size([3, 2, 3]),
 tensor([[[  0.,   6.],
          [  2.,   8.],
          [  4.,  10.]],
 
         [[-10.,  -4.],
          [ -8.,  -2.],
          [ -6.,   0.]],
 
         [[  0.,   6.],
          [  2.,   8.],
          [  4.,  10.]]], dtype=torch.float64),
 torch.Size([3, 3, 2]))

To flat the tensor we can use two seperate functions which are `flatten()` and `view()`. While `flatten()` flatten tensor to row vector by default, `view()` can flatten tensor to both column vector or row vector, it depends on passed argument. `reshape()` function also can be used for the same purposes. "-1" can be passed as an argument to `view()` and `reshape()` to flatten tensor to row vector for easy manner. These do not change the original tensor. If you want to keep tensors changed, you should assign it to a new tensor or itself.

In [24]:
t4.flatten(), t4.flatten().shape, t4.view(-1), t4.view(-1).shape, t4.reshape(-1), t4.reshape(-1).shape

(tensor([1., 3., 1., 3., 1., 2., 1., 3., 1., 2., 1., 3., 1., 3., 1., 3., 1., 2.,
         1., 3., 1., 2., 1., 3., 1., 3., 1., 3., 1., 2., 1., 3., 1., 2., 1., 3.,
         1., 3., 1., 3., 1., 2., 1., 3., 1., 2., 1., 3.], dtype=torch.float64),
 torch.Size([48]),
 tensor([1., 3., 1., 3., 1., 2., 1., 3., 1., 2., 1., 3., 1., 3., 1., 3., 1., 2.,
         1., 3., 1., 2., 1., 3., 1., 3., 1., 3., 1., 2., 1., 3., 1., 2., 1., 3.,
         1., 3., 1., 3., 1., 2., 1., 3., 1., 2., 1., 3.], dtype=torch.float64),
 torch.Size([48]),
 tensor([1., 3., 1., 3., 1., 2., 1., 3., 1., 2., 1., 3., 1., 3., 1., 3., 1., 2.,
         1., 3., 1., 2., 1., 3., 1., 3., 1., 3., 1., 2., 1., 3., 1., 2., 1., 3.,
         1., 3., 1., 3., 1., 2., 1., 3., 1., 2., 1., 3.], dtype=torch.float64),
 torch.Size([48]))

Tensors can be initialized with various numeric options like zeros, ones or random numbers. To initialize tensor with zeros, we can use `zeros()`, for ones we can use `ones()`, for random numbers we can use `randn()`. All of these functions takes dimension of tensor as arguments.

In [25]:
zeros = torch.zeros([2, 3], dtype=torch.int32)
ones = torch.ones([2, 3], dtype=torch.int32)
random = torch.randn([2, 3], dtype=torch.float32)

zeros, ones, random

(tensor([[0, 0, 0],
         [0, 0, 0]], dtype=torch.int32),
 tensor([[1, 1, 1],
         [1, 1, 1]], dtype=torch.int32),
 tensor([[ 1.2249, -1.3834, -1.0089],
         [-0.5472,  0.1367,  0.1600]]))

## 2 - Backward and Grad

Grad or gradient is a fancier term for derivative used in deep learning to represent slope of a cost function. We often use gradient calculations in deep learning algorithms. Taking derivative from end to starting point through the deep neural network called back propagation. To automize the backward process and finding gradients, PyTorch offers `backward` and `grad` functions. While `backward` function applies back propagation, `grad` function stores variable's gradients in special flags. For automatic differentiation `requires_grad` flag should be stated as "True" in the tensor definition.

In [26]:
random = torch.randn([3, 5], dtype=torch.float32, requires_grad=True)
result = random.pow(3).sum()  # Take "random" tensor to the element wise power 3 and sum all of the elements
print("Randomly initialized tensor: {}\n\n\Resultant tensor after pow by 3 and sum of all elements: {}".format(random, result))

Randomly initialized tensor: tensor([[ 1.1075, -1.7448, -0.2860, -0.7204, -0.1989],
        [-0.0222, -0.7873,  0.5428, -0.7075, -0.3362],
        [-0.6566,  1.0798,  0.4397,  1.4004,  0.3369]], requires_grad=True)

\Resultant tensor after pow by 3 and sum of all elements: -1.2334606647491455


In [27]:
result.backward()  # Perform back propagation
random.grad  # Print gradients of "result" with respect to "random"

tensor([[3.6799e+00, 9.1332e+00, 2.4545e-01, 1.5570e+00, 1.1867e-01],
        [1.4759e-03, 1.8594e+00, 8.8389e-01, 1.5018e+00, 3.3911e-01],
        [1.2934e+00, 3.4977e+00, 5.8001e-01, 5.8833e+00, 3.4041e-01]])

Using `zero_()` to clearing memory of `grad` flag that keeps gradient values:

In [28]:
random.grad.zero_()  # Reset the gradients at the memory of tensor

tensor([[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]])

## 3 - Arithmetic and Logical Operators

Various arithmetic and logical operations are available for tensors. These are not limited with the operations shown below. Also functions such as sum, square, pow, sqrt, log, mean and many more are available. Check documentation for those ones.

In [29]:
x = torch.randn(5)
y = torch.randn(5)
x, y

(tensor([ 0.7290,  0.5909, -0.6995,  0.6383,  0.4963]),
 tensor([ 1.4270, -2.0013, -0.5771, -0.2310,  0.7540]))

Some of the arithmetic operations.



In [30]:
tAdd = torch.add(x, y)  # x + y
tSub = torch.sub(x, y)  # x- y
tMul = torch.mul(x, 5)  # x * 5 (element wise multiplication)
tDiv = torch.div(x, 5)  # x / 5
tAdd, tSub, tMul, tDiv

(tensor([ 2.1560, -1.4103, -1.2767,  0.4073,  1.2503]),
 tensor([-0.6980,  2.5922, -0.1224,  0.8693, -0.2577]),
 tensor([ 3.6452,  2.9545, -3.4977,  3.1914,  2.4814]),
 tensor([ 0.1458,  0.1182, -0.1399,  0.1277,  0.0993]))

Please note that `mul()` function performs element wise multiplication. To perform matrix multiplication, we can use `@` operator.

A quick linear algebra trick: to be able to succes matrix multiplication, number of columns of 1st multiplier should be equal to number of rows of the 2nd multiplier.

In [31]:
mx_1 = torch.randn(3, 5)  # 1st multiplier
mx_2 = torch.randn(5, 2)  # 2nd multiplier
mul = mx_1@mx_2
mul

tensor([[ 1.2588,  0.9926],
        [ 0.5242,  0.4368],
        [-0.1098, -1.1673]])

Some of the trigonometric operations.

In [32]:
torch.cos(x), torch.sin(x)

(tensor([0.7458, 0.8304, 0.7651, 0.8031, 0.8794]),
 tensor([ 0.6662,  0.5571, -0.6439,  0.5958,  0.4762]))

Let's create two boolean tensors for logical operations.

In [33]:
inp_1 = torch.tensor([True, True, False], dtype=torch.int16)
inp_2 = torch.tensor([False, True, False], dtype=torch.int16)
inp_1, inp_2

(tensor([1, 1, 0], dtype=torch.int16), tensor([0, 1, 0], dtype=torch.int16))

Some of the logical operations.

In [34]:
tNot = torch.bitwise_not(inp_1, out=None)
tAnd = torch.bitwise_and(inp_1, inp_2, out=None)
tOr = torch.bitwise_or(inp_1, inp_2, out=None)
tXor = torch.bitwise_xor(inp_1, inp_2, out=None)
tNot, tAnd, tOr, tXor

(tensor([-2, -2, -1], dtype=torch.int16),
 tensor([0, 1, 0], dtype=torch.int16),
 tensor([1, 1, 0], dtype=torch.int16),
 tensor([1, 0, 0], dtype=torch.int16))

## 4 - Max Function

 `max()` is widely used in deep neural network models, this makes it one of the most important function in this notebook. It is basically takes a tensor as an argument, and takes the highest number within the tensor and returns number's itself with that number's index. Follow the example for better understanding.

In [35]:
tens = torch.randn(1, 10)  # Create a random tensor
tens

tensor([[ 0.8799, -0.3623,  0.7082,  1.6816,  1.1184,  0.2401, -0.7956, -0.8949,
         -0.5632,  0.2715]])

Please note that, in our random tensor there are randomly initialized numbers and one of them is has the highest value. The number with the highest value is "1.8813" and the index of this number is 8. `max()` function return these two values. See below.

In [36]:
inx, value = torch.max(tens, dim=1)
print("Index of the maximum value: {}, maximum value: {}".format(inx, value))

Index of the maximum value: tensor([1.6816]), maximum value: tensor([3])


`max()` function is widely used with `cross_entropy()` and softmax activation function since the softmax returns probabilistic values, `max()` function is very usefull to grab the highest probability from sofmax's output.

## 5 - Bounded and Stepped Tensors



To bound all elements in an input tensor into the range of [min, max] we can use `clamp` function.

Usage: *torch.clamp(input, min, max, out=None)*



In [37]:
# Bounded between 0 to 10
torch.clamp(tens, 0, 10, out=None)  # tens is randomly initialized tensor from previous cells

tensor([[0.8799, 0.0000, 0.7082, 1.6816, 1.1184, 0.2401, 0.0000, 0.0000, 0.0000,
         0.2715]])


To create a stepped vector tensor into the range of [start, end] we can use `linspace` function.

Usage: *torch.linspace(start, end, steps=100, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False)*


In [38]:
# Vector tensor started from 0 to 100 with step size of 5
torch.linspace(0,100,steps=5,dtype=torch.int32)

tensor([  0,  25,  50,  75, 100], dtype=torch.int32)

## Reference Links
* Official documentation for `torch.Tensor`: https://pytorch.org/docs/stable/tensors.html
* PyTorch Zero to GANs Course: http://zerotogans.com/

Special thanks to Jovian.ml team and freeCodeCamp.org!
