## Where are we now
1. Python - general problem solving
2. Data Science - Numpy, Pandas, Sklearn, Matplotlib
3. ML from Scratch - Intuition (so for those who want to further advance ........)
4. Signal Processing - Energy, Telecommunications, Biosignals, Time Series
5. Deep Learning - PyTorch
    1. One of the most popular DL framework (against TensorFlow)

## Deep Learning vs. Machine Learning

Good News
- Deep Learning can automatically feature engineer / feature selection
- Deep Learning can benefit from huge amount of data, while Machine Learning cannot
    - 100 samples vs 1000 samples, ML will get the same accuracy
    - But DL will see increased accuracy
- Deep Learning is basically stacking a lot of linear regression together
    - DL can learn very complex patterns
    - DL is perfect for (1) images, (2) text, (3) time series / signal (very random)

Bad News
- Deep Learning sucks with small data (vs. Machine Learning) - 5000++ samples
- For Tabular Data, Deep Learning will ALMOST LOSE TO gradient boosting (or its variants)
    - Gradient Boosting is basically decision tree stacking after one another....
    - For most competition, XGBoost and LIghtGBM are always the winner for tabular data
    - If you work in a company, mostly they use tabular data, then you should look for gradient boosting types....
- Deep Learing has NO feature importance; so it's mostly blackbox.... (Explanable AI)

# PyTorch

In [1]:
#pip install torch or pip3 install torch or conda install torch
import numpy as np
import torch

In [2]:
np.__version__

'1.22.3'

In [3]:
torch.__version__

'1.12.1+cpu'

## Torch Tensors

Pytorch don't use Numpy, Instead, it has its own data structures, called Tensor, which support automatic differentiation

### Create torch Tensor from Numpy 

In [4]:
#create a numpy array of 1 to 5
arr = np.arange(1,6)
arr

#print the data type
arr.dtype #int64

#print the type()
type(arr) #belongs to Python itself

numpy.ndarray

In [5]:
#convert numpy to tensor

#1. from_numpy (copy)
torch_arr_from = torch.from_numpy(arr)
torch_arr_from.dtype  #torch.int64
type(torch_arr_from)  #torch.Tensor
torch_arr_from.type() #torch.LongTensor (int64); if torch.IntTensor (int32)
                      #torch.FloatTensor (float32); if torch.DoubleTensor (float64)
#from_numpy is a copy!!!  This is intended, for easy use between numpy and tensor...
# arr[2] = 999
# torch_arr_from

#2. tensor (not a copy)
torch_arr_tensor = torch.tensor(arr)  #everything is the same, except it's NOT a copy
arr[2] = 9999999
torch_arr_tensor

#In our class, mostly we use torch.tensor; it won't fail us :-)

  torch_arr_from = torch.from_numpy(arr)


RuntimeError: Numpy is not available

## Some API to create tensor

'torch.empty(size)'

'torch.oens(size)'

'torch.zeros(size)'

'torch.arange(start,stop(ex),step)'

'torch.linspace(start,stop(ex),step)' - power of 10

'torch.rand(size)' - [0,1)

'torch.randn(size)' - std = 1 with uniform distribution 

'torch.randint(low, high ,size)' - [low, high)

'torch.ones_like(input)' = 'torch.ones(input.shape)'
'torch.zeros_like(input)'
'torch.rand_like(input)'

In [6]:
#import some deep learning layer
#you have to help me create the right shape to insert to this layer

import torch.nn as nn  #nn contains a lot of useful deep learning layers

linear_layer = nn.Linear(5, 1)  #basically you insert 5 features, output 1 number
linear_layer.weight  #they treat this as theta, X @ theta^T
linear_layer.bias
#[0.1315, 0.3990, 0.0960, 0.0807, 0.2908]
#weight - [5, 1]
#X @ weight
#(anything, 5) @ (5, 1)

#can you guys help me generate any pytorch tensor of size (?, ?)
data   = torch.rand(1000, 5)
output = linear_layer(data)
print(output.shape)  #output shape?? - 1000, 1

torch.Size([1000, 1])


In [7]:
torch.manual_seed(9999)  #this will make sure your weight is always init the same thing
#this seed is VERY IMPORTANT for research
#you CANNOT FORGET THIS - setting 5 different seeds is basically doing cross validation
#please create two linear layers of size (100, 5), (5, 1)
layer1 = nn.Linear(100, 5)
layer2 = nn.Linear(5,   1)

#try some input that pass through these two layers
sample_size = 1000
_input = torch.rand(sample_size, 100)
# _input = layer1(_input)
# _input = layer2(_input)
# _input.shape

#try nn.Sequential
model = nn.Sequential(
    layer1,
    layer2
)
_input = model(_input)
_input.shape

torch.Size([1000, 1])

In [8]:
#chaniging the type
#format: .type()

x = torch.arange(1,6)
x.dtype

x.type(torch.float64) #is NOT this in-place
x.dtype

torch.int64

## Reshape and view
- they are very similar
- view will create a copy, while reshape does not
- view will create a contiguous array, while reshape does not!
- contiguous array - share consectutive memory x001 x002
- non-contiguous array - memory in different place x001 x140 x004
- some algorihtms/model/cuda require your array to be contiguous
    - in those case, use view or rreshape to fix it

#

In [9]:
x = torch.arange(10)
x

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [10]:
y = x.view(2,5)
y[0,0] = 9999
y.is_contiguous()

True

In [11]:
#please help me check x, does it change?
x #xand y shares memory

tensor([9999,    1,    2,    3,    4,    5,    6,    7,    8,    9])

In [12]:
z = x.reshape(2,5) #can or cannot be copy # see the documentation
z.is_contiguous()

z[0,1] = 888833
x

tensor([  9999, 888833,      2,      3,      4,      5,      6,      7,      8,
             9])

In [13]:
z_transpose = z.transpose(1,0) #5,2
z_transpose.shape
z_transpose.is_contiguous()

False

## One strp back-propagation

The process of using derivates to learn the weights - **gradient descent**. But when in a deep learing form, because we can have many layers stacked together, we give a new name - **backpropagation**


$$y = 2x^4 + x^3 +3x^2 +5x +1 $$
$$ \frac{dy}{dx} = y' = 8x^3 + 3x^2 +6x +5 $$

In [14]:
#why pytorch is so amazing!!!
#because pytorch automically calculates this gradient, always available
#that's wht pytoch is very nice for deep learning
#requires_grad=True, make sure we gonna let pytorch always calculate the gradient/derivative
#when requires_grad=True, nubmer MUST BE FLOAT
#try remove the 2. --> 2 --> there will be error
x = torch.tensor(2., requires_grad=True)
print(x.grad) #no derivatives calculated yet *until we call. backward())

None


In [15]:
y = 2*x ** 4 + x ** 3 + 3*x ** 2 + 5*x + 1
print(y)

tensor(63., grad_fn=<AddBackward0>)


In [16]:
#call backward, which gonna calculate all the gradient relevant to y
y.backward() #is a inplace function

In [17]:
#try now print x.grad #this wil be basically dy/dx at the pont of x = 2
x.grad

#can you check how I get 93????

tensor(93.)

$$ y' = 8x^3 + 3x^2 +6x +5 $$

- try out x = 2 here
- this is the derivative/gradient/slop/rate of change at the point(2,63)

In [18]:
dy = 8*x ** 3 + 3*x ** 2 + 6*x + 5
dy

tensor(93., grad_fn=<AddBackward0>)

## Multiple step back-propagation

$$ y = 3x +2$$
$$ z = 2y^2 $$
$$ o = z/6 $$ 
- let's assume we have 6 elements

$$ \frac{\partial o}{\partial x} = \frac{\partial o}{\partial z} * \frac{\partial z}{\partial y} * \frac{\partial y}{\partial x}$$

$$\frac{\partial o}{\partial z}= \frac{1}{6} $$

$$\frac{\partial z}{\partial y}= 4y = 4(3x +2) $$

$$\frac{\partial y}{\partial x} = 3 $$

$$ \frac{\partial o}{\partial x} = 2(3x+2) $$

In [19]:
x = torch.tensor([[1.,2,3],[3,2,1]],requires_grad=True)

In [20]:
y = 3*x +2
z = 2*y **2
o = z.mean()

In [21]:
o.backward()

In [22]:
x.grad #try to find out how did we get 10,16,22!?

tensor([[10., 16., 22.],
        [22., 16., 10.]])

## Excercise

use the same x

$$
\begin{align}
y &= 10x-9999 \\
z &= 5 - y \\
o &= 3z^2 \\  
oo & = \frac{o}{6} \\
\end{align}
$$

Task1: Calculate all the gradients

$$ \frac{\partial oo}{\partial x} = \frac{\partial oo}{\partial o} *\frac{\partial o}{\partial z} * \frac{\partial z}{\partial y} * \frac{\partial y}{\partial x}$$

$$ \frac{\partial oo}{\partial o} = \frac{1}{6} \quad \frac{\partial o}{\partial z}= 6z \quad \frac{\partial z}{\partial y}= -1 \quad \frac{\partial y}{\partial x} = 10 \quad \frac{\partial o}{\partial x} = -60z $$

Task2: code and try whether it matchs yours

Put on the chat if you are done; you can put on the chat for task 1 first

In [23]:
x = torch.tensor([[1.,2,3],[3,2,1]],requires_grad=True)
y = 10*x-9999
z = 5 - y
o = 3*z ** 2
oo = o.mean() #we have to make it into one nubmer .... for backpropagation

In [24]:
oo.backward()

In [25]:
x.grad

tensor([[-99940., -99840., -99740.],
        [-99740., -99840., -99940.]])

In [26]:
-10 * (5- 10*1 + 9999)

-99940

## Exercise


In [27]:
x = torch.tensor([[1.,2,3],[3,2,1]],requires_grad=True)
x.shape

torch.Size([2, 3])

In [28]:
w = torch.arange(3.).view(3,1)
w.shape #[3,1]
w

tensor([[0.],
        [1.],
        [2.]])

In [29]:
o = x @ w
o

tensor([[8.],
        [4.]], grad_fn=<MmBackward0>)

In [30]:
oo = o.mean()
oo

tensor(6., grad_fn=<MeanBackward0>)

In [31]:
oo.backward()

In [32]:
oo

tensor(6., grad_fn=<MeanBackward0>)

In [33]:
x.grad

tensor([[0.0000, 0.5000, 1.0000],
        [0.0000, 0.5000, 1.0000]])

### Dot product
X = [
      X11 X12 X13
      X21 X22 X23
                  ]

W = [
      W1 
      W2 
      W3 ]

O = X @ W = [ 
            X11W1 + X12W2 + X13W3 
            X21W1 + X22W2 + X23W3
                                    ]

OO = 1/2 (O)

dOO/dX = dOO/dO * dO/dX

In [None]:
### Dot product
X = [
        1   2   3
        3   2   1
                  ]

W = [
      0 
      1 
      2 ]

O = X @ W = [ 
                1*0 + 2*1 + 3*2
                3*0 + 2*1 + 1*2
                            ]

O = [ 
      8
      4 ]

OO = 1/2 (O)

dOO/dX = dOO/dO * dO/dX

### Matrix Multiplication Dot Product

In [34]:
x = torch.tensor([[1.,2,3],[3,2,1]],requires_grad=True)
x.shape

torch.Size([2, 3])

In [66]:
w = torch.arange(3.).view(3,1)
w.shape

torch.Size([3, 1])

In [67]:
m,_ = x.shape
_,n = w.shape
o = np.zeros((m,n))

# multiply matrix
for i in range(len(x)):
   for j in range(len(w[0])):
      for k in range(len(w)):
         o[i][j] += x[i][k] * w[k][j]

print(o)
print(o.shape)

[[8.]
 [4.]]
(2, 1)


In [72]:
x = torch.tensor([[1.,2,3],[3,2,1]],requires_grad=True)
x.shape

w = torch.arange(3.).view(1,3)
w.shape

m,_ = x.shape
_,n = w.shape
o = np.zeros((m,n))

# multiply matrix
for i in range(len(x)):
   # for j in range(len(w[0])):
   for j in range(len(w[0])):
      for k in range(len(w)):
         o[i][j] = x[i][k] * w[k][j]
         
print(o,'\n')
print(o/2)

[[0. 1. 2.]
 [0. 3. 6.]] 

[[0.  0.5 1. ]
 [0.  1.5 3. ]]
