## Where are we now
1. Python - general problem solving
2. Data Science - NumPy, Pandas, Sklearn, Matplotlib 
3. ML from Scratch - Intuition (so for those who want to further advance....)
4. Signal Processing - Energy, Telecommunciations, Biosignals, Time Series
5. Deep Learning - PyTorch
   1. One of the most popular DL framework (against TensorFlow)

## Deep Learning vs. Machine Learning

Good News
- Deep Learning can automatically feature engineer / feature selection
- Deep Learning can benefit from huge amount of data, while Machine Learning cannot
  - 100 samples vs 1000 samples, ML will get the same accuracy
  - But DL will see increased accuracy
- Deep Learning is basically stacking a lot of linear regression together
  - DL can learn very complex patterns
  - DL is perfect for (1) images, (2) text, (3) signal (very random)

Bad News
- Deep Learning sucks with small data (vs. Machine Learning) - 5000++ samples
- For Tabular Data, Deep Learning will ALMOST ALWAYS LOSE TO gradient boosting (or its variants)
  - Gradient Boosting is basically decision trees stacking after one another....
  - For most competition, XGBoost and LightGBM are always the winners for tabular data
  - If you work in a company, mostly they use tabular data, then you should look for gradient boosting types.... 
- Deep Learning has NO feature importance; so it's mostly blackbox....(Explanable AI)

# PyTorch

In [2]:
import torch #pip install torch or pip3 install torch or conda install torch
import numpy as np

  from .autonotebook import tqdm as notebook_tqdm


In [4]:
np.__version__

'1.23.1'

In [5]:
torch.__version__

'1.12.0'

## Torch Tensors

PyTorch don't use NumPy.  Instead, it has its own data structures, called `Tensor`, which support automatic differentiation.

### Create torch tensors from NumPy

In [15]:
#create a numpy array of 1 to 5
arr = np.arange(1, 6)
# arr

#print the data type
arr.dtype  #int64

#print the type()
type(arr)  #belongs to Python itself

numpy.ndarray

In [16]:
#convert numpy to tensor

#1. from_numpy (copy)
torch_arr_from = torch.from_numpy(arr)
torch_arr_from.dtype  #torch.int64
type(torch_arr_from)  #torch.Tensor
torch_arr_from.type() #torch.LongTensor (int64); if torch.IntTensor (int32)
                      #torch.FloatTensor (float32); if torch.DoubleTensor (float64)
#from_numpy is a copy!!!  This is intended, for easy use between numpy and tensor...
# arr[2] = 999
# torch_arr_from

#2. tensor (not a copy)
torch_arr_tensor = torch.tensor(arr)  #everything is the same, except it's NOT a copy
arr[2] = 9999999
torch_arr_tensor

#In our class, mostly we use torch.tensor; it won't fail us :-)


tensor([1, 2, 3, 4, 5])

## Some API to create tensor

`torch.empty(size)` - any arbitrary numbers
`torch.ones(size)`
`torch.zeros(size)`
`torch.arange(start, stop(ex), step)`
`torch.linspace(start, stop, how many)`
`torch.logspace(start, stop, how many)`  - power of 10

`torch.rand(size)` - [0, 1)
`torch.randn(size)` - std = 1 with uniform distribution
`torch.randint(low, high, size)` - [low, high)

`torch.ones_like(input)` = `torch.ones(input.shape)`
`torch.zeros_like(input)`
`torch.rand_like(input)`

In [24]:
#import some deep learning layer
#you have to help me create the right shape to insert to this layer

import torch.nn as nn  #nn contains a lot of useful deep learning layers

linear_layer = nn.Linear(5, 1)  #basically you insert 5 features, output 1 number
linear_layer.weight  #they treat this as theta, X @ theta^T
linear_layer.bias
#[0.1315, 0.3990, 0.0960, 0.0807, 0.2908]
#weight - [5, 1]
#X @ weight
#(anything, 5) @ (5, 1)

#can you guys help me generate any pytorch tensor of size (?, ?)
data   = torch.rand(1000, 5)
output = linear_layer(data)
print(output.shape)  #output shape?? - 1000, 1

torch.Size([1000, 1])


In [31]:
torch.manual_seed(9999)  #this will make sure your weight is always init the same thing
#this seed is VERY IMPORTANT for research
#you CANNOT FORGET THIS - setting 5 different seeds is basically doing cross validation
#please create two linear layers of size (100, 5), (5, 1)
layer1 = nn.Linear(4,  3)
layer2 = nn.Linear(3,  1)

#try some input that pass through these two layers
sample_size = 3  #this is VERY UNCLEAN.....
_input = torch.rand(sample_size, 4)
# _input = layer1(_input)
# _input = layer2(_input)
# _input.shape

#try nn.Sequential
model = nn.Sequential(
    layer1,
    layer2
)

_input = model(_input)
_input.shape

torch.Size([3, 1])

In [37]:
#changing the type
#format: .type()

x = torch.arange(1, 6)
x.dtype

x = x.type(torch.float64)  #is NOT in-place
x.dtype


torch.float64

### Reshape and view
- they are very similar
- view will create a copy, while reshape may or may not
- view only work for contiguous array, while reshape works for both
- contiguous array - share consecutive memory x001 x002
- non-contiguous array - memory in different places x001 x140 x004

In [3]:
x = torch.arange(10)
x

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [4]:
y = x.view(2, 5)
y[0, 0] = 9999
y.is_contiguous()

True

In [5]:
y

tensor([[9999,    1,    2,    3,    4],
        [   5,    6,    7,    8,    9]])

In [41]:
#please help me check x, does it change?
x  #x and y shares memory

tensor([9999,    1,    2,    3,    4,    5,    6,    7,    8,    9])

In [44]:
z = x.reshape(2, 5)  #can or cannot be copy #see the documentation
z.is_contiguous()

z[0, 1] = 888833
x

tensor([  9999, 888833,      2,      3,      4,      5,      6,      7,      8,
             9])

In [46]:
z_transpose = z.transpose(1, 0)  #(5, 2)
z_transpose.shape
z_transpose.is_contiguous()

False

14:26 - 14:35

## One step back-propagation

The process of using derivates to learn the weights - **gradient descent**.  But when in a deep learning form, because we can have many layers stacked together, we give a new name - **backpropagation**

$$y = 2x^4 + x^3 + 3x^2 + 5x + 1$$

$$\frac{dy}{dx} = y' = 8x^3 + 3x^2 + 6x + 5$$

In [49]:
#why pytorch is so amazing!!!
#because pytorch automatically calculates this gradient, always available
#that's why pytorch is very nice for deep learning

#requires_grad=True, make sure we gonna let pytorch always calculate the gradient/derivatives
#when requires_grad=True, number MUST BE FLOAT
#try remove the 2. --> 2 --> there will be error......
x = torch.tensor(2., requires_grad=True)
print(x.grad)  #no derivative calculated yet (until we call .backward())

None


In [50]:
y = 2*x**4 + x**3 + 3*x**2 + 5*x + 1
print(y)

tensor(63., grad_fn=<AddBackward0>)


In [51]:
#call backward, which gonna calculate all the gradients relevant to y
y.backward()  #is a inplace function

In [52]:
#try now print x.grad  #this will be basically dy/dx at the point of x = 2
x.grad

#can you check how I get 93????

tensor(93.)

$$y' = 8x^3 + 3x^2 + 6x + 5$$

- try put x = 2 here
- this is the derivative/gradient/slope/rate of change at the point (2, 63)

## Multiple steps back-propagation

$$y = 3x + 2$$
$$z = 2y^2  $$
$$o = z / 6 $$ 

- let's assume we have 6 elements

$$\frac{\partial o}{\partial x} = \frac{\partial o}{\partial z} * \frac{\partial z}{\partial y} * \frac{\partial y}{\partial x}$$

$$\frac{\partial o}{\partial z} = \frac{1}{6}$$

$$\frac{\partial z}{\partial y} = 4(y) = 4(3x + 2)$$

$$\frac{\partial y}{\partial x} = 3$$

$$\frac{\partial o}{\partial x} = 2(3x + 2)$$

In [53]:
x = torch.tensor([[1., 2, 3], [3, 2, 1]], requires_grad=True)

In [54]:
y = 3*x + 2
z = 2*y**2
o = z.mean()

In [55]:
o.backward()

In [57]:
x.grad  #try to find out how did we get 10, 16, 22!?

tensor([[10., 16., 22.],
        [22., 16., 10.]])

## Exercise

#use the same x

$$
\begin{align}
y  &= 10x - 9999 \\
z  &= 5 - y \\
o  &= 3z^2 \\
oo & = \frac{o}{6} \\
\end{align}
$$

Task1: calculate all the gradients

$$\frac{\partial oo}{\partial x} =  \frac{\partial oo}{\partial o} * \frac{\partial o}{\partial z} * \frac{\partial z}{\partial y} * \frac{\partial y}{\partial x}$$

$$\frac{\partial oo}{\partial o} = \frac{1}{6} \quad \frac{\partial o}{\partial z} = 6z   \quad \frac{\partial z}{\partial y} = -1 \quad, \frac{\partial y}{\partial x} = 10 \quad, \frac{\partial o}{\partial x} = -10z$$

Task2: code and try whether it matches yours

Put on the chat if you are done; you can put on the chat for task 1 first

In [62]:
x = torch.tensor([[1., 2, 3], [3, 2, 1]], requires_grad=True)
y = 10*x - 9999
z = 5 - y
o = 3*z ** 2
oo = o.mean() #we have to make it into one number....for backpropagation

In [63]:
oo.backward()

In [64]:
x.grad

tensor([[-99940., -99840., -99740.],
        [-99740., -99840., -99940.]])

In [67]:
-10 * (5 - 10*1 + 9999)

-99940

## Exercise

In [71]:
x = torch.tensor([[1., 2, 3], [3, 2, 1]], requires_grad=True)
x.shape

torch.Size([2, 3])

In [81]:
w = torch.arange(3.).view(3, 1)
w.shape #[3, 1]
w

tensor([[0.],
        [1.],
        [2.]])

In [73]:
o = x @ w

In [76]:
oo = o.mean()

In [77]:
oo.backward()

In [78]:
oo

tensor(6., grad_fn=<MeanBackward0>)

In [80]:
x.grad  #how did i get these numbers??

tensor([[0.0000, 0.5000, 1.0000],
        [0.0000, 0.5000, 1.0000]])

X = [
      X11 X12 X13
      X21 X22 X23           
                  ]

W = [ 
      W1
      W2
      W3 ]

O = X @ W = [  
           X11W1 + X12W2 + X13W3
           X21W1 + X21W2 + X23W3
                                  ]

OO = 1/2 (O)

dOO/dX = dOO/dO * dO/dX