## Where are we now
1. Python - general problem solving
2. Data Science - NumPy, Pandas, Sklearn, Matplotlib 
3. ML from Scratch - Intuition (so for those who want to further advance....)
4. Signal Processing - Energy, Telecommunciations, Biosignals, Time Series
5. Deep Learning - PyTorch
   1. One of the most popular DL framework (against TensorFlow)

## Deep Learning vs. Machine Learning

Good News
- Deep Learning can automatically feature engineer / feature selection
- Deep Learning can benefit from huge amount of data, while Machine Learning cannot
  - 100 samples vs 1000 samples, ML will get the same accuracy
  - But DL will see increased accuracy
- Deep Learning is basically stacking a lot of linear regression together
  - DL can learn very complex patterns
  - DL is perfect for (1) images, (2) text, (3) signal (very random)

Bad News
- Deep Learning sucks with small data (vs. Machine Learning) - 5000++ samples
- For Tabular Data, Deep Learning will ALMOST ALWAYS LOSE TO gradient boosting (or its variants)
  - Gradient Boosting is basically decision trees stacking after one another....
  - For most competition, XGBoost and LightGBM are always the winners for tabular data
  - If you work in a company, mostly they use tabular data, then you should look for gradient boosting types.... 
- Deep Learning has NO feature importance; so it's mostly blackbox....(Explanable AI)

# PyTorch

In [1]:
import torch #pip install torch or pip3 install torch or conda install torch
import numpy as np

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
np.__version__

'1.23.3'

In [3]:
torch.__version__

'1.12.1+cu102'

## Torch Tensors

PyTorch don't use NumPy.  Instead, it has its own data structures, called `Tensor`, which support automatic differentiation.

### Create torch tensors from NumPy

In [4]:
#create a numpy array of 1 to 5
arr = np.arange(1, 6)
# arr

#print the data type
arr.dtype  #int64

#print the type()
type(arr)  #belongs to Python itself

numpy.ndarray

In [5]:
#convert numpy to tensor

#1. from_numpy (copy)
torch_arr_from = torch.from_numpy(arr)
torch_arr_from.dtype  #torch.int64
type(torch_arr_from)  #torch.Tensor
torch_arr_from.type() #torch.LongTensor (int64); if torch.IntTensor (int32)
                      #torch.FloatTensor (float32); if torch.DoubleTensor (float64)
#from_numpy is a copy!!!  This is intended, for easy use between numpy and tensor...
# arr[2] = 999
# torch_arr_from

#2. tensor (not a copy)
torch_arr_tensor = torch.tensor(arr)  #everything is the same, except it's NOT a copy
arr[2] = 9999999
torch_arr_tensor

#In our class, mostly we use torch.tensor; it won't fail us :-)


tensor([1, 2, 3, 4, 5])

## Some API to create tensor

`torch.empty(size)` - any arbitrary numbers

`torch.ones(size)`

`torch.zeros(size)`

`torch.arange(start, stop(ex), step)`

`torch.linspace(start, stop, how many)`

`torch.logspace(start, stop, how many)`  - power of 10

`torch.rand(size)` - [0, 1)

`torch.randn(size)` - std = 1 with uniform distribution

`torch.randint(low, high, size)` - [low, high)

`torch.ones_like(input)` = `torch.ones(input.shape)`



In [2]:
arr1 = torch.linspace(1, 5, 10)
arr1

tensor([1.0000, 1.4444, 1.8889, 2.3333, 2.7778, 3.2222, 3.6667, 4.1111, 4.5556,
        5.0000])

In [6]:
arr2 = torch.rand((2,4))
arr2.shape

torch.Size([2, 4])

In [7]:
arr2

tensor([[0.9797, 0.6278, 0.9673, 0.7281],
        [0.6260, 0.1581, 0.6334, 0.0018]])

In [8]:
#import some deep learning layer
#you have to help me create the right shape to insert to this layer

import torch.nn as nn  #nn contains a lot of useful deep learning layers

linear_layer = nn.Linear(5, 1)  #basically you insert 5 features, output 1 number
print(linear_layer.weight)  #they treat this as theta, X @ theta^T
print(linear_layer.bias)
#[0.1315, 0.3990, 0.0960, 0.0807, 0.2908]
#weight - [5, 1]
#X @ weight
#(anything, 5) @ (5, 1)

#can you guys help me generate any pytorch tensor of size (?, ?)
data   = torch.rand(1000, 5)
output = linear_layer(data)
print(output.shape)  #output shape?? - 1000, 1

Parameter containing:
tensor([[ 0.0629,  0.3609, -0.3114, -0.1044,  0.4206]], requires_grad=True)
Parameter containing:
tensor([0.1236], requires_grad=True)
torch.Size([1000, 1])


In [9]:
torch.manual_seed(9999)  #this will make sure your weight is always init the same thing
#this seed is VERY IMPORTANT for research
#you CANNOT FORGET THIS - setting 5 different seeds is basically doing cross validation
#please create two linear layers of size (100, 5), (5, 1)
layer1 = nn.Linear(100, 5)
layer2 = nn.Linear(5,   1)

#try some input that pass through these two layers
sample_size = 1000
_input = torch.rand(sample_size, 100)
# _input = layer1(_input)
# _input = layer2(_input)
# _input.shape

#try nn.Sequential
model = nn.Sequential(
     layer1,
     layer2
 )

_input = model(_input)
_input.shape


torch.Size([1000, 1])

In [12]:
#Changing the type
#format .tyope()

x = torch.arange(1,6)
x.dtype

x = x.type(torch.float64) # is NOT In Placae
x.dtype

torch.float64

## Reshape and view
- they are very similar
- view will create a copy, while reshape does not
- view will create a contiguous array, while reshape does not!
- contiguous array - share consecutive memory x001 x002
- non-contiguous array - memory in different places x001 x140 x004
- some algorithms/models/cuda require your array to be contiguous
  - in those case, use view or reshape to fix it...

In [14]:
x = torch.arange(10)
x

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [17]:
y = x.view(2,5)
y[0, 0] = 9999
y.is_contiguous()

True

In [16]:
#Check x, does it change?
x # X and y share memory

tensor([9999,    1,    2,    3,    4,    5,    6,    7,    8,    9])

In [19]:
z = x.reshape(2,5) #Can or cannot be copy, see the documentation
z.is_contiguous() #Not always contiguous, where as view is always contiguous

True

In [20]:
z[0,1] = 8888
x

tensor([9999, 8888,    2,    3,    4,    5,    6,    7,    8,    9])

In [21]:
z_transpose = z.transpose(1,0) #(5,2)
z_transpose.shape
z_transpose.is_contiguous()

False

## Gradients

### One step back-propagation

$$y= 2x^4 + x^3 + 3X^2 + 5x + 1$$

$$ dy /dx = y' = 8 x^3 + 3x^2 + 6x + 5$$


In [11]:
#What PyTorch is so amazing
#It automatically calculate this gradient, always available

x = torch.tensor(2.,requires_grad=True)
print(x.grad) # No derivative calculated yet

None


In [12]:
x.shape

torch.Size([])

In [26]:
y = 2*x**4 + x**3 + 3*x**2 + 5*x + 1
print(y)

tensor(63., grad_fn=<AddBackward0>)


In [27]:
y.backward() # is in palce function

In [28]:
#try now print x.grad # this will be basically dy/dx at point x = 2
x.grad

tensor(93.)

$$ dy /dx = y' = 8 x^3 + 3x^2 + 6x + 5$$

In [29]:
#Check how we get 93
y_der = 8*2**3 + 3*2**2 + 6*2 + 5
y_der 


93

### Multiple step back-propagation

$$y = 3x + 2$$
$$z = 2y^2  $$
$$o = z / 6 $$ 

- let's assume we have 6 elements

$$\frac{\partial o}{\partial x} = \frac{\partial o}{\partial z} * \frac{\partial z}{\partial y} * \frac{\partial y}{\partial x}$$

$$\frac{\partial o}{\partial z} = 1/6$$

$$\frac{\partial z}{\partial y} = 4(y) = 4(3x + 2) $$

$$\frac{\partial y}{\partial x} = 3$$

$$\frac{\partial o}{\partial x} =2(3x + 2)$$



In [33]:
x = torch.tensor([[1.,2.,3.],[3.,2.,1.]], requires_grad=True)


In [34]:
y = 3*x + 2
z = 2*y**2
o = z.mean()

In [35]:
o.backward()

In [36]:
x.grad
# How did we get 10, 16, 22?

tensor([[10., 16., 22.],
        [22., 16., 10.]])

In [38]:
x_grad = 2 * (3*1 + 2)
x_grad

10

## Exercise

### Use the same x

y = 10x - 9999
z = 5 - y 
o = 3z**2
oo = o /6

dy/dx = 10
dz/dy = -1
do/dz = 6*z
doo / do = 1/6 
do0/dx = -10(z) = -10(5-y) = -10(5-10x + 9999) = 100x - 100040


### Task1: calculalate all the gradient
### Task2: Code and try whether it matches with your answer
### Put on the chat if you are done

In [49]:
x = torch.tensor([[1., 2., 3.],[3.,2.,1.]],requires_grad=True)
y = 10*x-9999
z = 5 - y
o = 3*z**2

o = o.mean() # We have to make it into a number .... for back propagation
o.backward()
print(x.grad)


tensor([[-99940., -99840., -99740.],
        [-99740., -99840., -99940.]])


In [50]:
100 - 100040

-99940

## Exercise

In [51]:

x = torch.tensor([[1., 2., 3.],[3.,2.,1.]],requires_grad=True)
x.shape

torch.Size([2, 3])

X = [
    x11 x12 x13
    x21 x22 x23
]

W = [
    W1
    W2
    W3
    ]

x @ w = [
    x11*W1 x12*W2 x13*W3
    x21*W1 x22*W2 x23*W3
                            ]

dW/dX = [
            

]

In [59]:
w = torch.arange(3.).view(3,1)
w.shape # (3,1)
w

tensor([[0.],
        [1.],
        [2.]])

In [60]:
o = x @ w

In [61]:
oo = o.mean()

In [62]:
oo.backward()

In [64]:
oo

tensor(6., grad_fn=<MeanBackward0>)

In [63]:
print(x.grad)

tensor([[0., 1., 2.],
        [0., 1., 2.]])


In [9]:
torch.cuda.is_available()

False

In [10]:
#torch.cuda.current_device()

RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx