# Introduction to Deep Learning
CMU Deep Learning course. My Notes.

## Table of Contents

1.[Lecture 1 Introduction to Deep Learning](#lec1)  
- 1.1  [Recitation 0a](#rec1a) [Recitation 0b](#rec1b)  

2.[Lecture 2 The Neural Net as a Universal Approximator](#lec2)  
- 2.1  [Recitation 2 Your First Deep Learning Code](#rec2)
- 2.1  [Recitation 3 Pytorch Tutorial](#rec3a) [Recitation ](#rec13b)  
        

## 1 Lecture
## Introduction to Deep Learning
<a id=lec1> </a>


Notes from lecture 1

__MLP__ -- multilayer perceptron.

Unit “fires” if weighted input exceeds a threshold.  
Any real-valued “activation” function may operate on the weighted-sum input.

• Perceptrons are correlation filters
– They detect patterns in the input

Other things MLPs can do:  
- Loopy networks can “remember”
patterns.  
- Represent probability distributions  
-- Over integer, real and complex-valued
domains.  
-- MLPs can model both a posteriori and a
priori distributions of data(A posteriori conditioned on other variables)
     


More on neural networks as universal
approximators
– And the issue of depth in networks

# Lecture 2 <a id=lec2> </a>
## The Neural Net as a Universal Approximator

__Activation__: The function that acts on the weighted
combination of inputs (and threshold)

### Deep Structures  
• In any directed network of computational
elements with input source nodes and output
sink nodes, “depth” is the length of the
longest path from a source to a sink

## Recitation 0a
<a id=rec1a> </a>

### Data Containers

Python offers a variety of containers each dedicated for different purpose and constrained to harness certain optimisations
* lists - generic container , numeric indexing
* tuples - immutable lists 
* dictionaries - key-value organisation 
* sets - collection of unique elements

### Lists
Pay attention as these are techniques to handle data pre-processing and manipulation in the batch loading phase 

[Python Standard Library](https://docs.python.org/3/library/)

In [25]:
tuple_list = [
                (1,'Erebor',800.45),
                (2,'Rivendell',500.67),
                (3,'Shire',900.12),
                (4,'Mordor',1112.30)
            ]
tuple_list

[(1, 'Erebor', 800.45),
 (2, 'Rivendell', 500.67),
 (3, 'Shire', 900.12),
 (4, 'Mordor', 1112.3)]

#### Conditional operations - filtering:  
There are two ways to filter lists:  

- Index based - Slicing and Dicing  

- Condition based - List comprehension  

## Slicing and Dicing
sliced_list = [ start_idx : end_idx+1 : step]

In [47]:
my_list = [1,2,3,4,5,6,7]
my_list[1:6:2]
my_list[-1:0:-2]

[7, 5, 3]

#### List comprehension

`*result*  = [*transform*    *iteration*         *filter*     ]` 
~~~~
res = [ manipulation(instance[2]) for instance in sorted_dataset ]
~~~~

In [65]:
6 % 5

1

In [67]:
[x for x in my_list if x%2 == 0]

[2, 4, 6]

In [72]:
%%timeit
res = [i for i in range(10000)]

250 µs ± 1.45 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [73]:
%%timeit
res = []
for i in range(10000):
    res.append(i)

663 µs ± 37.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


### Classes

Specifically useful for datasets that are supposed to be 'iterable'

### Iterable and Iterators

In [75]:
class IterableADT:
    
    def __init__(self,train_data_src,train_data_src2, train_label_src):
        self.x = train_data_src
        self.x2 = train_data_src2
        self.y = train_label_src
        assert len(self.x) == len(self.x2)
        assert len(self.x2) == len(self.y)
    
    def __len__(self):
        return len(self.x)

    def __getitem__(self,key):
        return (self.x[key],self.x2[key],self.y[key])
    

### Generators
Instead of creating classes for iterators , you can use the generator 
Generators relieve the developer of recording the state of the iteration 
Simplistically, generators are functions that use `yield` statement instead of `return`


In [77]:
def pairwise_generator(input_data):
    for i in range(0,len(data),2):
        yield (input_data[i],input_data[i+1])

data = [1,'one',2,'two',3,'three',4,'four',5,'five']        
generator = pairwise_generator(data)
for elt in generator:
    print(elt)

(1, 'one')
(2, 'two')
(3, 'three')
(4, 'four')
(5, 'five')


# Debugging - Pdb

In [78]:
import pdb
def pairwise_generator(input_data):
    pdb.set_trace()
    for i in range(0,len(data),2):        
        yield (input_data[i],input_data[i+1])
        
data = [1,'one',2,'two',3,'three',4,'four']        
generator = pairwise_generator(data)
for elt in generator:
    print(elt)

> <ipython-input-78-327e211f3ec6>(4)pairwise_generator()
-> for i in range(0,len(data),2):
(Pdb) 
(Pdb) d
*** Newest frame
(Pdb) d
*** Newest frame
(Pdb) d
*** Newest frame
(Pdb) d
*** Newest frame
(Pdb) d
*** Newest frame
(Pdb) 
*** Newest frame
(Pdb) 
*** Newest frame
(Pdb) 
*** Newest frame
(Pdb) dd
*** NameError: name 'dd' is not defined
(Pdb) 1
1
(Pdb) 
1
(Pdb) 2
2
(Pdb) 
2
(Pdb) 3
3
(Pdb) 4
4
(Pdb) 5
5
(Pdb) 
5
(Pdb) 
5
(Pdb) q


BdbQuit: 

## Recitation 0b Numpy
<a id=rec1b> </a>
### Naive Python Implementation

In [210]:
W = [[1, 1, 1], 
    [2, 2, 2],
    [3, 3, 3]]

x = [7, 8, 5]
b = [1, 1, 1]
y = [0, 0, 0]

# Naive Computation of Wx + b
for i in range(len(W)):
    for j in range(len(x)):
        y[i] += W[i][j] * x[j]
#         print(y)

for i in range(len(y)):
     y[i] += b[i]
        

### NumPy-based Vectorized Implementation

In [201]:
import numpy as np

In [215]:
W = np.array([[1, 1, 1],
            [2, 2, 2],
            [3, 3, 3]])
x = np.array([7, 8, 5])
b = np.ones(3)

In [223]:
W.dot(x) + b

array([21., 41., 61.])

### Multi-Dimensional Arrays

In [228]:
T = np.random.random((2,3,4))
T

array([[[0.07456727, 0.5555167 , 0.38839831, 0.75335974],
        [0.19294572, 0.96287134, 0.19177305, 0.94792417],
        [0.90212215, 0.44429612, 0.69247528, 0.5499308 ]],

       [[0.49770814, 0.85863523, 0.82507032, 0.66291275],
        [0.8378004 , 0.94296372, 0.37113235, 0.22608651],
        [0.0098207 , 0.0503043 , 0.26927566, 0.1652351 ]]])

In [230]:
W = T[1]
W.shape, W.dtype

((3, 4), dtype('float64'))

In [252]:
X = np.full((4,3), 11785, dtype=np.int32)
X.shape, X.dtype

((4, 3), dtype('int32'))

In [253]:
W.dot(X)

array([[33520.38714219, 33520.38714219, 33520.38714219],
       [28024.52939453, 28024.52939453, 28024.52939453],
       [ 5829.28244319,  5829.28244319,  5829.28244319]])

In [261]:
X2 = np.zeros(X.shape)
# help(np.array_equal)
np.array_equal(W.dot(X2), np.zeros((3,3)))

True

### Basic Element-wise Operations

In [264]:
y = np.arange(10)
y, np.array(range(10))

(array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]))

In [265]:
y == np.array(range(10))

array([ True,  True,  True,  True,  True,  True,  True,  True,  True,
        True])

In [311]:
y+1
y * 10
np.sqrt(y ** 2), y.min(), y.mean(), y.sum()

(array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.]), 0, 4.5, 45)

In [310]:
def get_var(list_):
    mean = list_.mean()
    x = 0
    for i in list_:
        var=(i-mean)**2
        x += var
    return(x/len(list_))

def get_std(list_):
    return(np.sqrt(get_var(list_)))

### Indexing NumPy Arrays

In [313]:
#save numpy
np.save('tensor.npy', np.arange(2 * 4 * 4).reshape(2, 4, 4))

In [316]:
T = np.load('tensor.npy')
T,  T.shape

(array([[[ 0,  1,  2,  3],
         [ 4,  5,  6,  7],
         [ 8,  9, 10, 11],
         [12, 13, 14, 15]],
 
        [[16, 17, 18, 19],
         [20, 21, 22, 23],
         [24, 25, 26, 27],
         [28, 29, 30, 31]]]), (2, 4, 4))

In [317]:
T[0]

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [318]:
T[1, 0, 0]

16

In [320]:
T[0,1:3,0:2]

array([[4, 5],
       [8, 9]])

In [322]:
(T % 10) == 0

array([[[ True, False, False, False],
        [False, False, False, False],
        [False, False,  True, False],
        [False, False, False, False]],

       [[False, False, False, False],
        [ True, False, False, False],
        [False, False, False, False],
        [False, False,  True, False]]])

In [325]:
T[(T % 10) == 0]

array([ 0, 10, 20, 30])

### Example: Masking and Clipping

In [331]:
W = np.random.randint(0, 10, (4, 4))

In [334]:
mask = np.ones_like(W)
mask[1] = 0 
mask

In [341]:
print(W)
print(mask)

[[5 2 7 8]
 [7 9 6 3]
 [8 1 7 6]
 [0 7 1 8]]
[[1 1 1 1]
 [0 0 0 0]
 [1 1 1 1]
 [1 1 1 1]]


In [342]:
W * mask

array([[5, 2, 7, 8],
       [0, 0, 0, 0],
       [8, 1, 7, 6],
       [0, 7, 1, 8]])

In [344]:
W > 5
W2 = W
W2[W2 > 5] = 5
W2

array([[5, 2, 5, 5],
       [5, 5, 5, 3],
       [5, 1, 5, 5],
       [0, 5, 1, 5]])

In [346]:
np.minimum(W, 5)

array([[5, 2, 5, 5],
       [5, 5, 5, 3],
       [5, 1, 5, 5],
       [0, 5, 1, 5]])

## Computation Along Axes

In [351]:
A = np.random.random((3,3))
A

array([[0.54896062, 0.76335126, 0.54982982],
       [0.5478278 , 0.16361021, 0.67706879],
       [0.90405144, 0.77673016, 0.00485693]])

In [353]:
A.sum()

4.936287015979624

In [354]:
A.sum(axis=0)

array([2.00083986, 1.70369162, 1.23175554])

In [355]:
A.sum(axis=1)

array([1.86214169, 1.3885068 , 1.68563853])

In [357]:
A.max(axis=0), A.argmax(axis=0)

(array([0.90405144, 0.77673016, 0.67706879]), array([2, 2, 1]))

### Combining Arrays: Stacking & Concatenation

In [366]:
a = np.full(5,1)
b = np.full(5,2)
c = np.full(5,3)
a, b, c

(array([1, 1, 1, 1, 1]), array([2, 2, 2, 2, 2]), array([3, 3, 3, 3, 3]))

Stacking creates a new axis. It joins the supplied arrays, which must have the same shape, along that axis.
Indexing a single element from that axis returns the appropriate input array.

In [382]:
B = np.stack([a, b, c], axis=1)  # axis=0 by default
B, B.shape

(array([[1, 2, 3],
        [1, 2, 3],
        [1, 2, 3],
        [1, 2, 3],
        [1, 2, 3]]), (5, 3))

In [390]:
B[:,1]

array([2, 2, 2, 2, 2])

Concatenating arrays joins them along an *existing* aixs.

In [391]:
C = np.concatenate([B+10, B+200], axis=1)
C

array([[ 11,  12,  13, 201, 202, 203],
       [ 11,  12,  13, 201, 202, 203],
       [ 11,  12,  13, 201, 202, 203],
       [ 11,  12,  13, 201, 202, 203],
       [ 11,  12,  13, 201, 202, 203]])

### Transposing and Reshaping

In [394]:
B, B.shape

(array([[1, 2, 3],
        [1, 2, 3],
        [1, 2, 3],
        [1, 2, 3],
        [1, 2, 3]]), (5, 3))

In [395]:
BT = B.T
BT, BT.shape

(array([[1, 1, 1, 1, 1],
        [2, 2, 2, 2, 2],
        [3, 3, 3, 3, 3]]), (3, 5))

In [398]:
BR = B.reshape(3,5)
BR, BR.shape

(array([[1, 2, 3, 1, 2],
        [3, 1, 2, 3, 1],
        [2, 3, 1, 2, 3]]), (3, 5))

### Broadcasting

In [402]:
A = np.random.random((8, 4))
B = np.random.random((4,3))
A.dot(B)

array([[1.00944387, 0.99813086, 0.63782844],
       [1.46569025, 1.25946442, 0.87424471],
       [0.59464682, 0.80202032, 0.20740968],
       [0.65561833, 0.58362628, 0.16929863],
       [1.0172952 , 1.06297125, 0.44452967],
       [1.0054962 , 1.32263305, 0.81070651],
       [0.8345174 , 0.95285564, 0.5072036 ],
       [0.677714  , 0.76140857, 0.26641675]])

In [403]:
b = np.array([10, 20, 30])

Arrays must have same ending dimensions to broadcast.

In [410]:
A.dot(B) + b

array([[11.00944387, 20.99813086, 30.63782844],
       [11.46569025, 21.25946442, 30.87424471],
       [10.59464682, 20.80202032, 30.20740968],
       [10.65561833, 20.58362628, 30.16929863],
       [11.0172952 , 21.06297125, 30.44452967],
       [11.0054962 , 21.32263305, 30.81070651],
       [10.8345174 , 20.95285564, 30.5072036 ],
       [10.677714  , 20.76140857, 30.26641675]])

Mismatched dimensions cause an error.

In [412]:
try:
    x = np.random.random((3, 3, 4)) + np.random.random((3,2,4))
except ValueError as e:
    print(e)

operands could not be broadcast together with shapes (3,3,4) (3,2,4) 


## Saving and Loading

### Saving and loading a single NumPy array

In [None]:
# Save single array
x = np.random.random((5,))
print(x)

np.save('tmp.npy', x)

In [None]:
# Load the array
y = np.load('tmp.npy')

print(y)

### Saving and loading a dictionary of NumPy arrays

In [None]:
# Save dictionary of arrays
x1 = np.random.random((2,))
y1 = np.random.random((3,))
print(x1, y1)

np.savez('tmp.npz', x=x1, y=y1)

In [None]:
# Load the dictionary of arrays
data = np.load('tmp.npz')

print(data['x'])
print(data['y'])

### References

You can find a more complete introduction at https://docs.scipy.org/doc/numpy/user/quickstart.html

The following links were helpful in preparing this notebook:

- https://docs.scipy.org/doc/numpy/reference/index.html
- https://github.com/cmudeeplearning11785/deep-learning-tutorials/blob/master/recitation-2/Tutorial-numpy.ipynb
- http://cs231n.github.io/python-numpy-tutorial/#numpy

## Recitation 2 Your First Deep Learning Code
<a class='anchor' id='rec2'> </a>

In [4]:
import numpy as np
import torch
import torch.nn as  nn

In [16]:
# create tensor 
x = torch.FloatTensor(2,3)
# from numpy 
np_array = np.random.random((2,3)).astype(float)
x1 = torch.FloatTensor(np_array)
x2 = torch.randn(2,3)

In [20]:
# basic operation
x = torch.arange(4, dtype=torch.float).view(2,2)
s = torch.sum(x)
e = torch.exp(x)

In [30]:
# elementwise and matrix multiplication
z = s*e + torch.matmul(x1, x2.t())

In [None]:
# Move tensors to the gpu
x = torch.rand(3,2)
# copy to GPU 
y = x.cuda()
# copy back to CPU
t = z.cpu()

__Backpropagation__  
You have seen gradient descent and you know that to train a network you need to compute gradients i.e. derivatives, of some loss(divergence) over every parameter (weight, biases)

In [32]:
net = torch.nn.Linear(4, 2)

In [46]:
x = torch.arange(0,4).float()
y = net.forward(x)
y = net(x) 
print(y)

tensor([ 1.2247, -0.9030], grad_fn=<ThAddBackward>)


In [48]:
for param in net.parameters():
    print(param)

Parameter containing:
tensor([[-0.0242,  0.2514,  0.2608,  0.2381],
        [-0.4306, -0.1150, -0.2471, -0.0907]], requires_grad=True)
Parameter containing:
tensor([-0.2627, -0.0216], requires_grad=True)


## Recitation 3 Pytorch Tutorial
<a id=rec3a> </a>

Pytorch is a python framework for machine learning

- GPU-accelerated computations
- automatic differentiation
- modules for neural networks

This tutorial will teach you the fundamentals of operating on pytorch tensors and networks. You have already seen some things in recitation 0 which we will quickly review, but most of this tutorial is on mostly new or more advanced stuff.

For a worked example of how to build and train a pytorch network, see `pytorch-example.ipynb`.

For additional tutorials, see http://pytorch.org/tutorials/

In [3]:
import torch
import numpy as np
import torch.nn as nn