In [53]:
from traitlets.config.manager import BaseJSONConfigManager
# To make this work, replace path with your own:
# On the command line, type juypter --paths to see where your nbconfig is stored
# Should be in the environment in which you install reveal.js
path = " /Users/Blake/.virtualenvs/cme193/bin/../etc/jupyter"
cm = BaseJSONConfigManager(config_dir=path)
cm.update('livereveal', {
              'theme': 'simple',
              'transition': 'zoom',
              'start_slideshow_at': 'selected',
    })

{'start_slideshow_at': 'selected', 'theme': 'simple', 'transition': 'zoom'}

In [54]:
%%HTML 
<link rel="stylesheet" type="text/css" href="custom.css">

# CME 193 
## Introduction to Scientific Python
## Spring 2018

<br>

## Lecture 8
-------------
## Recursion, Exceptions, Unit Tests, Neural Networks

---------

# Lecture 8 Contents

* Admin
* Recursion
* Exceptions
* Unit Testing
* Deep Learning
* Quick intro to neural nets
* Deep Learning Packages
* Tensorflow Basics
* Keras Basics
* More packages




## Administration

* Thank you for your project proposals! Really cool and interesting ideas.
* Complete either HW2 or Project by **5/15**.
* Exercises also due **5/15**.

### Project tips and general feedback

- If your project involves a dataset, make sure you tackle this step early
- HW2 is the benchmark for required deliverables.
- If you need to pivot along the way, that is fine, if it's substantial let us know
- Have fun and research best practices along the way. You are *not* being graded on how well your model works.


## Office Hours

I will continue to hold office hours over the next two weeks:

- 2:00-3:15 Mon/Wed in Huang or class time, you decide!

---------

# Recursion

Recursive function solve problems by reducing them to smaller problems of the same form.

This allows recursive functions to call themselves.
 - New paradigm 
 - Powerful tool 
 - Divide-and-conquer 
 - Beautiful solutions

## First example

Let’s consider a trivial problem:

Suppose we want to add two positive numbers ```a``` and ```b```, but we can only add/subtract 1 to any number.

How would you write a function to do this without recursion? 

What control statement(s) would you use?

In [1]:
# Non-recursive solution
def add(a, b): 
    while b > 0:
        a += 1
        b -= 1 
    return a

add(7, 8)




15

## Recursive solution

- Simple case: 
 - If ```add(a,b)``` is called with ```b = 0``` just return ```a```
- Otherwise, we can return ```1 + add(a, b-1)```

In [2]:
# Recursive solution
# Adding b to a, (if only able to use +1)
def add(a, b):
    if b == 0:
        # base case
        return a
    # recursive step
    return add(a, b-1) + 1

### Base case and recursive steps

Recursive functions consist of two parts:

**Base case**: The base case is the trivial case that can be dealt with easily.

**Recursive step**: The recursive step brings us slightly closer (breaks the problem into smaller subproblems) to the base case and
calls the function itself again.

## Reversing a list

How can we recursively reverse a list? ```([1, 2, 3] → [3, 2, 1])```

 - If list is empty or has one element, the reverse is itself 
 - Otherwise, reverse elements 2 to n, and append the first

In [3]:
def reverse_list(xs):
    if len(xs) <= 1:
        return xs
    else:
        # shift first element to last
        return reverse_list(xs[1:]) + [xs[0]]
    
reverse_list([1,2,3])

[3, 2, 1]

## Palindromes

- A palindrome is a word that reads the same from both ways, such as radar or level.

- Let’s write a function that checks whether a given word is a palindrome.

## The recursive idea

Given a word, such as level, we check:
 - whether the first and last character are the same
 - whether the string with first and last character removed are the same

## Base case

What’s the base case in this case? 

- The empty string is a palindrome 
- Any 1 letter string is a palindrome

In [4]:
def is_palin(s):
    '''returns True iff s is a palindrome'''
    if len(s) <= 1:
        return True
    return s[0] == s[-1] and is_palin(s[1:-1])

print(is_palin('cme193'))
print(is_palin('racecar'))

False
True


## Another example

Write a recursive function that computes $a^b$ for given a and b, where b is an integer. (Do not use ∗∗)

### Another example

Base case: $b=0$ , $a^b =1$

Recursive step: (be careful) there are actually two options, one for if b < 0 and one for if b > 0.

In [5]:
def power(a,b):
    if b == 0:
        return 1
    elif b > 0:
        return a*power(a,b-1)
    else:
        return (1./a)*power(a,b+1)
    
power(2,10)
power(2, -10) == 1.0/1024

True

### Example: Fibonacci

```python
fib(0) = 0

fib(1) = 1

fib(n) = fib(n-1) + fib(n-2)  for n >= 2
```

In [6]:
def fib(n):
    if n <= 1:
        return n
    f = fib(n-1) + fib(n-2)
    return f
fib(34)

5702887

## Pitfalls

Recursion can be very powerful, but there are some pitfalls: 
- Have to ensure you always reach the base case.

- Each successive call of the algorithm must be solving a simpler problem

- The number of function calls shouldn’t explode. (see exercises)

- An iterative algorithm is always faster due to overhead of function calls. (However, the iterative solution might be much more complex)

---------

# Exceptions

## Exceptions
### Example

Consider a function that takes a filename, and returns the 20 most common words. (This is similar to one of the exercises you could have done.)

Suppose we have written a function:

```python
topkwords(filename, k)
```


Instead of entering ```filename``` and value of ```k``` in the script, we may also want to run it from the terminal.

### Parse input from command line

The sys module allows us to read the terminal command that started the script:

``` python
import sys

print(sys.argv)```


## ```sys.argv```

```sys.argv``` holds a list with command line arguments passed to a Python script.

Note that ```sys.argv[0]``` will be the name of the python script itself.

```python
import sys
def topkwords(filename, k):
# Returns k most common words in filename
    pass

if __name__ == "__main__":
    filename = sys.argv[1]
    k = int(sys.argv[2])
    print(topkwords(filename, k))```

## Issues

- What if the file does not exist?
- What if the second argument is not an integer? 
- What if no command line arguments are supplied?
- All result in errors: 
 - ```IOError```
 - ```ValueError```
 - ```IndexError```

## Exception handling

What do we want to happen when these errors occur? Should the program simply crash?

No, we want it to gracefully handle these
- ```IOError```: Tell the user the file does not exist.
- ```ValueError```, ```IndexError```: Tell the user what the format of the command line arguments should be.

## Try ... Except

- The try clause is executed
- If no exception occurs, the except clause is skipped
- If an exception occurs, the rest of the try clause is skipped. Then if the exception type is matched, the except clause is executed. Then the code continues after the try statement
- If an exception occurs with no match in the except clause, execution is stopped and we get the standard error

```python
import sys
if __name__ == "__main__": 
    try:
        filename = sys.argv[1]
        k = int(sys.argv[2])
        print topkwords(filename, k)
    except IOError:
        print("File does not exist")
    except (ValueError, IndexError):
        print("Error in command line input")
        print("Run as: python wc.py <filename> <k>")
        print("where <k> is an integer")
```

## A naked except

### A naked except
We can have a naked except that catches any error:
```python
try:
    t = 3.0 / 0.0
except:
    # handles any error
    print('There was some error')
```

Use this with extreme caution though, as genuine bugs might be impossible to correct!

## Try - Except - Else

- Else clause is executed only if there is no exception from the ``` try ``` block.

Why? 
 - Avoids catching exception that was not protected E.g. consider f.readlines raising an IOError
 - simplifies code readibility 

```python
# from Python docs
for arg in sys.argv[1:]:
    try:
        f = open(arg, 'r')
    except IOError:
        print('cannot open', arg) 
    else:
        print(arg, 'has', len(f.readlines()), 'lines' f.close())
```

## Raise

We can use Raise to raise an exception ourselves.

```
>>> raise NameError(’Oops’)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
NameError: Oops
```

## ```finally```

The finally statement is always executed before leaving the try statement, whether or not an exception has occured.

Useful in case we have to close files, closing network connections etc.

In [7]:
def div(x, y):
    res = None
    try:
        res =  x/y
    except ZeroDivisionError:
        print('Division by zero!') 
    else:
        print("we are error free")
    finally:
        print("Finally clause")
        return res
    
print(div(3,2))
print('-'*50)
print(div(3,0))

we are error free
Finally clause
1.5
--------------------------------------------------
Division by zero!
Finally clause
None


## Raising our own excecptions

Recall the Rational class we considered a few lectures ago.

What if the denominator passed in to the constructor is zero?

## Raising our own excecptions

```python
class Rational:
    def __init__(self, p, q=1):
        g = gcd(p, q)
        self.p = p / g
        self.q = q / g
```

What if ```q == 0```?

## Making the necessary change

```python
class Rational:
    def __init__(self, p, q=1):
        if q == 0:
            raise ZeroDivisionError('denominator is zero')
        g = gcd(p, q) 
        self.p = p / g 
        self.q = q / g
```

---------

# Unit tests


![unit_test](../img/unit_test.JPG)

## Unit tests: 

Test individual pieces of code.

For example, for factorial function, test
```0!= 1``` or ```3! = 6``` etc.

![title](../img/test_comp.png)

## Test driven development

Some write tests before code. Reasons:
- Focus on the requirements
- Don’t write too much
- Safely restructure/optimize code
- When collaborating: don’t break other’s code 
- Faster

## Test cases

How to construct test cases?

A test case should answer a single question about the code.

A test case should:
- Run by itself, no human input required
- Determine on its own whether the test has passed or failed 
- Be separate from other tests

## What to test?
- Known values
- Sanity check (for conversion functions for example)
- Bad input
 - Input is too large?
 - Negative input?
 - String input when expected an integer?
- etc: very dependent on problem

## ```unittest```

A testcase is created by subclassing ```unittest.TestCase```

Individual tests are defined with methods whose names start with the letters test. (Allows the test runner to identify the tests)

Each test usually calls an assert method to run the test - many assert options.

A few different ways to run tests (see documentation). Easiest way is to run ```unittest.main()``` for example if the test script is the main program.

# ```assert```

We can use a number of methods to check for failures:
- assertEqual
- assertNotEqual
- assertTrue, assertFalse
- assertIn
- assertRaises 
- assertAlmostEqual 
- assertGreater, assertLessEqual
- etc. (see Docs)

```python
import unittest
from my_script import is_palindrome

class KnownInput(unittest.TestCase):
    knownValues = (('lego', False),
                   ('radar', True))

    def testKnownValues(self):
        for word, palin in self.knownValues:
            result = is_palindrome(word)
            self.assertEqual(result, palin)
```

### ```unittest```

Note, to use the ```unittest``` package inside a Jupyter notebook instead of ```unittest.main()```, use:


``` unittest.main(argv=['ignored', '-v'], exit=False)```

# Alternatives

- ```nose2```
- ```Pytest```

http://nose2.readthedocs.io/en/latest/differences.html

## Pytest

```pip install pytest```

- Easy testing
- Automatically discovers tests
- No need to remember all assert functions, keyword assert works for everything
- Informative failure results

## Pytest

Test discovery: (basics)

- Scans files starting with test_ or ending with _test.py
- Run functions starting with test_

## Example : primes

Create two files in a directory:
    ```primes.py``` – Implementation 
    ```test_primes.py``` – Tests


```python
# primes.py
# (simplest solution that passes tests)

def is_prime(x):
    for i in range(2, x):
        if x % i == 0: 
            return False
    return True
```

```python

# test_primes.py
from primes import is_prime 
def test_is_three_prime():
    assert is_prime(3)
def test_is_four_prime(): 
    assert not is_prime(4)
```

### Using ```pytest``` to execute test suite

#### By default, it will run all files prefixed with test.

Here we pass in the name of our test script:

```pytest test_primes.py```

```python
from primes import is_prime 

def test_is_zero_prime():
    assert not is_prime(0) 
def test_is_one_prime():
    assert not is_prime(1) 
def test_is_two_prime():
    assert is_prime(2)
def test_is_three_prime(): 
    assert is_prime(3)
def test_is_four_prime(): 
    assert not is_prime(4)```

## Some more tests

- Negative numbers 
- Non integers
- Large prime
- List of known primes 
- List of non-primes

### When all tests pass...

- First make sure all tests pass
- Then optimize code, making sure nothing breaks

Now you can be confident that whatever algorithm you use, it still works as desired!

## Writing good tests
- Utilize automation and code reuse
- Know the type and scope - your module or somebody else’s?
- A single test should focus on a single thing
- Functional tests must be deterministic
- Leave no trace - safe setup and clean up

## Let's take a look at some examples...



# _DEEP LEARNING_
 - Who can tell me a difference between classical machine learning and deep learning?

### What is Machine Learning?
- Deep learning is a subfield of machine learning
- Most machine learning	methods	work well because of human-designed	representations and input features
- Classical Machine learning is pretty much optimization, i.e. find the best set of weights to optimize predictions for a given loss function
- So what's deep learning? 
- Well, what does wikipedia say?
    > "Deep learning (also known as deep structured learning or hierarchical learning) is part of a broader family of machine learning methods based on learning data representations, as opposed to task-specific algorithms" 
 

## So what does that mean?
- Lets give an example 
- Say I wanted to determine whether an image is of a cat vs. a dog. 
- In classical machine learning, we would have to define features for the input data:
    - the weight of the animal
    - does it have whiskers 
    - does it have ears and are they pointed
    - does it have ears and are they not pointed  etc.
- In short, we have to define a set of facial features and let the optimization identify which features are more important.


## What do we do in deep learning?
![image](../Data/11-fig/catdog.png)


* Neural Networks automatically learns which features are important for classification by applying a series of  nonlinear processing units
* When people say "Deep Learning" they mean using Deep Neural Network

* Main differences:
    1. ** Feature engineering **: Need to create custom feature space in classical machine learning. Not necessary in deep learning
    2. ** Data dimensions **: When the data is small, Deep Learning algorithms don’t perform that well. Typically, neural network algorithms need a large amount of data to learn patterns. With traditional machine learning, features are handcrafted, and thus tend to exhibit superior performance when data is sparse.
    3. ** Approach ** Neural Network Training has an end-to-end approach. In classical machine learning, typically you break the problem down into different parts, solve them individually, and combine them to get the result. In deep learning you train the model end to end (black box)
    4. ** Training time **: Deep Neural Networks takes alot longer to train.
    5. ** Hardward dependencies **: Deep learning usually takes advantage of GPUs to speed up training. 
    6. ** Interpretability **: Hard to interpret what's going on inside of a deep neural network. Difficult to interpret why a prediction was made. Millions of parameters. 

## Why use deep learning

- Manually designed features are often over-specified
- Learned features are adaptable 
- Deep learning is very flexible 
- Deep learning can handle supervised and unsupervised tasks
- Deep learning is has achieved ** superior ** performance in many tasks problems since 2010 (computer vision, NLP, classification tasks, game-playing)


## Why has it gotten better ?
- Explosion in the amount of data
- Way faster machines with the advent of more powerful CPUs and GPUs
- New models, algorithms,ideas

# Neural Network Basics 
- Perceptron 
- Fully Connected Forward Neural Networks
- Intuition:
    - Neural networks are a model of our brain. Each node, neuron, applies an operation on its inputs and passes its outputs to the next layer. These neurons can be connected into networks to fit more complicated patterns. 

# Perceptron 
- Most basic artificial neuron. Developed in 50s and 60s by Frank Rosenblatt
- So what is a perceptron. Well its simply a dot product operator + a threshold funtion 

![image](../Data/11-fig/perception.png)

- The inputs $x_1,x_2, x_3$ are multiplied by weights $w_1,w_2, w_3$ to determine their relative importance:

> > $output = \begin{cases} 0 \mbox{ if } \sum_{j} w_j x_j \leq \mbox{thresh} \\ 1 \mbox{ if } \sum_{j} w_j x_j > \mbox{thresh} \end{cases} $

- These weights are learned in training 

# Activation Neuron

- More generally, a neural network takes a vector $x$ and makes predictions by a composition of linear transformations and non-linear activation layers.
- Each node computes:
>  $output = f(Wx + b)$
- and passes the output to the next layer of the network
- $f$ is some non-linear activation function, W is a matrix of weights, and b is a vector of biases. 


# Sigmoid Neuron
- A sigmoid neuron simply uses the sigmoid function as its activation function:
> $output = \sigma(Wx + b)$ 
- Where $ \sigma(z) = \frac{1}{1 + e^{-z}}$
![image](../Data/11-fig/sigmoid.png)

# Relu activation functions

- Relu: $f(x) = max(x, 0)$
    > ![image](../Data/11-fig/relu.png)
    

# Tanh activation functions
- Tanh: $f(x) = \tanh(x)$
 > ![image](../Data/11-fig/tanh.gif)

# Fully connected neural networks
- Stack layers of neurons together, using the outputs of the previous layer as inputs to the following layer. When there are many layers, the network is ** _deep_**

![image](../Data/11-fig/fullnetwork.png)

## Forward Propation:

- given $h^0 = x$, we have 
> $h^{i +1}= f(W^{i} h^{i}  + b^{i})$
- where $h^i$ are the output of the $i$'th hidden layer
> $\hat{y} = f(W^{n-1} h^{n-1} + b^{n-1})$

# Backpropagation Basics

- Given a training set $\{(x^{(1)}, y^{(1)}), ... , (x^{(m)}, y^{(m)})\}$ of m training examples. 
- We need to define a loss  $L = J(W, b; x, y) = \sum_i (\hat{y}_i - y_i)^2$ < mean squared loss
- Backpropagation is just gradient descent on the weights
> $W_{t+1} = W_t - \eta \frac{\partial L}{\partial W}\big|_{W_t}$

## Backpropagation Continued
- Intuition: flow the gradients with respect to the loss backward through the network to update the weights
- More references on backprop:
    - https://ayearofai.com/rohan-lenny-1-neural-networks-the-backpropagation-algorithm-explained-abf4609d4f9d
    - http://cs231n.github.io/optimization-2/
    - http://web.stanford.edu/class/cs224n/lecture_notes/cs224n-2017-gradient-notes.pdf
    
- It's beautifully local, every node in the network can right away compute two things:
    - Its output value
    - its local gradient of the inputs with respect to its output value
    - Backpropagation is just repeated chain rule through the network

# Deep Learning Libraries

# Tensorflow
- ``` pip install tensorflow ```
- Open source, backend software developed by Google Brain before being released under the Apache 2.0 open source license
- Advantages:
    - Good Community
    - Very fledible. You just define your computation as a data flow graph. Can define it however you like. 
    - Portable: can run on GPUs
    - Creates a Static Computational Graph for fast backpropagation 
- Negatives:
    - It's big and complicated
    - Lots going on, easy for beginners to feel overwhelmed
    - It creates a static computational graph, so it is at times unflexible

## Tensorflow Basics
- Two steps:
    - Building the computational graph.
    - Running the computational graph.
- computational graph is just a series of tensorflow operations. 

In [9]:
import tensorflow as tf

node1 = tf.constant(3.0, dtype=tf.float32)
node2 = tf.constant(4.0)
print(node1, node2) # Doesn't actually run the graph just creates it 

Tensor("Const:0", shape=(), dtype=float32) Tensor("Const_1:0", shape=(), dtype=float32)


In [10]:
sess = tf.Session()
node3 = tf.add(node1, node2)
print(sess.run([node1, node2, node3]))

[3.0, 4.0, 7.0]


# Placeholders

- Well that wasn't very interesting, this only produces constant result.
- Don't we need some way to specify inputs?

- Yes! We can parameterize the graph to accept inputs using placeholders (external inputs)
- To declare a placeholder, use
``` python
tf.placeholder(dtype, shape, name) ```

In [11]:
x = tf.placeholder(tf.float32)
x2 = tf.placeholder(tf.float32)
sub_nodes = x - x2
print(sess.run(sub_nodes, {x: 5, x2: 2}))
print(sess.run(sub_nodes, {x: [5, 2,3], x2: [2,1,1]}))

3.0
[3. 1. 2.]


# Variables  
- The whole point of deep learning was to learn the weights, so how do we make those?
- We need to tell Tensorflow that these variables correspond to the weights we need to learn

- To do this use ``` tf.Variable(initial_val , dtype) ```
- constants as we have seen above are initialized on creation
- variables have to be initialized at run time by running ``` tf.global_variables_initializer() ```

In [12]:
W = tf.Variable([.5], dtype = tf.float32) # define our variables
b = tf.Variable([-.5], dtype = tf.float32)
x = tf.placeholder(tf.float32) # define our inputs
linear_predictor = W*x + b # define our model
init = tf.global_variables_initializer() #initialize variables
sess.run(init) ## initialize variables '
feed_dict = {x: [1,2,3,4,5,6]} # specify inputs
print(sess.run(linear_predictor, feed_dict))

[0.  0.5 1.  1.5 2.  2.5]


# Define a loss and train 
- So we've got a simple model $y = Wx + b$, and now we want to learn the correct weights
- To do this we need to define a loss function and an optimizer 
- Tensorflow provides a large number of  [loss functions](https://www.tensorflow.org/versions/r0.12/api_docs/python/nn/losses)
- They also provide a large number of [optimizers](https://www.tensorflow.org/api_guides/python/train)
- To train a model you specify the computational graph, define the loss function, set a optimizer, initialize your variables, and run


In [13]:
import tensorflow as tf
## CREATE YOUR MODEL 
W = tf.Variable([.5], dtype = tf.float32) # define our variables
b = tf.Variable([-.5], dtype = tf.float32)
# define our inputs
x = tf.placeholder(tf.float32) 
# define our model
linear_predictor = W*x + b 
 # define placeholder for y variables 
y = tf.placeholder(tf.float32)
# CREATE YOUR LOSS
loss = tf.reduce_sum(tf.square(linear_predictor - y))
# SPECIFY YOUR OPTIMIZER
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)

In [11]:
import tensorflow as tf
## CREATE YOUR MODEL 
W = tf.Variable([.3], dtype=tf.float32)
b = tf.Variable([-.3], dtype=tf.float32)
# define our inputs
x = tf.placeholder(tf.float32)
# define our model
linear_model = W * x + b
 # define placeholder for y variables
y = tf.placeholder(tf.float32)

# CREATE YOUR LOSS
loss = tf.reduce_sum(tf.square(linear_model - y)) # sum of the squares
# SPECIFY YOUR OPTIMIZER
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)

In [15]:
# training data
import time
x_train = [1, 2, 3, 4]
y_train = [0, -1, -2, -3]
# training loop
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init) # reset values to wrong
for i in range(1000):
  time.sleep(.001)
  _, loss_val = sess.run([train, loss], {x: x_train, y: y_train})
  print("Loss is: {}".format(loss_val), end = '\r')
# evaluate training accuracy
print()
curr_W, curr_b, curr_loss = sess.run([W, b, loss], {x: x_train, y: y_train})
print("W: %s b: %s loss: %s"%(curr_W, curr_b, curr_loss))

Loss is: 7.687006586820644e-110
W: [-0.9999964] b: [0.9999894] loss: 7.598544e-11


# Tensorflow is very powerful

- Please read the [documentation](https://www.tensorflow.org/get_started/) for more examples

# Keras
- ``` pip install keras ```
- Advantages:
    - Way easier to use
    - It is more of a front-end library, unlike Tensorflow which is a back-end library. 
    - Capable of running on top of other Machine and Deep Learning libraries like Tensorflow, CNTK or Theano.
- Disadvantages:
    - Relatively opaque implementation
    - Harder to create your own new networks
    - Less control
- Let's make a simple binary classfier using one hidden layer

In [16]:
from keras.models import Sequential
from keras.layers import Dense, Activation 
import numpy as np

# define our training/testing set 
x_train = np.random.random((1000, 10))
y_train = np.random.randint(2, size= (1000,1))
x_test = np.random.random((200, 10))
y_test = np.random.randint(2, size=(200,1))


  from ._conv import register_converters as _register_converters
Using Theano backend.


In [17]:
model = Sequential()
# Add a dense fully conected feed forward layer with 32 hidden units
model.add(Dense(32, input_dim = 10, activation = 'relu'))
# hidden layer of size 32
model.add(Dense(32, activation = 'relu'))
# output unit
model.add(Dense(1, activation = 'sigmoid'))

# compile the model specifying the loss
model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

# fit the model 
model.fit(x_train, y_train, epochs=20,batch_size=128)
score = model.evaluate(x_test, y_test, batch_size=128)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20

# Summary
#### Keras :
- is a fast, flexible protyping tool
- Can be used on top of tensorflow
- Not apt for large scale research
- Good to test out potential ideas on a dataset

#### Tensorflow:
- is the standard in deep learning research
- very flexible, gpu support, automatic differentiation
- statically defined: must declare a computational graph and run it
- Nice tensorboard visualization module
- Most Deep Learning classes at Stanford use Tensorflow
  

## PyTorch

- Developed in part by Facebook, Stanford, Nvidia...
- Similarly flexible to Tensorflow 
- Dynamic computational graph (each graph is computed on the fly)
- This leads to a more pythonic API
- I love it


## Tensors in Pytorch
- Conceptually identical to numpy array
- Generic tool for scientific computing, no knowledge of deep learning or computational graphs. 
- They can utilize GPUs to speed up their computation. 

## Variables in Pytorch

- autograd package allows for automatic differentiation of variables. 
- The forward pass through the network defines the computational graph (nodes are tensors, edges are functions)
- PyTorch autograd looks a lot like TensorFlow:
    - in both frameworks we define a computational graph, and use automatic differentiation to compute gradients. 
    - difference between the two is that TensorFlow's computational graphs are static and PyTorch uses dynamic computational graphs.
    
- In pytorch each forward pass defines a new computational graph. 

- To create a Variable, wrap Tensors in ```Variable``` objects. This variable then represents a node in the computational graph. 

In [2]:
import torch
from torch.autograd import Variable

dtype = torch.FloatTensor
N, D_in, H, D_out = 64, 500, 100, 10

# Setting requires_grad=False indicates that we do not need to compute gradients w.r.t var
# during the backward pass.
x = Variable(torch.randn(N, D_in).type(dtype), requires_grad = False)
y = Variable(torch.randn(N, D_out).type(dtype), requires_grad = False)

# Setting requires_grad=True indicates that we want to compute gradients with
# respect to these Variables during the backward pass.
w1 = Variable(torch.randn(D_in, H).type(dtype), requires_grad=True)
w2 = Variable(torch.randn(H, D_out).type(dtype), requires_grad=True)


In [3]:
learning_rate = 1e-6
for t in range(10000):
  # Forward pass: compute predicted y using operations on Variables;
    y_pred = x.mm(w1).clamp(min=0).mm(w2)
  
  # Compute and print loss using operations on Variables.
  # Now loss is a Variable of shape (1,) and loss.data is a Tensor of shape
    loss = (y_pred - y).pow(2).sum()


  # Use autograd to compute the backward pass. This call will compute the
  # gradient of loss with respect to all Variables with requires_grad=True.
    loss.backward()

  # Update weights using gradient descent; w1.data and w2.data are Tensors,
  # w1.grad and w2.grad are Variables and w1.grad.data and w2.grad.data are
  # Tensors.
    w1.data -= learning_rate * w1.grad.data
    w2.data -= learning_rate * w2.grad.data

    # Manually zero the gradients after running the backward pass
    w1.grad.data.zero_()
    w2.grad.data.zero_()
    print("Loss is: {}".format(loss.data.numpy()), end = '\r')

print()
print("Final loss is {}".format(loss.data[0]))

Loss is: [4.962239e-07]]]
Final loss is 4.962238904226979e-07


## That's still fairly cumbersome

- When building neural networks, arrange the computation into layers, some of which have learnable parameters which will be optimized during learning.
- Use the ``` torch.nn ``` package to define your layers
- Create custom networks by subclassing the nn.Module
- Really clean code!
- Just create a class subclassing the nn.Module
    - specify layers in the ```__init__``` 
    - define a forward pass by ```forward(self,x)``` method

In [35]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
from torch.autograd import Variable

class TwoLayerNet(nn.Module):
    
    def __init__(self, D_in, H, D_out):
        super(TwoLayerNet, self).__init__()
        self.layer1 = nn.Linear(D_in, H)
        self.layer2 = nn.Linear(H, D_out)
        
    def forward(self, x):
        out = F.relu(self.layer1(x))
        out = self.layer2(out)
        return out
    

In [36]:
# N is batch size; D_in is input dimension; H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold inputs and outputs, and wrap them in Variables
x = Variable(torch.randn(N, D_in))
y = Variable(torch.randn(N, D_out), requires_grad=False)

# Construct our model by instantiating the class defined above
model = TwoLayerNet(D_in, H, D_out)

# Construct our loss function and an Optimizer. 
criterion = torch.nn.MSELoss(size_average=False)
optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)
for t in range(1000):
  # Forward pass: Compute predicted y by passing x to the model
    y_pred = model(x)

    # Compute and print loss
    loss = criterion(y_pred, y)

    # Zero gradients, perform a backward pass, and update the weights.
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    
print("Final Loss is {}".format(loss.data[0]))

Final Loss is 5.559909199703839e-10


# For more examples... 
- check out [Pytorch Docs](http://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html)

## Examples
- Check notebook titled 01_Simple_Linear_Model.ipynb, ported from Hvass-Labs [Tensorflow Tutorials](https://github.com/Hvass-Labs/TensorFlow-Tutorials)
- Provide good introduction to the Tensorflow framework and delve into more complicated examples 

## Attribution
This lecture was developed drawing on material from [Pytest](https://semaphoreci.com/community/tutorials/testing-python-applications-with-pytest), [Pytorch Examples](https://github.com/jcjohnson/pytorch-examples), [UFLDL](http://ufldl.stanford.edu/tutorial/supervised/MultiLayerNeuralNetworks/), [CS224](http://web.stanford.edu/class/cs224n/lectures/cs224n-2017-lecture1.pdf), and the Tensorflow [documentation](https://www.tensorflow.org/get_started/mnist/beginners).

# Wrap Up

# Zen of Python

Very easy to write code.

A ton of packages already exist to help do most any tasks you like.

Once you know basics, very easy to pick up everything else - and a ton of sources as well!

# Feedback

Thanks a lot!

Hope you enjoyed the class, learned a lot and will continue using Python!

Please fill out feedback forms at the end of the quarter - or feel free to let me know any feedback you have.

Questions?