In [1]:
from traitlets.config.manager import BaseJSONConfigManager
# To make this work, replace path with your own:
# On the command line, type juypter --paths to see where your nbconfig is stored
# Should be in the environment in which you install reveal.js
path = " /Users/Blake/.virtualenvs/cme193/bin/../etc/jupyter"
cm = BaseJSONConfigManager(config_dir=path)
cm.update('livereveal', {
              'theme': 'simple',
              'transition': 'zoom',
              'start_slideshow_at': 'selected',
    })

{u'start_slideshow_at': 'selected', u'theme': 'simple', u'transition': 'zoom'}

In [2]:
%%HTML 
<link rel="stylesheet" type="text/css" href="custom.css">

# CME 193 
## Introduction to Scientific Python
## Spring 2017

<br>

## Lecture 8
-------------
## Recursion, Exceptions, Unit testing, and more packages

---------

# Lecture 8 Contents

* Admin
* Recursion
* Exceptions
* Unit Testing
* More packages

## Administration

* Thank you for your project proposals! Really cool and interesting ideas.
* Complete either HW2 or Project by **May 16**.
* Exercises also due **May 16**.

### Project tips and general feedback

- If your project involves a dataset, make sure you tackle this step early
- HW2 is the benchmark for required deliverables.
- If you need to pivot along the way, that is fine, if it's substantial let us know
- Have fun and research best practices along the way. You are *not* being graded on how well your model works.


## Office Hours

We will continue to hold office hours over the next two weeks:

- 11am - 1pm Tuesday/Thursday
 - ```May 4, May 9, May 11, May 16```

---------

# Recursion

Recursive function solve problems by reducing them to smaller problems of the same form.

This allows recursive functions to call themselves.
 - New paradigm 
 - Powerful tool 
 - Divide-and-conquer 
 - Beautiful solutions

## First example

Let’s consider a trivial problem:

Suppose we want to add two positive numbers ```a``` and ```b```, but we can only add/subtract 1 to any number.

How would you write a function to do this without recursion? 

What control statement(s) would you use?

In [24]:
# Non-recursive solution
def add(a, b): 
    while b > 0:
        a += 1
        b -= 1 
    return a

add(7, 8)

15

## Recursive solution

- Simple case: 
 - If ```add(a,b)``` is called with ```b = 0``` just return ```a```
- Otherwise, we can return ```1 + add(a, b-1)```

In [25]:
# Recursive solution
# Adding b to a, (if only able to use +1)
def add(a, b):
    if b == 0:
        # base case
        return a
    # recursive step
    return add(a, b-1) + 1

### Base case and recursive steps

Recursive functions consist of two parts:

**Base case**: The base case is the trivial case that can be dealt with easily.

**Recursive step**: The recursive step brings us slightly closer (breaks the problem into smaller subproblems) to the base case and
calls the function itself again.

## Reversing a list

How can we recursively reverse a list? ```([1, 2, 3] → [3, 2, 1])```

 - If list is empty or has one element, the reverse is itself 
 - Otherwise, reverse elements 2 to n, and append the first

In [26]:
def reverse_list(xs):
    if len(xs) <= 1:
        return xs
    else:
        # shift first element to last
        return reverse_list(xs[1:]) + [xs[0]]
    
reverse_list([1,2,3])

[3, 2, 1]

## Palindromes

- A palindrome is a word that reads the same from both ways, such as radar or level.

- Let’s write a function that checks whether a given word is a palindrome.

## The recursive idea

Given a word, such as level, we check:
 - whether the first and last character are the same
 - whether the string with first and last character removed are the same

## Base case

What’s the base case in this case? 

- The empty string is a palindrome 
- Any 1 letter string is a palindrome

In [27]:
def is_palin(s):
    '''returns True iff s is a palindrome'''
    if len(s) <= 1:
        return True
    return s[0] == s[-1] and is_palin(s[1:-1])

print((is_palin('cme193')))
print((is_palin('racecar')))

False
True


## Another example

Write a recursive function that computes ab for given a and b, where b is an integer. (Do not use ∗∗)

### Another example

Base case: $b=0$ , $a^b =1$

Recursive step: (be careful) there are actually two options, one for if b < 0 and one for if b > 0.

In [30]:
def power(a,b):
    if b == 0:
        return 1
    elif b > 0:
        return a*power(a,b-1)
    else:
        return (1./a)*power(a,b+1)
    
power(2,10)
power(2, -10) == 1.0/1024

True

### Example: Fibonacci

```python
fib(0) = 0

fib(1) = 1

fib(n) = fib(n-1) + fib(n-2)  for n >= 2
```

In [31]:
def fib(n):
    if n <= 1:
        return n
    f = fib(n-1) + fib(n-2)
    return f
fib(34)

5702887

## Memoization
#### A paradigm to trade memory for time

In [34]:
M = {0: 0, 1:1}

def fib_memo(n):
    if n in M:
        return M[n]
    f = fib(n-1) + fib(n-2)
    M[n] = f
    return f

## Trapezoid Rule

In [13]:
def trapezoid(f, a, b, N):
    '''integrates f over [a,b] using N steps'''
    if a > b:
        a, b = b, a
    # step size
    h = float(b-a)/N
    # running sum
    s = h/2 * (f(a) + f(b))
    for k in range(1, N-1):
        s += h * f(a + h*k)
    return s

## Adaptive integration

In [14]:
def ada_int(f, a, b, tol=1.0e-6, n=5, N=10):
    area = trapezoid(f, a, b, N)
    check = trapezoid(f, a, b, n)
    if abs(area - check) > tol:
        # bad accuracy, add more points to interval
        m = (b + a) / 2.0
        area = ada_int(f, a, m) + ada_int(f, m, b)
    return area

## Pitfalls

Recursion can be very powerful, but there are some pitfalls: 
- Have to ensure you always reach the base case.

- Each successive call of the algorithm must be solving a simpler problem

- The number of function calls shouldn’t explode. (see exercises)

- An iterative algorithm is always faster due to overhead of function calls. (However, the iterative solution might be much more complex)

---------

# Exceptions

## Exceptions
### Example

Consider a function that takes a filename, and returns the 20 most common words. (This is similar to one of the exercises you could have done.)

Suppose we have written a function:

```python
topkwords(filename, k)
```


Instead of entering ```filename``` and value of ```k``` in the script, we may also want to run it from the terminal.

### Parse input from command line

The sys module allows us to read the terminal command that started the script:

```
import sys

print sys.argv```


## ```sys.argv```

```sys.argv``` holds a list with command line arguments passed to a Python script.

Note that ```sys.argv[0]``` will be the name of the python script itself.

```python
import sys
def topkwords(filename, k):
# Returns k most common words in filename
    pass

if __name__ == "__main__":
    filename = sys.argv[1]
    k = int(sys.argv[2])
    print topkwords(filename, k)```

## Issues

- What if the file does not exist?
- What if the second argument is not an integer? 
- What if no command line arguments are supplied?
- All result in errors: 
 - ```IOError```
 - ```ValueError```
 - ```IndexError```

## Exception handling

What do we want to happen when these errors occur? Should the program simply crash?

No, we want it to gracefully handle these
- ```IOError```: Tell the user the file does not exist.
- ```ValueError```, ```IndexError```: Tell the user what the format of the command line arguments should be.

## Try ... Except

- The try clause is executed
- If no exception occurs, the except clause is skipped
- If an exception occurs, the rest of the try clause is skipped. Then if the exception type is matched, the except clause is executed. Then the code continues after the try statement
- If an exception occurs with no match in the except clause, execution is stopped and we get the standard error

```python
import sys
if __name__ == "__main__": 
    try:
        filename = sys.argv[1]
        k = int(sys.argv[2])
        print topkwords(filename, k)
    except IOError:
        print "File does not exist"
    except (ValueError, IndexError):
        print "Error in command line input"
        print "Run as: python wc.py <filename> <k>"
        print "where <k> is an integer"
```

## A naked except

### A naked except
We can have a naked except that catches any error:
```python
try:
    t = 3.0 / 0.0
except:
    # handles any error
    print 'There was some error'
```

Use this with extreme caution though, as genuine bugs might be impossible to correct!

## Try - Except - Else

- Else clause after all except statements is executed after successful execution of the try block (hence, when no exception was raised)

Why? 
 - Avoids catching exception that was not protected E.g. consider f.readlines raising an IOError

```python
# from Python docs
for arg in sys.argv[1:]:
    try:
        f = open(arg, 'r')
    except IOError:
        print 'cannot open', arg 
    else:
        print arg, 'has', len(f.readlines()), 'lines' f.close()
```

## Raise

We can use Raise to raise an exception ourselves.

```
>>> raise NameError(’Oops’)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
NameError: Oops
```

## ```finally```

The finally statement is always executed before leaving the try statement, whether or not an exception has occured.

Useful in case we have to close files, closing network connections etc.

In [35]:
def div(x, y):
    try:
        return x/y
    except ZeroDivisionError:
        print('Division by zero!') 
    finally:
        print("Finally clause")
print((div(3,2))) 
print((div(3,0)))

Finally clause
1
Division by zero!
Finally clause
None


## Raising our own excecptions

Recall the Rational class we considered a few lectures ago.

What if the denominator passed in to the constructor is zero?

## Raising our own excecptions

```python
class Rational:
    def __init__(self, p, q=1):
        g = gcd(p, q)
        self.p = p / g
        self.q = q / g
```

What if ```q == 0```?

## Making the necessary change

```python
class Rational:
    def __init__(self, p, q=1):
        if q == 0:
            raise ZeroDivisionError('denominator is zero')
        g = gcd(p, q) 
        self.p = p / g 
        self.q = q / g
```

---------

# Unit tests

## Unit tests: 

Test individual pieces of code.

For example, for factorial function, test
```0! = 1``` or ```3! = 6``` etc.

![title](img/test_comp.png)

## Test driven development

Some write tests before code. Reasons:
- Focus on the requirements
- Don’t write too much
- Safely restructure/optimize code
- When collaborating: don’t break other’s code 
- Faster

## Test cases

How to construct test cases?

A test case should answer a single question about the code.

A test case should:
- Run by itself, no human input required
- Determine on its own whether the test has passed or failed 
- Be separate from other tests

## What to test?
- Known values
- Sanity check (for conversion functions for example)
- Bad input
 - Input is too large?
 - Negative input?
 - String input when expected an integer?
- etc: very dependent on problem

## ```unittest```

A testcase is created by subclassing ```unittest.TestCase```

Individual tests are defined with methods whose names start with the letters test. (Allows the test runner to identify the tests)

Each test usually calls an assert method to run the test - many assert options.

A few different ways to run tests (see documentation). Easiest way is to run ```unittest.main()``` for example if the test script is the main program.

# ```assert```

We can use a number of methods to check for failures:
- assertEqual
- assertNotEqual
- assertTrue, assertFalse
- assertIn
- assertRaises 
- assertAlmostEqual 
- assertGreater, assertLessEqual
- etc. (see Docs)

```python
import unittest
from my_script import is_palindrome

class KnownInput(unittest.TestCase):
    knownValues = (('lego', False),
                   ('radar', True))

    def testKnownValues(self):
        for word, palin in self.knownValues:
            result = is_palindrome(word)
            self.assertEqual(result, palin)
```

### ```unittest```

Note, to use the ```unittest``` package inside a Jupyter notebook instead of ```unittest.main()```, use:


``` unittest.main(argv=['ignored', '-v'], exit=False)```

# Alternatives

- ```nose2```
- ```Pytest```

http://nose2.readthedocs.io/en/latest/differences.html

## Pytest

```pip install pytest```

- Easy testing
- Automatically discovers tests
- No need to remember all assert functions, keyword assert works for everything
- Informative failure results

## Pytest

Test discovery: (basics)

- Scans files starting with test_ 
- Run functions starting with test_

## Example : primes

Create two files in a directory:
    ```primes.py``` – Implementation 
    ```test_primes.py``` – Tests


```python
# primes.py
# (simplest solution that passes tests)

def is_prime(x):
    for i in xrange(2, x):
        if x % i == 0: 
            return False
    return True
```

```python

# test_primes.py
from primes import is_prime 
def test_is_three_prime():
    assert is_prime(3)
def test_is_four_prime(): 
    assert not is_prime(4)
```

### Using ```py.test``` to execute test suite

#### By default, it will run all files prefixed with test.

Here we pass in the name of our test script:

```py.test test_primes.py```

```python
from primes import is_prime 

def test_is_zero_prime():
    assert not is_prime(0) 
def test_is_one_prime():
    assert not is_prime(1) 
def test_is_two_prime():
    assert is_prime(2)
def test_is_three_prime(): 
    assert is_prime(3)
def test_is_four_prime(): 
    assert not is_prime(4)```

## Some more tests

- Negative numbers 
- Non integers
- Large prime
- List of known primes 
- List of non-primes

### When all tests pass...

- First make sure all tests pass
- Then optimize code, making sure nothing breaks

Now you can be confident that whatever algorithm you use, it still works as desired!

## Example - tests for Rational class

Recall the rational numbers class we made earlier. What are some unit tests that you would use to test that class?

### Test for Rational class
```python
import exception_rational_fix
import unittest

class TestMethods(unittest.TestCase):
    def test_denomZero(self):
        self.assertRaises(ZeroDivisionError)

if __name__ == '__main__':
    unittest.main()
```

## Writing good tests
- Utilize automation and code reuse
- Know the type and scope - your module or somebody else’s?
- A single test should focus on a single thing
- Functional tests must be deterministic
- Leave no trace - safe setup and clean up

---------

# More packages

## More packages

We've seen the huge value in using packages for scientific programming. Let's check out a few more packages that could be useful for your projects and in the future.

Here is an amazing repository of curated Python resources: https://github.com/vinta/awesome-python

## Pickle

```python
import pickle

with open('grades.txt', 'r') as fin:
    with open('grades.bin', 'wb') as fout:
        lines = fin.readlines()
        n = len(lines)
        pickle.dump(n, fout)
        for line in lines:
            student = line.split()
            name = student[0]
            grades = [int(student[i]) for i in range(1,len(student))]
            pickle.dump(name, fout)
            pickle.dump(grades, fout)```

## Pickle
There are definitely other options for binary file i/o in Python.

Just use the tags ’rb’ and ’wb’ - but Pickle makes it easier to deal with conversion of objects to byte streams.

Easiest way to deal with unknown size of loads is to just save all data in one big data structure and load everything at once (though may be infeasible if you are working with a lot of data).

Warning: Pickle not secure against erroneous or maliciously constructed data!

## Speeding up Python

Compared to C or Fortran, Python can be slow.

Ways to improve execution time:

- Pypy: no need to change any code, simply run your code using pypy script.py. However, does not work with Numpy etc.
- Numba: A little more work, but works with numpy
- Cython: Most work, fastest

## ```requests```

HTTP library for Python

Alternative: ```urllib, urllib2```

```python
import requests

r = requests.get('http://google.com')

print r.text```

## Beatiful Soup

Useful for scraping HTML pages.

Such as: finding all links, or specific urls.

Get data from poorly designed websites.

Alternative: Scrapy

## APIs

There are several modules that you can use to access APIs of websites Twitter python-twitter, Tweepy
Reddit PRAW ...

Always a good idea to see if the maintainers of a particular data source provide an API.

Able to get data or create apps for the ambitious.

# Flask

## Flask

Flask is a 'microframework' for web development using Python.

Example script next, which you can run and then browse to http://127.0.0.1:5000/.

In-depth tutorial: http://blog.miguelgrinberg.com/post/
the-flask-mega-tutorial-part-i-hello-world

```python
from flask import Flask

app = Flask(__name__)
@app.route("/")
def hello(name="World"):
    return "Hello {}!".format(name)
    
if __name__ == "__main__":
    app.run(debug=True) # only for debugging!```

## Django

Another web development framework using Python
https://www.djangoproject.com/

## ```Scikit```

We saw ```scikit-learn``` last week.

There are other Scikit packages available with a lot of functionality. Sponsored by INRIA (and Google sometimes)

## ```scikit-*```

Additional scikit packages that extend Scipy: 

- ```skikit-aero```
- ```scikit-image```
- ```cuda```
- ```odes```

## ```PyMC```

A framework for Monte Carlo simulations

Tutorial: https://camdavidsonpilon.github.io/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/

## ```Selenium```

- Selenium Python bindings provides a simple API to write functional/acceptance tests using Selenium WebDriver.

Tutorial: http://selenium-python.readthedocs.io/

# Wrap Up

# Zen of Python

Very easy to write code.

A ton of packages already exist to help do most any tasks you like.

Once you know basics, very easy to pick up everything else - and a ton of sources as well!

# Feedback

Thanks a lot!

Hope you enjoyed the class, learned a lot and will continue using Python!

Please fill out feedback forms at the end of the quarter - or feel free to let me know any feedback you have.

Questions?