## Q1.

Use github to integrate our math library from the lab with Travis CI and Coveralls.
In the cell below, put a link to your github `cs207test` repo so we can track your badges.

>*your answer here*

My repo is here.
https://github.com/lshen2009/CS207test.git

## Q2.

Take the implementation of binary search from a previous lecture/lab. Write unit tests for the algorithm (remember we have doctests in there), stripping the doctests down to those that illustrate the interface for a user. How do these doctests deal with the concerns we had?

Make sure you have good test coverage. You will be integrationg with Travis and Coveralls.

In [141]:
%%file binsearch.py
def binary_search(da_array: list, needle, left:int=0, right:int=-1) -> int:
    """
    An algorithm that operates in O(lg(n)) to search for an element
    in an array sorted in ascending order.
    
    Parameters
    ----------
    da_array : list
        a list of "comparable"items sorted in non-descending order
    needle: an item to find in the array; it may or may not
        be in the array
    left: int, optional
        the left index in the array to search from. Default 0
    right: int, optional
        the right index in the array to search to. Default is -1
        in which case we will use the end of the array `len(da_array) - 1`
        
    Returns
    -------
    index: int
        an integer representing the index of `needle` if found, and -1
        otherwise
        
    Notes
    -----
    PRE: `da_array` is sorted in non-decreasing order (thus items in
        `da_array` must be comparable: implement < and ==)
    POST: 
        - `da_array` is not changed by this function (immutable)
        - returns `index`=-1 if `needle` is not in `da_array`
        - returns an int `index ` in [0:len(da_array)] if
          `index` is in `da_array`
    INVARIANTS:
        - If `needle` in `da_array`, needle in `da_array[rangemin:rangemax]`
          is a loop invariant in the while loop below.
          
    Examples
    --------
    >>> input = list(range(10))
    >>> binary_search(input, 5)
    5
    >>> binary_search(input, 4.5)
    -1
    >>> binary_search(input, 10)
    -1
    >>> binary_search([5], 5)
    0
    >>> binary_search([5], 4)
    -1
    >>> import numpy as np
    >>> binary_search([1,2,np.inf], 2)
    1
    >>> binary_search([1,2,np.inf], np.inf)
    2
    >>> binary_search(input, 5, 1,3)
    -1
    >>> binary_search(input, 2, 1,3)
    2
    >>> binary_search(input, 2, 3, 1)
    -1
    >>> binary_search(input, 2, 2, 2)
    2
    >>> binary_search(input, 5, 2, 2)
    -1
    """
    if left==0:
        rangemin = 0
    else:
        rangemin = left
    if right==-1:
        rangemax=len(da_array) - 1
    else:
        rangemax=right
    while True:
        "needle in da_array => needle in da_array[rangemin:rangemax]"   
        if rangemin > rangemax:
            index = -1
            return index
        #If rangemin and rangemax are both very high we do not want overflow,
        #so get the midpoint like this:
        midpoint = rangemin + (rangemax - rangemin)//2
        if da_array[midpoint] > needle:#lower part
            rangemax = midpoint - 1
        elif da_array[midpoint] < needle:
            rangemin = midpoint + 1
        else:
            index = midpoint
            return index


Overwriting binsearch.py


In addition, we should be **systematic** about testing our code. You should at-least jave some tests like this:

1. We should test with wierd data, ie a wierd array: does it have NANs, is it numeric? Does it have 0 elelemts? 1 element? 2?...ie test the boundaries

2. Then think of how the needle relates to the above. Try needles less than or greter than the range in the sorted array, besides needles inbetween (in both cases the needle not being in the array). Try needles at the extremes of the array.

3. test the integration of 1 and 2 to make sure something has not gone wrong there.

Note: you can always compare your answers with linear search implemented in python.

For reference, here are some of our concerns from that lab:

#### What's happened to our issue from before?

- What if the value is not there in the array? What if it is there multiple times? what are we returning (why the -1). Are we consistent in our returning?

We return -1 if the element is not in the array. If it is there multiple times, we will return one of them: it is not defined which. We are consistent by always returning an int, choosing one which cannot be an index.

- What if rangemax is so high as to create overflow: 

We fixed it by using the difference and have documented it in the algorithm


- what types are we supporting? . 

It seems that as long as we have a notion of equals `==`, and a notion of `<` to support sorting we are set. We should document this.

- what happens if we have a NaN in our array? Infty?

If our preconditions are violated by the user, we can do anything. Doing it nicely might be costly. so we wont.


- what if da_array was not an array?

The user violated the pre-conditions. Anything could happen. We could check for a list
but yhen that would hurt a special class which implemented the python sequence protocol

- What happens if array is not sorted: 

The user violated the pre-conditions. We could return an error, violate post conditions. If we sort it we'd violate the o(lg(n)) notion. (fixing it seems dubious). Can we check if its sorted? This is naively O(n) and breaks our performance specifications. We can create a guard to terminate the array at more than n iterations for the infinite case or let the user have enough rope to hang themselves



**Submit** this to us by creating a repo `cs207binsearch` under your userid with a file `binarysearch.py` and accompanying test file(s). Intergrate with Travis CI and Coveralls. Set up badges on the README of your repo. Write the link to your repo below.

>*your answer here*

https://github.com/lshen2009/cs207binsearch

In [1]:
%%file binsearch.py
def binary_search(da_array: list, needle, left:int=0, right:int=-1) -> int:       
    """
    An algorithm that operates in O(lg(n)) to search for an element
    in an array sorted in ascending order.
    
    Newly added documentation by Lu Shen, Sep 29, 2016
    ----------
    Warning: We strongly suggest the user to avoid the below situations, which
    will voilate the O(log(n)) time complexity. 
    (1) It is not incrementally sorted.
    (2) It contains missing data or non-numeric elements.

    We report an error if the needle is outside the range of da_array.  
    -----------
    
    Parameters
    ----------
    da_array : list
        a list of "comparable"items sorted in non-descending order
    needle: an item to find in the array; it may or may not
        be in the array
    left: int, optional
        the left index in the array to search from. Default 0
    right: int, optional
        the right index in the array to search to. Default is -1
        in which case we will use the end of the array `len(da_array) - 1`
        
    Returns
    -------
    index: int
        an integer representing the index of `needle` if found, and -1
        otherwise
        
    Notes
    -----
    PRE: `da_array` is sorted in non-decreasing order (thus items in
        `da_array` must be comparable: implement < and ==)
    POST: 
        - `da_array` is not changed by this function (immutable)
        - returns `index`=-1 if `needle` is not in `da_array`
        - returns an int `index ` in [0:len(da_array)] if
          `index` is in `da_array`
    INVARIANTS:
        - If `needle` in `da_array`, needle in `da_array[rangemin:rangemax]`
          is a loop invariant in the while loop below.
          
    Examples
    --------
    >>> input = list(range(10))
    >>> binary_search(input, 5)
    5
    >>> binary_search(input, 4.5)
    -1
    >>> binary_search(input, 10) #We report an error in this test exercise here, Lu Shen, Sep 29
    -1
    >>> binary_search([5], 5)
    0
    >>> binary_search([5], 4)
    -1
    >>> import numpy as np
    >>> binary_search([1,2,np.inf], 2)
    1
    >>> binary_search([1,2,np.inf], np.inf)
    2
    >>> binary_search(input, 5, 1,3)
    -1
    >>> binary_search(input, 2, 1,3)
    2
    >>> binary_search(input, 2, 3, 1)
    -1
    >>> binary_search(input, 2, 2, 2)
    2
    >>> binary_search(input, 5, 2, 2)
    -1
    """    
    import numpy as np    
    #----- The test clodes are here, Lu Shen, Sep 29 -----
    if type(da_array) not in (tuple,list):
        raise TypeError("The array is not a list or tuple")        
    n=len(da_array)
    if n==0:
        raise ValueError("can't search in a list with length being 0")  
    else:        
        if not all(isinstance(x,(int,float)) for x in da_array):            
            raise TypeError("Not all elements are integers")          
        if np.isnan(da_array).any():
            raise ValueError("This array has missing data")            
        if n>=2:          
            if any(da_array[i] > da_array[i+1] for i in range(len(da_array)-1)):        
                raise TypeError("Not sorted")
            if needle<min(da_array) or needle>max(da_array):
                raise ValueError("The needle is outside the range of the array")    
    #------end of the test codes ----------------------------
    
    if left==0:
        rangemin = 0
    else:
        rangemin = left
    if right==-1:
        rangemax=len(da_array) - 1
    else:
        rangemax=right
    while True:
        "needle in da_array => needle in da_array[rangemin:rangemax]"   
        if rangemin > rangemax:
            index = -1
            return index
        #If rangemin and rangemax are both very high we do not want overflow,
        #so get the midpoint like this:
        midpoint = rangemin + (rangemax - rangemin)//2
        if da_array[midpoint] > needle:#lower part
            rangemax = midpoint - 1
        elif da_array[midpoint] < needle:
            rangemin = midpoint + 1
        else:
            index = midpoint
            return index

Overwriting binsearch.py


In [2]:
%%file test_binsearch.py
from pytest import raises
from binsearch import binary_search
import numpy as np

def test_mymath():
    input = list(range(10))
    assert binary_search(input,5) == 5
    assert binary_search(input,4.5) == -1
    assert binary_search([5],5) == 0
    assert binary_search(input, 5, 1,3) == -1
    assert binary_search(input, 2, 1,3) == 2
    assert binary_search(input, 2, 3, 1) == -1
    assert binary_search(input, 2, 2, 2) == 2
    assert binary_search(input, 5, 2, 2) == -1    
def test_infinite_data():    
    assert binary_search([1,2,np.inf], 2) == 1
    assert binary_search([1,2,np.inf], np.inf) == 2
def test_array_type():
    with raises(TypeError):
        binary_search(5,5)
def test_missing_data():#The time complexity is O(n), so I add one warning in the doc    
    with raises(ValueError):
        binary_search([1,2,3,4,np.nan],2)  
def test_zero_elements():
    with raises(ValueError):
        binary_search([],2)  
def test_non_int_elements():
    with raises(TypeError):
        binary_search(['a',2,3],2)
def test_sorted():#The time complexity is O(n), so I add one warning in the doc
    with raises(TypeError):
        binary_search([4,2,3],2)         
def test_needle_toolarge():# In this case, we assume that it returns an error if the needle is too large
    with raises(ValueError):
        binary_search([1,2,3,4],5) 
def test_needle_toosmall():# In this case, we assume that it returns an error if the needle is too small
    with raises(ValueError):
        binary_search([1,2,3,4],-2)  
def test_missing_and_str():#An integration of two errors together    
    with raises(TypeError):
        binary_search([1,2,3,4,"a",np.nan],2)          
def test_missing_and_str_large_needle():#An integration of three errors together    
    with raises(TypeError):
        binary_search([1,2,3,4,"a",np.nan],5)                  

Overwriting test_binsearch.py


In [3]:
!py.test  --cov --cov-report term-missing binsearch.py test_binsearch.py

platform darwin -- Python 3.5.2, pytest-2.9.2, py-1.4.31, pluggy-0.3.1
rootdir: /Users/lulushen/Desktop/CS207/cs207work, inifile: 
plugins: cov-2.3.1
collected 11 items 
[0m
test_binsearch.py ...........

---------- coverage: platform darwin, python 3.5.2-final-0 -----------
Name                Stmts   Miss  Cover   Missing
-------------------------------------------------
binsearch.py           33      0   100%
test_binsearch.py      43      0   100%
-------------------------------------------------
TOTAL                  76      0   100%




In [4]:
!rm -rf /tmp/cs207binsearch
!git clone https://github.com/lshen2009/cs207binsearch.git /tmp/cs207binsearch

Cloning into '/tmp/cs207binsearch'...
remote: Counting objects: 20, done.[K
remote: Compressing objects: 100% (14/14), done.[K
remote: Total 20 (delta 6), reused 15 (delta 4), pack-reused 0[K
Unpacking objects: 100% (20/20), done.
Checking connectivity... done.


In [5]:
!cp binsearch.py test_binsearch.py /tmp/cs207binsearch/

In [6]:
%%file /tmp/cs207binsearch/setup.cfg
[pytest]
addopts =  --cov-report term-missing --cov test_binsearch

Overwriting /tmp/cs207binsearch/setup.cfg


In [7]:
%%file /tmp/cs207binsearch/.travis.yml
language: python
python:
    - "3.5"
before_install:
    - pip install pytest pytest-cov
script:
    - py.test

Overwriting /tmp/cs207binsearch/.travis.yml


In [8]:
%%bash
pushd /tmp/cs207binsearch
git add .
git commit -m "travis integration" -a
git push
popd

/tmp/cs207binsearch ~/Desktop/CS207/cs207work
[master 4044468] travis integration
 2 files changed, 11 insertions(+), 8 deletions(-)
~/Desktop/CS207/cs207work


To https://github.com/lshen2009/cs207binsearch.git
   6984c3f..4044468  master -> master


In [9]:
%%file /tmp/cs207binsearch/.travis.yml
language: python
python:
    - "3.5"
before_install:
    - pip install pytest pytest-cov
    - pip install coveralls
script:
    - py.test
after_success:
    - coveralls

Overwriting /tmp/cs207binsearch/.travis.yml


In [10]:
%%bash
pushd /tmp/cs207binsearch
git add .
git commit -m "added coveralls" -a
git push
popd

/tmp/cs207binsearch ~/Desktop/CS207/cs207work
[master 207b2ac] added coveralls
 1 file changed, 4 insertions(+), 1 deletion(-)
~/Desktop/CS207/cs207work


To https://github.com/lshen2009/cs207binsearch.git
   4044468..207b2ac  master -> master


In [11]:
%%file /tmp/cs207binsearch/README.md

# cs207binsearch

[![Build Status](https://travis-ci.org/lshen2009/cs207binsearch.svg?branch=master)](https://travis-ci.org/lshen2009/cs207binsearch)

[![Coverage Status](https://coveralls.io/repos/github/lshen2009/cs207binsearch/badge.svg?branch=master)](https://coveralls.io/github/lshen2009/cs207binsearch?branch=master)

Overwriting /tmp/cs207binsearch/README.md


In [12]:
%%bash
pushd /tmp/cs207binsearch
git add .
git commit -m "added badges" -a
git push
popd

/tmp/cs207binsearch ~/Desktop/CS207/cs207work
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working tree clean
~/Desktop/CS207/cs207work


Everything up-to-date
