##Answer to part 5

####Do this: look through the output from `nosetests` and make a (mental) list of things that are wrong with the new function

the new function doesn't return common n-grams in the order in which they're encountered


the new function doesn't correctly deal with n less than 1



The first different in behaviour is an interesting one: do we actually care about the order? Probably not, since the n-grams are scattered throughout the text. 

Let's fix the tests rather than the function. If we were using `assert`, then we could wrap everything in `set()` like this:

In [2]:
def find_common_ngrams(text, cutoff, n):
    if n < 1:
        return []
    words = text.lower().split(' ')
    result = []
    for start in range(len(words) +1 - n):
        ngram = ' '.join(words[start:start+n])
        if text.count(ngram) >= cutoff and ngram not in result:
            result.append(ngram)
            
    return result

text = "it was the best of times it was the worst of times"

assert set(find_common_ngrams(text, 2, 1)) == set(['it', 'was', 'the', 'of', 'times'])

buth with `nose` it's more elegant: we can replace `assert_equal` with `assert_items_equal`:

In [3]:
from nose.tools import assert_equal, assert_items_equal

def test_single_words():
    assert_items_equal(find_common_ngrams(text, 2, 1),['it', 'was', 'the', 'of', 'times'])

def test_all_words():
    assert_items_equal(find_common_ngrams(text, 1, 1),['it', 'was', 'the', 'best', 'of', 'times', 'worst'])

The second problem is more interesting: note that we get exactly the same output for n=0 as we originally got with the old function:

```
AssertionError: Lists differ: [''] != []

First list contains 1 additional elements.
First extra element 0:


- ['']
+ []

```

By editing the function, we have re-introduced an old bug. This is called a *regression* and is the reason for the *test-the-bug-before-we-fix-the-bug* approach. 

#### Do this: edit the *test_ngrams.py* file to solve the two problems we encountered. 

Change the tests to use `assert_items_equal` where necessary, and change the function to return an empty string for n<1. 

Verify that when you re-run `nosetests` you don't get any errors.

[click here for part 7](Testing for scientists part 7.html)

In [222]:
# ignore this cell, it's for loading custom js code
from IPython.core.display import Javascript
Javascript(filename="custom.js")

<IPython.core.display.Javascript object>

In [223]:
# ignore this cell, it's for loading custom css code
from IPython.core.display import HTML
HTML(filename="custom.css")