![alt text](./images/cover_b.png "Title")

<br>
<br>

# Table of Contents
- [Course Overview](#Python_Testing_Tutotial)
- [Quotes](#quotes)
- [Warm-up](#warm_up)
- [Unit Testing](#unit_testing)
- [Test Fixtures](#fixture_testing)
- [Parameterized Testing](#parameterized_testing)
- [Testing Command-Line Programs](#command_testing)
- [Test Suites](#test_suites)
- [Test Coverage](#test_coverage)
- [Testing New Features](#new_features)
- [Recap Puzzle](#recap)

<br>
<br>
<a id="Python_Testing_Tutotial"></a>

# Python Testing Tutorial

## Overview

**This tutorial helps you learn about automated testing in Python 3 using the ```py.test``` framework**

![alt text](images/mobydick.png "Moby")

### Sources for this tutorial
[github.com/krother/python_testing_tutorial](https://github.com/krother/python_testing_tutorial)

### License
Released under the conditions of a [Creative Commoms Attribution License 4.0](https://creativecommons.org/licenses/by/4.0/)

### Contributors
**Authors:** Kristian Rother, Magdalena Rother, Daniel Szoska

**This notebook:** Peter Causey-Freeman, Lecturer healthcare sciences (clinical bioinformatics), The University of Manchester 

**(Other content as referenced)**

<br>

# Counting Words in Moby Dick

## Moby Dick: Plot synopsis
*Captain Ahab was vicious because Moby Dick, the white whale, had bitten off his leg. So the captain set sail for a hunt. For months he was searching the sea for the white whale. The captain finally attacked the whale with a harpoon. Unimpressed, the whale devoured captain, crew and ship. The whale won.*

![alt text](images/counting.png "MobyTick")

## Video overview

In [12]:
from IPython.display import YouTubeVideo
# Youtube
YouTubeVideo('EFPhnR5CZtc', width=560, height=315)

## Course Objective
Herman Melville's book *“Moby Dick”* describes the epic fight between the captain of a whaling ship and a whale. In the book, the whale wins by eating most of the other characters.

**But does he also win by being mentioned more often?**

In this course, you have a program that analyzes the text of Melville's book.

**You will test whether the program works correctly?**


## Why was this example selected?

Three main reasons:

* The implementation is simple enough for beginners.
* Counting words easily yields different results (because of upper/lower case, special characters etc). 
* Therefore the program needs to be thoroughly tested.

<br>
<br>
<a id="quotes"></a>

# Quotes
**"Call me Ishmael"** <br> Herman Melville, Moby Dick 1851

----

**"UNTESTED == BROKEN"** <br> Schlomo Shapiro, EuroPython 2014

----

**"Code without tests is broken by design”** <br> Jacob Kaplan-Moss

----

**"Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it?"** <br> Brian Kernighan, "The Elements of Programming Style", 2nd edition, chapter 2

----

**"Pay attention to zeros. If there is a zero, someone will divide by it."** <br> Cem Kaner

----

**"If you don’t care about quality, you can't meet any other requirement”** <br> Gerald M. Weinberg

----

**"Testing shows the presence, not the absence of bugs."** <br> Edsger W. Dijkstra

----

**"... we have as many testers as we have developers. And testers spend all their time testing, and developers spend half their time testing. We're more of a testing, a quality software organization than we're a software organization."** <br> Bill Gates (Information Week, May 2002)

<br>
<br>
<a id="warm_up"></a>

# Warm-up
## Group discussion?
How do you know that your code works? (give answers in the next cell)

*Give your answers in this cell
.
.
.
.
.

<div class="alert alert-block alert-info">
    
## How many words are in the following sentence?

    The program works perfectly?

You will probably agree, that the sentence contains **four words**, but your task is to count the words in the sentence using python

<br>

***note: all blue cells are exercises, so once you have had a go, click on the +/- to show/hide a possible solution***
</div>

In [5]:
# Create text string
text_1 = "The program works perfectly?"
# Split the string at white space characters
list_1 = text_1.split()
# Use Python to tell us how many words we have by calling the len function
len(list_1)

4

In [None]:
# How many words in the following sentence?


<div class="alert alert-block alert-info">

## How many words are in the next sentence?

    That #§&%$* program still doesn't work!\nI already de-bugged it 3 times, and still numpy.array keeps raising AttributeErrors.\tWhat should I do?

You may find the answer to this question less obvious. It depends on how precisely the special characters are interpreted.

<br>

Your task is to count the words in this sentence using python
</div>

In [7]:
# Create text string
text_2 = "That #§&%$* program still doesn't work!\nI already de-bugged it 3 times, and still numpy.array keeps raising AttributeErrors.\tWhat should I do?"
# Split the string at white space characters
list_2 = text_2.split()
# Use Python to tell us how many words we have by calling the len function
print("output 1: " + str(len(list_2)))

# Now use a regular expression to handle special characters
import re
list_3 = str((re.sub('\W+',' ', text_2))).split()
print("output 2: " + str(len(list_3)))

output 1: 22
output 2: 24


In [None]:
# How many words in the second sentence?


## Why use automated testing?
The examples above shows that variations in your code  can produce different answers, so it is essential to ensure that the functions you create return the result(s) you are expecting. You will also want to ensure the function continues to return the correct result during ongoing development. 

This can be achieved using automated testing.

Writing automated tests for your software helps you to:

* get clear on what you want the program to do
* identify gaps in the requirements
* prove the presence of bugs (**not their absence!**)
* help you during code refactoring

<br>
<br>
<a id="unit_testing"></a>

# Unit Testing
## What is Unit Testing?
[Software Testing Fundamentals - Unit Testing](http://softwaretestingfundamentals.com/unit-testing/)
## What is pytest?
[pytest documentation](https://docs.pytest.org/en/latest/)

## Testing code with PyCharm
If you have been following the **"Your First Python Application"** PyCharm tutorials, in addition to following this notebook you may wish to perform the [testing](https://www.jetbrains.com/help/pycharm/testing-your-first-python-application.html) in this tutorial with PyCharm.

## Unit Testing exercises

<div class="alert alert-block alert-info">

### Exercise 1: Test a Python function

The function **main()** in the module **word_counter.py** calculates the number of words in a text body. For instance, the following sentence contains **three** words:

    Call me Ishmael

<br>

Your next task is to prove that the **TextCorpus** class calculates the number of words in the sentence correctly with **three**.

Run the example test in **test_unit_test.py** with `pytest test_unit_test.py`
    
<br>

*Note, although pytest is a Python module, it is invoked using a standard command line/command prompt terminal.

<br>

To invoke such commands, precede the command with the ! character, *i.e*

`! pytest test_unit_test.py`
</div>

In [None]:
# ! pytest test_unit_test.py


<div class="alert alert-block alert-info">

**Oh no, that went wrong?**

Look at the error produced by pytest and correct your command
</div>

In [None]:
# We are in the directory above tests. Since we are specifying a specific test file, we need to explicitely tell python where it is i.e. test/<script> or ./test/<script>
! pytest test/test_unit_test.py

In [None]:
# Correct your command


<div class="alert alert-block alert-info">

**Still not working - let's do some debugging!**

<br>

The error is telling us **No module named 'mobydick'**. 

<br>

Use PyCharm to look at the project structure in Git and discuss what the problem might be (comment in the cell below marked Add comments here)

</div>

The **mobydick** module is on the same level as the test files. Python cannot import these modules without explicit instructions as to where they might be found.

There are several solutions, including adding directories to your python PATH. 

This is not a practical solution for portability though (*i.e. what happens if you want to run this tutorial from a different machine*)

A better solution is to add additional code to your test scripts so that they can import the **mobydick** module

In [None]:
# Hint, run this command
! echo $PATH

**Add comments here:** 


<div class="alert alert-block alert-info">

**Your next task is to add additional code (using PyCharm) to test_unit_test.py and run pytest again**

<br>

Remember, google and stackoverflow are your friends! *If at first you don't succeed, try, try and pytest again!*

<br>

***Remember to comment your code so you know what you have done and why!***

</div>

**Possible solution**

Add code that captures the **absolute file path** for the directory above **test_unit_test.py** and tell python to import the **mobydick** module from this directory

For example, add the following before trying to import **mobydick**

```python
"""
Additional code for Unit Testing Exercise 1
    - The code allows python to import modules from the mobydick module

Other possible solutions may include 

>>> from ../mobydick import

but I have always had limited success with this approach
"""

import os
parentdir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
os.sys.path.insert(0,parentdir)
```

In [None]:
# Keep running this cell until your test is successful
! pytest test/test_unit_test.py

<div class="alert alert-block alert-info">

### Exercise 2: Test proves if code is broken

The test in the module **test_failing_code.py** fails, because there is a bug in the function **word_counter.average_word_length()**. 

In the sentence

    Call me Ishmael

The words are **four, two,** and **seven** characters long. This gives an average of:

    >>> (4 + 2 + 7) / 3.0
    4.333333333333333

<br>

Fix the code in **test_broken_code.py**, and test it until the test passes.

<br>

*Note you will need to add code enabling **test_broken_code.py** to import the **mobydick** module*

</div>

**Solution**

```python
"""
The provided function in test_broken_code.py is correct
"""
class TestMobyDickBrokenCode:

    def test_average_word_length(self):
        """Calculate average word length in a short sentence"""
        text = TextCorpus("Call me Ishmael")
        assert round(text.average_word_length, 3) == 4.333
        
"""
But there is a bug in mobydick.word_counter.py
   - Refer to https://www.w3schools.com/python/ref_func_map.asp
   - We need to convert the map object into a list
"""
    @property
    def average_word_length(self):
        """Returns the average word length as a float."""
        lengths = map(len, self.text.split())
        lengths = list(lengths)
        return sum(lengths) / len(lengths)
```

In [None]:
# Keep running this cell until your test is successful
! pytest test/test_broken_code.py

<div class="alert alert-block alert-info">

### Exercise 3: Code proves if tests are broken

The test in the module **test_broken_test.py** fails, because there is a bug in the test file.

Your task is to fix the test, so that the test passes. 

</div>

**Solution**

```python
"""
The words in the provided in the original test in test_broken_test.py do not match the test string
"""

class TestMobyDickBrokenTest:

    def test_words(self):
        """The word attribute is a list"""
        # words = ['my', 'name', 'is', 'ishmael']
        words = ['Call', 'me', 'Ishmael']
        text = TextCorpus('Call me Ishmael')
        assert text.words == words
```

In [None]:
# Keep running this cell until your test is successful
! pytest test/test_broken_test.py

<div class="alert alert-block alert-info">

### Exercise 4: Test border cases

High quality tests cover many different situations. The most common situations for the program **word_counter.py** include:

| test case | description | example input | expected output
|-----------|-------------|---------------|-----------------
| empty | input is valid, but empty | "" | 0
| minimal | smallest reasonable input | "whale" | 1
| typical | representative input | "whale eats captain" | 3
| invalid | input is supposed to fail | 777 | *Exception raised*
| maximum | largest reasonable input | *Melville's entire book* | *more than 200000*
| sanity | program recycles its own output | *TextBody A created from another TextBody B* | *A equals B*
| nasty | difficult example | "That #~&%* program still doesn't work!" | 6

<br>

Your task is to make all tests in **test_border_cases.py** pass.

</div>

**Solution**

```python
class TestBorderCases:

"""
Code changes in test_border_cases.py are shown below with the failing code commented out
"""

   def test_empty(self):
        """Empty input works"""
        text = TextCorpus('')
        #assert text.n_words == _____
        assert text.n_words == 0


    def test_smallest(self):
        """Minimal string works."""
        text = TextCorpus("whale")
        #  _____ text.words == ['whale']
        assert text.words == ['whale']

    def test_typical(self):
        """Representative small input works."""
        text = TextCorpus("whale eats captain")
        # assert text.words == [_____, 'eats', 'captain']
        assert text.words == ['whale', 'eats', 'captain']

    def test_wrong_input(self):
        """Non-string doesn't work"""
        # with pytest.raises(_____) as e_info:
        with pytest.raises(TypeError) as e_info:
            TextCorpus(777)

    def test_biggest(self):
        """An entire book works."""
        #text = TextCorpus(open('mobydick_full.txt').read())
        text = TextCorpus(open(parentdir+'/test/mobydick_full.txt').read())
        #assert text._____ > 200000
        assert text.n_words > 200000


    def test_sanity(self):
        """Feed output of a class into itself"""
        # text = TextCorpus(open('mobydick_full.txt').read())
        text = TextCorpus(open(parentdir+'/test/mobydick_full.txt').read())
        words_before = list(text.words)
        copy = ' '.join(text.words)
        text_after = TextCorpus(copy)
        #assert words_before == _____
        assert words_before == list(text_after.words)


    def test_nasty(self):
        """Ugly data example works."""
        text = TextCorpus("""That #~&%* program still doesn't work!
    I already de-bugged it 3 times, and still numpy.array keeps throwing AttributeErrors.
    What should I do?""")
        # assert text.n_words == _____
        assert text.n_words == 22
```


In [None]:
# Keep running this cell until your test is successful
! pytest test/test_border_cases.py

<br>
<br>
<a id="fixture_testing"></a>

# Test fixtures
## What are Test Fixtures?
[Wikipedia - Test Fixtures](https://en.wikipedia.org/wiki/Test_fixture#Software)

[pytest fixtures](https://docs.pytest.org/en/latest/fixture.html)

<div class="alert alert-block alert-info">


### Exercise 1: A module for test data

Create a new module **conftest.py** with a string variable that contains a sentence with lots of special characters:

    sample = """That #§&%$* program still doesn't work!
    I already de-bugged it 3 times, and still numpy.array keeps raising AttributeErrors. What should I do?"""

Create a function that returns a **mobydick.TextCorpus** object with the sample text above. Use the following as a header:

    @pytest.fixture
    def sample_corpus():
        ...

***Remember, you will need to import modules which will be used by your new module***

</div>

**Solution**

```python
"""
Text Fixtures Exercise 1
Example of conftest.py
"""

"""
Required at the top of every script since we are only setting in local scope
"""
import os
parentdir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
os.sys.path.insert(0,parentdir)

from mobydick import TextCorpus, count_word

"""
Pytest will also need to be imported
"""
import pytest

sample = """That #§&%$* program still doesn't work! I already de-bugged it 3 times, and still numpy.array keeps raising
 AttributeErrors. What should I do?"""

@pytest.fixture
def sample_corpus():
    text = TextCorpus(sample)
    return text
```

<div class="alert alert-block alert-info">


### Exercise 2: Using the fixture

Now create a module **test_sample.py** with a function that uses the fixture:

    def test_sample_text(sample_corpus):
        assert sample_corpus.n_words == 22

Execute the module with `pytest`. Note that you **do not** need to import **conftest**. Pytest does that automatically.

</div>

**Solution**

```python

"""
Text Fixtures Exercise 2
Example of test_sample.py
"""

def test_sample_text(sample_corpus):
    assert sample_corpus.n_words == 22

```

In [None]:
# Keep running this cell until your test is successful
! pytest test/test_sample.py

<div class="alert alert-block alert-info">


### Exercise 3: Create more fixtures

Create fixtures in your **conftest.py** module for the two text corpora in the files **mobydick_full.txt** and **mobydick_summary.txt** as well.

<br>

*Note: consider the file paths for these text files*

</div>

**Solution**

```python

"""
To conftest.py add the following 2 functions that replace the sample corpus with the text from mobydick summary and full-text

- Note the absolute filepath must be provided within the file open statement
"""

@pytest.fixture
def full_text_corpus():
    text = TextCorpus(open(parentdir + '/test/mobydick_full.txt').read())
    return text

@pytest.fixture
def summary_text_corpus():
    text = TextCorpus(open(parentdir + '/test/mobydick_summary.txt').read())
    return text

```

<div class="alert alert-block alert-info">

### Exercise 4: Fixtures from fixtures

In the following section we will make use of the word_counter function. Create a fixture in **conftest.py** that uses another fixture in **conftest.py**:

```python
from mobydick import count_word
    
    @pytest.fixture
    def counter(summary_text_corpus):
        return count_word(summary_text_corpus, 'whale')
```

Write a simple test in your **test_sample.py*** module that makes sure the ***return*** from the fixture `is not None`

</br>

Execute the module with `pytest`.
</div>

**Solution**

```python
"""
To conftest.py add the code from the notebook
"""
@pytest.fixture
def counter(summary_text_corpus):
    return count_word(summary_text_corpus, 'whale')

"""
To test_sample.py add a test to test the fixture created in conftest.py
"""
def test_counter(counter):
    assert counter is not None
    
```

In [None]:
# Keep running this cell until your test is successful
! pytest test/test_sample.py

<br>
<br>
<a id="parameterized_testing"></a>

# Parameterized Tests
## What are Parameterized Tests?
[Wikipedia - Unit Testing](https://en.wikipedia.org/wiki/Unit_testing)

[pytest parameterizing tests](https://docs.pytest.org/en/latest/example/parametrize.html)

<div class="alert alert-block alert-info">

### Exercise 1: Sets of example data

You have a list of pairs (word, count) that apply to the text file **mobydick_summary.txt**:

    PAIRS = [
             ('months', 1),
             ('whale', 3),
             ('captain', 2),
             ('white', 2),
             ('harpoon', 0),
             ('goldfish', 0)
    ]

We will create six tests from these samples.

Instead of creating six tests manually, we will use the **test parametrization in pytest**. Edit the file **test_parameterized.py** and add the following decorator to the test function:

```python
@pytest.mark.parametrize('word, number', PAIRS)
```

<br>

Add two arguments `word` and `number` to the function header and remove the assignments of `word` and `number` below.

<br>

Run the test and make sure all six tests pass. 

</div>

**Solution**

```python

"""
Amended code for test_paremeterized.py
   - Changes to the original code are hashed out
"""
import pytest

# MOBYDICK_SUMMARY = open('mobydick_summary.txt').read()
MOBYDICK_SUMMARY = open(parentdir+'/test/mobydick_summary.txt').read()

PAIRS = [
    ('months', 1),
    ('whale', 3),
    ('captain', 2),
    ('white', 2),
    ('harpoon', 0),
    ('goldfish', 0)
    ]

@pytest.mark.parametrize('word, number', PAIRS)
def test_check_word(word, number):
    # word, number = PAIRS[0]
    text = TextCorpus(MOBYDICK_SUMMARY)
    assert count_word(text, word) == number

```

In [None]:
# Keep running this cell until your test is successful
! pytest test/test_parameterized.py

<div class="alert alert-block alert-info">

### Exercise 2: Write another parameterized test

The function **get_top_words()** calculates the most frequent words in a text corpus. It should produce the following top five results for the book **mobydick_full.txt**:

| position | word |
|----------|------|
| 1. | the   |
| 2. | of  |
| 3. | and   |
| 4. | a  |
| 5. | to |


<br>

Write one parameterized test that checks these five positions.

</div>

**Solution**

```python

"""
To test_parameterized.py add
"""

RANKS = [
    ('the', 1),
    ('of', 2),
    ('and', 3),
    ('a', 4),
    ('to', 5)
    ]

MOBYDICK_FULL = open(parentdir+'/test/mobydick_full.txt').read()

@pytest.mark.parametrize('new_word, new_number', RANKS)
def test_check_word(new_word, new_number):
    text = TextCorpus(MOBYDICK_FULL)
    top_words =  get_top_words(text, 5)
    assert top_words[new_number-1][1] == new_word

```

In [None]:
# Keep running this cell until your test is successful
! pytest test/test_parameterized.py

<br>
<br>
<a id="command_testing"></a>

# Testing Command-Line Programs

<div class="alert alert-block alert-info">

### Exercise 1: Test a command-line application
The program **word_counter.py** can be used from the command line to calculate the most frequent words, *e.g.* the top five words:

```bash
! python mobydick/word_counter.py test/mobydick_summary.txt 5```

**User Acceptance**

The ultimate test for any software is whether your users are able to do what they need to get done.

<br>

Your task is to *manually* use the program **word_counter.py** to find out whether Melville used *'whale'* or *'captain'* more frequently in the full text of the book *"Moby Dick"*.

<br>

*Hint: use the grep commandline function*

<br>

*The User Acceptance test cannot be replaced by a machine.*

</div>

In [8]:
# Solution
# Check how many times whale is used
! python mobydick/word_counter.py test/mobydick_full.txt 100 | grep whale

whale 392


In [1]:
# Solution
# Check how many times whale is used
! python mobydick/word_counter.py test/mobydick_full.txt 100 | grep captain

In [None]:
# Check how many times whale is used


In [None]:
# Check how many times whale is used


<br>
<br>
<a id="test_suites"></a>

# Test Suites

## What are Test Suites?
[Wikipedia Test Suite](https://en.wikipedia.org/wiki/Test_suite)

<div class="alert alert-block alert-info">

### Exercise 1: Test collection

Run all tests written so far by simply typing

```bash 
pytest 
```

*Note: We have not yet looked in every one of out test scripts, so expect test failures!*

</div>

In [None]:
# Run pytest
! pytest

<div class="alert alert-block alert-info">

### Exercise 2: Options

Try some options of pytest:

```bash
pytest -v  # verbose output

pytest -lf # re-run failed tests

pytest -x  # stop on first failing test
```

</div>

In [None]:
# pytest -v


In [None]:
# pytest - lf


In [None]:
# pytest -x


<div class="alert alert-block alert-info">

### Exercise 3: Fixing tests

Fix the tests in **test_suite.py**

<br>

***To break this task down, try skipping to exercise 4 so you can isolate and fix tests individually***

</div>

**Solution**

```python

"""
Amendments to test_suite.py
   - Code edits are hashed
"""

from mobydick import TextCorpus


# MOBYDICK_SUMMARY = open('mobydick_summary.txt').read()
MOBYDICK_SUMMARY = open(parentdir+'/test/mobydick_summary.txt').read()

#class AverageWordLength:
class TestAverageWordLength:

    """Tests for word_counter module."""

    def test_word_number(self):
        """Count words in a short sentence"""
        text = TextCorpus("Call me Ishmael")
        assert text.n_words == 3

    def test_average_words(self):
        """Simple average length."""
        text = TextCorpus("white whale")
        # assert text.get_average_word_length() == 5
        assert text.average_word_length == 5

    def tesl_average_words_complex(self):
        """Complex average length."""
        text = TextCorpus(MOBYDICK_SUMMARY)
        self.assertAlmostEqual(text.get_average_word_length(), 4.0, 3)

    def test_average_empty(self):
        """Tests behaviour when input is an empty string."""
        text = TextCorpus("")
        # assert text.get_average_word_length() == 0
        with pytest.raises(ZeroDivisionError):
            assert text.average_word_length == 0

```

In [None]:
# Keep running this cell until your test is successful
! pytest test/test_suite.py

<div class="alert alert-block alert-info">

### Exercise 4: Test selection

Run only one test class

    pytest test_suite.py::TestAverageWordLength
    
</div>

In [None]:
! pytest test/test_suite.py::TestAverageWordLength

<div class="alert alert-block alert-info">

or a single test function:

    pytest test_suite.py::TestAverageWordLength::test_average_words

</div>

In [None]:
! pytest test/test_suite.py::TestAverageWordLength::test_average_words

<div class="alert alert-block alert-info">
    
### Exercise 5:

Your next task is to run only the function **test_another.test_simple** from the test suite in **tests/** and to fix the test!

</div>

In [None]:
# Solution
! pytest test/test_another.py::test_count_word_simple

**Solution**

```python

"""
Amendments to test_another.py
 - Altered code is hashed out
 - As always, you will need to begin by telling python where to find the mobydick module
 - You will also need to provide the absolute paths for text files
"""

# from mobydick.word_counter import TextBody, WordCounter
from mobydick.word_counter import TextCorpus as TextBody, count_word as WordCounter 
from nose.tools import assert_equal

# MOBYDICK_SUMMARY = open('mobydick_summary.txt').read()
MOBYDICK_SUMMARY = open(parentdir+'/test/mobydick_summary.txt').read()

def test_count_word_simple():
    """Count word in a short text"""
    text = TextBody("the white white whale")
    # counter = WordCounter(text)
    counter = WordCounter(text, "white")
    # assert_equal(counter.count_word("white"), 2)
    assert_equal(counter, 2)
```

In [None]:
# Your answer here


<br>
<br>
<a id="test_coverage"></a>

# Test Coverage

For the next exercises, we need to install a small plugin:

***Run the cell below***

In [None]:
! pip install pytest-cov

<div class="alert alert-block alert-info">

### Exercise 1: Calculate Test Coverage

Calculate the percentage of code covered by automatic tests:

</div>

In [None]:
! pytest --cov

<div class="alert alert-block alert-info">

### Exercise 2: Identify uncovered lines
Find out which lines are not covered by tests. Execute

</div>

In [2]:
! coverage html

<div class="alert alert-block alert-info">

And open the resulting **htmlcov/index.html** in a web browser.

<br>

*Note, the* ***index.html*** *file will be located in the directory* ***htmlcov*** *, and you will need to open your web browser and select `File > Open File` (or similar)*

### Exercise 3: Increase test coverage

Bring test coverage of **mobydick/word_counter.py** to 100%

*Hint, is there a file in htmlcov that may show you the coverage of wordcounter.py?*

<br>

**Method:** 
1. Write a new test in **test_another.py** that determines whether the second word of the return is "python"
2. Test the function using `pytest test/test_another.py::test_sys_count_words`
3. Once the test passes, re-run the cells above to re-create the coverage html files

*You will need to look at the python subprocess module, recommended useage `subprocess.test_call` and also to determine how to convert a bytes object into a string*

</div>

**Solution**

```python

"""
Additional function added to test_another.py
 - The code allows python to make a system call to the command line function in word_counter.py
"""

def test_sys_count_words():
    """Attempt to test if name=main statemtn in wordcounter.py"""
    import subprocess
    ret = subprocess.check_output(["python", "mobydick/word_counter.py", "test/mobydick_full.txt", "10"], cwd = parentdir)
    ret2 = ret.decode()
    assert ret2.split()[1]=="python"

 ```

In [None]:
# Keep running this cell until your test is successful then re-run the cells above to achieve 100% code coverage
! pytest test/test_another.py::test_sys_count_words

<br>
<br>
<a id="new_features"></a>

# Testing New Features

<div class="alert alert-block alert-info">

### Exercise 1: Add new feature: special characters
Add a new feature to the **word_counter.py** program. The program should remove special characters from the text before counting words.

The test string is

`"That #§&%$* program still doesn't work!\nI already de-bugged it 3 times, and still numpy.array keeps raising AttributeErrors.\tWhat should I do?"`

<br>

Your task is to prove that the new feature is working by adding a new test to `test_another.py` and running the following until the test is successful

</div>

**Solution**

```python

"""
Additional function added to word_coounter.py
"""
def special_character_handling(corpus):
    import re
    corpus.text = re.sub('\W+',' ', corpus.text)
    number = corpus.n_words
    return number


"""
Additional test added to test_another.py
 - Remember, a new function has been created in word_counter.py so don't forget to import it!
"""
def test_special_character_handling():
    """Test a method for handling special characters"""
    text = TextBody("That #§&%$* program still doesn't work!\nI already de-bugged it 3 times, and still numpy.array keeps" \
                  " raising AttributeErrors.\tWhat should I do?")
    counter = special_character_handling(text)
    assert counter == 24

```

In [None]:
# Keep running this cell until your test is successful
! pytest test/test_another.py

<div class="alert alert-block alert-info">

### Exercise 2: Add new feature: ignore case
Add a new feature to the **word_counter.py** program. The program should ignore the case of words, e.g. *'captain'* and *'Captain'* should be counted as the same word.

Your task is to prove that the new feature is working by adding an addition test to **test_another.py** and running the following until the test is successful

</div>

**Solution**

```python

"""
Additional function added to word_coounter.py
"""
def case_insensitive_comparison(corpus1, corpus2):
    t1 = corpus1.text.lower()
    t2 = corpus2.text.lower()
    if t1 == t2:
        return True
    else:
        return False


"""
Additional test added to test_another.py
 - Remember, a new function has been created in word_counter.py so don't forget to import it!
"""
def test_case_insensitive_comparison():
    """Test a method for handling special characters"""
    text1 = TextBody("Captain")
    text2 = TextBody("captain")
    assert case_insensitive_comparison(text1, text2) is True
    
```

In [None]:
# Keep running this cell until your test is successful
! pytest test/test_another.py

<div class="alert alert-block alert-info">

### Exercise 3: Add new feature: word separators
The program **word_counter.py** does separate words at spaces, but does it separate words at tabulators?

The following sentence should also contain **four** words:

    The\tprogram\tworks\tperfectly.

<br>

Your task is to add a to determine whether the text is being split at tabulators by adding an additional function to **word_counter.py** and a test to **test_another.py** and running the following until the test is successful

</div>

**Solution**

```python

"""
Additional function added to word_coounter.py
"""
def word_separators(corpus):
    nw1 = corpus.n_words
    nw2 = special_character_handling(corpus)
    if nw1 == nw2:
        return [True, nw1]
    else:
        return [False, nw1]


"""
Additional test added to test_another.py
 - Remember, a new function has been created in word_counter.py so don't forget to import it!
"""
def test_word_separators():
    """Test a method for handling special characters"""
    text = TextBody("The\tprogram\tworks\tperfectly.")
    elements = word_separators(text)
    assert elements[0] is True and elements[1] == 4
    
```

In [None]:
# Keep running this cell until your test is successful
! pytest test/test_another.py

<br>
<br>
<a id="recap"></a>

# Recap Puzzle

<div class="alert alert-block alert-info">

The rows in the table got messed up! 
Match the test strategies with the correct descriptions.

| test strategy | description |
|---------------|-------------|
| 1. Unit Test | a. files and examples that help with testing |
| 2. Acceptance Test  | b. collection of tests for a software package |
| 3. Mock | c. relative amount of code tested |
| 4. Fixture | d. tests a single module, class or function |
| 5. Test suite | e. prepare tests and clean up afterwards |
| 6. Test data | f. replaces a complex object to make testing simpler |
| 7. Test coverage | g. tests functionality from the users point of view |

</div>


| test strategy | description |
|---------------|-------------|
| 1. Unit Test | d. tests a single module, class or function |
| 2. Acceptance Test  | g. tests functionality from the users point of view | 
| 3. Mock | f. replaces a complex object to make testing simpler |
| 4. Fixture | e. prepare tests and clean up afterwards |
| 5. Test suite | b. collection of tests for a software package | 
| 6. Test data | a. files and examples that help with testing | 
| 7. Test coverage | c. relative amount of code tested |

**Your answers here**


