---
title: Production code: unit testing, refactoring, and PEP-8
duration: "1:25"
creator:
    name: Winston Featherly-Bean
    city: NY
---

![](https://ga-dash.s3.amazonaws.com/production/assets/logo-9f88ae6c9c3871690e33280fcf557f33.png)
# Production code: unit testing, refactoring, and PEP-8
Week 9 | Lesson 3.1


### LEARNING OBJECTIVES
*After this lesson, you will be able to:*
- Write unit tests and run them as a suite
- Identify examples of code that could be refactored

### Software engineer: You didn't check your code and your tests into master without a code review, did you?


### Data Scientist: Checked my what into what without a what?

#### - Software development skills for data scientists (http://treycausey.com/software_dev_skills.html)

### LESSON GUIDE
| TIMING  | TYPE  | TOPIC  |
|:-:|---|---|
| 5 min | [Opening](#opening) | Opening |
| 50 min | [Unit testing](#unit-testing) | Unit testing |
| 25 min | [Refactoring](#refactoring) | Refactoring |
| 5 min | [Conclusion](#conclusion) | Conclusion |

## Unit testing

Testing for bugs is ubiquitous within software development. There are structured frameworks for doing this.

_Unit testing_ is testing the most granular components of your code, e.g. specific functions, to look for syntax, logic and execution errors.

If your job is data analysis, rather than building data products, you'll probably get away without formal testing. But it's still a good idea. It will sharpen your code, ease collaboration, and make _refactoring_ less fretful.   

There are several frameworks for unit testing in Python. We'll use `pytest` today:

```bash
pip install -U pytest
```

`pytest` is popular because it simplifies the code required to write and run tests. But you should also get familiar with the base [unittest/PyUnit library](https://docs.python.org/2/library/unittest.html).

Your initial tests can be written based on your program specifications: what are its functions supposed to _do_? Let's say we have these trivial functions:

```python

def rectangle_area(w,h):
    return w*h

def strip_stopwords(phrase, stopwords):
    phrase = phrase.split()
    phrase = [w for w in phrase if w not in stopwords]
    phrase = ' '.join(phrase)
    return phrase
```

If these were in a file (module) named `example.py`, then a pytest `test.py` file could look like:

```python
import pytest
import examples

def test_area_calculation():
    assert examples.rectangle_area(10,2) == 20
    
def test_stopwords():
    sentence = "the quick brown fox jumped over the lazy dog"
    stopwords = ['the', 'an', 'a', 'of', 'to']
    assert examples.strip_stopwords(sentence, stopwords) == 'quick brown fox jumped over lazy dog'
```

We could `assert` any Boolean condition, e.g.:
```python
assert examples.strip_stopwords(sentence, stopwords) != 'the quick brown fox jumped over the lazy dog'
```

It is also important to consider what your code should _not_ do, i.e., when should it fail and what exceptions should it raise? Our area calculation function should only work with numeric types:

```python
def test_area_type_handling():
    with pytest.raises(TypeError):
        examples.rectangle_area(5,'testing')
```

`assert` and `with pytest.raises(___Error): ....` are two workhorse commands.

### Guided practice: running a test suite

A basic unit testing battery requires a couple things:

- A script with your test functions, each of which has a name starting with "test"
- The module you want to test (for simplicity, in the same directory)

Py.test will automatically detect and run your tests for you. Let's try it! The bash command is:

```bash
py.test [-v] test_script.py
```

Often you will need to test methods of classes - and for this you may need to instantiate the class with specific values. Take yesterday's Game of War code, for example:

```python
class War():
    def __init__(self):

        self.newDeck = Deck()
        self.newDeck.shuffle()
        
        self.handOne = Hand()
        self.handTwo = Hand()
        self.tableau = Hand()
        return

        
    def deal(self):
        while len(self.newDeck.cards) > 0:
            self.handOne.add_card(self.newDeck.draw_card()) # add one card to handOne
            self.handTwo.add_card(self.newDeck.draw_card())
        return
 ```

Let's say we want to make sure that `deal` leaves handOne and handTwo with an equal number of cards. Our initialization (right now) ensures that they do. So how could we make sure our test will work when it needs to?

"Fixtures". They're a little involved, but the basic syntax is:

```python
@pytest.fixture()
def setup():
	w = cardgames.War()
    w.handOne.add_card(cardgames.Card())
	return w

def test_equal_hands(setup):
    setup.deal()
    assert len(setup.handOne.cards) == len(setup.handTwo.cards)
```


### Independent practice: writing test functions

Let's take a step toward production! Add at least one more unit test for the trivial example functions, and two more unit tests for the Game of War solution code (or your version). 

In [2]:
import random
import logging
logging.basicConfig(filename='cards.log', filemode='w',\
    format='%(asctime)s %(levelname)s:%(message)s',level=logging.DEBUG)

log = logging.getLogger(__name__)




class War():
    """ INCOMPLETE example -- how can you build on this?"""
    def __init__(self):
        import ipdb; ipdb.set_trace()

        # refactoring: could reuse newDeck as tableau?
        self.newDeck = Deck()
        self.newDeck.shuffle()
        
        self.handOne = Hand()
        self.handTwo = Hand()
        self.tableau = Hand()
        log.debug('Created newDeck, handOne, handTwo and tableau instances')
        return

        
    def deal(self):
        import ipdb; ipdb.set_trace()
        while len(self.newDeck.cards) > 0:
            self.handOne.add_card(self.newDeck.draw_card()) # add one card to handOne
            self.handTwo.add_card(self.newDeck.draw_card())
            log.debug('Added to hands, {} left in deck'.format(len(self.newDeck.cards)))
        return

    def turn(self):
        import ipdb; ipdb.set_trace()
        # each hand draw its top card
        self.cardOne = self.handOne.cards.pop()
        self.cardTwo = self.handTwo.cards.pop()

        self.tableau.add_card(self.cardOne)
        self.tableau.add_card(self.cardTwo)
        log.debug('Drew two cards: {0} and {1}'.format(str(self.cardOne), str(self.cardTwo)))
        
        # check if the cards are equal
        # if it's true, then go to war
        # if it's false then see which is greater

        if self.cardOne.is_equal(self.cardTwo):
            # the code for war
            # we have draw three top cards from each hand
            # then draw the fourth top card from each and comapre
            # then either repeat or add the tableau randomly to the winner's hand
            # highly imperative style rather than functional
            self.message = "war!"
            for i in range(3):
                self.tableau.add_card(self.handOne.cards.pop())
                self.tableau.add_card(self.handTwo.cards.pop())
            
            # Call turn again, this time with more cards in tableau
            log.info("War! There are {} cards in tableau".format(len(self.tableau.cards)))
            self.turn()

        else:
            if self.cardOne.greater_than(self.cardTwo):
                # giving the cards to hand one
                self.message = "player 1 wins round"
                log.info('player 1 wins round and adds {} cards to their hand of {}'.format(len(self.tableau.cards),len(self.handOne.cards)))
                
                # add self.cardOne and self.cardTwo into self.handOne.cards
                # they must go onto the bottom of self.handOne.cards
                # and they must go onto the bottom in random order
                while len(self.tableau.cards) > 0:
                    self.handOne.cards.insert(0,self.tableau.draw_card())
                
                log.debug('player 1 has {} cards'.format(len(self.handOne.cards)))


            else:
                # give the cards to hand two
                self.message = "player 2 wins round"
                log.info('player 2 wins round and adds {} cards to their hand of {}'.format(len(self.tableau.cards),len(self.handTwo.cards)))
                
                while len(self.tableau.cards) > 0:
                    self.handTwo.cards.insert(0,self.tableau.draw_card())
                
                log.debug('player 2 has {} cards'.format(len(self.handTwo.cards)))

        return self.message

    # def play_game(self):

class Card():
    '''A standard playing card'''
    
    def __init__(self, suit=0, rank=2):
        import ipdb; ipdb.set_trace()
        self.suit = suit
        self.rank = rank
        
    suit_names = ['Clubs', 'Diamonds', 'Hearts', 'Spades']
    rank_names = [None, 'Ace', '2', '3', '4', '5', '6', '7', '8', \
                 '9', '10', 'Jack', 'Queen', 'King']
    
    def __str__(self):
        import ipdb; ipdb.set_trace()
        return "%s of %s" % (Card.rank_names[self.rank], \
                            Card.suit_names[self.suit])
    
    def greater_than(self, other):
        import ipdb; ipdb.set_trace()
        # YOUR CODE HERE
        if other.rank == 1:
            return False            
        else:
            return self.rank > other.rank
        
    def is_equal(self, other):
        import ipdb; ipdb.set_trace()
        if self.rank == other.rank:
            return True
        else:
            return False

class Deck():
    '''52 unique cards. No jokers.'''
    
    def __init__(self):
        import ipdb; ipdb.set_trace()
        self.cards = []
        for suit in range(4):
            for rank in range(1,14):
                card = Card(suit, rank)
                self.cards.append(card)
                
    def __str__(self):
        import ipdb; ipdb.set_trace()
        results = []
        for card in self.cards:
            results.append(str(card))
        return '\n'.join(results)
    
    def draw_card(self):
        import ipdb; ipdb.set_trace()
        '''Draws a random card'''
        c = random.choice(self.cards)
        self.cards.remove(c)
        return c
    
    def add_card(self, card):
        import ipdb; ipdb.set_trace()
        '''Puts a card object back in the deck'''
        self.cards.append(card)            
    
    def shuffle(self):
        import ipdb; ipdb.set_trace()
        '''Shuffles the deck'''
        random.shuffle(self.cards)

    def sort(self):
        import ipdb; ipdb.set_trace()
        '''Sorts the deck'''
        self.cards.sort()
        
class Hand(Deck):
    '''Empty for now'''
    def __init__(self):
        self.cards = []

if __name__ == "__main__":
    w = War()
    w.deal()
    counter = 0

    # sometimes this fails -- examine the log to see why!
    while (w.handOne.cards and w.handTwo.cards):
        counter+=1
        w.turn()
        if len(w.handOne.cards) + len(w.handTwo.cards) > 52: log.warning("Created more than 52 cards.")
        if counter > 10000:
            log.warning('possible infinite loop, breaking execution')
            break
    if w.handOne.cards:
        print 'player 1 won!'
    else:
        print 'player 2 won!'
    print len(w.handOne.cards), len(w.handTwo.cards)


MultipleInstanceError: Multiple incompatible subclass instances of TerminalIPythonApp are being created.

### Refactoring

This software development vocabulary word just means "improving your code". The general axes are:

- Efficiency
- Readability
- Extensibility

Some easy wins:

- Don't Repeat Yourself (DRY)
- Use helpful names
- Comment your code!

You can also improve your code's readability, and your own credibility, by following a community standard stylistic convention. The most popular is [PEP-8](https://www.python.org/dev/peps/pep-0008/).

Please take a few minutes to skim the documentation.

"Linters" are tools for checking your code for errors. There are style linters available, as standalone programs or integrations to IDEs / text editors.

We'll use an easy one:

```bash
$pip install -U pep8
$pep8 tictactoe.py
```

Let's look at a few refactoring examples together:

```python
if isSpecialDeal():
    total = price * 0.95
    send()
else:
    total = price * 0.98
    send()
```

> Check: what notion is this violating? How can we improve it?


(Examples from https://github.com/shvetsgroup/refactoring.guru-examples/tree/master/simple/python)

Don't repeat yourself:

```python
if isSpecialDeal():
    total = price * 0.95
else:
    total = price * 0.98
send()
```

How about this one?

```python
def output(self, type):
    if name == "banner"
        # Print the banner.
        # ...
    if name == "info"
        # Print the info.
        # ...
```



Make it easier to adjust what happens in each case:

```python
def outputBanner(self):
    # Print the banner.
    # ...

def outputInfo(self):
    # Print the info.
    # ...
```

And here?

```python
def foundPerson(people):
    for i in range(len(people)):
        if people[i] == "Alice":
            return "Alice"
        if people[i] == "John":
            return "John"
        if people[i] == "Kent":
            return "Kent"
    return ""
```

That code wasn't very Pythonic, plus it doubles the risk of typos.

```python
def foundPerson(people):
    candidates = ["Alice", "John", "Kent"]
    for i in range(len(people)):
        if people[i] in candidates:
            return people[i]
    return ""
```

### Additional resources

Software development

- http://treycausey.com/software_dev_skills.html
- http://12factor.net/

Logging

- http://victorlin.me/posts/2012/08/26/good-logging-practice-in-python
- http://www.blog.pythonlibrary.org/2012/08/02/python-101-an-intro-to-logging/

(Unit) testing
- http://docs.python-guide.org/en/latest/writing/tests/
- http://stackoverflow.com/questions/4904096/whats-the-difference-between-unit-functional-acceptance-and-integration-test
