# 11) Modules and Packages <a class="tocSkip">

In this notebook we will write our own modules and learn how to use others from Python's standard library and other sources. To allow Python applications to scale larger, you can organize modules into file and module hierarchies called packages. A package is a subdirectory that contains .py files.

Consider we have a local module with the same name as a standard one; how do we choose the correct one? Python supports absolute or relative import. If you typed import [name], for each directory in the search path (found at sys.path), Python will look for a file named name.py (a module) or a directory named name (a package). 

- If name.py is in the same directory as your calling problem, you can import it relative to your location with from . import name

- If it is in the directory above you use from .. import name

- If it is under a sibling directory called name_sub use from ..namesub import name

The . and .. notation is burrowed from Unix's shorthand for current directory and parent directory.

You can also split a package across directories with namespace packages. Say we want to create a package called $\textit{teams}$ that will contain a Python module for each cricket team. This might get large over time and you would like to subdivide these by geographic location. One option is to add location subpackages under $\textit{teams}$ and move the existing .py module files under them, but this would break things for other modules that import them. Instead, we can go up a subdirectory and do the following:

- Make new location directories above $\textit{teams}$.

- Make cousin $\textit{teams}$ under these new parents.

- Move existing modules to their respective directories.

Say we started with the following file layout:

+-- teams <br>
|&emsp;+-- surrey.py <br>
|&emsp;+-- glamorgan.py <br>
|&emsp;+-- queensland.py <br>

Normal imports of these modules would look like:

    from teams import surrey, glamorgan, queensland.

Now if we used geographical locations, the files and directories would look like:

+-- england <br>
|&emsp;+-- teams <br>
|&emsp;|&emsp;+-- surrey.py <br>
|&emsp;|&emsp;+-- glamorgan.py <br>

+-- australia <br>
|&emsp;+-- teams <br>
|&emsp;|&emsp;+-- queensland.py <br>
    
You can import the modules as though they were still cohabiting a single directory using:

    from teams import surrey, glamorgan, queensland

### Python standard library

In this section we shall discuss some standard modules that have generic uses.

#### Handle missing keys

Trying to access a dictionary with a nonexistent key raises an exception. Using the dictionary get() function to return a default value avoids an exception. The setdefault() function is like get(), but also assigns an item to the dictionary if the key is missing:

In [6]:
from collections import defaultdict

In [1]:
# Create an example dictionary

periodic_table = {'Hydrogen': 1, 'Helium': 2}
periodic_table

{'Hydrogen': 1, 'Helium': 2}

If the key was not already in the dictionary, the new value is used:

In [2]:
# Assigning a new key to the dictionary

carbon = periodic_table.setdefault('Carbon', 12)
periodic_table

{'Hydrogen': 1, 'Helium': 2, 'Carbon': 12}

If we try to assign a different default value to an existing key, the original value is returned and nothing is changed:

In [3]:
# Trying to change helium

helium = periodic_table.setdefault('Helium', 100)
periodic_table

{'Hydrogen': 1, 'Helium': 2, 'Carbon': 12}

The function defaultdict() is similar, but specifies the default value for any new key up front, when the dictionary is created. Its argument is a function, in this example we pass the function int. Now any missing value will be an integer with the value 0.

In [4]:
# Attempting to call Lead

periodic_table = defaultdict(int)
periodic_table['Lead']

0

In [5]:
# Showing that we have created Lead in our dictionary with a default value

periodic_table

defaultdict(int, {'Lead': 0})

You can use the functions int(), list() or dict() to return default empty values for those types: int() returns 0, list() return an empty list ([]) and dict() returns an empty dictionary ({}). If you omit the argument, the initial value of a new key will be set to None.

#### Count items with counter()

The standard library has a counter() function that can be used in a variety of ways:

In [7]:
from collections import Counter

In [14]:
# Define an example list

breakfast = ['porridge', 'jam', 'cereal', 'porridge', 'porridge', 'muffin', 'jam']

In [16]:
# Creating a counter

breakfast_counter = Counter(breakfast)
breakfast_counter

Counter({'porridge': 3, 'jam': 2, 'cereal': 1, 'muffin': 1})

The most_common() function returns all elements in descending order, or just the top count elements if given a count:

In [17]:
# Finding the most common element

breakfast_counter.most_common(1)

[('porridge', 3)]

We can combine counters. Let us create a lunch counter. The first way we can combine the two counters is by addition using + and subtract one counter from the other using -.

In [19]:
# Define our lunch counter

lunch = ['cereal', 'pasta', 'pasta', 'soup', 'muffin', 'pasta', 'soup']
lunch_counter = Counter(lunch)
lunch_counter

Counter({'cereal': 1, 'pasta': 3, 'soup': 2, 'muffin': 1})

In [20]:
# Addition of counters

breakfast_counter + lunch_counter

Counter({'porridge': 3,
         'jam': 2,
         'cereal': 2,
         'muffin': 2,
         'pasta': 3,
         'soup': 2})

In [21]:
# Subtraction of counters

breakfast_counter - lunch_counter

Counter({'porridge': 3, 'jam': 2})

We can find the items in common between the two lists using the intersection operator &:

In [22]:
# Intersection of counters

breakfast_counter & lunch_counter

Counter({'cereal': 1, 'muffin': 1})

The intersection chooses the common element with the lower count. Finally, we can obtain all items by using the union operator |:

In [23]:
# Union of counters

breakfast_counter | lunch_counter

Counter({'porridge': 3,
         'jam': 2,
         'cereal': 1,
         'muffin': 1,
         'pasta': 3,
         'soup': 2})

Some items are common to both, unlike addition, union does not add their counts but selects the one with the larger count.

#### Stack + Queue == deque

A deque is a double-ended queue, which has features of both a stack and a queue. It is useful when you want to add and delete items from either end of a sequence. In this example we work from both ends of a word to the middle to see whether it is a palindrome. This case would obviously be easier by checking a string with its reverse, but a deque also works:

In [27]:
# Defining a check for palindromes using deque

from collections import deque

def palindrome(word):
    dq = deque(word)
    while len(dq) > 1:
        if dq.popleft() != dq.pop():
            return False
    return True

In [28]:
palindrome ('racecar')

True

In [29]:
palindrome('palindrome')

False

#### Iterate over code structures with itertools

itertools contains special purpose iterator functions. Each returns one item at a time when called within a for ... in loop and remembers its state between calls. The chain() function runs through its arguments as though they were a single iterable:

In [31]:
import itertools

for item in itertools.chain([1, 2, 3], ['a', 'b', 'c'], ['alpha', 'beta', 'gamma']):
    print(item)

1
2
3
a
b
c
alpha
beta
gamma


The accumulate() function calculates the accumulated values. By default it calculates the sum:

In [32]:
# Using the accumulate() function of itertools

for item in itertools.accumulate([1, 2, 3, 4]):
    print(item)

1
3
6
10


You can provide a function as the second argument to accumulate() and it will be used instead of addition. The function should take two arguments and return a single result. This example calculates the accumulated product:

In [33]:
# Using a defined function with accumulate()

def multiply(a, b):
    return a*b

for item in itertools.accumulate([1, 2, 3, 4], multiply):
    print(item)

1
2
6
24


The itertools module has many more functions, notably some for combinations and permutations.

#### Print nicely with pprint()

Generally we use print() or just the variable name in the interactive interpreter to print objects, which can be hard to read. We can use pprint() to prettify printing calls.

In [49]:
from pprint import pprint

variables = ('alpha', 'Definition for alpha', 1), ('beta', 'Definition for beta', 2), ('gamma','Definition for gamma', 3)

In [50]:
# Standard print() call

print(variables)

(('alpha', 'Definition for alpha', 1), ('beta', 'Definition for beta', 2), ('gamma', 'Definition for gamma', 3))


In [51]:
# Using pprint()

pprint(variables)

(('alpha', 'Definition for alpha', 1),
 ('beta', 'Definition for beta', 2),
 ('gamma', 'Definition for gamma', 3))


#### Get random

The random.choice() function returns a value from the sequence (list, tuple, dictionary, string) argument that it is given. If we wish to get more than one value at a time we can use sample():

In [58]:
from random import choice, sample, randint, randrange, random

numbers = [1, 4, 7, 9, 12, 16, 18, 19]

sample(numbers, 2)

[9, 7]

To get a random integer from any range, you can use choice() or sample() with range(), or use randint() or randrange(). randrange(), like range(), has arguments for the start inclusive and end (exclusive) integers, and an optional integer step:

In [54]:
# Choosing a random integer

randint(0, 100)

86

In [57]:
# Choosing from a range with integer step

randrange(0, 100, 10)

50

Finally, we can get a random real number (a float) between 0 and 1 using random.random:

In [59]:
# Random float between 0 and 1

random()

0.44801514842790036