# Get pythonic with the collections module

## First day: your new data structure friend

> This module implements specialized container datatypes providing alternatives to Python’s general purpose built-in containers, dict, list, set, and tuple.

### 1. Counter objects

A counter tool is provided to support convenient and rapid tallies.

In [3]:
# Tally occurrences of words in a list
from collections import Counter
cnt = Counter()
for word in ['red', 'blue', 'red', 'green', 'blue', 'blue']:
    cnt[word] += 1
cnt

Counter({'red': 2, 'blue': 3, 'green': 1})

Elements are counted from an iterable or initialized from another mapping (or counter):

In [4]:
c = Counter()                          # a new, empty counter
c = Counter('gallahad')                # a new counter from an iterable
c = Counter({'red': 4, 'blue': 2})     # a new counter from a mapping
c = Counter(cats=4, dogs=8)            # a new counter from keyword args

Counter objects have a dictionary interface except that they return a zero count for missing items instead of raising a KeyError.

In [5]:
c = Counter(['eggs', 'ham'])
c['bacon']                             # count of a missing element is zero

0

Counter objects support three methods beyond those available for all dictionaries:

#### elements()

Return an iterator over elements repeating each as many times as its count. Elements are returned in arbitrary order. If an element's count is less than one, elements() will ignore it.

In [6]:
c = Counter(a=4, b=2, c=0, d=-2)
list(c.elements())

['a', 'a', 'a', 'a', 'b', 'b']

#### most_common([n])

Return a list of the n most common elements and their counts from the most common to the least. If n is not specified, most_common() returns all elements in the counter.

In [7]:
Counter('abracadabra').most_common(3)

[('a', 5), ('b', 2), ('r', 2)]

#### subtract([iterable-or-mapping])

Elements are subtracted from an iterable or from another mapping (or counter). 

In [8]:
c = Counter(a=4, b=2, c=0, d=-2)
d = Counter(a=1, b=2, c=3, d=4)
c.subtract(d)
c

Counter({'a': 3, 'b': 0, 'c': -3, 'd': -6})

### 2. deque objects

Deques are a generalization of stacks and queues (the name is short for “double-ended queue”). 
Deque objects support the following methods:

In [9]:
from collections import deque
d = deque('ghi')                # make a new deque with three items
for elem in d:                  # iterate over the deque's
    print(elem.upper())

G
H
I


In [10]:
d.append('j')                   # add a new entry to the right side
d.appendleft('f')               # add a new entry to the left side 
d                               # show the representation of the deque

deque(['f', 'g', 'h', 'i', 'j'])

In [11]:
d.pop()                         # return and remove the rightmost item

'j'

In [12]:
d.popleft()                  # return and remove the leftmost item

'f'

In [13]:
list(d)                      # list the contents of the deque

['g', 'h', 'i']

In [14]:
d[0]                         # peek at leftmost item

'g'

In [15]:
d[-1]                        # peek at rightmost item

'i'

In [16]:
list(reversed(d))            # list the contents of a deque in reverse

['i', 'h', 'g']

In [17]:
'h' in d                     # search the deque

True

In [19]:
d.extend('jkl')              # add multiple elements at once

In [20]:
d

deque(['g', 'h', 'i', 'j', 'k', 'l', 'j', 'k', 'l'])

In [21]:
d.rotate(1)                  # right rotation
d

deque(['l', 'g', 'h', 'i', 'j', 'k', 'l', 'j', 'k'])

In [22]:
d.rotate(-1)                 # left rotation
d

deque(['g', 'h', 'i', 'j', 'k', 'l', 'j', 'k', 'l'])

In [23]:
deque(reversed(d))           # make a new deque in reverse order

deque(['l', 'k', 'j', 'l', 'k', 'j', 'i', 'h', 'g'])

In [24]:
d.clear()                    # empty the deque

In [25]:
d.pop()                      # cannot pop from an empty deque

IndexError: pop from an empty deque

In [26]:
d.extendleft('abc')           # extendleft() reverses the input order
d

deque(['c', 'b', 'a'])

### 3. namedtuple()

Named tuple instances do not have per-instance dictionaries, so they are lightweight and require no more memory than regular tuples. Named tuples are especially useful for assigning field names to result tuples returned by the csv or sqlite3 modules. Let's first look at a classic tuple:

In [4]:
user = ('bob', 'coder')

The order is not really meaningful leading to ugly code to output the data:

In [5]:
f'{user[0]} is a {user[1]}'

'bob is a coder'

Let's contrast that with a namedtuple:

In [6]:
User = namedtuple('User', 'name role')

You can directly see that the object has a name and role:

In [7]:
user = User(name='bob', role='coder')

In [8]:
user.name

'bob'

In [9]:
user.role

'coder'

Making last string much more informational and elegant (f-strings helps too of course)

In [10]:
f'{user.name} is a {user.role}'

'bob is a coder'

CONCLUSION: use a namedtuple wherever you can! They are easy to implement and make your code more readable.

### 4. defaultdict

I guess you are all too familiar with KeyError when using a dict, no?

In [11]:
users = {'bob': 'coder'}

In [12]:
users['bob']
users['julian']  # oops, this will give an error

KeyError: 'julian'

In [14]:
users.get('bob')

'coder'

In [15]:
users.get('julian') is None 

True

But what if you need to build up a collection though? Let's make a dict from the following list of tuples:

In [16]:
challenges_done = [('mike', 10), ('julian', 7), ('bob', 5),
                   ('mike', 11), ('julian', 8), ('bob', 6)]
challenges_done

[('mike', 10),
 ('julian', 7),
 ('bob', 5),
 ('mike', 11),
 ('julian', 8),
 ('bob', 6)]

In [18]:
challenges = {}
for name, challenge in challenges_done:
    challenges[name].append(challenge)

KeyError: 'mike'

In [20]:
from collections import defaultdict
challenges = defaultdict(list)
for name, challenge in challenges_done:
    challenges[name].append(challenge)

challenges

defaultdict(list, {'mike': [10, 11], 'julian': [7, 8], 'bob': [5, 6]})

### 5. OrderedDict 

In [21]:
#regular unsorted dictionary
d = {'banana': 3, 'apple':4, 'pear': 1, 'orange': 2}

A lambda function is a small anonymous function. A lambda function can take any number of arguments, but can only have one expression.

In [24]:
#dictionary sorted by key
from collections import OrderedDict
OrderedDict(sorted(d.items(), key=lambda t: t[0]))

OrderedDict([('apple', 4), ('banana', 3), ('orange', 2), ('pear', 1)])

In [25]:
# dictionary sorted by value
OrderedDict(sorted(d.items(), key=lambda t: t[1]))

OrderedDict([('pear', 1), ('orange', 2), ('banana', 3), ('apple', 4)])

In [26]:
# dictionary sorted by length of the key string
OrderedDict(sorted(d.items(), key=lambda t: len(t[0])))

OrderedDict([('pear', 1), ('apple', 4), ('banana', 3), ('orange', 2)])