<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Counter" data-toc-modified-id="Counter-1"><span class="toc-item-num">1&nbsp;&nbsp;</span><code>Counter</code></a></span><ul class="toc-item"><li><span><a href="#Counter-Methods" data-toc-modified-id="Counter-Methods-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Counter Methods</a></span></li></ul></li><li><span><a href="#defaultdict" data-toc-modified-id="defaultdict-2"><span class="toc-item-num">2&nbsp;&nbsp;</span><code>defaultdict</code></a></span></li><li><span><a href="#OrderedDict" data-toc-modified-id="OrderedDict-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>OrderedDict</a></span><ul class="toc-item"><li><span><a href="#Equality-With-OrderedDict" data-toc-modified-id="Equality-With-OrderedDict-3.1"><span class="toc-item-num">3.1&nbsp;&nbsp;</span>Equality With OrderedDict</a></span></li></ul></li><li><span><a href="#namedTuple" data-toc-modified-id="namedTuple-4"><span class="toc-item-num">4&nbsp;&nbsp;</span><code>namedTuple</code></a></span></li></ul></div>

# `collections` module

- Built-in module that implements specialized container data types
- Alternatives to Python’s general purpose built-in containers

## `Counter`

- A `dict` subclass which helps count hash-able objects
- Elements are stored as dictionary keys and the counts of the objects are stored as the value

In [1]:
from collections import Counter

# From list to Counter
ls = [1, 2, 2, 2, 2, 3, 3, 3, 1, 2, 1, 12, 3, 2, 32, 1, 21, 1, 223, 1]
print(Counter(ls))

Counter({1: 6, 2: 6, 3: 4, 12: 1, 32: 1, 21: 1, 223: 1})


In [2]:
# From string to Counter
sentence = 'Hello world! This is a simple test for Counter with strings!'
wordsInSentence = Counter(sentence)
print(wordsInSentence)

Counter({' ': 10, 's': 6, 'i': 5, 't': 5, 'e': 4, 'l': 4, 'o': 4, 'r': 4, 'w': 2, '!': 2, 'h': 2, 'n': 2, 'H': 1, 'd': 1, 'T': 1, 'a': 1, 'm': 1, 'p': 1, 'f': 1, 'C': 1, 'u': 1, 'g': 1})


In [3]:
# We cannot iterate through Counter() object result directly
# But we can iterate through Counter().most_common()
for k, v in wordsInSentence.most_common():
    print(str(k) + ': ' + str(v), end=", ")

 : 10, s: 6, i: 5, t: 5, e: 4, l: 4, o: 4, r: 4, w: 2, !: 2, h: 2, n: 2, H: 1, d: 1, T: 1, a: 1, m: 1, p: 1, f: 1, C: 1, u: 1, g: 1, 

In [4]:
# `Counter` with words in a sentence
string = 'How many times does each word show up in this sentence word times each each word'
words = string.split() # Produces a list
print(Counter(words))

Counter({'each': 3, 'word': 3, 'times': 2, 'How': 1, 'many': 1, 'does': 1, 'show': 1, 'up': 1, 'in': 1, 'this': 1, 'sentence': 1})


### Counter Methods

In [5]:
l = [1, 2, 2, 2, 2, 3, 3, 3, 1, 2, 1, 12, 3, 2, 32, 1, 21, 1, 223, 1]
c = Counter(l)

print(sum(c.values()))                  # total of all counts
print(c.clear())                        # reset all counts

c = Counter(l)
print(list(c))                          # convert the keys to a list
print(set(c))                           # convert to a set (uniques): Would result in the same as list(c)
print(dict(c))                          # convert to a regular dictionary: {k: v}
print(c.items())                        # convert to a list of (elem, cnt) pairs
# print(Counter(dict(list_of_pairs)))     # convert from a list of (elem, cnt) pairs
# print(c.most_common()[:-n-1:-1])        # n least common elements
# print(c += Counter())                   # remove zero and negative counts

20
None
[1, 2, 3, 12, 32, 21, 223]
{32, 1, 2, 3, 12, 21, 223}
{1: 6, 2: 6, 3: 4, 12: 1, 32: 1, 21: 1, 223: 1}
dict_items([(1, 6), (2, 6), (3, 4), (12, 1), (32, 1), (21, 1), (223, 1)])


---

## `defaultdict `

- Dictionary-like object which provides all methods provided by dictionary
- But takes first argument (`default_factory`) as default data type for the dictionary
- Using `defaultdict` is faster than doing the same using `dict.set_default` method
- Will never raise a `KeyError`: Any key that does not exist gets the value returned by the default factory


In [6]:
from collections import defaultdict

d = {}
# d['one'] # => Error: There is no key 'one' in d

dd = defaultdict(lambda: "Default value") # Default: () => 0
print(dd['one']) # Not Error: Default value returned
dd['two'] = "Hello"

for item in dd:
    print(str(item) + " - " + str(dd[item]))

# Can also initialize with default values:
dd = defaultdict(lambda: 0)
print(dd['one'])

Default value
one - Default value
two - Hello
0


---

## OrderedDict

- A dictionary subclass that remembers the order in which its contents are added

In [7]:
# Normal Dictionary
print('Normal dictionary:')
d = {}
d['a'] = 'A'
d['c'] = 'c'
d['b'] = 'B'
d['e'] = 'E'
d['d'] = 'D'

for k, v in d.items():
    print(k, v)

Normal dictionary:
a A
c c
b B
e E
d D


In [8]:
# An Ordered Dictionary
from collections import OrderedDict

print('OrderedDict:')
d = OrderedDict()
d['a'] = 'A'
d['b'] = 'B'
d['c'] = 'C'
d['d'] = 'D'
d['e'] = 'E'

for k, v in d.items():
    print(k, v)

OrderedDict:
a A
b B
c C
d D
e E


### Equality With OrderedDict

- A regular dict looks at its contents when testing for equality
- An OrderedDict also considers the order the items were added

In [9]:
# A normal Dictionary
print('Dictionaries are equal? ')

d1 = {}
d1['a'] = 'A'
d1['b'] = 'B'

d2 = {}
d2['b'] = 'B'
d2['a'] = 'A'

print(d1 == d2)

Dictionaries are equal? 
True


In [10]:
# An Ordered Dictionary:
print('Dictionaries are equal? ')

d1 = OrderedDict()
d1['a'] = 'A'
d1['b'] = 'B'

d2 = OrderedDict()
d2['b'] = 'B'
d2['a'] = 'A'

print(d1 == d2)

Dictionaries are equal? 
False


---

## `namedTuple`

- The standard tuple uses numerical indexes to access its members
- For simple use cases, this is usually enough
- Trying to remember which index should be used for each value can lead to error
- A `namedtuple` assigns names, as well as the numerical index, to each member
- Each kind of `namedtuple` is represented by its own class, created by using the `namedtuple()` factory function
  - The arguments are the name of the new class and a string containing the names of the elements
- Think of namedtuples as a very quick way of creating a new object/class type with some attribute fields

In [11]:
from collections import namedtuple

# Construction: namedTuple('ObjectName', 'attr1 attr2 attr3...')
Dog = namedtuple('Dog', 'age breed name')

sam = Dog(age=2, breed='Lab', name='Sammy')
frank = Dog(age=3, breed='Shepard', name="Frankie")

In [12]:
print(sam)
print(sam.age)
print(sam.breed)
print(sam[0])

Dog(age=2, breed='Lab', name='Sammy')
2
Lab
2
