## `collections` Module


Specialized container type objects which provide alternatives to Python's built-in **dict**, and **times** objects.

---


In [1]:
# Import the most common 'collections' types
from collections import defaultdict, namedtuple, Counter, deque
import csv
import random
from urllib.request import urlretrieve

### `namedtuple`

- A convenient way to define a class without methods.
- Allows you to store `dict` -like objects which are accessible by attributes.
- Conventional `tuple` objects use indices to access data which aren't significant to the data elements themselves.

In [2]:
# Conventional tuple to define a user and a role
user = ('Tim', 'admin')

print(user)

('Tim', 'admin')


In [3]:
# tuple indices have no significant meaning to the data in each index
print(f'User {user[0]} has the role of {user[1]}.')

User Tim has the role of admin.


In [4]:
# Create a namedtuple container object for comparision
# First argument is the 'typename'
# Second argument is the names of the fields, space separated
User = namedtuple('User', 'name role')

In [5]:
# Create a data set from the named tuple
user = User(name='Tim', role='admin')

In [6]:
# Access data from the namedtuple with meaningful references
print(f'User {user.name} has the role of {user.role}.')

User Tim has the role of admin.


#### `namedtuple` documentation


In [7]:
help(namedtuple)

Help on function namedtuple in module collections:

namedtuple(typename, field_names, *, rename=False, defaults=None, module=None)
    Returns a new subclass of tuple with named fields.
    
    >>> Point = namedtuple('Point', ['x', 'y'])
    >>> Point.__doc__                   # docstring for the new class
    'Point(x, y)'
    >>> p = Point(11, y=22)             # instantiate with positional args or keywords
    >>> p[0] + p[1]                     # indexable like a plain tuple
    33
    >>> x, y = p                        # unpack like a regular tuple
    >>> x, y
    (11, 22)
    >>> p.x + p.y                       # fields also accessible by name
    33
    >>> d = p._asdict()                 # convert to a dictionary
    >>> d['x']
    11
    >>> Point(**d)                      # convert from a dictionary
    Point(x=11, y=22)
    >>> p._replace(x=100)               # _replace() is like str.replace() but targets named fields
    Point(x=100, y=22)



---
### `defaultdict`

- Useful to avoid `KeyError` exceptions when building a nested data set.
- In this example, players have multiple game scores in the data set.
- The goal is to have a single dictionary **key** for each player with the **value** for each **key** being a list of scores.


In [8]:
# List of tuples with names and ages
game_scores = [
    ('Tim', 100),
    ('Sara', 150),
    ('Lily', 130),
    ('Ella', 180),
    ('Tim', 50),
    ('Sara', 60),
    ('Lily', 100),
    ('Ella', 70)
]

print(game_scores)

[('Tim', 100), ('Sara', 150), ('Lily', 130), ('Ella', 180), ('Tim', 50), ('Sara', 60), ('Lily', 100), ('Ella', 70)]


In [9]:
# Add family members to a new dictionary
scores = {}

print(scores)

{}


In [10]:
# Loop over the list and expand the tuples with multiple variable assignment
# Produces a value error because the keys for the player names do not yet exist
for name, score in game_scores:
    scores[name].append(score)

KeyError: 'Tim'

In [11]:
# Create a defaultdict and set the data type to produce when a key is not present (a list in this case)
scores = defaultdict(list)

print(scores)

defaultdict(<class 'list'>, {})


In [12]:
# Loop over the data set (game_scores)
# Create a key for each name, if it doesn't already exist
# Append a score to the value for any matching name key
for name, score in game_scores:
    scores[name].append(score)

print(scores)

defaultdict(<class 'list'>, {'Tim': [100, 50], 'Sara': [150, 60], 'Lily': [130, 100], 'Ella': [180, 70]})


#### `defaultdict` documentation

In [13]:
help(defaultdict)

Help on class defaultdict in module collections:

class defaultdict(builtins.dict)
 |  defaultdict(default_factory[, ...]) --> dict with default factory
 |  
 |  The default factory is called without arguments to produce
 |  a new value when a key is not present, in __getitem__ only.
 |  A defaultdict compares equal to a dict with the same items.
 |  All remaining arguments are treated the same as if they were
 |  passed to the dict constructor, including keyword arguments.
 |  
 |  Method resolution order:
 |      defaultdict
 |      builtins.dict
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  __copy__(...)
 |      D.copy() -> a shallow copy of D.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __init__(self, /, *args, **kwargs)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __missing__(...)
 |      __missing__(key) # Called by __getitem__ for missing key; pseudo-code:
 |      if self.default_facto

---
### `Counter`

- Note 1