Pierre Navaro - [Institut de Recherche Mathématique de Rennes](https://irmar.univ-rennes1.fr) - [CNRS](http://www.cnrs.fr/)

# Container datatypes

`collection` module implements specialized container datatypes providing alternatives to Python’s general purpose built-in containers, `dict`, `list`, `set`, and `tuple`.

- `namedtuple()`	: factory function for creating tuple subclasses with named fields
- `deque`	: list-like container with fast appends and pops on either end
- `ChainMap`	: dict-like class for creating a single view of multiple mappings
- `Counter`	: dict subclass for counting hashable objects
- `defaultdict` :	dict subclass that calls a factory function to supply missing values


## ChainMap 


The ChainMap class manages a sequence of dictionaries, and searches through them in the order they are given to find values associated with keys. 
A ChainMap makes a good “context” container, since it can be treated as a stack for which changes happen as the stack grows, with these changes being discarded again as the stack shrinks.

In [189]:
import collections

rgb = {'r': 'red', 'g': 'green', 'b': 'blue', }
cmjk = {'c': 'cyan','m': 'magenta', 'y': 'yellow', 'b': 'black'}

colormode = collections.ChainMap(d1, d2)

print('r = {}'.format(colormode['r']))
print('g = {}'.format(colormode['g']))
print('b = {}'.format(colormode['b']))

r = red
g = green
b = blue


In [190]:
print('Keys = {}'.format(list(colormode.keys())))
print('Values = {}'.format(list(colormode.values())))

Keys = ['c', 'm', 'y', 'k', 'b', 'r', 'g']
Values = ['cyan', 'magenta', 'yellow', 'black', 'blue', 'red', 'green']


In [191]:
print('Items:')
for k, v in colormode.items():
    print('{} = {}'.format(k, v))

Items:
c = cyan
m = magenta
y = yellow
k = black
b = blue
r = red
g = green


### Reordering

The ChainMap stores the list of mappings over which it searches in a list in its maps attribute. This list is mutable, so it is possible to add new mappings directly or to change the order of the elements to control lookup and update behavior.



In [192]:
print(cm.maps)
print('b = {}\n'.format(cm['b']))

# reverse the list
cm.maps = list(reversed(cm.maps))

print(cm.maps)
print('b = {}'.format(cm['b']))

[{'t': 'tint', 's': 'staturation', 'l': 'lightness'}, {'r': 'red', 'g': 'green', 'b': 'blue'}, {'c': 'cyan', 'm': 'magenta', 'y': 'yellow', 'k': 'black', 'b': 'black'}, {'t': 'tint', 's': 'staturation', 'l': 'lightness'}, {'t': 'tint', 's': 'staturation', 'l': 'lightness'}, {'t': 'tint', 's': 'staturation', 'l': 'lightness'}]
b = blue

[{'t': 'tint', 's': 'staturation', 'l': 'lightness'}, {'t': 'tint', 's': 'staturation', 'l': 'lightness'}, {'t': 'tint', 's': 'staturation', 'l': 'lightness'}, {'c': 'cyan', 'm': 'magenta', 'y': 'yellow', 'k': 'black', 'b': 'black'}, {'r': 'red', 'g': 'green', 'b': 'blue'}, {'t': 'tint', 's': 'staturation', 'l': 'lightness'}]
b = black


### Updating Values

A ChainMap does not cache the values in the child mappings. Thus, if their contents are modified, the results are reflected when the ChainMap is accessed. So we can fix the wrong key in d2.

In [193]:
cmjk['k'] = cmjk.pop('b') 
cm['k']

'black'

ChainMap provides a convenience method for creating a new instance with one extra mapping at the front of the maps list to make it easy to avoid modifying the existing underlying data structures.




In [194]:
tsl = dict(t='tint', s='staturation',l='lightness')
cm = cm.new_child(tsl)

cm

ChainMap({'t': 'tint', 's': 'staturation', 'l': 'lightness'}, {'t': 'tint', 's': 'staturation', 'l': 'lightness'}, {'t': 'tint', 's': 'staturation', 'l': 'lightness'}, {'t': 'tint', 's': 'staturation', 'l': 'lightness'}, {'c': 'cyan', 'm': 'magenta', 'y': 'yellow', 'k': 'black', 'b': 'black'}, {'r': 'red', 'g': 'green', 'b': 'blue'}, {'t': 'tint', 's': 'staturation', 'l': 'lightness'})

In [195]:
print('Items:')
for k, v in cm.items():
    print('{} = {}'.format(k, v))

Items:
c = cyan
t = tint
m = magenta
y = yellow
k = black
l = lightness
s = staturation
b = black
r = red
g = green


## Counter

A Counter is a dict subclass for counting hashable objects. It is an unordered collection where elements are stored as dictionary keys and their counts are stored as dictionary values. Counts are allowed to be any integer value including zero or negative counts. The Counter class is similar to bags or multisets in other languages.

Elements are counted from an iterable or initialized from another mapping (or counter):

In [196]:
from collections import Counter

violet = dict(r=238,g=130,b=238)
cnt = Counter(violet)  # or Counter(r=238, g=130, b=238)
print(cnt['c'])
print(cnt['r'])

0
238


In [197]:
print(*cnt.elements())

r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b 

In [199]:
cnt.most_common(2)

[('r', 238), ('b', 238)]

In [200]:
cnt.values()

dict_values([238, 130, 238])

In [67]:
from lorem import paragraph
from collections import Counter

import string

def words(p):
    translator = str.maketrans('', '', string.punctuation)
    output = []
    p = p.translate(translator)
    for word in p.split():
        word = word.lower()
        output.append((word, 1))
    output.sort()
    return output

def reduce(words):
    """ Read the sorted list from map and print out every word with 
    its number of occurences"""
    d = {}
    for w in words:
        try:
            d[w[0]] +=1
        except KeyError:
            d[w[0]] = 1
    return d

d1, d2, d3 = [reduce(words(paragraph())) for i in range(3)]
cm = ChainMap(d1,d2,d3)
cm.maps

[{'adipisci': 2,
  'aliquam': 2,
  'consectetur': 4,
  'dolore': 3,
  'dolorem': 1,
  'eius': 5,
  'est': 1,
  'etincidunt': 2,
  'ipsum': 1,
  'labore': 5,
  'modi': 1,
  'neque': 3,
  'non': 2,
  'numquam': 1,
  'porro': 2,
  'quaerat': 4,
  'quiquia': 2,
  'quisquam': 3,
  'sed': 5,
  'sit': 2,
  'tempora': 2,
  'ut': 1,
  'velit': 2,
  'voluptatem': 2},
 {'adipisci': 2,
  'aliquam': 1,
  'amet': 3,
  'consectetur': 1,
  'dolor': 1,
  'dolore': 1,
  'dolorem': 1,
  'est': 2,
  'etincidunt': 1,
  'ipsum': 1,
  'labore': 2,
  'modi': 2,
  'neque': 2,
  'non': 4,
  'numquam': 1,
  'porro': 2,
  'quaerat': 2,
  'quiquia': 1,
  'quisquam': 2,
  'sed': 3,
  'sit': 1,
  'tempora': 7,
  'velit': 2,
  'voluptatem': 2},
 {'adipisci': 1,
  'aliquam': 5,
  'amet': 2,
  'consectetur': 2,
  'dolor': 3,
  'dolore': 2,
  'dolorem': 3,
  'eius': 1,
  'est': 1,
  'etincidunt': 1,
  'ipsum': 5,
  'magnam': 5,
  'modi': 2,
  'neque': 5,
  'non': 3,
  'porro': 1,
  'quaerat': 3,
  'quiquia': 1,
  'quisq


`defaultdict` 
Using list as the default_factory, it is easy to group a sequence of key-value pairs into a dictionary of lists:





In [None]:
s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
d = defaultdict(list)
for k, v in s:
    d[k].append(v)

sorted(d.items())

## Partition

Before **reduce** operation, data must be aligned in a container. Create a function named `partition` that stores the key/value pairs from `words` into a [defaultdict](https://docs.python.org/3.6/library/collections.html#collections.defaultdict) from `collections` module. Ouput will be:
```python
[('word1', [1, 1]), ('word2', [1]), ('word3', [1, 1, 1])]
```


When each key is encountered for the first time, it is not already in the mapping; so an entry is automatically created using the default_factory function which returns an empty list. The list.append() operation then attaches the value to the new list. When keys are encountered again, the look-up proceeds normally (returning the list for that key) and the list.append() operation adds another value to the list. This technique is simpler and faster than an equivalent technique using dict.setdefault():

```python
>>> d = {}
>>> for k, v in s:
...     d.setdefault(k, []).append(v)
...
>>> sorted(d.items())
[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
```

In [19]:
from collections import defaultdict

defaultdict??

In [39]:
import collections
def partition_mp(mapped_values):
    """
        Organize the mapped values by their key.
        Returns an unsorted sequence of tuples with a key and a sequence of values.
    """
    partitioned_data = collections.defaultdict(list)
    for key, value in mapped_values:
        partitioned_data[key].append(value)
    return partitioned_data.items()

- [itertools.chain(*mapped_values)](https://docs.python.org/3.6/library/itertools.html#itertools.chain) is used for treating consecutive sequences as a single sequence. 
- [operator](https://docs.python.org/3/library/operator.html).itemgetter(1)
Return a callable object that fetches item from its operand using the operand’s __getitem__() method. 
```python
inventory = [('apple', 3), ('banana', 2), ('pear', 5), ('orange', 1)]
getcount = itemgetter(1)
>>> list(map(getcount, inventory))
[3, 2, 5, 1]
>>> sorted(inventory, key=getcount)
[('orange', 1), ('banana', 2), ('apple', 3), ('pear', 5)]
```

In [4]:
from lorem import text
from string import punctuation
from collections import Counter

translator = str.maketrans('', '', punctuation)
p = text().lower().translate(translator).replace('\n',' ').split(' ')
c = Counter(p)
del c['']
c.most_common()

[('sit', 21),
 ('consectetur', 16),
 ('ipsum', 15),
 ('velit', 15),
 ('neque', 14),
 ('modi', 14),
 ('est', 14),
 ('quiquia', 13),
 ('quisquam', 13),
 ('dolor', 13),
 ('dolore', 12),
 ('voluptatem', 12),
 ('adipisci', 12),
 ('etincidunt', 12),
 ('aliquam', 11),
 ('amet', 11),
 ('tempora', 10),
 ('dolorem', 10),
 ('labore', 9),
 ('magnam', 9),
 ('porro', 8),
 ('non', 8),
 ('numquam', 7),
 ('quaerat', 7),
 ('eius', 7),
 ('ut', 6),
 ('sed', 2)]