Pierre Navaro - [Institut de Recherche Mathématique de Rennes](https://irmar.univ-rennes1.fr) - [CNRS](http://www.cnrs.fr/)

# Container datatypes

`collection` module implements specialized container datatypes providing alternatives to Python’s general purpose built-in containers, `dict`, `list`, `set`, and `tuple`.

- `namedtuple()`	: factory function for creating tuple subclasses with named fields
- `deque`	: list-like container with fast appends and pops on either end
- `ChainMap`	: dict-like class for creating a single view of multiple mappings
- `Counter`	: dict subclass for counting hashable objects
- `defaultdict` :	dict subclass that calls a factory function to supply missing values


## Counter

A Counter is a dict subclass for counting hashable objects. It is an unordered collection where elements are stored as dictionary keys and their counts are stored as dictionary values. Counts are allowed to be any integer value including zero or negative counts. The Counter class is similar to bags or multisets in other languages.

Elements are counted from an iterable or initialized from another mapping (or counter):

In [196]:
from collections import Counter

violet = dict(r=238,g=130,b=238)
cnt = Counter(violet)  # or Counter(r=238, g=130, b=238)
print(cnt['c'])
print(cnt['r'])

0
238


In [197]:
print(*cnt.elements())

r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g g b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b 

In [199]:
cnt.most_common(2)

[('r', 238), ('b', 238)]

In [201]:
cnt.values()

dict_values([238, 130, 238])

In [224]:
from lorem import text

translator = str.maketrans('', '', string.punctuation)
p = text().lower().translate(translator).replace('\n',' ').split(' ')
c = Counter(p)
del c['']
c.most_common()

[('quaerat', 11),
 ('ut', 9),
 ('neque', 8),
 ('magnam', 8),
 ('labore', 8),
 ('voluptatem', 8),
 ('sit', 8),
 ('tempora', 7),
 ('non', 7),
 ('dolore', 7),
 ('ipsum', 7),
 ('eius', 6),
 ('etincidunt', 6),
 ('velit', 6),
 ('adipisci', 6),
 ('dolor', 6),
 ('quiquia', 5),
 ('porro', 5),
 ('sed', 5),
 ('quisquam', 5),
 ('amet', 5),
 ('numquam', 4),
 ('consectetur', 4),
 ('dolorem', 4),
 ('modi', 4),
 ('aliquam', 4),
 ('est', 3)]

In [221]:
from lorem import text
import string

def words(p):
    """ Read a multiline string, remove the punctuation,
    lower all character """
    translator = str.maketrans('', '', string.punctuation)
    output = []
    p = p.translate(translator)
    for word in p.split():
        word = word.lower()
        output.append((word, 1))
    output.sort()
    return output

def reduce(words):
    """ Read the sorted list from map and print out every word with 
    its number of occurences"""
    d = {}
    for w in words:
        try:
            d[w[0]] +=1
        except KeyError:
            d[w[0]] = 1
    return d

reduce(words(text()))


{'adipisci': 6,
 'aliquam': 9,
 'amet': 4,
 'consectetur': 5,
 'dolor': 7,
 'dolore': 5,
 'dolorem': 6,
 'eius': 8,
 'est': 6,
 'etincidunt': 7,
 'ipsum': 4,
 'labore': 5,
 'magnam': 12,
 'modi': 13,
 'neque': 8,
 'non': 4,
 'numquam': 5,
 'porro': 8,
 'quaerat': 11,
 'quiquia': 10,
 'quisquam': 15,
 'sed': 4,
 'sit': 5,
 'tempora': 9,
 'ut': 7,
 'velit': 5,
 'voluptatem': 5}


`defaultdict` 
Using list as the default_factory, it is easy to group a sequence of key-value pairs into a dictionary of lists:





In [None]:
s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
d = defaultdict(list)
for k, v in s:
    d[k].append(v)

sorted(d.items())

## Partition

Before **reduce** operation, data must be aligned in a container. Create a function named `partition` that stores the key/value pairs from `words` into a [defaultdict](https://docs.python.org/3.6/library/collections.html#collections.defaultdict) from `collections` module. Ouput will be:
```python
[('word1', [1, 1]), ('word2', [1]), ('word3', [1, 1, 1])]
```

In [None]:


[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
When each key is encountered for the first time, it is not already in the mapping; so an entry is automatically created using the default_factory function which returns an empty list. The list.append() operation then attaches the value to the new list. When keys are encountered again, the look-up proceeds normally (returning the list for that key) and the list.append() operation adds another value to the list. This technique is simpler and faster than an equivalent technique using dict.setdefault():

>>>
>>> d = {}
>>> for k, v in s:
...     d.setdefault(k, []).append(v)
...
>>> sorted(d.items())
[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]

In [19]:
from collections import defaultdict

defaultdict??

In [39]:
import collections
def partition_mp(mapped_values):
    """
        Organize the mapped values by their key.
        Returns an unsorted sequence of tuples with a key and a sequence of values.
    """
    partitioned_data = collections.defaultdict(list)
    for key, value in mapped_values:
        partitioned_data[key].append(value)
    return partitioned_data.items()

- [itertools.chain(*mapped_values)](https://docs.python.org/3.6/library/itertools.html#itertools.chain) is used for treating consecutive sequences as a single sequence. 
- [operator](https://docs.python.org/3/library/operator.html).itemgetter(1)
Return a callable object that fetches item from its operand using the operand’s __getitem__() method. 
```python
inventory = [('apple', 3), ('banana', 2), ('pear', 5), ('orange', 1)]
getcount = itemgetter(1)
>>> list(map(getcount, inventory))
[3, 2, 5, 1]
>>> sorted(inventory, key=getcount)
[('orange', 1), ('banana', 2), ('apple', 3), ('pear', 5)]
```