# Collections Module

The collections module is a built-in module that implements specialized container data types providing alternatives to Python’s general purpose built-in containers. We've already gone over the basics: dict, list, set, and tuple.

Now we'll learn about the alternatives that the collections module provides.

## Counter

*Counter* is a *dict* subclass which helps count hash-able objects. Inside of it elements are stored as dictionary keys and the counts of the objects are stored as the value.

Lets see how it can be used:

In [1]:
from collections import Counter

In [3]:
help('collections')

Help on package collections:

NAME
    collections

MODULE REFERENCE
    https://docs.python.org/3.10/library/collections.html
    
    The following documentation is automatically generated from the Python
    source files.  It may be incomplete, incorrect or include features that
    are considered implementation detail and may vary between Python
    implementations.  When in doubt, consult the module reference at the
    location listed above.

DESCRIPTION
    This module implements specialized container datatypes providing
    alternatives to Python's general purpose built-in containers, dict,
    list, set, and tuple.
    
    * namedtuple   factory function for creating tuple subclasses with named fields
    * deque        list-like container with fast appends and pops on either end
    * ChainMap     dict-like class for creating a single view of multiple mappings
    * Counter      dict subclass for counting hashable objects
    * OrderedDict  dict subclass that remembers the o

**Counter() with lists**

In [4]:

lst = [1,2,2,2,2,3,3,3,1,2,1,12,3,2,32,1,21,1,223,1]

Counter(lst)

Counter({1: 6, 2: 6, 3: 4, 12: 1, 32: 1, 21: 1, 223: 1})

**Counter with strings**

In [5]:
Counter('aabsbsbsbhshhbbsbs')

Counter({'a': 2, 'b': 7, 's': 6, 'h': 3})

**Counter with words in a sentence**

In [6]:
s = 'How many times does each word show up in this sentence word times each each word'

words = s.split()
print(words)

Counter(words)

['How', 'many', 'times', 'does', 'each', 'word', 'show', 'up', 'in', 'this', 'sentence', 'word', 'times', 'each', 'each', 'word']


Counter({'How': 1,
         'many': 1,
         'times': 2,
         'does': 1,
         'each': 3,
         'word': 3,
         'show': 1,
         'up': 1,
         'in': 1,
         'this': 1,
         'sentence': 1})

In [7]:
# Methods with Counter()
c = Counter(words)

c.most_common(3)

[('each', 3), ('word', 3), ('times', 2)]

In [9]:
c = Counter(lst)
c.most_common(5)

[(1, 6), (2, 6), (3, 4), (12, 1), (32, 1)]

## defaultdict

defaultdict is a dictionary like object which provides all methods provided by dictionary but takes first argument (default_factory) as default data type for the dictionary. Using defaultdict is faster than doing the same using dict.set_default method.

**A defaultdict will never raise a KeyError. Any key that does not exist gets the value returned by the default factory.**

In [None]:
from collections import defaultdict

In [None]:
d = {}

In [None]:
d['one'] 

In [None]:
d  = defaultdict(object)

In [None]:
d['one'] 

In [None]:
d['python'] 

In [None]:
for item in d:
    print(item)

Can also initialize with default values:

In [None]:
d = defaultdict(lambda: None)

In [None]:
d['one']

In [None]:
d['exam'] = 10
d['exam'] 

In [None]:
d['name']

## OrderedDict
An OrderedDict is a dictionary subclass that remembers the order in which its contents are added.

Fro example a normal dictionary:

In [None]:
print( 'Normal dictionary:')

d = {}

d['a'] = 'A'
d['b'] = 'B'
d['e'] = 'E'
d['c'] = 'C'
d['d'] = 'D'

for k, v in d.items():
    print( k, v)

An Ordered Dictionary:

In [None]:
import collections
print( 'OrderedDict:')

d = collections.OrderedDict()

d['a'] = 'A'
d['b'] = 'B'
d['c'] = 'C'
d['e'] = 'E'
d['d'] = 'D'

for k, v in d.items():
    print( k, v)

## Equality with an Ordered Dictionary
A regular dict looks at its contents when testing for equality. An OrderedDict also considers the order the items were added.

A normal Dictionary:

In [None]:
print( 'Dictionaries are equal? ')

d1 = {}
d1['a'] = 'A'
d1['b'] = 'B'

d2 = {}
d2['b'] = 'B'
d2['a'] = 'A'

print( d1 == d2)

An Ordered Dictionary:

In [None]:
from collections import OrderedDict
print( 'Dictionaries are equal? ')

d1 = OrderedDict()
d1['a'] = 'A'
d1['b'] = 'B'


d2 = OrderedDict()

d2['b'] = 'B'
d2['a'] = 'A'

print( d1 == d2)

## namedtuple
The standard tuple uses numerical indexes to access its members, for example:

In [None]:
t = (12,13,14)

In [None]:
t[0]

For simple use cases, this is usually enough. On the other hand, remembering which index should be used for each value can lead to errors, especially if the tuple has a lot of fields and is constructed far from where it is used. A namedtuple assigns names, as well as the numerical index, to each member. 

Each kind of namedtuple is represented by its own class, created by using the namedtuple() factory function. The arguments are the name of the new class and a string containing the names of the elements.

You can basically think of namedtuples as a very quick way of creating a new object/class type with some attribute fields.
For example:

In [None]:
from collections import namedtuple

In [None]:
dogTuple = namedtuple('Dog','age breed name')

sam = dogTuple(age=2,breed='Lab',name='Sammy')

frank = dogTuple(age=2,breed='Shepard',name="Frankie")

We construct the namedtuple by first passing the object type name (Dog) and then passing a string with the variety of fields as a string with spaces between the field names. We can then call on the various attributes:

In [None]:
sam

In [None]:
sam.age

In [None]:
sam.breed

In [None]:
sam[0]