## Collections Module

The collections module is a built-in module that implements specialized container datatypes providing alternatives to Python’s general purpose built-in containers.

## Counter

Counter is a dict subclass which helps count hashable objects. 
Inside of it elements are stored as dictionary keys and the counts of the objects are stored as the value.

In [1]:
from collections import Counter

In [5]:
lx = [1,2,2,2,2,3,3,3,1,2,1,12,3,2,32,1,21,1,223,1] # with a list 
xl = 'aabsbsbsbhshhbbsbs' # with a list
sx = 'How many times does each word show up in this sentence word times each each word' # with a sentence

print Counter(lx) # with a List
print Counter(xl) # with a String
sxx = sx.split() # with a sentence
print Counter(sxx)


Counter({1: 6, 2: 6, 3: 4, 32: 1, 12: 1, 21: 1, 223: 1})
Counter({'b': 7, 's': 6, 'h': 3, 'a': 2})
Counter({'word': 3, 'each': 3, 'times': 2, 'show': 1, 'this': 1, 'many': 1, 'in': 1, 'up': 1, 'How': 1, 'does': 1, 'sentence': 1})


### Counter special functions and patterns

In [10]:
# Methods with Counter()
c = Counter(sx)

c.most_common(2)
sum(c.values())                 # total of all counts
c.clear()                       # reset all counts
list(c)                         # list unique elements
set(c)                          # convert to a set
dict(c)                         # convert to a regular dictionary
c.items()                       # convert to a list of (elem, cnt) pairs
# Counter(dict(List_of_pairs))  # convert from a list of (elem, cnt) pairs
# c.most_common()[:-n-1:-1]     # n least common elements
c += Counter()                  # remove zero and negative counts

## defaultdict

defaultdict is a dictionary like object which provides all methods provided by dictionary but takes first argument (default_factory) as default data type for the dictionary. 

Using defaultdict is faster than doing the same using dict.set_default method.
A defaultdict will never raise a KeyError. Any key that does not exist gets the value returned by the default factory.

In [11]:
from collections import defaultdict

In [12]:
dd = {}
dd['one']

KeyError: 'one'

In [13]:
dd = defaultdict(object)
dd['one']

<object at 0x4452530>

In [15]:
for item in dd:
    print item

one


In [16]:
dd = defaultdict(lambda: 0)

In [18]:
dd['one']

0

## OrderedDict 

Is a dictionary subclass that remembers the order in which its contents are added.

In [3]:
import collections

print 'Normal dictionary:'

db = {}

db['a'] = 'A'
db['b'] = 'B'
db['c'] = 'C'
db['d'] = 'D'
db['e'] = 'E'

for k, v in db.items():
    print k, v
    
print 'OrderedDict:'

db = collections.OrderedDict()

db['a'] = 'A'
db['b'] = 'B'
db['c'] = 'C'
db['d'] = 'D'
db['e'] = 'E'

for k, v in db.items():
    print k, v

Normal dictionary:
a A
c C
b B
e E
d D
OrderedDict:
a A
b B
c C
d D
e E


## Equality with an Ordered Dictionary

A regular dict looks at its contents when testing for equality. 

An OrderedDict also considers the order the items were added.

In [4]:
# Normal dictionary

print 'Dictionaries are equal? '

d1 = {}
d1['a'] = 'A'
d1['b'] = 'B'

d2 = {}
d2['b'] = 'B'
d2['a'] = 'A'

print d1 == d2

# Ordered dictionary

print 'Dictionaries are equal? '

d1 = collections.OrderedDict()
d1['a'] = 'A'
d1['b'] = 'B'


d2 = collections.OrderedDict()

d2['b'] = 'B'
d2['a'] = 'A'

print d1 == d2

Dictionaries are equal? 
True
Dictionaries are equal? 
False


## namedtuple

Remembering which index should be used for each value can lead to errors, especially if the tuple has a lot of fields and is constructed far from where it is used. 

A namedtuple assigns names, as well as the numerical index, to each member. 

Each kind of namedtuple is represented by its own class, created by using the namedtuple() factory function. 

The arguments are the name of the new class and a string containing the names of the elements.

You can basically think of namedtuples as a very quick way of creating a new object/class type with some attribute fields.

In [10]:
from collections import namedtuple

In [13]:
Dog = namedtuple('Dog','age breed name')

sam = Dog(age=2,breed='Lab',name='Sammy') # assigning values, establishing variable name

frank = Dog(age=2,breed='Shepard',name="Frankie") 

bobby = Dog(3, 'Salchicha', 'bobby') # assigning values, without using variable name

Rules:

        1.Pass object type name (Dog)
        2.Pass string with the variety of fields **as a string with spaces between the field names**

In [7]:
sam.age

2

In [8]:
sam[0]

2

In [12]:
bobby.age

3

In [13]:
city, year, pop = ('Tokyo', 2003, 3000000)

In [14]:
city

'Tokyo'

In [2]:
mapa = ('America','Africa','Europe','Asia','Oceania')

In [5]:
cont1, cont2, cont3, cont4, cont5 = mapa # Tuple unpacking

In [4]:
cont1

'America'