## Exploring Python Standard Library

### collections — Container datatypes

This module implements **specialized container datatypes providing alternatives to Python’s general purpose built-in containers**, dict, list, set, and tuple.

### Counter : dict subclass for counting hashable objects
- It's a special dictionary that can store frequency of occurance of each element
- It is a collection where elements are stored as dictionary keys and their counts are stored as dictionary values
- Unlike dictionary, Counter returns `0` and not KeyError when an item which doesn't exist is looked up 

### Big Idea: 
Pass an iterable to Counter() to get the count of each element in the iterable

_Let's see an example_
**Problem statement**: Count the frequency of occurances of each word in the below string:<br>
`The brown fox jumps over the brown lazy dog into the brown backyard where it had brown cubs`

In [14]:
### Approach 1: updating count via dictionary

def approach_1():
    cnt_dict={}
    tstr = "The brown fox jumps over the brown lazy dog into the brown backyard where it had brown cubs"
    #print(tstr.split())
    for element in tstr.lower().split():
        cnt_dict.setdefault(element,0)
        cnt_dict[element] +=1
        
    return cnt_dict  
        

In [16]:
for k,v in sorted(approach_1().items()):
    print(f"{k}:{v}")

backyard:1
brown:4
cubs:1
dog:1
fox:1
had:1
into:1
it:1
jumps:1
lazy:1
over:1
the:3
where:1


In [2]:
### Approach 2: via Counter
from collections import Counter

In [3]:
def approach_2():
    tstr = "The brown fox jumps over the brown lazy dog into the brown backyard where it had brown cubs"
    cnt_dict = Counter(tstr.lower().split())
    return cnt_dict

In [4]:
for k,v in sorted(approach_2().items()):
    print(f"{k}:{v}")

backyard:1
brown:4
cubs:1
dog:1
fox:1
had:1
into:1
it:1
jumps:1
lazy:1
over:1
the:3
where:1


### Two important methods supplied with Counter

In [5]:
tstr = "The brown fox jumps over the brown lazy dog into the brown backyard where it had brown cubs"
c_dct = Counter(tstr.lower().split())

**Counter.elements()**
- generates an iterator; so I have used a list to iterate over the values
- displays each element the number of times it has occured as per the Counter

**Counter.most_common(n)**
- displays a list of `n` top most occuring elements along with their frequency
- the diplay happens in descending order
- elements with equal counts are displayed in the order they have been passed as input to `Counter`

In [12]:
list(sorted(c_dct.elements()))

['backyard',
 'brown',
 'brown',
 'brown',
 'brown',
 'cubs',
 'dog',
 'fox',
 'had',
 'into',
 'it',
 'jumps',
 'lazy',
 'over',
 'the',
 'the',
 'the',
 'where']

In [13]:
c_dct.most_common(3)

[('brown', 4), ('the', 3), ('fox', 1)]

In [14]:
c_dct.most_common(4)

[('brown', 4), ('the', 3), ('fox', 1), ('jumps', 1)]

In [16]:
c_dct["unknown_element"] ## note that Counter doesn't throw a KeyError

0